Why can't I send UDP packets through a browser?

Introduction

In 2017, most of the popular web games like agar.io use WebSockets over TCP for data transfer. If browsers had a built-in UDP analogue of WebSockets, then this would greatly improve the work with networks in these games.

Introductory information

Web browsers are based on the HTTP protocol (stateless request and response protocol). It was originally designed to serve static web pages. HTTP works on top of TCP, a low-level protocol that guarantees reliable delivery and correct order of data transmitted over the Internet.

All of this worked fine for many years, but recently websites have become more interactive and have stopped responding to the request-response paradigm of the HTTP protocol. To solve this problem, modern web protocols, such as WebSockets, WebRTC, HTTP 2.0, and QUIC, have the potential to significantly improve network interactivity.
')
Unfortunately, the new set of web development standards does not meet the needs of multiplayer games or is too complicated to implement.

This is frustrating for game developers, because they just want to be able to send and receive UDP packets through a browser.

Problem

The web is built on top of TCP, which is a packet-ordering protocol. For reliable delivery of data in the right order in the face of packet loss, TCP must keep the latest data in the queue, waiting for resending the lost packets. Otherwise, the data will be delivered in the wrong order.

This principle is called blocking the turn of the queue . It creates an annoying developer and an almost tragicomic situation. The most recent data that they need is waiting for the old data to be sent again, but at the time of sending the data they are already outdated and are becoming useless.

Unfortunately, this process cannot be corrected within the framework of TCP; in it all data must be received without fail in the necessary order. Therefore, the standard solution for the gaming industry over the past 20 years has been the transfer of data over UDP.

In practice, this meant that each game developed its own protocol over UDP, implementing all the necessary functionality, and sending most of the data in an unreliable way without preserving order. This ensured the fastest possible delivery of time data without waiting for the retransmission of lost packets.

And what should be done in the case of web games?

The main problem of web games today is that game developers are not able to use the best solution chosen by the gaming industry in the browser. Instead, web games send game data over TCP, which results in low responsiveness.

Using TCP is completely optional, this problem could be solved "at the click of a finger", if web games had the opportunity to send and receive UDP packets.

What is WebSockets?

WebSockets is an extension to the HTTP protocol that modifies an HTTP connection so that data can be transferred in both directions. At the same time, the standard request-response pattern is not used.

This technique allows you to elegantly solve the problem of web sites that need to display dynamically changing content, because after installing a websocket connection, the server can send data to the browser without a request.

Unfortunately, since WebSockets is implemented on top of TCP, data is still subject to blocking the start of the queue.

What is QUIC?

QUIC is an experimental protocol created over UDP and developed as a replacement transport layer for HTTP. Currently it is supported only by Google Chrome.

The most important feature of QUIC is support for multiple data streams. A client or server can implicitly create new channels by increasing the channel id.

The concept of channels provides two big advantages:

Avoids sending connection confirmation requests each time a new request is created.
Eliminates blocking the beginning of the queue between unrelated data streams.

Unfortunately, although we eliminate the problem of blocking the beginning of the queue for individual threads, it still exists inside each thread.

What is WebRTC?

WebRTC is a set of protocols that provide a peer-to-peer connection between browsers for applications such as streaming audio and video.

I note that WebRTC supports a data channel that can be configured to “unreliable” mode, which allows for unreliable data transfer through the browser without saving the order.

So why are we still using WebSockets in modern 2017 browser games?

The reason is that in multiplayer games there is a tendency of transition from peer-to-peer transmission to the client-server model. And although WebRTC allows you to conveniently send unreliable "erratic" data from the browser to the browser, it crashes when data transfer between the browser and the dedicated server is required.

The problem arises due to the extreme complexity of WebRTC. The reasons for this complexity are clear: WebRTC was primarily designed for peer-to-peer data exchange between browsers, therefore, in the worst case, it requires support for STUN, ICE and TURN to bypass NAT.

But from the point of view of game developers, all this complexity falls on them as a dead load, because STUN, ICE and TURN are absolutely not needed to exchange data with dedicated servers that have public IP addresses.

“I felt that we needed a UDP version of WebSockets. This is the only thing we dreamed of. ”
Matheus Valadares, creator of agar.io

In short, game developers love simplicity, and a solution like WebSockets for UDP attracts them much more than the complexity of WebRTC.

Why not just allow sending UDP?

The final solution to the problem is simply to allow users to send and receive UDP packets directly through the browser. Of course, this is an absolutely terrible idea and there are good reasons why this should never be allowed.

Websites could launch DDoS attacks by coordinating the mass distribution of UDP packets from browsers.
There would be new security holes, because JavaScript running on web pages could create malicious UDP packets to “probe” the internal system of corporate networks and transmit reports via HTTPS.
UDP packets are not encrypted, so it is very easy for an attacker to organize sniffing and reading all the data transmitted in these packets, or even to change them during transmission. Ensuring that browsers can transmit unencrypted packets would be a huge step backwards in network security.
There is no authentication in UDP, so a dedicated server that reads packets sent by the browser would have to apply its own validity method to the users connecting to it. Such labor costs are much higher than the efforts that game developers are willing to invest in solving this problem.

So, it is absolutely clear that JavaScript should not in any way create UDP packets in a browser.

What could be the solution?

But what if you come from the other end? Instead of trying to build bridges from the world of the web to games, we can start with the right games for the technician and refine them to a solution that works well on the web.

My name is Glenn Fiedler , I have been developing games for the past 15 years. For most of this time, I specialized in network programming. I got a lot of experience working on dynamic action games. The last game I worked on was Titanfall 2 .

About a month ago I read this article on Hacker News: WebRTC: The Future of Web Games .

In it, the creator of agar.io, Mateus Valadares, said that WebRTC is too complicated for him, and he continues to use WebSockets in his games.

I wondered: surely there must be a simpler solution than WebRTC?

I wondered what that solution would look like?

In my opinion, the solution should have the following properties:

It must establish a connection so that it cannot be used in DDoS attacks and to search for security holes.
Encryption , because in 2017 no game or application should send unencrypted packets.
Authentication , because dedicated servers should only accept connections from clients that are authorized in the backend.

I want to present my solution. I do not amuse myself with the illusions that it will be fully accepted as a standard for browsers, I am not a web programmer, I write games. But I hope that it, at least, will help browser creators and web developers see what client-server games really need. I want the solution I proposed to at least partially help build bridges between the games and the web.

Hopefully, as a result, in the near future we will get a much better performance of multiplayer browser games.

netcode.io

The solution I came to is netcode.io

netcode.io is a simple network protocol that allows clients to securely connect to dedicated servers and exchange data over UDP. It is connection-oriented, encrypts and signs packets, and provides authentication support so that only authorized clients can connect to dedicated servers.

It is designed for games such as agar.io , which need to spread players from the main website to instances of dedicated servers. Each of the servers has a limit on the maximum number of players (in the basic implementation - up to 256 players per server instance).

The basic idea is that the web backend performs authorization. When a player wants to play, the backend makes a REST call to get a connection token , which is sent to the dedicated server as part of the UDP connection confirmation request.

Connection tokens have a short lifetime and rely on a shared private key between the web backend and instances of dedicated servers. The advantage of this approach is that only authorized users can connect to dedicated servers.

netcode.io outperforms WebRTC in simplicity. It uses the scheme only with dedicated servers, so ICE, STUN and TURN are not required. By implementing encryption, signatures, and authentication with libsodium , it avoids the complexity of a full DTLS implementation, while providing the same level of security.

Over the past month, I created a base implementation of netcode.io in C. It is released under the BSD license of three points. In a few months, I hope to improve this implementation, write a specification and work with other developers on porting netcode.io to various languages.

How it works

The client is authorized in the web backend using standard authentication techniques (for example, through OAuth). After authorizing the client, he sends a request to start the game, making a REST call. The REST call returns a connection- token encoded in base64 to the client over HTTPS.

The connection token consists of two parts:

The private part is encrypted and signed with a shared private key using the AEAD primitive from libsodium. It can not be considered, modified or forged in the client.
Public part that provides the information necessary for the client to connect. For example, encryption keys for UDP packets and a list of server addresses to which you can connect, as well as other information related to the AEAD “linked data” part.

The client reads the connection token and has a list of N IP addresses to which you can connect. Since N can be equal to 1, it is best to pass the address of several servers to the client in case the first server is already full by the time the client tries to connect.

When connecting to a dedicated server, the client periodically sends a connection request packet via UDP. This package contains private connection token data, as well as additional data for AEAD, for example, netcode.io version information, protocol identifier (64-bit unique for each particular game), connection token expiration time stamp and AEAD primitive sequence number .

When a dedicated server receives a connection request via UDP, it first checks the validity of the packet contents using the AEAD primitive. If any public data in the connection request packet has been changed, the signature verification will generate an error. This prevents clients from changing the timestamp of the connection token, and also allows them to quickly reject expired tokens.

If the connection token is valid, it is decrypted. Inside it contains a list of addresses of dedicated servers for which it is valid. This prevents malicious clients from using a single token to connect to all available servers.

The server also checks if the connection token has already been used by searching for a brief history of the HMAC token. If a match is found, the connection request is ignored. Due to this, one token cannot be used to connect multiple clients.

In addition, the server allows only one client to connect to a single IP address and port at any one time. Also, at the same time only one client can be connected to the server using a unique client id . The client id is a 64-bit integer that uniquely identifies the client authorized by the web backend.

If the connection token has not expired, it is decrypted. If the public IP address of the dedicated server is in the list of server addresses and all other checks have been completed successfully, then the dedicated server establishes the correspondence between the client's IP address and the encryption keys contained in the private data of the connection token.

From this point on, all packets transmitted between the client and the server are encrypted with these keys. If during a short period of time (for example, five seconds), UDP packets from the address do not arrive, then the bundle of address and encryption keys becomes invalid.

The server then checks if the server has room for the client. Each server supports a certain maximum of clients. For example, in the game for 64 players there will be 64 places for connecting customers. If the server is full, it responds with a connection request rejection packet . This allows customers to quickly find out that the server is full and need to move to the next server in the list.

If the server has room for a client, the server does not provide this place immediately. Instead, it stores the address + HMAC of the client connection token as a potential client . The server then responds with a connection call packet containing the call token . A call token is a block of data encrypted with a random key. The key is released when the server starts.

Key randomization ensures that there are no security problems arising when encrypting tokens of multiple server calls by one ordinal number (servers are not coordinated with each other). In addition, a connection call packet is significantly smaller than a connection request packet, which avoids the use of a protocol for “gain” DDoS attacks.

The client receives a connection call packet via UDP and switches to the state in which it sends connection response packets to the server. Connection response packets simply send the call token back to a dedicated server, thus confirming that the client can actually receive packets to the original IP address from which, he informed, packets were sent. This avoids spoofing source addresses of packets.

When the server receives a response packet to the connection , it looks for the corresponding record of the waiting client, and if it exists, it again searches for a place for the client to connect. If there are no empty seats, it responds with a connection rejection package , because the place that was free at the time of the first receipt of the connection request is already taken.

Otherwise, the server assigns the client free space on the server and responds with a connection support package, which informs the client that it has allocated server space. Such a place is called a customer index . In multiplayer games, it is usually used to identify clients connected to the server. For example, clients 0, 1, 2, 3 in a game with four players correspond to players 1, 2, 3 and 4.

Now the server considers that the client is connected and that it can send packets of the payload . These packages contain data related to the game. Packages are delivered without order. The only drawback of this method is that since the client, before getting the client index and making sure the connection is complete, he must first receive the connection support package, and the server keeps track of whether the client is confirmed by checking the location for each client.

The confirmation flag for each client initially is false and becomes true when the server receives a connection support packet or a payload packet from the client. As long as the client is not confirmed, each time the payload packet is sent to this client, the connection support packet is also pre-sent. This ensures the statistical probability that the client knows its index and will be fully connected before receiving the first packet of the payload, which minimizes the number of connection setup cycles.

After the client and server are fully connected, they can exchange UDP packets in both directions. Usually, game protocols send information entered by a player from a client to a server at high speed, for example, 60 times per second, and the state of the world from server to client is slightly less frequent, for example, 20 times per second. However, in the most advanced AAA games, the server data refresh rate is increased.

If the server or client does not transmit a stable packet flow, connection support packets are automatically generated so that the connection is not interrupted by a timeout. If during a short period of time, for example, five seconds, not a single packet is received from both sides, the connection is terminated by a timeout.

If either party explicitly wants to terminate the connection, then an excess amount of connection completion packets is sent to ensure a high statistical likelihood of receiving packets even if they are partially lost. This allows you to quickly complete the connection so that the other party does not expect a timeout.

Conclusion

Popular web games like agar.io transfer data via WebSockets over TCP, since it is difficult to use WebRTC in the context of a client-server structure with dedicated servers.

One solution for Google is to make the integration of WebRTC data feed support for dedicated servers much easier for game developers.

Or you can use netcode.io , which uses a much simpler solution like “WebSockets for UDP”. If you standardize it and embed it in browsers, this can also solve the problem.

Glenn Fiedler (Glenn Fiedler) - the founder and president of The Network Protocol Company . It provides services for setting up the network part of games. Prior to founding the company, Glenn was the lead programmer for Respawn Entertainment, where he worked on Titanfall 1 and 2.

Glenn is also the author of several popular article cycles on gafferongames.com about online data transfer and physics in games. Fiedler has created open-source libyojimbo and netcode.io network libraries.

Source: https://habr.com/ru/post/322690/

All Articles