[We advise you to read] Other 19 parts of the cycle Before you - the translation of the fifth material from the series, devoted to the features of JS-development. In previous articles, we looked at the main elements of the JavaScript ecosystem, the capabilities of which are used by developers of server and client code. In these materials, after setting out the basics of certain aspects of JS, recommendations are given on their use. The author of the article says that these principles are applied during the development of the
SessionStack application. A modern user of libraries and frameworks can choose from a variety of possibilities, so any project, in order to adequately look at the competition, has to be squeezed out of the technologies on which it is built, everything that is possible.
This time we will talk about communication protocols, compare and discuss their features and components. Here we will deal with WebSocket and HTTP / 2 technologies, in particular, let's talk about security and share tips on choosing the appropriate protocols in various situations.
Introduction
Nowadays, complex web applications with rich dynamic user interfaces are taken for granted. But the Internet had to go a long way in order to achieve its current state.
At the very beginning, the Internet was not designed to support such applications. It was conceived as a collection of HTML pages, as a “web” of related documents. Everything was basically built around the HTTP request / response paradigm. Client applications loaded pages and after that nothing happened until the user clicked on the link to go to the next page.
')
Approximately in 2005, AJAX technology appeared and a lot of programmers began to explore the possibilities of bidirectional communication between the client and the server. However, all HTTP communication sessions were still initiated by the client, which required either user participation or periodic server access to download new data.
"Bidirectional" HTTP communication
Technologies that allow "proactively" send data from the server to the client have been around for quite some time. Among them -
Push and
Comet .
One of the most commonly used techniques for creating the illusion that the server sends data to the client itself is called “long polling” (long polling). Using this technology, the client opens an HTTP connection to the server, which keeps it open until a response is sent. As a result, when the server has data for the client, it sends it to him.
Here is an example of a very simple code snippet that implements the long poll technology:
(function poll(){ setTimeout(function(){ $.ajax({ url: 'https://api.example.com/endpoint', success: function(data) {
This design is a function that calls itself after, for the first time, it is launched automatically. It sets a 10-second interval for each asynchronous Ajax call to the server, and after processing the server’s response, the function call is scheduled again.
Another technique used in a similar situation is
Flash or a composite HXR query, and so-called
htmlfiles .
All these technologies have the same problem: the additional load on the system, which creates the use of HTTP, which makes all this unsuitable for the organization of applications that require high speed response. For example, this is something like a multiplayer browser-based "shooter" or any other online game in which actions are performed in real time.
Introduction to WebSocket Technology
The
WebSocket specification defines an API for establishing a connection between a web browser and a server based on a "socket". Simply put, it is a permanent connection between the client and the server, using which the client and server can send data to each other at any time.

The client establishes a connection by performing the process of the so-called WebSocket handshake. This process begins with the client sending a regular HTTP request to the server. This request includes the
Upgrade
header, which informs the server that the client wants to establish a WebSocket connection.
Let's see how the installation of such a connection looks like from the client:
// WebSocket-. var socket = new WebSocket('ws://websocket.example.com');
The URL used for the WebSocket connection uses the
ws
scheme. In addition, there is a
wss
scheme for organizing secure WebSocket connections, which is equivalent to HTTPS.
In this case, the beginning of the process of opening a WebSocket connection to the
websocket.example.com
server is
websocket.example.com
.
Here is a simplified example of the original request headers.
GET ws://websocket.example.com/ HTTP/1.1 Origin: http://example.com Connection: Upgrade Host: websocket.example.com Upgrade: websocket
If the server supports the WebSocket protocol, it will agree to switch to it and report this in the
Upgrade
response header. Let's look at the implementation of this mechanism using Node.js:
// WebSocket //https://github.com/theturtle32/WebSocket-Node var WebSocketServer = require('websocket').server; var http = require('http'); var server = http.createServer(function(request, response) { // HTTP-. }); server.listen(1337, function() { }); // wsServer = new WebSocketServer({ httpServer: server }); // WebSocket- wsServer.on('request', function(request) { var connection = request.accept(null, request.origin); // - , // . connection.on('message', function(message) { // WebSocket }); connection.on('close', function(connection) { // }); });
After the connection is established, the server’s response will include information on switching to the WebSocket protocol:
HTTP/1.1 101 Switching Protocols Date: Wed, 25 Oct 2017 10:07:34 GMT Connection: Upgrade Upgrade: WebSocket
After that, the
open
event is triggered in the WebSocket instance on the client:
var socket = new WebSocket('ws://websocket.example.com'); // WebSocket-. socket.onopen = function(event) { console.log('WebSocket is connected.'); };
Now that the handshake phase is complete, the original HTTP connection is replaced with a WebSocket connection that uses the same basic TCP / IP connection. At this point, both the client and the server may begin sending data.
Through the use of WebSocket, you can send any amount of data without exposing the system to unnecessary load caused by the use of traditional HTTP requests. Data is transmitted over a WebSocket connection as messages, each of which consists of one or more frames containing the data to be sent (payload). In order to ensure the correct assembly of the original message when it reaches the client, each frame has a prefix containing 4-12 bytes of payload data. The use of a frame-based messaging system helps to reduce the number of service data transmitted over the communication channel, which significantly reduces delays in the transmission of information.
It should be noted that the client will be informed of the arrival of a new message only after all the frames have been received and the original message payload has been reconstructed.
Different WebSocket protocol URLs
Above, we mentioned that WebSocket uses a new URL scheme. In fact, there are two of them:
ws://
and
wss://
.
When building URLs, certain rules are used. A feature of the WebSocket URL is that they do not support anchors (
#sample_anchor
).
Otherwise, the same rules apply to WebSocket URLs as to HTTP URLs. When using ws addresses, the connection is unencrypted, the default port is 80. When using wss, TLS encryption is required and port 443 is used.
The protocol of working with frames
Let's take a closer look at the protocol for working with WebSocket frames. Here's what you can learn about the frame structure from the relevant
RFC :
If we talk about the standardized version of WebSocket RFC, then we can say that at the beginning of each package there is a small header. However, it is quite difficult. Here is a description of its components:
fin
(1 bit): indicates whether this frame is the last frame completing the transmission of the message. Most often, a single frame is enough to send a message, and this bit is always set. Experiments have shown that Firefox creates a second frame after the message size exceeds 32 KB.
rsv1
, rsv2
, rsv3
(each for the 1st bit): these fields should be set to 0 only if no agreement was reached on extensions, which will determine the meaning of their nonzero values. If a non-zero value is set in one of these fields and no agreement was reached on the meaning of this value, the recipient must declare the connection invalid.
opcode
(4 bits): The frame contents are encoded here. The following values ​​are currently used:
0x00
: in this frame is the next part of the transmitted message.
0x01
: This frame contains text data.
0x02
: there are binary data in this frame.
0x08
: this frame terminates the connection.
0x09
: this is a ping frame.
0x0a
: This is a pong frame.
As you can see, there are enough unused values. They are reserved for the future.
mask
(1 bit): indicates that the frame is masked. Now it is the case that each message from the client to the server must be masked, otherwise the specifications prescribe to break the connection.
payload_len
(7 bits): payload length. WebSocket frames support the following methods for specifying payload sizes. A value of 0-125 indicates the length of the payload. 126 means the next two bytes mean size. 127 means that the next 8 bytes contain the size information. As a result, the length of the payload can be written in approximately 7 bits, or 16, or 64 bits.
masking-key
(32 bits): all frames sent from the client to the server are masked using the 32-bit value that is contained in the frame.
payload
: framed data that is masked for sure. Their length corresponds to what is specified in payload_len
.
Why is the WebSocket protocol based on frames, not threads? If you know the answer to this question - you can share it in the comments. In addition,
here is an interesting discussion on this topic at HackerNews.
Data in frames
As already mentioned, the data can be divided into multiple frames. In the first frame, from which data transfer begins, in the
opcode
field, the type of data to be transferred is specified. This is necessary, since in JavaScript, it can be said, there was no support for binary data when starting work on the WebSockets specification. Code
0x01
indicates UTF-8 encoded data, code
0x02
used for binary data. Often JSON data is sent in WebSocket packages, for which the
opcode
field is usually set as for text. When transferring binary data, they will be presented in the form of Web-specific
Blob entities.
The API for transferring data using the WebSocket protocol is very simple:
var socket = new WebSocket('ws://websocket.example.com'); socket.onopen = function(event) { socket.send('Some message');
When, on the client side, WebSocket receives data, a
message
event is raised. This event has a
data
property that can be used to work with message content.
// , . socket.onmessage = function(event) { var message = event.data; console.log(message); };
To find out what's inside the frames of a WebSocket connection, you can use the Network tab of the Chrome developer tools:

Data fragmentation
The payload may be split into several separate frames. It is assumed that the receiving side will buffer the frames until a frame with the
fin
header field is set. As a result, for example, the “Hello World” message can be transmitted in 11 frames, each of which carries 1 byte of the payload and 6 bytes of header data. Control packet fragmentation is prohibited. However, the specification makes it possible to handle
alternating control frames. This is necessary if TCP packets arrive in a random order.
The logic of combining frames, in general, is as follows:
- Take the first frame.
- Remember the value of the
opcode
field.
- Accept other frames and combine the frame payload until a frame is received with the
fin
bit set.
- Check that the
opcode
field for all frames except the first one is set to zero.
The main purpose of fragmentation is to allow the sending of messages, the size of which is unknown at the time the data is sent.
Due to fragmentation, the server can pick up a buffer of a reasonable size, and when the buffer is full, send data to the network. The second use of fragmentation is multiplexing, when it is undesirable for a message to occupy the entire logical communication channel. As a result, for multiplexing purposes, you need to be able to break up messages into smaller fragments in order to better organize channel sharing.
About heartbeat messages
At any time after the handshake procedure, either the client or the server may decide to send a ping message to the other party. When receiving such a message, the recipient should send, as soon as possible, a pong-message. This is the heartbeat message. They can be used to check if the client is still connected to the server.
The ping and pong messages are just control frames. For ping messages, the
opcode
field is set to
0x9
, for pong messages, to
0xA
. When receiving a ping message, in response, you must send a pong message containing the same payload as the ping message (for such messages, the maximum payload length is 125). In addition, you can receive a pong message without sending a ping message before. Such messages can simply be ignored.
This messaging scheme can be very useful. There are services (like load balancers) that stop idle connections.
In addition, one of the parties cannot, without additional efforts, find out that the other party has completed the work. Only the next time you send the data you can find out that something went wrong.
Error processing
You can handle errors while working with WebSocket connections by subscribing to the
error
event. It looks like this:
var socket = new WebSocket('ws://websocket.example.com'); // . socket.onerror = function(error) { console.log('WebSocket Error: ' + error); };
Connection closure
In order to close the connection, either the client or the server must send a control frame with the
opcode
field set to
0x8
. Upon receipt of a similar frame, the other party, in response, sends a frame to close the connection. The first side then closes the connection. Thus, data obtained after closing the connection is discarded.
Here's how to initiate a WebSocket close operation on the client:
// , . if (socket.readyState === WebSocket.OPEN) { socket.close(); }
In addition, in order to perform a cleanup after completing the closure of a connection, you can subscribe to the
close
event:
// . socket.onclose = function(event) { console.log('Disconnected from WebSocket.'); };
The server needs to listen for the
close
event in order to process it, if necessary:
connection.on('close', function(reasonCode, description) {
Comparison of WebSocket and HTTP / 2 technologies
Although HTTP / 2 offers many features, this technology cannot completely replace existing push technologies and streaming data transfer methods.
The first thing that is important to know about HTTP / 2, is that it is not a replacement for everything that is in HTTP. The types of requests, status codes and most of the headers remain the same as when using HTTP. HTTP / 2 innovations consist in increasing the efficiency of data transmission over the network.
If we compare HTTP / 2 and WebSocket, we will see many similarities.
Indicator
| HTTP / 2
| Websocket
|
Header compression
| Yes (HPACK)
| Not
|
Transfer of binary data
| Yes
| Yes (binary or text)
|
Multiplexing
| Yes
| Yes
|
Prioritization
| Yes
| Not
|
Compression
| Yes
| Yes
|
Direction
| Client / Server and Server Push
| Bidirectional data transfer
|
Full duplex mode
| Yes
| Yes
|
As already mentioned, HTTP / 2 introduces Server Push technology, which allows the server to send data to the client cache on its own initiative. However, when using this technology, data cannot be sent directly to the application. The data sent by the server on its own initiative processes the browser, while there are no APIs that allow, for example, notifying the application of the receipt of data from the server and responding to this event.
It is in this situation that Server-Sent Events (SSE) technology turns out to be very useful. SSE is a mechanism that allows a server to asynchronously send data to a client after establishing a client-server connection.
After connection, the server can send data at its discretion, for example, when the next data fragment is ready for transmission. This mechanism can be thought of as a one-way
publisher-subscriber model. In addition, as part of this technology, there is a standard JavaScript client API, called
EventSource
, implemented in most modern browsers as part of the HTML5
W3C standard . Note that for browsers that do not support the
EventSource API, there are polyfills.
Since SSE is based on HTTP, it goes well with HTTP / 2. It can be combined with some HTTP / 2 capabilities, which opens up additional perspectives. Namely, HTTP / 2 provides an efficient transport layer based on multiplexed channels, and SSE provides applications with an API for transferring data from the server.
To fully understand the capabilities of multiplexing and streaming, take a look at the IETF definition:
“stream” is an independent, bidirectional sequence of frames transmitted between the client and the server as part of the HTTP / 2 connection. One of its main characteristics is that one HTTP / 2 connection can contain several simultaneously open streams, and any endpoint can handle alternating frames from several streams .

SSE technology is based on HTTP. This means that using HTTP / 2, not only several SSE streams can transmit data in one TCP connection, but the same can be done with a combination of several sets of SSE streams (sending data to the client at the initiative of the server) and several client requests (leaving the server).
Thanks to HTTP / 2 and SSE, there is now the possibility of organizing bidirectional connections based solely on the capabilities of HTTP, and there is a simple API that allows data from servers to be processed in client applications. Insufficient bi-directional data transfer capabilities were often considered a major disadvantage when comparing SSE and WebSocket. Thanks to HTTP / 2, this flaw no longer exists. This opens up possibilities for building data exchange systems between server and client parts of applications using only HTTP capabilities, without using WebSocket technology.
WebSocket and HTTP / 2. What to choose?
Despite the extremely widespread use of the HTTP / 2 + SSE bundle, the WebSocket technology will most certainly not disappear, mainly due to the fact that it is well mastered and due to the fact that in very specific cases it has advantages over HTTP / 2, since it was created to provide two-way data exchange with less additional load on the system (for example, this concerns headers).
Suppose you want to create an online game that needs to transfer a huge number of messages between clients and the server. In this case, WebSocket is much better suited than the combination of HTTP / 2 and SSE.
In general, we can recommend using WebSocket for cases when you need a really low level of delays, approaching, when organizing communication between the client and the server, to real-time data exchange. Remember that such an approach may require rethinking how the server part of the application is built, as well as the fact that you may need to pay attention to other technologies, such as event queues.
If you need, for example, to show users real-time news or market data, or you are creating a chat application, using the HTTP / 2 + SSE connection will give you an effective bidirectional communication channel, and, at the same time, the benefits of working with technologies from the world. HTTP , WebSocket , -, HTTP- , HTTP . , . - (, , ) , HTTP. , , , HTTP-.
, . , WebSocket:

. , HTTP/2 :

HTTP/2 :
- HTTP/2 TLS (, , ).
- IE 11, Windows 10.
- OSX 10.11+ Safari.
- HTTP/2 , ALPN ( ).
SSE, , :

IE/Edge. (, Opera Mini SSE, WebSocket, , , .) , IE/Edge .
Results
, WebSockets HTTP/2+SSE , , , . - ? , . , , . , , , SessionStack, , , WebSockets, HTTP.
SessionStack -, DOM, , , JS-, , , , , . -. . SessionStack HTTP, ( ). WebSocket , . , SessionStack, -, WebSocket, , WebSocket , HTTP.
SessionStack. , , , , , , WebSocket.
Dear readers! WebSocket HTTP/2+SSE? — , , , , .