Report - overview of the capabilities and architecture of CppComet comets
This is the text of the report and a video of my presentation from the conference rumeetup.ru given in an easy-to-read form. I also withdrew part of the introduction so as not to take time away from my readers for lyrical digressions about the reasons that prompted me to start developing my open source server comets from scratch in C ++. Performance video
Entry not containing technically important details
Slide 1 - Greeting
Slide 2 - a little about the project
5 years ago I attended the highload ++ 2012 conference where I had the idea to create a comet server. I had no objective reasons for doing this. I just liked this idea. Especially since I had some groundwork from past projects that I could apply. ')
Since then, I have spent almost 2000 hours on this project. For which I wrote the SaaS comets service and its open source counterpart in such a way that their API is compatible, which allows you to switch without code changes between the SaaS version and the open source version.
During this time I managed to write detailed documentation in Russian and English.
Despite the fact that I have been using CppComet in production for almost 4 years, I still actively develop it by gradually adding new functions. For example, last week added functionality clustering.
For 4 years I have been engaged in technical support for project users. And on the basis of each question, improved the API and documentation so that the following users were easier.
Slide 3 - Analogs
My project does not do something revolutionary that cannot be realized with the help of other tools. Therefore, on this slide I have compiled a list of competing projects.
Probably before each developer sooner or later there is the task of creating your own chat or implementing any realtime notifications.
And it’s good if the project is new and you can, at the design stage, consider where and what data should be updated in real-time.
But this is not always the case. Sometimes you need to add real-time data updates to an existing project.
Suppose we wanted to have comments on articles on the site updated from users immediately, and not after reloading the page.
The first and simplest solution is to request data about new comments from the ajax server once per second.
The solution is simple and quite working, some do. For example, in the admin area, where a maximum of 1 - 2 people sitting.
Slide 5
But when there are more than one visitors, this code is great if you want to arrange the ddos of your site even with relatively small attendance.
Slide 6
Instead of sending a lot of requests from the client to the server, it is more efficient to send new data to the client when it appears on the server.
In this scheme, clients from JavaScript connect to comets to the server via web socket and wait for notifications from the backend from it.
And the web server sends event notifications to the comets server, in the hope that these notifications will be delivered to the frontend.
From my own experience I will say that it is convenient to send data to the comets server immediately after this data has been recorded in the site’s database.
Slide 7
The server comets have two APIs, one for connecting via websockets from browsers. And the second interface for sending requests from the backend.
For JavaScript API, there is no particular choice in the protocols. We work on websockets. But the API for calls from the backend could be implemented in different ways.
Slide 8 - API
First, I did not hesitate to write a simple API for my protocol and a PHP client for it. But the resulting bike was simple to design and inconvenient to maintain. As a result, maintaining backward compatibility of all API versions for several platforms has proven to be a cumbersome task.
Against the background of its protocol, the work on all known REST API looks very attractive. But in practice, when working on the http protocol, for each request it is necessary to establish a network connection, execute the request and close the network connection. This is longer than the tcp option where all requests are performed within the same network connection.
And in the end I decided to follow the path that was implemented in SphinxSearch, I implemented the server part of the mysql protocol and it became possible to work with the comets server using mysql clients that are under every less popular language.
Slide 9 - CometQL API
The comets server has become pretending to be a mysql server. And the API has become similar to database access.
insert to send data to the client, select to get information from the comets of the server. The name of the table refers to the object on which the operation is performed. The name of the database symbolizes the version of the API we are working on. So far, version 1. Since backward compatibility has not been broken for more than 2 years.
Slide 10 - JavaScript API
The JavaScript API hides a lot of error handling code, clustering functions and various optimizations. For example, no matter how many tabs of one site we open, only one network connection will be established with the comets server and the remaining tabs will communicate with the comets server through a connection common to all.
Slide 11 - Channels for message delivery
To get information about any event in JavaScript, you must subscribe to this event.
For example, if we want to receive new comments on the page that we have just opened in the browser, then subscribe to the channel through which the server will notify us about anything.
After we subscribe, you can send data from the server to the channel to which we subscribed.
The name of the channel can be any, the main thing is that the server sent messages to the channel to which we subscribed from JS
Slide 12 - Receive Private Messages
In the past example, anyone who knew which channel to subscribe to could receive the message. But it is not always convenient.
For example, if we are writing a chat and want to deliver a message to someone specifically so that other users cannot receive this message, it’s more convenient to use a private message mechanism instead of channels.
To do this, subscribe to a special channel named “msg”. Only messages that are addressed to us personally will be sent to it. This feature is available after the user is authorized on comets server.
Slide 13 - Performance
I measure the performance and memory consumption with tsung on average memory consumption of less than 5 GB of RAM for 64,000 users . And at the same time, due to the fact that the comets server works in multi-threaded mode, it manages to use all available cores almost evenly.
The test is of course synthetic, the real workload in my production happens much less. In the meantime, up to 1500 people online. And as a result, the servers are only a few percent of their capacity.
Slide 14
As a result, we returned to almost the same amount of code that I gave on the first slide, but did everything correctly. And now they are able to withstand a load of approximately 5 GB of RAM per 64,000 online. Plus there are opportunities for clustering, scaling and fault tolerance.
I in the comets server provided the possibility of clustering in which each server in the cluster can accept requests and forward them to those servers in the cluster that need to be notified of the event.
Data insert operations (insert and set) are performed asynchronously, which means that you will not wait until the request is sent to all servers in the cluster.
Data fetch operations (select and show) work synchronously.
Slide 16 - Fault tolerance
If something breaks and the comets server becomes unavailable, then at best messages will stop coming to users.
If we do not have a cluster, then everything can obviously break at once with problems with the server. If a cluster of comets from servers, the failure of one node can also be painful if it is through this node that we are trying to send a request.
Slide 17
In the real world, mistakes sometimes happen. And to write such code for error handling is not the most pleasant thing. And most importantly, if some of the servers are unavailable, we will not connect at the first attempt. Therefore, you can include haproxy in the cluster schema for balancing requests between live nodes of the cluster.
Slide 18
In haproxy there is a mechanism for polling mysql servers and exclusions from the list of those that do not work. And it is also possible to set the proportion for load distribution between the nodes of the cluster. In javascript api is also built in the ability to connect to another node of the cluster if one of the nodes is not available. As a result, if you turn off the nodes of the cluster of comets of servers, then the performance will continue as long as there is at least one working node in the system.
Moreover, clients from web browsers do not even need to refresh the page. Everything will be hidden inside the javascript api.
Slide 19 - Project Development Plans
Slide 19 - Plans
The first is improvements in the cluster. Since I implemented the clustering mechanism only this autumn, I need to accumulate user experience with it. And the second is the addition of video chat functionality and video conferencing .
Already have a video chat demo. But the API is still not fully settled. And although video chat rooms can already be used. Not the fact that the following releases will have full backward compatibility for video chats.
In fact, my comet server is used on several dozen sites. Here are just some of the projects I know about. The first three have a simple, common chat for all participants.
The second line is a chat for a social network and a chat for a dating site.