Handling thousands of requests per second using the XBT Tracker example

Recently we conducted a test, the results of which showed that one application processes 2000 requests per second on a modest server, where it was not the only load. In this case, the result of each query is recorded in 3-5 tables in MySQL. Honestly, I was surprised by this result, so I decided to share with the community about the description of the architecture of this application. This approach is applicable from banner shows to chat rooms and microblogging, I hope someone will find it interesting.

First, this application is single threaded. Everything is done by a single process, working with sockets - non-blocking epoll / select, no threads waiting for I / O. With the development of HTTP, first the appearance of Keep-Alive, then AJAX and the increasing popularity of COMET, the number of persistent connections to the web server grows, on loaded projects it is measured by thousands and even tens of thousands, and if everyone creates their own thread with their own stack and constantly switching between them - the server resources very quickly will not be enough.

The second key point is that one SELECT ... WHERE pk in (k1, k2, ..., kN) is faster than several SELECT ... WHERE pk = ... While working with the database in large batches, you can reduce not only the number of queries per second, but and total load.
')

Subject area

XBT Tracker (XBTT) - Bittorrent tracker. Please refrain from the topic of copyright, for the torrent is officially used, for example, to distribute Linux distributions and patches to World of Warcraft. Unlike ed2k and DC ++, it is possible to put several files into one torrent, without packing them into the archive, and at any time check the integrity of the file, and, if necessary, restore it by downloading broken pieces.

When downloading, the client periodically calls the tracker, reporting traffic statistics and getting the addresses of other distributors and downloads. The more often such calls, the more accurate the traffic accounting (if it is a closed tracker) and the sooner the new distribution participants learn about each other.

XBT Tracker, about which this post, is written in C ++ and is used on many foreign trackers, both open and closed, and even on a couple of Russian ones. Another high-performance tracker, OpenTracker , does not support closed trackers with regard to traffic, so it does not need to write the results of queries to the database, therefore in this context it is less interesting.

Non-blocking I / O

In the 90s, blocking I / O was used when working with sockets, when the current thread “hung” when calling recv and send methods before waiting for results. For each accepted connection, a separate process was created (fork) in which its request was processed. But each process requires memory for the stack and processor time for context switching between processes. On small loads, this is not terrible, and the web was not interactive at that time, it was not enough in the request-response mode, there was little dynamic context (CGI), mostly - counters of page visits and primitive non-forums. Apache still works this way. In apache2 it is possible to use lighter threads (threads) instead of processes, but the essence remains the same.

As an alternative to this, non-blocking I / O appeared, when one process could open multiple sockets, periodically poll their state, and if there were any events, for example, a new connection arrived or data was received for reading, to serve them. This is exactly how nginx works. In Java version 1.4 and higher, there is NIO for this.

Later, there were improvements, for example, TCP_DEFER_ACCEPT, which allows you to "postpone" the reception of the connection until the data came over it, SO_ACCEPTFILTER, postponing the connection until a full HTTP request arrived. Now it is possible to increase the queue length of unaccepted connections (by default, there are only 128) using sysctl kern.ipc.somaxconn in BSD and sysctl net.core.somaxconn in Linux, which is especially important if there are pauses in processing sockets.

Service requests

Requests in XBTT are very simple, their processing does not require special computational resources, it keeps all the necessary data in memory, so there are no problems to execute them in the same process as working with sockets. In the case of more serious tasks, it is still necessary to create separate threads for their maintenance.

One of the outputs is the creation of a thread pool (thread pool) to which the request is sent for processing, after which the thread is returned to the pool. If there are no free threads, the request waits in the queue. This approach allows you to reduce the total number of threads used, and each time you don’t have to create a new one and kill it after the request has been processed.

An even better mechanism called “actors” (actors) is in Erlang (erlang) and Scala (scala) languages, perhaps - in the form of libraries - for other languages as well. Processing is performed by asynchronous message transfer between actors, which can be imagined as sending emails with an incoming mailbox for everyone, but this topic is beyond the scope of this post (for example, here is a fresh post about it).

Batch database operation

The result of each appeal to the XBTT tracker is recorded in several tables. The user increases his downloaded and flooded traffic. Increases statistics from the torrent. The table of the current distribution participants is filled. Plus a couple of utility tables with download history.

With the traditional method of processing for each request to the tracker, at least 3 separate INSERT or UPDATE would be executed, the client would wait for their execution, thus the database server would have to execute 3 requests for each call to the tracker.

XBTT does not execute them immediately, but accumulates a large bundle of INSERT ... VALUES (...), (...). ..., (...) ON DUPLICATE KEY UPDATE f1 = VALUES (f1), ..., fN = VALUES (fN), and executes every few seconds. Due to this, the number of requests to the database is reduced from several per tracker request to several per minute. Also, he periodically re-reads the necessary data, which may have changed from the outside (the web interface is independent of the tracker).

Posting severity

In this application, the loss of data is not critical at all. If no torrent traffic statistics are recorded in the database within a few seconds, nothing terrible will happen. Although at abnormal termination, it writes the accumulated buffers to the database, the server may have a UPS in case of a power outage, etc. - guarantees that all the data transmitted by the client is not recorded on the disk. For a banner network, this is also not scary, but there are tasks where saving all data is critical.

Similarly, not all applications have the ability to store all data in memory. To process a client request, it may be necessary to retrieve data from the database.

But in this case, block data processing is possible. A pipeline is organized (pipeline; actors are perfectly suited for its implementation) from several stages, each group is assembled for the request as soon as a sufficient amount (of course, adjustable) has accumulated or some time has passed (for example, 10-100 milliseconds), during which the required number was not accumulated, - a group query is made to the database, where instead of “key = value” the condition “key IN (accumulated list)” is set.

If you need to lock (lock) these records, then you can add FOR UPDATE SKIP LOCKED to the query (of course, you will need to record the results in the same connection to the database, the same transaction). You can use the Prepared Statement in those databases that support it, to perform a query analysis (parse) once, select the optimal execution plan, and then only insert data into it every time. To reduce the number of such requests prepared, the number of parameters to it can be taken only by powers of two: 1, 2, 4, 8, 16, 32 ... It is also possible to group (batch) requests without first performing each, but only adding to the packet, and then do it all at once.

Source: https://habr.com/ru/post/53360/

All Articles