⬆️ ⬇️

The evolution of architecture: from "samopisnyh" services to HandlerSocket





Today we will talk about how the approach to the design of loaded “key-value” services has changed in Badoo. You will find out by what scheme such services were created by us several years ago (using the database as a repository and a specialized daemon as an interface to the data), what difficulties we encountered and what architecture as a result we came to resolve the problems that appeared.



Modern Internet projects are actively using internal services that allow access to values ​​by key. It can be both ready-made solutions, and own development. Since 2006, a number of such services have been created by Badoo specialists, including:





Despite their diversity, a unified design approach was applied, according to which the service should consist of the following components:

  1. Database repository that stores the reference version of the data.
  2. A fast daemon in C or C ++ that processes requests for data and is updated with the repository database.
  3. PHP classes that work with the daemon and the repository database.


It was important for us that any of the services could handle a large number of simultaneous requests, so a solution based on only one MySQL was not suitable. Hence, an additional component appeared in the form of a fast daemon, which could not be replaced by memcached, because we needed to use specific data indexes.

')

At the end of 2010, the HandlerSocket MySQL plug-in, written by a Japanese craftsman , providing NoSQL an interface to the data stored in MySQL, became increasingly popular. In the spring of 2011, Badoo experts turned their attention to a new technology, hoping with its help to simplify the development and support of the key-value services of the company.



“First victim”



A great number of users who receive various e-mail notifications have been registered in the network for finding new friends of Badoo. And each of these users has the opportunity to select the type of notifications that they would like to receive. Accordingly, it is necessary to store mail settings somewhere and provide access to this data. Over 99% of requests to them are read requests. The main "readers" are script-generators and email senders, who, on the basis of these data, decide whether or not to send correspondence of a certain type to a specific user. At first, only the database was used to store data, but it ceased to cope with the ever-increasing number of read requests. To remove this load, a special daemon was created - EmailNotification.



“Old school” service implementation



The key component of the service was the C-daemon, which stores all the settings and gives them access to read and write via the simplest protocol over TCP. In addition, this daemon collected statistics of requests, on which we built graphs.



Initially, the service architecture was fairly simple and looked like this:





The settings were constantly stored in the database on one DB-server, and the C-daemon worked on another. At the start, the daemon selected all data from the database and built an index ( judy arrays ) on them. The initialization process took about 15 minutes, but since this operation was required only a few times during the entire service life, this was not a significant drawback. In the process, clients (CLI-scripts, web and other services) addressed a daemon through a special API, for example, asking whether the user could send this email or not, and the built-in logic searched the daemon in its memory for setting and responding . The client writers gave the daemon the command to change certain settings for a specific user.



The task of writing data to MySQL was entirely left to the EmailNotification API. At the same time, potentially out of synchronization of data could occur, for example, when the recording successfully passed to the database, but did not go to the daemon, or vice versa. Nevertheless, the service worked fine. Until in 2007 Badoo had a “little trouble”, namely, a second data center appeared, geographically distant from the first and intended to serve the users of the New World. It immediately became clear that the usual duplication of the architectural solution on the new site would not be possible. Since emails to the same user can be sent from both sites, it is required that both services operate on the same data.



Fortunately, especially for such cases within the company there is a system of CPQ-events (CPQ - Cross Platform Queue, cross-platform queues. - Note of the author), which allows you to quickly and, most importantly, with guarantee and in a predetermined sequence to transmit information about the events between sites. As a result, at two sites, the service architecture took the following form:





Now, any write requests were sent not only to the base and C-demon of the local site, but also to CPQ. CPQ sent the request to the adjacent queue of another site, and she was already replaying the write request via the EmailNotification API at the same site.



The system became more complicated, but nevertheless continued to work stably for several years. And everything would be fine if there were no discrepancies in the data at the sites. The two bases of the available sites had a different number of settings. And although the difference was less than 0.1%, the feeling of “cleanliness and security” was gone. Moreover, we found that the difference appeared not only between the bases of the sites, but it was present between the base and the C-demon within the same platform. I had to think about how to make the service more reliable.



New approach



Initially, there were two basic requirements for the EmailNotification service: first, the high speed of processing read requests, which the C-daemon did perfectly well; the second is the identity of the data at both sites, with which there were problems. Instead of fighting over synchronization, we decided to completely redo the service architecture, taking the path of simplifying it:





First of all, we connected the HandlerSocket plugin to MySQL and taught our API to work through it with the database. Thanks to this, we were able to abandon the use of C-demon. Then, by simplifying the API, we removed the CPQ service from the schema, replacing it with the well-proven “master-master” replication between sites. As a result, we got a very simple and reliable scheme, which has the following advantages:

  1. Replication is carried out transparently; no code is required that works with an internal CPQ service. At the same time, the delay in transferring updates between sites was reduced from a few seconds to fractions of a second.
  2. Atomic data recording (finally!). If the EmailNotification API request for writing to HandlerSocket is completed successfully, then the task is completed, the recording is exactly duplicated at another site, and we do not need to inform any other components about it.


Did we have a problem when switching to a new scheme? Serious - no. AUTO_INCREMENT is already supported by the HandlerSocket plugin, composite indexes are working, ENUMs, too, all I had to do was abandon the CURRENT_TIMESTAMP default values ​​for one of the timestamp fields.



As you know, the advantage of HandlerSocket is not in its speed - it is more likely acceptable than impressive, but in its ability to work stably under a large number of requests per unit of time. Given that now the service serves only 2 - 2.5 thousand requests per second on one site, then we have a large margin of safety.



But for those who are especially interested in the speed of the HandlerSocket plugin, we will present a graph with an average execution time of three commands: connecting to the HandlerSocket, opening the index and retrieving data on it (values ​​on the Y axis in milliseconds):





Final word



Over the past year, Badoo has tended to use HandlerSocket as a “key-value” repository with persistent data storage. This allows us to write more simple and understandable code, relieves C-programmers from working on trivial tasks and significantly simplifies support. And while everything says that the movement in this direction will continue.



But do not think that using HandlerSocket inside Badoo is limited to the simplest tasks. For example, we have a wide experience of its use for solving problems with a predominance of the recording function, where a number of nuances are manifested under a really heavy load. If you are interested in the details - comment, ask, and we will definitely continue this topic with new articles.



Thank!

Source: https://habr.com/ru/post/141945/



All Articles