Ask Badoo backend developers. Part 1. Platform

We really like the AMA format (ask me anything) on Reddit, when someone (in our case, the development team) comes to the AMA subreddit and says that he is ready to answer the questions asked. Of the most memorable Ask Me Anything sessions, for example, the Space X engineering team, or Google engineers , and even current US President Barack Obama answered questions on Reddite four years ago. Recently, our Android team conducted the AMA and answered questions from developers online.

But in Russia there is no Reddit. But there is a Habr. Therefore, we decided to come with the format “ask us a question” here. And not empty-handed, as the AMA rules dictate. To make it easier for you to understand the topic, we chose one of our teams - “Platform” - and asked the guys to tell them what they do, what they program with and what they have achieved during the existence of the team. And summed up the small results of the outgoing 2016. Go!
')

1. What does the "Platform"
2. Services: Pinba, SoftMocks and others
3. System programming. How we started using Go and what it led to
4. Photos
5. Scripting cloud
6. LSD: Live Streaming Daemon
7. Cassandra Time Series: what it is and how it works
8. Badoo AMA: ask the developers of the "Platform"

Proof that it really is us.

What does the "Platform"

Anton Povarov , einstein_man , head of the Platform
Mikhail Kurmaev , demi_urg , head of A-Team

The “platform” is an infrastructure team that helps other divisions. Our main goal is to ensure that everything is good, that everything works, programmers are happy and can safely write code without thinking about all sorts of complicated things. We are the backend for the backend.

The “platform” consists of two teams: the C-programmers team (they write in C, but lately they also go) and the A-Team (they write in PHP and in some places on Go too). Sishniki write services, make PHP extensions. A-Team is engaged in PHP and database-infrastructure, as well as the development and support of tools for other teams.

If we talk about specific projects from the point of view of what the user sees (what he uses), then we deal with:

photos: the entire infrastructure of storage and return lies on us;
we have our own script cloud (this is “distributed cron”, which allows us to run offline handlers on a cloud of machines);
we are responsible for all the “wrappers” to services (we provide other Badoo teams with convenient access to services, because we have sharding, we have self-signed replication methods).

We are responsible for the “wrappers” because we want to hide all these insides from the backend-developers of other teams in order to simplify their work and prevent unforeseen situations when everything that we do from the main tasks suddenly broke.

Services: Pinba, SoftMocks and others

Some of our internal services have grown over time into full-fledged products and even become the de facto standard in the PHP ecosystem. The most famous of them is PHP-FPM (now it has become part of the standard PHP delivery for the web), it was written by Andrey Nigmatulin (a report on this topic can be found here ), and Pinba (http://pinba.org/), a service for get realtime statistics from running applications without the overhead of its collection (UDP), with which it is easy to understand what is happening with the performance of your application at any time.

Pinba is convenient because it allows us to collect data all the time. And this data will be at hand whenever you need to understand the cause of the problem. This is convenient and significantly reduces the time spent searching for and fixing the problem. No less important is the fact that Pinba helps to see the problem in advance when it has not yet affected users.

We also invented and made SoftMocks, this is our own framework that facilitates unit-testing, which allows you to replace classes and functions in tests. We had to create it in connection with the transition to PHP7, in which the internal architecture of the interpreter was greatly reworked, and our old Runkit just stopped working. At that time, we had about 50k unit tests, most of which somehow use mocks to isolate the external environment, and we decided to try a different approach, potentially more powerful than Runkit / Uopz.

One of the main advantages of SoftMocks is independence from the internal structure of the interpreter and the absence of the need for any third-party PHP extensions. This is achieved due to the approach we have chosen - rewriting the source code of the program on the fly, rather than a dynamic substitution within the interpreter. At the moment the project is laid out in open-source , and anyone can use it.

You may know that we have a very strong team of PHP developers in Badoo. Therefore, there is nothing surprising in the fact that we were among the first companies that transferred a project of this magnitude (like Badoo) to PHP 7 in this, 2016, year. You can read about how we came to this, what we encountered and what we received, in this post .

System Programming. How we started using Go and what it led to

Marco Kevac , mkevac , programmer in the C / C ++ department

In the C / C ++ department, we develop high-performance in-memory daemons that process hundreds of thousands of requests per second and store hundreds of gigabytes of data in memory. Among them, you can find such things as search demons that use bitmap indexes and search them using a self-written JIT, or a smart proxy that processes connections and requests from all of our mobile clients. If necessary, we expand the PHP language to our needs. Some patches are sent to the upstream, some are too specific for us, and some things can be done in the form of loadable modules. We write and support modules for NGINX that deal with such things as encrypting URLs and data and fast photo processing on the fly.

We are hardcore system programmers, but at the same time we are well aware of all the drawbacks of C / C ++ programming: slow development, potential errors, complexity of programming using streams.

Since the appearance of Go, the new-fangled, youthful and promising language from Google, we have become interested in it. And almost immediately after the release of the first stable version in 2012, we began to consider the possibility of its use in production.

Go promised to be close in spirit and performance to our beloved C, but he allowed us to make prototypes and even final products noticeably faster and with fewer errors. And the fact that Go was synonymous with competitiveness with its channels and gorutines, especially excited our imagination.

At that moment we had a new cool and very urgent task of finding intersections between people in the real world. Having listened to the requirements, we almost exclaimed in chorus: “This is a task for Go!” It was necessary to stream a large number of user coordinates, correctly cross them in several “coordinates”, including time, and produce some result. There are a lot of interactions between parts, a lot of parallel computations. In short, exactly what is the basic task for Go.

The prototype was made by three people in a week. He worked. He worked well. And we understood that Go will take root. In 2015, Anton Povarov described in detail in his speech about Go in Badoo.

But one cannot say that our novel was perfect. Go at that time was a very young language with a bunch of problems, and we immediately started writing products that processed tens of thousands of requests per second and consumed almost 100 gigabytes of memory.

We had to optimize our services in order not to make extra memory allocations directly and so that the Go compiler would not decide to do these allocations for us. And here the beauty and convenience of Go proved themselves again. From the very beginning, Go had excellent tools for profiling performance, memory consumption, in order to see when the compiler decided to allocate some piece on the heap, and not on the stack. The availability of these tools made the optimization an interesting and informative adventure, not a flour.

In the first project, we had to use the existing library for geological computing written in C. So we plunged into the thick of the problems and nuances of the interaction of these two languages.

Since Go was a bottom-up initiative, we had to make sure that our colleagues and managers did not reject our idea right away. We understood that it was necessary to make so that, from the side of operation, the project on Go would not differ in any way from the project on C: the same configs in JSON, the same interaction protocols (main protobuf and additional on JSON), the same standard statistics that goes to Rrd. We had to make sure that the release of the engineering project on Go was no different from the project on C: the same Git + TeamCity Flow, the same build in TeamCity, the same layout process. And we did it.

Administrators, maintenance and release engineers do not think about what the project was written on. We realized that now we can not hesitate to use new tools, since they showed themselves perfectly in practice (in non-critical tasks, as it should be for the beginning).

We didn’t create anything from scratch — we built Go into the existing infrastructure for many years. This fact limited us to use some things that are standard for Go. But it is this fact, coupled with the fact that we immediately began to write a serious high-loaded project, allowed us to plunge into the language of the ears. We noticeably got dirty, I tell you, but this closeness helped us to “grow together” with this beautiful language.

It was interesting to observe how, with each version of Go coming out, grew like a child turning into an adult. We saw how GC pauses on our demons melted with each new version, and this without changing the code on our part!

Now, after four years of working with this language, we have about a dozen of the most diverse services on Go in three teams and a few more new ones in the plans. Go firmly in our arsenal. We know how to “prepare” it and when to use it. After so many years, it is interesting to hear how programmers regularly say things like “yes, quickly throw a prototype on Go” or “there are so much parallelism and interactions, this is work for Go”.

Photo

Artyom Denisov , bo0rsh201 , senior PHP programmer

Photos are one of the key components of Badoo in terms of product, and we just have to pay a lot of attention to their storage and display infrastructure. At the moment we store about 3 PB photos, every day users upload about 3.5 million new images, and the reading load is about 80k req / sec on each site.

Conceptually, it is structured as follows. We have three points of presence in three data centers (in Miami, Prague and Hong Kong), which provide locality to most of our target markets.

The first layer of the infrastructure is cache servers with fast SSD drives that process 98% of incoming traffic, our own mini-CDN runs on them - this is a proxy cache optimized for our nature of workload, which also has a lot of utilitarian / grocery logic (ACL, resize, overlay filters and watermarks on the fly, circuit breaker, etc.).

The next layer is a cluster of pairs of servers responsible for long-term storage,
some of which have local disks that directly store photos, and some are connected optically to the third layer, the Storage Area Network.

These pairs of machines serve the same user ranges and operate in master-master mode, fully replicating and reserving each other through an asynchronous queue. The presence of such pairs allows us to have fault tolerance not only at the level of hard drives, but also at the level of physical hosts (kernel panic, reboot, blackout, etc.), and also it is easy to carry out planned work and to experience failures without major degradation of the service. Scales are not uncommon.

Artem Denisov told in more detail about our work with photos this year in Highload ++ .

Script cloud

It is not a secret for anyone that in any project, besides actions that are performed in the context of a user's request, there are a large number of background tasks performed deferred or on a specific schedule. Usually, some kind of background worker's scheduler is used to start them (in the simplest case, this is cron).

With the increase in the number of such tasks and the amount of resources they consume, which gradually cease to fit into one, and sometimes several dozen physical machines, it becomes more difficult to manage these crowns and balance the load manually for each node from the cluster. Thus, it became necessary to create our cloud - a flexible infrastructure for transparently launching development tasks.

It works something like this:

1) The developer describes a job as a PHP class that implements one of several interfaces (cron script, queue disassembler, database crawler, etc ...

2) Adds it via the web interface to the cloud, selects parameters for launch frequency, timeouts and resource limits.

3) Next, the system itself runs this job on a distributed infrastructure that is allocated to the cloud, monitors its implementation and balances the load on the cluster. The developer can only monitor the job status of his job and watch logs via the Web UI (how many instances are running, what settings, what starts as completed).

At the moment in the cloud we have about 2000 hosts in two DC ~ 48k CPU cores / 84Tb memory. 1800 user tasks generate about 4,000 starts per second.

We talked about the cloud here and here .

LSD: Live Streaming Daemon

Everyone who works with large amounts of data, one way or another is faced with the task of streaming. As a rule, we stream some data from a large number of different sources into one place in order to process them there centrally. Moreover, the type of this data often does not matter: we stream application logs, statistics, user events, and much more. Conceptually, we use two different approaches to solve this problem:

1) Our own implementation of a queue server for delivering events related to the logic of the product / application.

2) A simpler mechanism for streaming various logs, statistical metrics and just large amounts of data from a variety of nodes that need to be centrally aggregated and processed in large batches in one place.

For the second task, we used Scribe from Facebook for a long time, but with an increase in the amount of data being pumped through it, it became less and less predictable and had long been forgotten.

As a result, at some point it became more profitable for us to write our solution (the benefit of this task does not look very difficult), which would be easier to maintain.

We called LSD: Live Streaming Daemon our own stream of events.

Key features of LSD:

transport in the form of lines from plain files (for the client there is nothing more reliable than writing data to a local FS file);
Clients are not blocked during recording, even if all the receiving servers are not accessible (the buffer accumulates in the local FS);
transparent control / setting limits on the consumption of network / disk resources;
scribe-compatible recording format / file aggregation method on the receiver.

This year we published the LSD source code, and now you can use it in your projects.

Cassandra Time Series: what it is and how it works

Evgeny Guguchkin , che , senior PHP programmer

Badoo is a complex system consisting of many related components. Assessing the status of this system is not an easy task. In order to do this, we collect more than 250 million metrics at a speed of about 200,000 values per second, and this data takes up about 10 TB.

Historically, for storing and visualizing time series, we used the well-known utility RRDtool, “wrapping” it with our own framework for convenience.

What we liked about RRDtool was reading speed. However, there are serious drawbacks:

high load on the disks generated by a large number of random access I / O operations (we solved this problem using SSD and RRDcached);
the inability to write backdating: that is, if we recorded the value at the moment of time 2016-01-01 00:00:00, then we should not write down the value for 2015-12-31 23:59:59);
rather wasteful disk space usage for sparse data;
data access is carried out locally: it is impossible to build a distributed system with horizontal scaling out of the box.

It was the last point that became decisive for us, because without it we could not display metrics from different servers on one chart.

As a result, we conducted a detailed analysis of existing solutions in the field of time series databases, made sure that none of them fit us, and wrote our solution based on Cassandra.

At the moment, half of our real data is duplicated in a new repository. In numbers, it looks like this:

nine servers;
10 TB of data;
100,000 values per second;
140 million metrics.

At the same time, we solved almost all the tasks that we faced:

the failure of one node in the cluster does not block either reading or writing;
it is possible to add, as well as rewrite, “raw” data, if they are not older than a week (the window in which data can be changed, can be configured and changed during operation);
simple cluster scaling;
no need to use expensive SSD drives;
Moreover, it is possible not to use RAID-arrays from the HDD with redundancy, since when replacing a disk, the node is able to recover data from neighboring replicas.

We are very proud of the work done to analyze existing solutions and build a new one. We attacked countless rakes while working with Cassandra and will be happy to answer your questions and share our experience.

Badoo AMA: Ask a Question to Platform Developers

And now, in fact, why we publish this post. Today, from 12:00 to 19:00 (Moscow time), the team "Platforms", we will answer your questions. We experienced a lot of things during the existence of the team: we expanded, changed, learned, faced with some problems, came to new programming languages. And we are ready to share our experience with you (including telling about feils, fakapy and our pain).

For example, ask about:

the structure of our internal projects (how our sharding is arranged, how we collect and draw statistics, how we communicate with services);
what rake we attacked, why we made certain decisions (what we did when we no longer fit on one server, in one DC);
setting up a strong PHP / C-programmers team;
transition to PHP 7 (what problems you may encounter);
features of work with a high-loaded project;
recommendations for PHP and Go programmers;
all that is described above in the post.

But do not limit yourself to this!

UPD: Thank you all for the questions, we end our AMA session, but we will continue to respond, but not so quickly. Therefore - ask more.

Source: https://habr.com/ru/post/317442/

All Articles