
This Wednesday, our developers, graduates of MIPT, will hold a meeting with MIPT students and tell you how large projects are created and how to make Badoo on your own.
No marketing, PR and other bulshita. Only development, only hardcore!
Developers from the A-team department will communicate with students - they specialize in the development of infrastructure projects of the company. In Badoo, the A-team department builds scalable and fault-tolerant application platforms, develops cluster management applications, test / deploy code automation utilities, collects and researches tons of data to improve the quality and performance of multi-server production systems.
The work is carried out at the interface of applications for end users and system software.
If suddenly one of you studies in another university, but wants to get to the meeting, write about it in the comments to the post or a personal message before 15-00 on October 23. We are waiting for a letter with the name of the university, name, course and specialty.
Where: Dolgoprudny, MIPT, main building, 117 audience
When: October 23, Wednesday, at 19-00
Bonus: The ability to ask tricky questions
fisher ,
antonstepanenko ,
youROCK and Demi (without an account on Habré).
We managed to steal the drafts, according to which the developers are preparing for the performance. We will share them with you.
What are we going to talk about at the meeting:
')
Badoo: do it yourself
1.Start of the project- 1 server, LAMP;
- MySQL database is simple, fast, and inexpensive to maintain, and we don’t need Oracle functionality, since we don’t want logic in the database;
- PHP because fast development, good performance, it is easier to find developers, and there were no alternatives at the start of the project;
- nginx + fpm because the problem is slow clients;
- Run, work.
Total:* LAMP: Linux / Apache / Mysql / PHP
Apache -> nginx + php-fpm
2. Caching- Large traffic, serious load, the base does not cope;
- We put memcache, there are other options (redis, cassandra, etc.), but memcache is simple, reliable and fast, and the persistent will be provided by the base;
- Shard keys on a pool of servers, all keys of one user in one server;
- Prolongate;
- Reset by transaction commit;
- Cish daemons for special occasions, gpb interface.
3. Scale the web- The muzzle has ceased to cope;
- We increase the number of snouts, set a balancer in front of them (the simplest option is nginx as reverse proxy, but we have our own expensive piece of hardware with clever balancing and failover);
- Store session in memcache.
4. Monitoring and logs- Pinba - realtime monitoring, sending packets via UDP, data aggregation, graphics;
- Collecting logs via scribe to the database, search by sphinx logs, filters;
- Errors will always be, you just need to control their number and criticality.
5. Scale the base- Increased the number of faces - the base has ceased to cope;
- Tried a master slave - write intensive applications do not help;
- We will shard the base;
- Division residues and the like are not suitable;
- UDB, spots;
- Sampling data on such an architecture, search (fasten the sphinx, but we have our own magic);
- Queues, sending events in one transaction with changes, the problem of a two-phase commit, deferred event handling, asynchrony;
6. Scripting framework- We made a lot of queues - there were a lot of scavenging scripts, we need a pool of machines;
- They made a pool, rigidly tied the script to the machine - bad, the machine crashes, the script does not work;
- Existing solutions (Slurm, etc.) do not fit, either poor balancing, or very specific requirements for the tasks;
- Making a cloud;
- On each machine, a special agent to start and heartbeat;
- The central node manages the queue for execution, shoves tasks for living machines, monitors the load.
7. Depla- We have more than 2000 cars, we need to somehow deploy the code on them;
- The simplest solution is git pull, slow and nonatomic;
- The next stage, rsync, is already faster, but still not atomic, plus the network is heavily loaded and 2000 forks are hard;
- Our version is uftp + aio, that is, multicast, it works quickly and does not load the network;
- Atomic simlink swapping, libpssh on aio;
- Uimages: file-based versioning of statics;
- Automatic build build, unit run and other tests, deploy twice a day.
8. And we also have:- Your own CDN;
- Code formatter;
- Replication on php;
- Antispam;
- Fast blitz template engine.
Come!