📜 ⬆️ ⬇️

23,000 people wrote online dictation on April 8, 2017. How did this happen?

This year 200 thousand people from 858 cities of the world took part in the educational campaign “Total dictation”. They have been writing dictation for seven years, mostly on offline platforms, the ability to do this online has been available since 2014. Having experienced all the sorrows of extreme loads on the site, this year the organizers of the action attracted a whole team of IT companies. Today we talk about our part of the work.

image
photo: Valery Melnikov, RIA News

The Foundation "Total Dictation" began promotion of the action in February - then the online preparation classes started and the first publications went to the media. Each lesson was watched by an average of 13,990 people - of course, on the day of the dictation an even greater load on the site was assumed. Last year, during a dictation, a DDoS attack fell on the server, due to which the site was unavailable for some time. For the performance of the site answered:
')

Preparing the site for the load


Prior to the start of our work, the project was placed on a simple virtual server with the following characteristics: 2 CPU cores, 4 GB of RAM, HDD.

Initially, the project team suggested that 120 RPS would come to the server on the day of the action, and 1000 visitors would come to the site every minute. To find out how much RPS server can withstand now, and what server configuration is required for peak load, a load test was performed on the server by Yandex.Tank. The final configuration of the main and backup servers looked like this: 48 CPU cores, 128 GB of RAM, 250 GB SSD.

For the period of preparation of the project for the peak load, we made an upgrade of the virtual server with the site - so that it was possible to carry out all the necessary optimizations both in terms of settings and in terms of code.

In parallel with the load testing, the anti-DDoS provider was connected to the site. What he looked like:

  1. All A-records of the site were switched to IP anti-DDoS.
  2. The mail server settings have been changed so that the real IP of the project never appears in the headers of outgoing emails.
  3. On the side of antiddos, the filtering of all requests coming to the site and their subsequent proxying to the project servers were configured.

Initially, new servers were planned to be divided, and one server should be made as the main one, and the other one - as a backup one, in case the main one falls. But during the load tests to increase the total capacity of the system, it was decided to use a backup server to process requests for backend (in our case - php-fpm). Backend requests between the primary and backup servers were balanced using nginx on the primary server. MySQL was configured as a shared session storage - “1C-Bitrix” allows you to do this without the need to modify server settings.

A week and a half before the day of the dictation, the project was switched to new servers. To do this, they first created a complete copy of the old server - including all software settings, site files, databases. The switching process itself looked like this:

  1. Configured database replication and project file synchronization from the old server to the new one.
  2. At the time of switching, proxying of all requests from the old server to the new one was enabled using nginx.
  3. Disable DB replication.

On the side of the anti-DDoS provider, the addresses of the target servers were changed so that all traffic would flow to the new servers.

After the site was transferred, the final load testing was carried out - a load of 500 RPS was emulated, since the organizers suggested that there would be more visitors than they thought. As a result of the tests, it was found that due to the use of MySQL for storing sessions, the load on the disks turned out to be quite large, and in peaks this could lead to problems. Therefore, it was decided to reconfigure the sessions for storage in memcache - the load testing carried out after that showed that with the expected load on the current hardware the “narrow” places should not appear.

Load on the day of the promotion


In general, a project in which several independent parties are involved at once is always a challenge, always a risk. Therefore, before the start, despite all the preparatory work, load testing, code audits, and so on, there still remained some tension and jim-jams.

image

The dictation started on April 8, 2017 at 15:00 Vladivostok (GMT +10). At the start, the load was minimal - about 20 requests per second for the dynamics. But relax, of course, was not worth it. To the largest broadcast, the last one, at 14:00 Moscow time, we allocated more memory for caching in Memcache, carried out the same sessions, so that there was less load on the disks. The broadcast went without any fixed problems, the load was controlled, everything worked quickly, correctly.

image

Here is how the picture of loads on that day looked (time on the Irkutsk charts, GMT +8).

The overall result


Everybody tried! We transferred the data on load monitoring to our colleagues from “Total dictation” to draw up a plan for the next year.

image

On April 8, the Internet watched the rally of 90,000 viewers, wrote a dictation online of 23,000 people. From 9 to 18 April, 454798 unique users visited the site, who viewed four million pages - the participants learned their ratings and watched webinars with an analysis of errors. The preparation for the dictation on April 14, 2018, the organizers have already begun - pull up too (repeat the rules of the Russian language on the online courses), all dictation!

PS On October 13, we are holding a free Uptime Day conference in Moscow on accidents in IT infrastructure: details in yesterday's post , registration on the site .

Source: https://habr.com/ru/post/338506/


All Articles