📜 ⬆️ ⬇️

Experience in the development of a high-loaded system within the HighLoad Cup

Mail.Ru offered an interesting championship for backend developers: HighLoad Cup. Which allows not only to get good prizes, but also to raise your skill backend-developer. About experience of development and environment setup it will be told under cat.

1. Input data


You need to write a fast server that will provide a web API for the travelers service.

In the initial data for the server there are three types of entities: User (Traveler), Location (Visit), Visit (Visits). Each has its own set of fields.

You must implement the following queries:
')
 GET / <entity> / <id> to get entity data
 GET / users / <id> / visits for getting a list of visits by user
 GET / locations / <id> / avg to get average sights
 POST / <entity> / <id> to update
 POST / <entity> / new to create

The maximum penalty time for a request is equal to the timeout of the tank and is 2 seconds (2k microseconds).

The solution should be in one docker container.
Iron used for testing: Intel Xeon x86_64 2 GHz 4 cores, 4 GB RAM, 10 GB HDD.

So, the task is essentially simple, but the knowledge in Docker is 0, the development experience under high load is around 50%.
For writing, php7 + nginx + mysql was chosen because the experience gained could be used later in the work.

2. Docker


We will understand what a Docker.
Docker is an automation software for deploying and managing applications in a virtualization environment at the operating system level. Allows you to "pack" the application with all its environments and dependencies into a container that can be transferred to any Linux system with cgroups support in the kernel, and also provides a container management environment.
It sounds just fine, if in brief, we do not need to configure nginx / php / apache locally for each project and not get additional dependencies on other projects. For example, there is a site that is not compatible with php7, in order to work with it you need to switch the php module in apache2 to the correct version. Everything is simple with the docker - we launch the container with the project and develop it. We switched to another project, stop the current container and raise a new one.

Docker ideology 1 process - 1 container. That is, nginx with php in its container, mysql in its. Docker-compose is used to combine and configure them.

Sample docker-compose.yml file
version: '2' services: mysql: image: mysql:5.7 #   environment: MYSQL_ROOT_PASSWORD: 12345 # root  volumes: - ./db:/var/lib/mysql #      ports: - 3306:3306 #   - _: nginx: build: context: ./ dockerfile: Dockerfile #    depends_on: [mysql] #  ports: - 80:80 volumes: - ./:/var/www/html #    ,      


Run:

 docker-compose -f docker-compose.yml up 

Everything works, there is a connection. We try to send for review the solution iiii read carefully the task - everything should be in 1 container. And the container in turn works while the process running via the CMD or ENTRYPOINT command is alive. Since we have several services, you need to use the process manager - supervisord.

Dockerfile configuration
 FROM ubuntu:17.10 RUN apt-get update && apt-get -y upgrade \ && DEBIAN_FRONTEND=noninteractive apt-get install -y mysql-server mysql-client mysql-common \ && rm -rf /var/lib/mysql && mkdir -p /var/lib/mysql /var/run/mysqld \ && chown -R mysql:mysql /var/lib/mysql /var/run/mysqld \ && chmod 777 /var/run/mysqld \ && rm /etc/mysql/my.cnf \ && apt-get install -y curl supervisor nginx \ php7.1-fpm php7.1-json \ php7.1-mysql php7.1-opcache \ php7.1-zip ADD ./config/mysqld.cnf /etc/mysql/my.cnf COPY config/www.conf /etc/php/7.1/fpm/pool.d/www.conf COPY config/nginx.conf /etc/nginx/nginx.conf COPY config/nginx-vhost.conf /etc/nginx/conf.d/default.conf COPY config/opcache.ini /etc/php/7.1/mods-available/opcache.ini COPY config/supervisord.conf /etc/supervisord.conf COPY scripts/ /usr/local/bin/ COPY src /var/www/html #          # #RUN mkdir /tmp/data /tmp/db #COPY data_full.zip /tmp/data/data.zip ENV PHP_MODULE_OPCACHE on ENV PHP_DISPLAY_ERRORS on RUN chmod 755 /usr/local/bin/docker-entrypoint.sh /usr/local/bin/startup.sh RUN chmod +x /usr/local/bin/docker-entrypoint.sh /usr/local/bin/startup.sh WORKDIR /var/www/html RUN service php7.1-fpm start EXPOSE 80 3306 CMD ["/usr/local/bin/docker-entrypoint.sh"] 


The CMD command ["/usr/local/bin/docker-entrypoint.sh"] makes a small environment configuration after starting the container and starting the process manager.

Process Manager Setup
 [unix_http_server] file=/var/run/supervisor.sock [supervisord] logfile=/tmp/supervisord.log logfile_maxbytes=50MB logfile_backups=10 loglevel=info pidfile=/tmp/supervisord.pid nodaemon=false minfds=1024 minprocs=200 user=root [rpcinterface:supervisor] supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface [supervisorctl] serverurl=unix:///var/run/supervisor.sock [program:php-fpm] command=/usr/sbin/php-fpm7.1 autostart=true autorestart=true priority=5 stdout_events_enabled=true stderr_events_enabled=true stdout_logfile=/dev/stdout stdout_logfile_maxbytes=0 stderr_logfile=/dev/stderr stderr_logfile_maxbytes=0 [program:nginx] command=/usr/sbin/nginx -g "daemon off;" autostart=true autorestart=true priority=10 stdout_events_enabled=true stderr_events_enabled=true stdout_logfile=/dev/stdout stdout_logfile_maxbytes=0 stderr_logfile=/dev/stderr stderr_logfile_maxbytes=0 [program:mysql] command=mysqld_safe autostart=true autorestart=true priority=1 stdout_events_enabled=true stderr_events_enabled=true stdout_logfile=/dev/stdout stdout_logfile_maxbytes=0 stderr_logfile=/dev/stderr stderr_logfile_maxbytes=0 [program:startup] command=/usr/local/bin/startup.sh startretries=0 priority=1100 stdout_events_enabled=true stderr_events_enabled=true stdout_logfile=/dev/stdout stdout_logfile_maxbytes=0 stderr_logfile=/dev/stderr stderr_logfile_maxbytes=0 


Using the priority parameter, you can change the startup sequence, and stdout_logfile / stderr_logfile allows you to display the logs of services in the container log. The most recent startup.sh script runs, which contains the database filling with data from the archive.
Now you can finally send your child to the first check. Docker commands are similar to git, for sending we use:

 docker tag < -> stor.highloadcup.ru/travels/< > docker push stor.highloadcup.ru/travels/< > 

You can also register on the official website https://cloud.docker.com and add the container there. There you can set up an automatic assembly when updating a branch on github or bitbucket and continue to use the ready-made image in other projects as a basis.

3. Service development


To ensure high performance, it was decided to abandon all the frameworks and use bare php + pdo. The framework, though much easier to develop, but pulls along a bunch of dependencies that use script execution time.
The starting point will be the index.php script with routing requests and results (Router + Controller). Using urls like:

 /<entity>/<id> 

Itself involves the use of regulars to determine the route and parameters. It is very flexible and makes it easy to expand the service. But the option on if'ah was faster (Although there is a possibility of an error, why? Read below).

index.php
 $uri = parse_url($_SERVER['REQUEST_URI'], PHP_URL_PATH); $routes = explode('/', $uri); //    $entity = $routes[1] ?? 0; $id = $routes[2] ?? 0; $action = $routes[3] ?? 0; $className = __NAMESPACE__.'\\'.ucfirst($entity); if (!class_exists($className)) { //     header('HTTP/1.0 404 Not Found'); die(); } $db = new \PDO( 'mysql:dbname=travel;host=localhost;port=3306', 'root', null, [ \PDO::MYSQL_ATTR_INIT_COMMAND => 'SET NAMES \'UTF8\'', \PDO::ATTR_PERSISTENT => true ] ); //   /** @var \Travel\AbstractEntity $class */ $class = new $className($db); // POST  (/ ) if ($_SERVER['REQUEST_METHOD'] === 'POST') { if (isset($_SERVER['Content-Type'])) { //  json  $type = trim(explode(';', $_SERVER['Content-Type'])[0]); if ($type !== 'application/json') { header('HTTP/1.0 400 Bad Values'); die(); } } $inputJSON = file_get_contents('php://input'); $input = json_decode($inputJSON, true); // if ($input && $class->checkFields($input, $id !== 'new')) { $itemId = (int)$id; if ($itemId > 0 && $class->hasItem($itemId)) { $class->update($input, $itemId); header('Content-Type: application/json; charset=utf-8'); header('Content-Length: 2'); echo '{}'; die(); } //   if ($id === 'new') { $class->insert($input); header('Content-Type: application/json; charset=utf-8'); header('Content-Length: 2'); echo '{}'; die(); } //    -  header('HTTP/1.0 404 Not Found'); die(); } //    header('HTTP/1.0 400 Bad Values'); die(); } // GET  if ((int)$id > 0) { if (!$action) { //   ,    $res = $class->findById($id); if ($res) { $val = json_encode($class->hydrate($res)); header('Content-Type: application/json; charset=utf-8'); header('Content-Length: '.strlen($val)); echo $val; die(); } header('HTTP/1.0 404 Not Found'); die(); } //     $res = $class->hasItem($id); if (!$res) { header('HTTP/1.0 404 Not Found'); die(); } $filter = []; if (!empty($_GET)) { //  $filter = $class->getFilter($_GET); if (!$filter) { header('HTTP/1.0 400 Bad Values'); die(); } } header('Content-Type: application/json; charset=utf-8'); echo json_encode([$action => $class->{$action}($id, $filter)]); die(); } header('HTTP/1.0 404 Not Found'); die(); 


It looks awkward, but it works quickly. Next, the main class for processing data is AbstractEntity. I will not bring it here, because everything is just corny here - insert / update / selection (all the source code can be viewed on GiHub ). It already forms classes with entities. For example, take the essence Users.

Filter
It checks data from the GET request for validity and forms a filter for the request in the database. In the code below, there are no checks / filtering on injections, etc. Do not repeat this at home in combat projects.

 public function getFilter(array $data) { $columns = [ 'fromDate' => 'visited_at > ', 'toDate' => 'visited_at < ', 'country' => 'country = ', 'toDistance' => 'distance < ', ]; $filter = []; foreach ($data as $key => $datum) { if (!isset($columns[$key])) { return false; } if (($key === 'fromDate' || $key === 'toDate' || $key === 'toDistance') && !is_numeric($datum)) { return false; } $filter[] = $columns[$key]."'".$datum."'"; } return $filter; } 

Getting places visited by the user
Places and ratings for a specific user are displayed, the filter obtained above can also be applied.

 public function visits(int $id, array $filter = []) { $sql = 'select mark, visited_at, place from visits LEFT JOIN locations ON locations.id = visits.location where user = '.$id; if (count($filter)) { $sql .= ' and '.implode(' and ', $filter); } $sql .= ' order by visited_at asc'; $rows = $this->_db->query($sql); if (!$rows) { return false; } $items = $rows->fetchAll(\PDO::FETCH_ASSOC); foreach ($items as &$item) { $item['mark'] = (int)$item['mark']; $item['visited_at'] = (int)$item['visited_at']; } return $items; } 

Age calculation
This was probably the most discussed topic in the telegram chat. The user's date of birth is given in the timestamp format (number of seconds from the beginning of the Linux era), for example, 12333444. But the countdown has been going on since 1970, and there are still people who were born before the 70s. In this case, the timestamp will be negative, for example -123324. Users can be superimposed by age filter, for example, select all who are over 18 years old. In order not to calculate the age each time when querying in the database, I calculated it before adding the user to the database and saved it in an additional field.

Age calculation function:

 public static function getAge($y, $m, $d) { if ($m > date('m', TEST_TIMESTAMP) || ($m == date('m', TEST_TIMESTAMP) && $d > date('d', TEST_TIMESTAMP))) { return (date('Y', TEST_TIMESTAMP) - $y - 1); } return (date('Y', TEST_TIMESTAMP) - $y); } 

A “crutch” with TEST_TIMESTAMP is needed for passing tests, since the data + responses are generated simultaneously and unchanged over time. The php date function perfectly converts a negative timestamp to a date, given leap years.

Database
The database was created exactly for the entity, all field sizes were TK. DB engine InnoDb. Indexes have been added to the fields participating in the filter or sorting.

Web server setup and database
To improve performance, the settings found on the Internet were used; they should have been the beginning from where to turn the services tune knob.

4. Processing reports, adjusting service settings


The source code in php turned out to be minimal in size and it quickly became clear that I was turning from a backend developer into a sysadmin. Rapid tests run on a small amount of data and are more used to verify the correctness of the answers than to test the application under load. A full-fledged tests could be run only 2 times in 12 hours. Testing on my computer did not always lead to clear results - it could have worked quickly for me, but it failed to fail on checking. Due to this, I could not set up memcached, which should speed up server responses.

The only positive thing was the use of the MyISAM engine instead of InnoDb. Tests gave 133 penalty seconds, instead of 250 on InnoDb.

Now that did not give a good setup to configure nginx / mysql / php-fpm - Significant variation in the results of a single solution at different times of day . This thoroughly upset me, since I also had a variation in the evening / morning results of the same solution. I don’t know how the “combat” inspection infrastructure was arranged for them, but it is obvious that something could interfere with and load the car (perhaps the preparation of the next launch solution is possible). And when the account goes on for a millisecond in the rating, it becomes impossible to fine-tune the server.

Below are the configurations on which I stopped:

mysql
 [mysqld_safe] socket = /var/run/mysqld/mysqld.sock nice = 0 [mysqld] # # * Basic Settings # user = mysql pid-file = /var/run/mysqld/mysqld.pid socket = /var/run/mysqld/mysqld.sock port = 3306 basedir = /usr datadir = /var/lib/mysql tmpdir = /tmp lc-messages-dir = /usr/share/mysql skip-external-locking # # Instead of skip-networking the default is now to listen only on # localhost which is more compatible and is not less secure. bind-address = 127.0.0.1 # # * Fine Tuning # key_buffer_size = 16M max_allowed_packet = 16M thread_stack = 192K thread_cache_size = 32 sort_buffer_size = 256K read_buffer_size = 128K read_rnd_buffer_size = 256K myisam_sort_buffer_size = 64M myisam_use_mmap = 1 myisam-recover-options = BACKUP table_open_cache = 64 # # * Query Cache Configuration # query_cache_limit = 10M query_cache_size = 64M query_cache_type = 1 join_buffer_size = 4M # # Error log - should be very few entries. # log_error = /var/log/mysql/error.log expire_logs_days = 10 max_binlog_size = 100M # # * InnoDB # innodb_buffer_pool_size = 2048M innodb_log_file_size = 256M innodb_log_buffer_size = 16M innodb_flush_log_at_trx_commit = 2 innodb_thread_concurrency = 8 innodb_read_io_threads = 64 innodb_write_io_threads = 64 innodb_io_capacity = 50000 innodb_flush_method = O_DIRECT transaction-isolation = READ-COMMITTED innodb_support_xa = 0 innodb_commit_concurrency = 8 innodb_old_blocks_time = 1000 


nginx
 user www-data; worker_processes auto; error_log /var/log/nginx/error.log warn; pid /var/run/nginx.pid; events { worker_connections 2048; multi_accept on; use epoll; } http { include /etc/nginx/mime.types; default_type application/octet-stream; log_format main '$remote_addr - $remote_user [$time_local] "$request" ' '$status $body_bytes_sent "$http_referer" ' '"$http_user_agent" "$http_x_forwarded_for"'; sendfile on; tcp_nodelay on; tcp_nopush on; access_log off; client_max_body_size 50M; client_body_buffer_size 1m; client_body_timeout 15; client_header_timeout 15; keepalive_timeout 2 2; send_timeout 15; open_file_cache max=2000 inactive=20s; open_file_cache_valid 60s; open_file_cache_min_uses 5; open_file_cache_errors off; gzip_static on; gzip on; gzip_vary on; gzip_min_length 1400; gzip_buffers 16 8k; gzip_comp_level 6; gzip_http_version 1.1; gzip_proxied any; gzip_disable "MSIE [1-6]\.(?!.*SV1)"; gzip_types text/plain text/css application/x-javascript text/xml application/xml application/xml+rss text/javascript application/json image/svg+xml svg svgz; include /etc/nginx/conf.d/*.conf; } 


nginx-vhost
 server { listen 80; server_name _; chunked_transfer_encoding off; root /var/www/html; index index.php index.html index.htm; error_log /var/log/nginx/error.log crit; location / { try_files $uri $uri/ /index.php?$args; } location ~ \.php$ { try_files $uri =404; include /etc/nginx/fastcgi_params; fastcgi_pass unix:/var/run/php/php7.1-fpm.sock; fastcgi_index index.php; fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name; fastcgi_read_timeout 3s; } } 


In php-fpm it was not possible to achieve anything specific.

Before the final, the amount of data was increased, unfortunately I did not have enough time to try to further optimize my solution. But after the final, a sandbox was opened where you can still try to drive your decisions and compare the results with the top.

5. Conclusions


I am glad that I participated in this championship. Understand the principle of the docker, a deeper configuration of servers under high loads. And I also liked the competitive spirit and chat chatting telegram. For all the time championship in the top were programmers with ++ and go. One could follow the example and also write in any of these languages. But I wanted to look at my results in what I know and with what I work. Thank you Mail.Ru for this.

6. References


1. Source Code
2. highloadcup.ru first round of competition

Source: https://habr.com/ru/post/337076/


All Articles