Testing in Yandex: we build our Lunapark

Sometimes a second glance at the response time schedule is enough to say: the service will not fly. Another couple of seconds - and the reason is found: the processor cores are loaded unevenly, too few threads are running on the server. How to create a convenient system for collecting and storing the results of stress tests? About what experience we have gained about this in Yandex, today is my story.

By the way, I will talk about Yandex.Tank and Graphite on the Test Environment , the registration for which will be open until 18:00, 18 November . There you can ask your questions live.

If you read the doctornkz article on how load testing is organized in Yandex, then you know that the results of the shooting are stored in a repository that can show them through the web interface. It is called Lunapark. It is very convenient to conduct a test and send a link to it to all interested people (and see for yourself all on one page). The service is a web application that is sharpened on internal processes (there is gamification, and connection with other internal resources), which we do not plan to post. Therefore, I decided to tell you how to build such a system using only open-source products.
')

System architecture

An automation system is a module that controls the launch of tests, allows them to be parameterized, to perform additional actions (download the production server log or raise the test environment). All this can be done using tools such as Jenkins, Maven, Rake. I will not talk about this today, this is a topic for a separate big post.

The load generator is a workhorse, a module that creates a load on the target (test bench). The story will be about Yandex. The tank is a modular and expandable shooter that allows you to use different generators inside, in particular, the familiar JMeter. I note that Tank is an open-source project published by Yandex in 2012. It is not for beginners, you need to be on "hello, how are you?" With Linux, and even better to be able to write simple scripts.

Finally, a repository of results. The tank shoots at the service and measures response times and other parameters. It turns out time series that need to be stored somewhere, and then displayed and analyzed. We will use Graphite for this.

Graphite is a high-performance and scalable time series repository written in Python. Open source. It is very easy to load data into it (and there are many different ways for every taste for this) and then it’s convenient to turn them through the Web-API (and there are a lot of frontends for that too). In detail about how Graphite is used in Yandex, its architecture and performance can be heard here .

Installing Yandex.Tank

If you have Ubuntu, you're lucky. Because you just need to connect the Tank's repository and install it just like all other packages - dependencies will stretch themselves (hereinafter you may need root rights - provide them):

#      sources.list: deb http://ppa.launchpad.net/yandex-load/main/ubuntu precise main deb-src http://ppa.launchpad.net/yandex-load/main/ubuntu precise main

 apt-get update && sudo apt-get install yandex-load-tank-base

If you do not have Ubuntu (for example, I want to try on MacOS), you can try downloading .deb and making it .rpm, but the most universal way is to download the source from github.

 git clone https://github.com/yandex-load/yandex-tank.git

You don’t need to collect the Tank itself - it is in Python, but you will need to download and install dependencies. Among them is Phantom - this is a high-performance web server, which Tank uses as an online reseller, also from Yandex. You can listen to the story here .

The required python libraries are installed as follows:

 pip install ipaddr lxml progressbar psutil mysqldb sqlalchemy

In addition, you have to collect from the source code Phantom. I will not explain here how to do this, who needs it - write, tell.

It's time to shoot

To shoot a tank, you need to write a configuration file for it (I am a little KO today). I will not go into details, of which there are many ; I will give the simplest example:

 [phantom] address=example.org rps_schedule=line(1, 100, 10m) headers = [Host: example.org] [Connection: close] [Bloody: yes] uris=/ /list /img

After creating the file - just launch the tank with the command yandex-tank. By default, it searches for a config named load.ini in the current directory. It will rustle, fire, shoot, the output will be a text file phout * .log with data, which is usually advised to be stuffed into gnuplot. But we are not like that, right?

We put Graphite

Unfortunately, there is no official deb-package for Graphite at the moment, so we will install it from the Python repository (pypi):

 apt-get install python python-dev python-cairo pip install whisper carbon graphite-web django==1.5.1 Twisted==11.1.0 django-tagging

After installation, copy the default configuration (* .conf.example -> * .conf), for example, like this:

 for file in /opt/graphite/conf/*.example; \ do cp $file ${file%.*}; done

By default, Graphite stores data with a resolution of 1 minute. Of course, this is not enough for us; every second is important in load tests. Configuring data storage policies:

 [load] pattern = ^one_sec\.yandex_tank\. retentions = 1s:7d,5s:1y

What kind of ancient writings? I asked Graphite that all metrics that fall under the regexp specified in the pattern parameter be stored in accordance with the policy specified in the retentions parameter:

 1s:7d, 5s:1y

Everything is simple: second accuracy - seven days, then five seconds - within a year.

And one more thing. It is necessary to set the time zone to your local, otherwise, specifying the local time, the graphs you will not see - just miss your data. The time zone is specified in the local_settings.py file, for example, so (by default there is no file):

 echo TIME_ZONE = "Europe/Moscow" \ > /opt/graphite/webapp/graphite/local_settings.py

Now create the django tags:

 cd /opt/graphite/webapp/graphite python manage.py syncdb

To start Graphite, you need to start carbon and web front-end storage:

 /opt/graphite/bin/carbon-cache.py start /opt/graphite/bin/run-graphite-devel-server.py /opt/graphite/

Carbon by default waits for data on the 2003rd port. Let's try to write something in Graphite. It is very simple, for example:

 echo my.favourite.metric 1 $(date +%s) | nc -q0 localhost 2003

Here we simply send the value 1 with the current timestamp to the metric "my.favourite.metric". And now let's fill the contents of / proc / vmstat into Graphite (this is already usable):

 while read -r metric; \ do echo one_sec.vmstat.$metric $(date +%s); \ done < /proc/vmstat \ | nc -q0 localhost 2003

And of course, many tools have already been invented to fill in data on system resources. Take a look, for example, on the Diamond and CollectD projects .

You can view the flooded data through the web interface, which by default listens on port 8080. Play around with it a bit, and then continue.

Connect Tank to Graphite

Well then! It's time to make friends of our new friends. It's easy too. Add the following section to the Tank configuration file:

 [graphite] address=localhost

That's it, now you can shoot again and see our results already in Graphite. In addition, in the results folder, you can now find the HTML text that Tank carefully generated for us. It has already collected graphs and time intervals. Here are the graphics we see there:

Quantiles and average response time

According to the quantile chart you can see the distribution of response times every second.

Requests per second by marker

According to this graph, you can track how many requests accounted for each marker. Before firing, cartridges can be labeled to separate, for example, light from heavy.

Average times by marker

How the server responds to different types of requests - the answer is here.

Response codes

If errors occur, you will see them on this graph.

Cumulative Quantiles

Unlike the first chart, here quantiles are “piling up” from the beginning of the test. You can see when they stopped changing - it means you shot enough to imagine how the answers are generally distributed.

Report template

Everyone remembers that in the previous section we discussed how to fill in system data with Graphite? How to see these metrics too? To generate HTML with images, Tank uses a template that can be specified in the options:

 [graphite] template = ./my.tpl

The template is just HTML with variables that the Tank substitutes. For example:

 <h2>RPS by marker</h2> <img src="" /> <h2>Average response time by marker</h2> <img src="" /> <h2>HTTP codes</h2> <img src="" />

One link to a graph in Graphite looks like this:

 http://{host}:{web_port}/render/? width={width}& height={height}& from={start_time}& until={end_time}& target=aliasByMetric({prefix}.overall.quantiles.25_0)& target=aliasByMetric({prefix}.overall.quantiles.50_0)& target=aliasByMetric({prefix}.overall.quantiles.75_0)& target=aliasByMetric({prefix}.overall.quantiles.90_0)& target=aliasByMetric({prefix}.overall.quantiles.95_0)& target=aliasByMetric({prefix}.overall.quantiles.99_0)& target=aliasByMetric({prefix}.overall.quantiles.100_0)& target=aliasByMetric({prefix}.overall.avg_response_time)& areaMode=all

In curly brackets, we see substitute fields, whose name speaks for itself. Instead of {host} there will be a host specified in the settings, instead of {start_time} and {end_time} - the start and end times for firing. Well, you understand.

What is the result?

So, we got a shooter that fills the data into Graphite and generates HTML with links to this data. How now to start shooting automatically? By cron? Could be so. But it is more convenient to use Jenkins. About it somehow next time. Stay tuned!

By the way, if you read this text to the end, it means that the topic is interesting for you. Come discuss it on the test environment . On it, my colleagues will tell everything else about gamification and automation in load testing. Come listen and chat!

Links

tech.yandex.ru/events/yac/2013/talks/1122 - dkulikovsky @ o Graphite
github.com/graphite-project/graphite-web - Graphite on github
github.com/yandex-load/yandex-tank - Yandex Tank on github
github.com/BrightcoveOS/Diamond - Diamond
collectd.org - CollectD
twitter.com/direvius - Follow me on Twitter

Source: https://habr.com/ru/post/202446/

All Articles