Hi Habr! A couple of months ago, I wanted to test the performance of some network frameworks in order to understand how large the gap between them was. Do I need to use Node.js where I would like Python with Gevent or need Ruby with its EventMachine.

I want to draw your attention to the fact that these materials are not a guide to the choice of framework and may contain controversial points. I was not going to publish the results of this study at all, but when they came to my eyes I caught myself thinking that it might be useful to someone. Now I will start throwing graphs at you.
1. Text / Httperf / VPS 1 CPU, 512Mb RAM
I conducted the first test on the cheapest VPS DigitalOcean (1 Core, 512Mb RAM, 20Gb SSD). For performance testing, the
httperf utility was
used . To produce the necessary load, VPS of the same configuration was used, in the amount of 5 pieces. To run the test simultaneously on all clients, I used the
autobench utility with the following parameters:
')
autobench_admin --single_host --host1 example.com --port1 8080 --uri1 / --low_rate 50 --high_rate 600 --rate_step 10 --num_call 10 --num_conn 6000 --timeout 5 --clients XX.XX.XX.XX:4600,XX.XX.XX.XX:4600,XX.XX.XX.XX:4600,XX.XX.XX.XX:4600,XX.XX.XX.XX:4600 --file bench.tsv
This test starts execution with 50 connections per second (10 requests through one connection) and in increments of 10 connections per second reaches 600. Each test establishes only 6000 connections and all requests that have not been processed within 5 seconds are considered an error.
All HTTP servers do the same thing, namely, they return the string “I am a stupid HTTP server!” For each request. The results were as follows (
on the X axis - the number of requests per second ):
CPU load

RAM consumption (in% from 512Mb)

Number of responses

Response time (in milliseconds)

Number of mistakes

As soon as we reach 100% of the CPU usage, the memory consumption starts to grow, the number of responses falls, the response time to each request increases and errors begin to appear. As I wrote above, each request that did not receive an answer within 5 seconds is considered an error and this is exactly what is happening, this can be traced in the “Response Time” graph.
Results (in brackets the number of processed requests without errors):
- Gevent ( 4700 )
- Express.js ( 3600 )
- Eventlet ( 3200 )
- Tornado ( 2200 )
I am never completely satisfied with my work, so after a couple of hours I decided that testing VPS performance was not the best choice. Between the frameworks, the difference in performance is clear and some conclusions can be made, but we cannot find out how many clients we can serve on one core of this processor. It's one thing to share unknown resources with someone, and quite another when all resources are known and at our disposal.
2. Text / Httperf / Intel Core i7-4770 Quad-Core Haswell, 32 GB DDR3 RAM
For the next test, I rented a dedicated server from Hetzner (EX40) with an Intel Core i7-4770 Quad-Core Haswell processor and 32 GB DDR3 RAM.
This time I created 10 VPS, which will create the necessary load and run autobench with the following parameters:
autobench_admin --single_host --host1 example.com --port1 8080 --uri1 / --low_rate 50 --high_rate 1500 --rate_step 50 --num_call 10 --num_conn 15000 --timeout 5 --clients XX.XX.XX.XX:4600,XX.XX.XX.XX:4600,XX.XX.XX.XX:4600,XX.XX.XX.XX:4600,XX.XX.XX.XX:4600 ... --file bench.tsv
This test starts execution with 50 connections per second (10 requests through one connection) and reaches 1500 in increments of 50 connections per second. Each test establishes only 15000 connections and all requests that have not been processed within 5 seconds are considered an error.
The source code of the servers is the same as in the first test. One server copy is running that uses only 1 core. In this test, I added the
Twisted 13.2 framework and
Eventmachine 1.0.3. I removed the memory consumption from the test results because the difference, by modern standards, is negligible. I will not pull the cat by the tail, here are the results:
CPU load

Number of responses

Response time (in milliseconds)

Number of mistakes

Here, as before, rested against the CPU, which was to be expected. On average, the performance here is 3 times higher than on the VPS DigitalOcean (1 Core, 512Mb), from which it is possible to draw appropriate conclusions about the amount of resources allocated to us.
Results (in brackets the number of processed requests without errors):
- Eventmachine ( details below )
- Gevent ( 12500 )
- Express.js ( 11500 )
- Eventlet ( 9000 )
- Twisted ( 7000 )
- Tornado ( 6500 )
Eventmachine
Eventmachine surprised me with its performance and went far from competitors, which is why I had to increase the load to 25,000 requests per second especially for him. The result on the graphs:
CPU load

Number of responses

Response time (in milliseconds)

Number of mistakes

I have suspicions that he would be able to process 30,000 requests, but I had to move on, so I could not make sure of it. Actually, by this time I already knew that I would use Python for my project, so I needed frameworks in other languages ​​just for comparison.
3. Files / Siege / Intel Core i7-4770 Quad-Core Haswell, 32 GB DDR3 RAM
As I wrote above, I am not completely satisfied with my work, so I went to bed with a sense of accomplishment, but woke up with the thought “I need more tests!”. It’s nice to give a line of text to each request, but it’s not the only function of the web server, so we’ll distribute files.
For this test, I used 10 VPS to create the necessary load. Experimentally, I found out that on 1 VPS DigitalOcean, on average, 100Mbps channel is allocated. I had a server with a 1Gbps channel and I had to fully load it. The files for distribution were images from an online store in the amount of 10,000 pieces, of different sizes. To create the load, I used the
siege utility with the following parameters:
siege -i -f fileslist.txt -c 55 -b -t1M
The filelist.txt stores the list of files, establishes 55 connections and through them we begin to hammer the server with requests for the 1st minute. The files are randomly selected from the list of fileslist.txt. Definitely worth noting that this test runs on 10 machines at the same time, which means we install
not 55, but 550 simultaneous connections. Moreover, I constantly changed this option from 5 to 55 in increments of 5, thereby increasing the load on the server, and setting from 50 to 550 simultaneous connections.
This is what we get (
on the X axis - the number of simultaneous connections ):
Number of requests executed

Requests processed per second

CPU load (%)

RAM consumption (in% from 32Gb)

The load on the communication channel (megabytes per second)

Average response time for a request (in seconds)

In this test, I added RAM consumption as well as the nginx web server for comparison. Here, the bottleneck is the communication channel, and the 1st core is enough to load this entire channel in 1Gbps.
Results (in brackets the number of processed requests without errors):
- Nignx ( 100175 )
- Eventlet ( 97925 )
- Gevent ( 96918 )
- Express.js ( 96162 )
- Twisted ( 85733 )
- Tornado ( 83241 )
4. GridFS / Siege / Intel Core i7-4770 Quad-Core Haswell, 32 GB DDR3 RAM
That was the end of the article, but I wanted to use MongoDB GridFS in my project, so I decided to see how the performance would change with its use. This test is similar to the 3rd, except that I uploaded all the images in the amount of 10,000 pieces to MongoDB and rewrote the web servers so that they would distribute the files from the database. So, what we get:
Number of requests executed

Requests processed per second

CPU load (%)

RAM consumption (in% from 32Gb)

The load on the communication channel (megabytes per second)

Average response time for a request (in seconds)

Number of mistakes

During the test, Gevent had answers with errors, so I added the “Number of errors” graph. In general, GridFS is quite possible to use, but it should be borne in mind that the base itself creates a considerable load on the CPU, and I had 7 free cores at its disposal, when everything is much simpler with the file system.
Results (in brackets the number of processed requests without errors):
- Express.js ( 88714 )
- Gevent ( 86182 )
findings
- MacBook Pro Retina really works 9 hours on one charge.
- Node.js is not the only tool, as some people believe, for developing network applications.
- Gevent has a very good performance.
- Registration of the article takes more time than its writing.
- Performance testing is a complex process that takes time.
But seriously, it all depends on the conditions under which your project will work. You can spend a huge number of tests, but when the service is written, everything is likely to be very different. For example, with an increase in the number of pictures from 10,000 to 1,000,000, the performance of the hard disk becomes the bottleneck, and not the communication channel.
Materials
If you decide to conduct your own testing or study mine in more detail, then this list should help you.
Reports
Full reports with individual graphs and figures can be downloaded from these links:
- Text / Httperf / VPS 1 CPU, 512Mb RAM
- Text / Httperf / Intel Core i7-4770 Quad-Core Haswell, 32 GB DDR3 RAM
- Files / Siege / Intel Core i7-4770 Quad-Core Haswell, 32 GB DDR3 RAM
- GridFS / Siege / Intel Core i7-4770 Quad-Core Haswell, 32 GB DDR3 RAM
Instruments
In my tests I used:
Frameworks
In the tests participated:
Thank you all for your attention.
Follow me on Twitter , I talk about working in a startup, my mistakes and the right decisions, about python and everything related to web development.
PS
I'm looking for developers to the company, the details in my profile .