Benchmark of HTTP servers (C / C ++) on FreeBSD

A comparison was made of the performance of the cores of HTTP servers built using seven C / C ++ libraries, as well as (for educational purposes) other ready-made solutions in this area (nginx and node.js).

The HTTP server is a complex and interesting mechanism. There is an opinion that a programmer who did not write his own compiler is bad, I would replace “compiler” with “HTTP-server”: this is a parser, and work with a network, and asynchrony with multithreading and much more ....
')
Tests on all possible parameters (static return, dynamics, various encryption modules, proxies, etc.) are not only one month of hard work, so the task is simplified: we will compare the performance of the cores. The core of an HTTP server (like any network application) is a socket event dispatcher and some primary mechanism for processing them (implemented as a pool of threads, processes, etc.). This also includes the HTTP parser and response generator. At first glance, everything should come down to testing the capabilities of one or another system mechanism for handling asynchronous events (select, epoll, etc.), their meta-wrappers (libev, boost.asio, etc.) and the OS kernel, but the specific implementation in the form of a ready-made solution gives a significant difference in performance.

The version of the HTTP server on libev was implemented . Of course, support is provided for a small subset of the requirements of the notorious rfc2616 (it is unlikely that it is fully implemented by at least one HTTP server), only the necessary minimum to meet the requirements for participants in this test,

Listen to requests on port 8000;
Check method (GET);
Check the path in the request (/ answer);

The answer should contain:

             HTTP / 1.1 200 OK
             Server: bench
             Connection: keep-alive
             Content-Type: text / plain
             Content-Length: 2
             42

To any other method \ path - the answer should be returned with error code 404 (page not found).

As you can see, there are no extensions, access to files on the disk, gateway interfaces, etc. - everything is as simple as possible.
_{In cases when the server does not support keep-alive connections (by the way, only cpp-netlib was the only one distinguished by this), the testing was done in acc.} _mode.

Prehistory

Initially, the task was to implement an HTTP server with a load of hundreds of millions of hits per day. It was assumed that there would be a relatively small number of customers generating 90% of requests, and a large number of customers generating the remaining 10%. Each request must be sent further to several other servers, collect responses and return the result to the client. The success of the project depended on the speed and quality of the response. Therefore, it was not possible to simply take and use the first available ready-made solution. It was necessary to get answers to the following questions:

Should I reinvent my bike or use existing solutions?
Is node.js suitable for high-load projects? ~~If yes, then throw out thickets of C ++ code and rewrite everything in 30 lines on JS.~~

There were also less significant issues, for example, does HTTP keep-alive affect performance? (a year later, the answer was voiced here - it affects, and quite significantly).

Of course, my bike was first invented, then node.js appeared (I found out about it two years ago), and then I wanted to know how much the existing solutions were more efficient than my own, wasn’t it wasted time? Actually, this is how this post appeared.

Training

Iron

Processor: CPU: AMD FX (tm) -8120 Eight-Core Processor
Network: localhost (why see TODO)

Soft

OS: FreeBSD 9.1-RELEASE-p7

Tuning
Usually in load testing network applications, it is customary to change the following standard set of settings:

/etc/sysctl.conf

kern.ipc.somaxconn = 65535
net.inet.tcp.blackhole = 2
net.inet.udp.blackhole = 1
net.inet.ip.portrange.randomized = 0
net.inet.ip.portrange.first = 1024
net.inet.ip.portrange.last = 65535
net.inet.icmp.icmplim = 1000

/boot/loader.conf

kern.ipc.semmni = 256
kern.ipc.semmns = 512
kern.ipc.semmnu = 256
kern.ipc.maxsockets = 999999
kern.ipc.nmbclusters = 65535
kern.ipc.somaxconn = 65535
kern.maxfiles = 999999
kern.maxfilesperproc = 999999
kern.maxvnodes = 999999
net.inet.tcp.fast_finwait2_recycle = 1

However, in my testing, they did not lead to an increase in performance, and in some cases even led to a significant slowdown, so in the final tests no changes were made to the system settings (i.e. all the default settings, the GENERIC core).

Members

Library

Name	Version	Developments	Keep-alive support	Mechanism
cpp-netlib	0.10.1	Boost.Asio	not	multithreaded
hand-made	1.11.30	libev	Yes	multiprocess (one thread per process), asynchronous
libevent	2.0.21	libevent	Yes	single-threaded *, asynchronous
mongoose	5.0	select	Yes	single-threaded, asynchronous, with a list (more)
onion	0.5	libev	Yes	multithreaded
Pion network library	0.5.4	Boost.Asio	Yes	multithreaded
POCO C ++ Libraries	1.4.3	select	Yes	multi-threaded (separate stream for incoming connections), with a queue (more)

Turnkey solutions

Name	Version	Developments	Keep-alive support	Mechanism
Node.js	0.10.17	libuv	Yes	cluster module (multiprocess processing)
nginx	1.4.4	epoll, select, kqueue	Yes	multiprocess processing

_{* for tests reworked according to the scheme “multiprocess - one process one thread”}

Disqualified

Name	Cause
nxweb	only linux
g-wan	only Linux (and generally ... )
libmicrohttpd	constant drops under load
yield	compilation errors
EHS	compilation errors
libhttpd	synchronous, HTTP / 1.0, does not change the headers
libebb	compilation errors

As a client, an application from the developers of lighttpd - weighttpd was used . It was originally planned to use httperf as a more flexible tool, but it is constantly falling. In addition, weighttpd is based on libev, which is much better suited for FreeBSD than httperf with select. As the main test script (wrappers over weighttpd with the calculation of resource consumption, etc.), the gwan-ovsky ab.c , converted to FreeBSD, was considered, but was later rewritten from scratch on Python (bench.py in the appendix).

The client and server were running on the same physical machine.
As variable values were used:

Number of server threads (1, 2 and 3)
The number of parallel open customer requests (10, 100, 200, 400, 800)

In each configuration, 20-30 iterations were performed, 2 million requests per iteration.

results

In the first version of the article gross violations were made in the testing methodology, which was indicated in the comments by users of VBart and wentout . So, in particular, the strict separation of tasks by processor cores was not used, the total number of server / client threads exceeded the permissible norms. Also, the options affecting the measurement results (AMD Turbo Core) were not disabled, measurement errors were not indicated. The current version of the article uses the approach described here .

For servers running in single-threaded mode, the following results were obtained (maximum medians for server / client stream combinations were taken):

A place	Name	Client. streams	% time		Requests
A place	Name	Client. streams	User	Syst.	Successful (in sec.)	Unsuccessful (%)
one	nginx	400	ten	ten	101210	0
2	mongoose	200	12	15	53255	0
3	libevent	200	sixteen	33	39882	0
four	hand-made	100	20	32	38550	0
five	onion	ten	22	33	29230	0
6	Poco	ten	25	50	20943	0
7	pion	ten	24	83	16526	0
eight	node.js	ten	23	173	9374	0
9	cpp-netlib	ten	100	183	5362	0

Scalability:

In theory, if there were more cores, we would observe a linear increase in productivity. Unfortunately, it is not possible to verify the theory - there are not enough cores.

nginx, frankly, surprised - because in fact it is a ready-made, multifunctional, modular solution, and the results exceeded the highly specialized libraries by an order of magnitude. Respect

mongoose is still raw, version 5.0 is not run in and the branch is in active development.

cpp-netlib showed the worst result. Not only did he not only support HTTP keep-alive connections, it also fell somewhere in the bowels of the boost, it was problematic to perform all iterations in a row. Definitely, the solution is raw, the documentation is outdated. Legitimate last place.

node.js already scolded here , I will not be so categorical, but the V8 is still sawed and sawed. What is this high-load solution that, even without a payload, so greedily consumes resources and gives out 10-20% of the performance of top testing participants?

HTTP keep-alive on / off: if in a post the difference reached x2 times, then in my tests the difference was up to x10.

Ministat error: No difference proven at 95.0% confidence.

Todo

benchmark mode "client and server on different machines." You need to be careful - everything can rest against network glands, and not only network card models, but switches, routers, etc. - the entire infrastructure between real machines. For starters, you can try a direct connection;
Testing client HTTP API (organize as a server and proxy). The problem is that not all libraries provide an API for implementing an HTTP client. On the other hand, some popular libraries (libcurl, for example) provide an exclusively client-side API set;
use other HTTP clients. httperf was not used for the above reasons, ab - for many reviews is outdated and does not hold real loads. Many recommended. Here are a couple of dozen solutions, some of them would be worth comparing;
similar benchmark in a Linux environment. This should be an interesting topic (at least - a new wave for holivarny discussions);
run tests on top-end Intel Xeon with a bunch of cores.

Links

Stress-testing httperf, siege, apache benchmark, and pronk are HTTP clients for load testing servers.
Performance Testing with Httperf - tips and tricks for benchmarking.
ApacheBench & HTTPerf - description of the benchmark process from G-WAN.
Warp is another high-load HTTP server with a complaint, Haskell.

application

In the appendix, you will find the source code and the results of all iterations of testing, as well as detailed information on building and installing HTTP servers.

Source: https://habr.com/ru/post/207460/

All Articles