Simple and fast application stress testing framework

[ ]

Web applications under high loads have recently been created more and more, but with frameworks that allow them to test their stress flexibly and add their own logic - not a lot.
There are of course many different ones (see voting at the end of the post), but someone does not support cookies, someone gives a weak load, someone is very heavy, and they are suitable mainly for very single-type requests, i.e. dynamically generate each request using its own logic and at the same time still as quickly as possible (and ideally on java to finish if anything) - did not find such.

Therefore, it was decided to sketch his own, because it is only 3-5 classics in this case. Basic requirements: speed and dynamic query generation. At the same time, speed is not just thousands of RPS, but ideally, when stress rests only on network bandwidth and works from any free machine.
')

Engine

The requirements are clear, now you need to decide on what it will all work, i.e. what http / tcp client to use. Of course, we don’t want to use the outdated thread-per-connection model (thread per connection), because we immediately push into several thousand rps depending on the power of the machine and the speed of context switching in jvm. So apache-http-client and the like are swept away. Here we must look at the so-called. non-blocking network clients built on NIO .

Fortunately, in the java world in this niche there has long been a standard de facto open source Netty , which is also very versatile and low-level, allows you to work with tcp and udp.

Architecture

In order to create our sender, we need the ChannelUpstreamHandler handler in terms of Netty, from which it will send our requests.

Next, you need to select a high-performance timer to send the maximum possible number of requests per second (rps). Here you can take the standard ScheduledExecutorService , it basically copes with this, but on weaker machines it is better to use HashedWheelTimer (included in Netty) because of the lower overhead when adding tasks, it only requires some tuning. On powerful machines there is almost no difference between them.

Lastly, in order to squeeze the maximum rps from any machine, when any limits on the connections in this OS or the total current load are unknown, it’s safest to use the trial and error method: first set some outrageous value, for example, a million requests per second how many connections will start mistakes when creating new ones. Experiments have shown that the maximum number of rps is usually slightly smaller than this figure.
Those. We take this figure as the initial value of rps and then, if errors repeat, we reduce it by 10-20%.

Implementation

Request generation

To support dynamic query generation, we create an interface with the only method that our stress will call to receive the contents of the next query:

public interface RequestSource { /** * @return request contents */ ChannelBuffer next(); }

ChannelBuffer is an abstraction of the byte stream in Netty, i.e. here the entire contents of the request should be returned as a stream of bytes. In the case of http and other text protocols, this is simply a byte representation of the request string (text).
Also in the case of http, it is necessary to put 2 new line characters at the end of the request (\ n \ n), this is also a sign of the end of the request for Netty (it will not send the request otherwise)

Sending

To send requests to Netty, you first need to explicitly connect to a remote server, so at the start of the client we start periodic connections with a frequency in accordance with the current rps:

 scheduler.startAtFixedRate(new Runnable() { @Overrid public void run() { try { ChannelFuture future = bootstrap.connect(addr); connected.incrementAndGet(); } catch (ChannelException e) { if (e.getCause() instanceof SocketException) { processLimitErrors(); } ... }, rpsRate);

After successful connection, we immediately send the request itself, so our Netty handler will conveniently inherit from SimpleChannelUpstreamHandler where there is a special method for this. But there is one nuance: a new connection is processed by a so-called. the main thread (“boss”), where long operations should not be present, which may be the generation of a new request, so you have to transfer it to another stream, as a result, sending the request itself will look something like this:

 private class StressClientHandler extends SimpleChannelUpstreamHandler { .... @Override public void channelConnected(ChannelHandlerContext ctx, final ChannelStateEvent e) throws Exception { ... requestExecutor.execute(new Runnable() { @Override public void run() { e.getChannel().write(requestSource.next()); } }); .... } }

Error processing

Next is the error handling of creating new connections when the current request rate is too large. And this is the most non-trivial part, or rather it is difficult to make it platform-independent, since different operating systems behave differently in this situation. For example, linux throws BindException, windows - ConnectException, and MacOS X - either one of these, or in general InternalError (Too many open files). So on the max-axis, stress behaves most unpredictably.

In this regard, in addition to error handling when connecting, in our handler, it is also necessary to do this (simultaneously counting the number of errors for statistics):

 private class StressClientHandler extends SimpleChannelUpstreamHandler { .... @Override public void exceptionCaught(ChannelHandlerContext ctx, ExceptionEvent e) throws Exception { e.getChannel().close(); Throwable exc = e.getCause(); ... if (exc instanceof BindException) { be.incrementAndGet(); processLimitErrors(); } else if (exc instanceof ConnectException) { ce.incrementAndGet(); processLimitErrors(); } ... } .... }

Server responses

Finally, we must decide what we will do with the answers from the server. Since this is a stress test and only throughput is important for us, it remains only to read the statistics:

 private class StressClientHandler extends SimpleChannelUpstreamHandler { @Override public void messageReceived(ChannelHandlerContext ctx, MessageEvent e) throws Exception { ... ChannelBuffer resp = (ChannelBuffer) e.getMessage(); received.incrementAndGet(); ... } }

There may also be a count of types of http responses (4xx, 2xx)

All code

All code with additional buns like reading http templates from files, templating, timeouts and so on. lies in the form of a finished maven project on github (ultimate-stress) . There you can also download the finished distribution (jar file).

findings

All of course rests on the limit of open connections. For example, on linux, with an increase in some OS settings (ulimit, etc.), on a local machine, it was possible to achieve about 30K rps, on modern hardware. Theoretically, in addition to the limit of connections and the network, there should be no more restrictions; in practice, the jvm overhead costs are still felt and the actual rps is 20-30% lower than the specified one.

Source: https://habr.com/ru/post/186100/

All Articles