Log4j 1.x performance in synchronous and asynchronous mode

Good day, habrazhiteli!

At the moment I am engaged in improving the performance of a large corporate solution. Its specificity allows multiple users to perform similar actions, and therefore behind the scenes, the application server works with the same code. And at one of the moments of the long way to accelerate the application, it was noticed that in the top of the most low-performance sections, the top places are taken by log4j. The first thought was - excessive logging, but subsequent analysis refuted it. Moreover, this is extremely important data for support engineers, and if you remove it now, then either the code will be returned or the analysis of the production problems of the server will be significantly complicated.

This prompted me to this study - is there a way to optimize the logging process?
')
Disclaimer : because of the abundance of diagrams in this article that cause psychedelic associations (like this: this is not a comparison of synchronous and asynchronous logging modes, the dragon kills a snake here!) Is not recommended for minors, people with unstable mentality and pregnant prodakshene and issuing the nearest patch is not this year.

Carefully, under the cut traffic.

What is the reason?

As usual, the reason is trivial - “competitive access”. After several experiments with a floating number of parallel threads, it became clear that the call logging time was not linear, so the output is that the railway loses a lot.

Here are the results of measurements:

What is the solution?

Search for other logging methods, upgrade libraries, you can do all this, but the task is to achieve maximum results with minimum effort. I can also tell about the “log4j 2” library, but this will be a separate article. Now we will consider the tools provided to us "out of the box" in log4j 1.x.

Among the appenders supplied with the library is AsyncAppender, which allows using the intermediate buffer to accumulate logging events to organize asynchronous work with the file system (if the final appender is a file, because it was originally intended for the SMTP logger). When generated, logging events accumulate, and only when a certain buffer level is reached, they get into the file.

Measurements

Now, when the approach is defined, it would be necessary to understand how effective it is, for this we will conduct the appropriate measurements.

We will measure this way:
0) I warn you in advance, “I didn’t do beautiful data,” in places it is clear that the processor switched to another job, and I left these places as it was. This is also part of the real work of the system.
1) We divide the tests into 3 groups:
- 10 logging events (from 1 to 10 with a step of 1)
- 550 logging events (from 10 to 100 with a step of 10)
- 5500 logging events (from 100 to 1000 in increments of 100)
2) In each group there will be 3 subgroups of tests - depending on the buffer size (we will try to find the optimal one):
- 500 events
- 1500 events
- 5000 events
3) Tests will be performed synchronously and asynchronously.

Synchronous Logger Configuration

<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE log4j:configuration SYSTEM "log4j.dtd" > <log4j:configuration> <appender name="fileAppender" class="org.apache.log4j.FileAppender"> <param name="file" value="st.log"/> <layout class="org.apache.log4j.PatternLayout"> <param name="ConversionPattern" value="%d{ABSOLUTE} %5p %t %c{1}:%M:%L - %m%n"/> </layout> </appender> <root> <priority value="info" /> <appender-ref ref="fileAppender"/> </root> </log4j:configuration>

Asynchronous Logger Configuration

 <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE log4j:configuration SYSTEM "log4j.dtd" > <log4j:configuration> <appender name="fileAppender" class="org.apache.log4j.FileAppender"> <param name="file" value="st.log"/> <layout class="org.apache.log4j.PatternLayout"> <param name="ConversionPattern" value="%d{ABSOLUTE} %5p %t %c{1}:%M:%L - %m%n"/> </layout> </appender> <appender name="ASYNC" class="org.apache.log4j.AsyncAppender"> <param name="BufferSize" value="500"/> <appender-ref ref="fileAppender"/> </appender> <root> <priority value="info" /> <appender-ref ref="ASYNC"/> </root> </log4j:configuration>

4) The tests themselves are simple logging calls interspersed with “random work” (duration from 1 to 15 ms, in order to be able to alternate access to the file).

Test Source Code

 package com.ice.logger_test; import org.apache.commons.lang3.time.StopWatch; import org.apache.log4j.Logger; import java.util.Random; public class SimpleTest { private static Logger logger = Logger.getLogger(SimpleTest.class); private static double NANOS_TO_SEC = 1000000000.0d; private static String LOG_MESSAGE = "One hundred bytes log message for performing some tests using sync/async appenders of log4j library"; public static void main(String[] args) throws InterruptedException { //performTest("Single thread"); ThreadRunner t1 = new ThreadRunner(); new Thread(t1).start(); new Thread(t1).start(); new Thread(t1).start(); new Thread(t1).start(); new Thread(t1).start(); new Thread(t1).start(); new Thread(t1).start(); new Thread(t1).start(); new Thread(t1).start(); new Thread(t1).start(); } private static void performTest(String message) throws InterruptedException { logger.info("Warm-up..."); logger.info("Warm-up..."); logger.info("Warm-up..."); StopWatch timer = new StopWatch(); Random ran = new Random(); for(int i = 1; i <= 10000; i += getIncrementator(i)) { timer.reset(); timer.start(); int iterations = 0; for(int j = 1; j <= i; j++) { timer.suspend(); Thread.sleep(ran.nextInt(15)+1); // some work timer.resume(); logger.info(LOG_MESSAGE); timer.suspend(); Thread.sleep(ran.nextInt(15)+1); // some work timer.resume(); iterations = j; } timer.stop(); System.out.printf(message + " %d iteration(s) %f sec\n", iterations, (timer.getNanoTime() / NANOS_TO_SEC)); } } private static int getIncrementator(int i) { if(i >= 0 && i < 10) return 1; if(i >= 10 && i < 100) return 10; if(i >= 100 && i < 1000) return 100; if(i >= 1000 && i < 10000) return 1000; if(i >= 10000 && i <= 100000) return 10000; return 0; } static class ThreadRunner implements Runnable { @Override public void run() { try { performTest(Thread.currentThread().getName()); } catch (InterruptedException e) { e.printStackTrace(); } } } }

Both results (plus intermediate, so as not to lose the thread), synchronous and asynchronous will be visualized diagrams, for clarity.
So, let's begin.

Synchronous start, 2 threads

First, let's see how the logger behaves in a synchronous configuration. Run all scripts on 2 threads.
Here are the results:

Asynchronous start, 2 threads

Buffer = 500

Now we will switch to asynchronous mode and at the same time we will try to find the optimal buffer, let's start with 500

Buffer = 1500

Increase the buffer 3 times and perform the same tests:

Buffer = 5000

Increase the buffer 10 times and perform the same tests:

Total for 2 threads

Now, for clarity, I will collect all the asynchronous (to try to determine the optimal buffer) and synchronous (for clarity, it is interesting, who won) tests in one diagram:

Now I think I can clearly see the acceleration in asynchronous mode.
But before drawing conclusions, let's repeat our tests on the 5th and 10th streams.

Synchronous start, 5 threads

Asynchronous start, 5 threads

Buffer = 500

Buffer = 1500

Buffer = 5000

Total for 5 threads

Synchronous start, 10 threads

Asynchronous start, 10 threads

Buffer = 500

Buffer = 1500

Buffer = 5000

Total for 10 threads

The conclusion, as they say, on the face.
Since now we can definitely talk about the advantages of the asynchronous logging method, then let's try to increase the volume of tests by another 10 times. Add a test for 55000 logging events (from 1000 to 10,000 in increments of 1000). We take the buffer equal to 500 (since it is at first glance, and later it will be proved, is the most optimal in our tests).

Synchronous start, 10 flows on large volumes of data

Asynchronous start, 10 flows on large volumes of data

Total for 10 streams on large amounts of data

Optimal buffer

At the moment, we have already accumulated quite a lot of statistical data, so let's average them all and see which buffer is the most suitable.

2 Flows

5 Streams

10 Streams

10 Streams, large amount of tests

Total - 500 events , it is our buffer that allows you to work most effectively in asynchronous mode.
And now, if we compare the total (or average) running time of all the tests at all, then we can get a constant that reflects the order in which the asynchronous mode wins before the synchronous one, for me it is 8.9 (times) .

Conclusion

The above material makes it clear that the strategy of asynchronous logging gives a performance benefit. Then the question arises - why not always use it? In order to make a choice in favor of a particular method, it is necessary to present the mechanisms that are hidden inside. Below are some abstracts taken from offsite:
1) AsyncAppender operates with its own thread (while FileAppender itself runs in the context of the current thread), therefore when using it, it is desirable to increase the limit of application server threads.
2) When using AsyncAppender, memory overheads occur, since logging events are not immediately flushed to the file, but after filling the buffer.
3) Locking a log file takes longer than using the synchronous approach, since the message buffer contains much more information to write.
In principle, everything is prosaic, but here you need to understand that synchronization itself is also a lock, so it is important that transferring it from one place to another does not make it worse.

Use asynchrony where you really need it:
- long-playing appenders - SMTP, JDBC
- general resource lock - FTP, Local File Storage
But above all, be sure to profile your code.

Excel version of this article:
docs.google.com/spreadsheet/ccc?key=0AkyN15vTZD-ddHV0Y3p4QVUxTXVZRldPcU0tNzNucWc&usp=sharing
docs.google.com/spreadsheet/ccc?key=0AkyN15vTZD-ddFhGakZsVWRjV1AxeGljdDczQjdNbnc&usp=sharing

Thanks for attention. I hope the article will be useful to you.
Have a productive week!

Source: https://habr.com/ru/post/215147/

All Articles

Log4j 1.x performance in synchronous and asynchronous mode

What is the reason?

What is the solution?

Measurements

Synchronous start, 2 threads

Asynchronous start, 2 threads

Buffer = 500

Buffer = 1500

Buffer = 5000

Total for 2 threads

Synchronous start, 5 threads

Asynchronous start, 5 threads

Buffer = 500

Buffer = 1500

Buffer = 5000

Total for 5 threads

Synchronous start, 10 threads

Asynchronous start, 10 threads

Buffer = 500

Buffer = 1500

Buffer = 5000

Total for 10 threads

Synchronous start, 10 flows on large volumes of data

Asynchronous start, 10 flows on large volumes of data

Total for 10 streams on large amounts of data

Optimal buffer

Conclusion

More articles: