📜 ⬆️ ⬇️

Java Logging: A Nightmare Story

Introduction


The path of the Java platform to the correct way to write lines to log files is stubborn and tortuous. The history of logging in Java is quite informative in terms of exploring the features of Open Source, including its interaction with corporations and individual programmers. I am going to tell as much as possible about the history of Java logging, as well as about what it all came to and how to live. My analysis of the situation will be quite subjective about the reason that logging is always a matter of taste, and my tastes have been formed by my adults. I think that this will not be informative in terms of some technical features of the entire zoo logging frameworks, but in terms of the policy and psychology of the developers in the Open Source model.

Start


It is clear that any logging library should allow at least to print a string to the console / log file.

In the beginning was, of course, System.err.println . In addition, the first version of the Servlet API consisted of a log function (though, rather primitive).
')
One of the more advanced solutions in 1999 was the Avalon project (and the subprojects called Excalibur and Fortress), which in addition to DI services offered the LogEnabled interface. The component that LogEnabled declared itself was injected (I use this word instead of “injected” to emphasize its connection with DI) an object of type Logger, where you could write: a) lines b) exceptions. This approach at that time seemed fresh and innovative, but from a modern point of view it is pure idiocy and over-engineering. There is no point in using DI for logging, and a static instance of this Logger itself would be fine for everyone. In Avalon, I had to think about where this damned Logger should be saved and what to do if the class does not use DI (that is, it is not controlled by the container), but I really want to log in it.

Around 1999, a new generation library appeared - log4j. The library's prototype was developed by IBM (back in the era when the blue giant tried to squeeze Java into OS / 2), then ASF picked up the baton. The product was already much more thoughtful and run-in on real needs. In general, it must be said that by that time server applications in Java were only about a year old, and logging was always required on the server. During this time, the Java community began to gradually understand what they needed and how.

log4j divided the concept of a logger or category (i.e., the application area that wants to write to the log), the actual entries in the log, which are carried out by the so-called appenders , and the formatting of records ( layout ). The log4j configuration determines which appenders are attached to which categories and which log level messages fall into each appender.

The cornerstone of log4j is the hierarchy of categories. For example, you can log all messages from org.hibernate and mute everything from org.hibernate.type . After some time, the de facto established practice of matching the hierarchy of categories and hierarchy of packages in an application.

The hierarchy of categories allows you to effectively cut off unnecessary messages, so log4j worked extremely fast. By the way, the principal for loggers is not so much the recording speed, as the filtering speed is unnecessary (and unnecessary, usually more than 90%) and formatting.

The principles laid down in log4j were rather well ported to other languages: log4cxx, log4net (and the new cub - log4php). The standard logging package in Python 2.x is a redesigned log4j (with a small addition of other libraries).

So, we summarize. A successful architecture, clear configuration scheme, the principle of fail-safe - why not include such a wonderful library in the platform?

Java Logging API


In fact, everything turned out strange. IBM, in the depths of which log4j arose, turned out to be quite brisk in questions of forming a new JSR47 (Java Logging API). In particular, the friend responsible for JSR47, Graham Hamilton, decided not to base log4j , but the original IBM logging toolkit . Moreover, the logging toolkit was used to its fullest: not only the names of all the main classes coincided, but also their implementations; The code was tried to finish as little as possible, apparently, in time for the next release of the platform. However, conceptually it was very similar to log4j, only instead of appenders it was called handlers, and instead of layout there was a formatter.

Since the main purpose of JSR47 is to determine the API, and not the implementation , there were only 4 output tools (by default in the platform) (in log4j more than 10), and the formatting tools were so poor that almost immediately I had to make my own formatters, because not enough. JSR47 offered to use the configuration in the form of .properties , and it was noted in brackets that not all can be described in the file. Thus, with the complexity of the configuration, the programmer unexpectedly discovered that it was again necessary to write code, since in the form of .properties its configuration is not realizable.

This is not to say that JSR47 is losing performance. In some places, he overtook log4j by keeping in memory a special representation of his configuration (which, by the way, at the same time complicated this very configuration). However, as it turned out, JSR47 compulsorily collected the so-called Caller Information, that is, "from where this message is logged." Getting Caller Information is quite an expensive operation, it proceeds using Native-code. Experienced uncles from log4j knew this, therefore they provided this opportunity with the proviso “don't turn it on”.

The developers of log4j made an open petition , where they demanded to “remove JSR47 from the conveyor”, while it was not yet part of the platform. The petition was signed by more than 100 people ... However, it was too late. The next JDK release was approved and the platform rushed into the future with a rudimentary java.util.logging , or abbreviated as JUL. The new logging was so underdeveloped and inconvenient that it was decided to use it only in a few application servers (among them Resin and Jetty). Sun, however, responded to the petition and most of the major problems with the original JSR47 were gradually resolved. Nevertheless, these manipulations were more like the installation of props to a wooden bridge, which by no means make this bridge reinforced. The developers of log4j made a nod toward the Sun, noting, however, that the degree of curvature of JUL is still quite high. In addition, the JDK 1.4 license did not allow using log4j as a JUL implementation. The last train for log4j is gone.

Not being able to support a large number of log writers (i.e. handlers), JUL showed up by identifying an incredible number of logging levels. For example, for debug messages, there were already 3 levels - FINE, FINER and FINEST. Seeing all this, the developers often did not understand at all which of the three levels, damn it, should be used.

The Java community was completely disoriented by the emergence of “standard” logging in parallel with the popular, stable and evolving log4j. No one understood which of them was a tenant. There were often situations when several libraries were assembled in the project, each of which used its own logging and its own settings, recording its log files completely at odds with each other.

Of course, the community tried to fix this problem. Wrapping epidemic began . Or, I would even say, a pandemic.

Wrapper hell


When you connect several libraries and try to connect their logs into one unit (and the code cannot be modified), it will be called Adapter. Adapters were written from JUL to log4j and vice versa. Unfortunately, adapters in terms of functionality are “the least common multiple”. Even when context support (NDC and MDC) appeared in log4j, it was lost during the transfusion into JUL. Worse, JUL only worked since JDK 1.4, while an incredible amount of enterprise applications still sat on 1.3. As a result, the community became obsessed with the idea of ​​creating a “common de facto standard” that everyone would use together and who worked always and everywhere.

Around 2002, a project called commons-logging (JCL = Jakarta Commons Logging) emerged from the Jakarta group. In fact, it was a wrapper around all the logging tools available at the time. It was suggested to write applications in such a way that they addressed the wrapper (interface called Log ), which selected the “suitable” logging system and connected to it itself. The wrapper was poorly functional and did not add any additions to the existing means of logging.

How was the appropriate logging system automatically selected? But this is the most interesting. First, you could explicitly set it by placing a special commons-logging.properties file somewhere in CLASSPATH. Secondly, through the system property (which, obviously, no one will do). Thirdly, if log4j was found somewhere in CLASSPATH, it was automatically activated. The implementations of all other libraries were sought by the same method, the first one was always connected.

Handsomely! Well, that would be nice if all the software in the world would use commons-logging. Then you could easily assemble JARs, put them into the application server, and there JCL will pick up the logging of this application server and voila!

In fact, as it turned out, a bunch of software usually uses “the favorite logging of its developer”. This means that a completely arbitrary library can, in the form of a dependency, pull up, for example, log4j, which thus falls into CLASSPATH and suddenly switches JCL to use log4j. Even worse with commons-logging.properties . If any figure thought of pushing it into his JAR, then when you connect this JAR, you understand, write it was gone. The situation was particularly piqued by the fact that it was completely incomprehensible from which JAR it came from. Sometimes it helped to enumerate all JARs in alphabetical order. Sometimes a tambourine.

The complete unpredictability of the choice of logging turned out to be the main and very fun feature of JCL. The log4j group broke out with an angry Think again article before adopting the commons-logging API , where it offered to stop the epidemic and focus on refining the existing solution - log4j.

Unfortunately, it was too late. From the submission of Jakarta to commons-logging, hundreds and then thousands of libraries were translated. Among them were Hibernate, Spring, Tomcat. After that, numerous users of these libraries were overwhelmed by a wave of problems, generally described as ClassLoader hell . Application servers use a rather complex hierarchy of ClassLoaders, often with serious deviations from the J2EE standard. Under these conditions , the JCL is sometimes initialized twice, and incorrectly, leading to a completely mystical stack traces, which do not even allow one to suspect that the problem is in the log wrapper.

Why, strictly speaking, Open Source worked in such a strange way, giving birth to this perversion? Why the developers did not decide to just take and use another mature and popular Open Source product - log4j? The point here is perhaps in some inertia of the community, accustomed to be led by either ASF (and the Jakarta group that gave rise to this nightmare has a part of ASF), or Sun. As soon as a critical mass of projects using JCL is formed, all the rest (and not the most stupid people, don't they, Gavin King?) Start using JCL (for Apache is cool!). This is generally reminiscent of the Brownian movement, where brands such as Apache or Sun are capable of creating low pressure areas where millions of developers rush. In the case of the JCL, the “success story” is described in the blog of Rod Waldhoff (one of the developers of the so-called Jakarta Commons) in 2003.

New round of progress


So, somewhere in 2004 we have in the set:
  1. Stable and functionally developed log4j
  2. Dull java.util.logging
  3. Problem commons-logging
  4. Several small loggers unworthy of mention

Note that in the draft log4j at this time conservative moods prevailed. Particular attention was paid to the issue of compatibility with the old JDK. It seems that the development of the new log4j branch - 1.3.x. begins. This version is a kind of compromise solution: yes, I want a new functionality, yes, I want to maintain backward compatibility, yes, we will try to please both ours and yours. Meanwhile, JDK 1.5 is on its way with varargs, JMX extensions and a bunch of other gifts. In the team log4j began the fermentation of minds. The 2.x branch is budded - incompatible with the main 1.2.x branch and created specifically for JDK 1.5. The Java community is impatient. It seems to be happening as something . But what exactly is impossible to understand - log4j 2.0 still remains an unattainable alpha, log4j 1.3 is wildly buggy and does not provide the promised drop-in compatibility. And only branch 1.2 is still stable and alive and well, jumping over several years - attention! - from version 1.2.6 to 1.2.12.

Somewhere in 2006, one of the founding fathers of log4j, Ceki Gülcü, decides to leave the rapidly dying team. This is how the next “wrapper of everything” called SLF4J (Simple Logging Facade for Java) is born. Now it wraps around: log4j, jul, commons-logging and a new logger called logback. As you can see, progress quickly reached the “wrapper around the wrapper” stage. It is easy to predict that according to the same scheme, the number of libraries to be wrapped will grow as a factorial. However, SLF4J offers other other quirks. These are special binary adapters: from log4j to SLF4J, from commons-logging to SLF4J, and the like. Such adapters are made for the code whose source is not available; at the same time, they should replace the original JARs of the log libraries. I can not imagine what kind of porridge is formed, but if you really want, then you can.

With all my hatred of wrappers, hand on heart, SLF4J is a well-made product. Were taken into account all the shortcomings of the predecessors. For example, instead of shaman dancing with the search for classes in CLASSPATH a more reliable scheme was invented. Now the whole wrapper is divided into two parts - API (which is used by applications) and the Implementation, which is represented by separate JAR files for each type of logging (for example, slf4j-log4j12.jar , slf4j-jdk14.jar , etc.). Now you just need to connect the necessary Realization file to the project, after which - oops! all project code and all used libraries (provided that they access the SLF4J API) will log in the right direction.

Functionally, the SLF4J supported all modern NDC and MDC freaks. In addition to actually wrapping calls, SLF4J offered a small but useful bonus when formatting strings. The bonus is here next. In the code it is often necessary to print constructions of the form:

 log.debug("User " + user + " connected from " + request.getRemoteAddr()); 

In addition to printing the string itself, the user.toString() conversion will be implicitly followed by string concatenation. All anything. In the debug mode, the speed of execution doesn’t worry us much. However, even if we set the level, say, to INFO, it turns out that the construction of the string will still occur! No miracles: the string is constructed before calling log.debug , so log.debug has no way to control this. If you imagine that this log.debug located in some critical internal loop ... well, you can't live like this. The log4j developers suggested to frame the debugging code like this:

 if (log.isDebugEnabled()) { log.debug("User " + user + " connected from " + request.getRemoteAddr()); } 

Not good at it. In theory, all these problems should be taken up by the logging library itself. This problem has become just the Achilles heel of log4j. The developers reacted sluggishly to kicks, telling that now you can add an object to logging calls (exactly one!) And describe how this object will be logged using the ObjectRenderer interface. By and large, all this was excuses and half measures.

SLF4J was not squeezed within the framework of compatibility with older versions of the JDK and API, so the move offered a more elegant solution:
  log.debug("User {} connected from {}", user, request.getRemoteAddr()); 

In general, everything is simple. In this line, {} are references to parameters that are passed separately. Conversion of parameters to the string and final formatting of the log record occurs only when DEBUG level is set. Parameters can pass a lot. Works! Do not write framing if and other stupidity!

In brackets, it should be noted that this possibility was also completely unexpectedly implemented by the Groovy language, where there is a GString concept, i.e view string
 "User ${user} connected from ${request.getRemoteAddr()}" 
which is implicitly associated with several context variables (here it is user , request ), with the string being calculated in a lazy way . This is very convenient for such log libraries as log4j - you can get a GString as input, and then either throw it away without a calculation, or convert it to a normal (static) string, String.

In short, the SLF4J was done right, with the future in mind. This led to a serious increase in its popularity among the community: now, SLF4J are using such significant projects as Jetty, Hibernate, Mina, Geronimo, Mule, Wicket, Nexus ... well, almost all the losers who had hanged on commons-logging at one time switched to SLF4J . Interestingly, what prevented the improvement of commons-logging to the desired state many years ago? But these are the realities of Open Source - the development of software in it is more revolutionary than evolution.

At the same time with SLF4J, a completely new logger was submitted to the table - Logback. It was made by a man who ate a dog at logging, and in fact turned out to be a good product. Logback was originally sharpened by JDK 1.5+, having at one stroke got rid of all the backward-compatibility senile diseases inherent in the log4j project. And that means - varargs, java.util.concurrent and other delights . For example, due to the built-in runtime filtering system, you can change the logging level depending on the user session, scatter users in different log files, and so on.

I'll put the mustard into the idyll drawn by the author. Most of these features can be implemented as additional appenders to log4j. It is necessary to bend and file the configuration, it is more difficult, but it is a fact that it is not necessary to switch to this for a new logger. Thus, all advertised Logback chips are convenient, but not unique.

As for the community, it is with caution to the Logback. First, in a few years he got to version 0.9.x, and that scares some programmers. Secondly, the Logback is neither under the Apache umbrella, nor in the scope of the Sun. It confuses people scrupulous. Third, the author needs to eat, so for some add-ons to the Logback and support, it requires money. This sometimes scares off students. Among other things, Logback has a rather complicated dual license (LGPL / EPL), while log4j is a universal Apache license. For libraries and generally redistributable software, licensing is a very subtle point.

, Logback — . Logback logging-, . , :

, Open Source « ». SLF4J «». SLF4J - . , SLF4J, « ». Logback ( , ) . (, log4j 2.0, .) , Logback Apache, .

Conclusion


It is interesting to look at the history of the issue from the perspective of the psychology of programmers. After all, in principle, all this is spiral (and it seems like a progressive !) — «reinvent the wheel». « » « » . ( «-»). «» , , . , . (, Graham Hamilton IBM), . Jakarta Commons « » « ».

-, . 10- , , Sun - Java-. , Sun Sun. , , . — , :)

Source: https://habr.com/ru/post/113145/


All Articles