📜 ⬆️ ⬇️

Identify all the classes that the Java application uses.

Without a doubt, everyone who in his resume indicates the experience of developing in Java, at least once in his life wrote the lines
public static void main(String[] args) 
compiled them and executed them with a command like java HelloWorld .
But how many people know what happens inside the JVM from the moment this command is executed until control is transferred to the main method, how does Java find and load the classes needed by the user? Occurring once the production problem forced the author to understand this issue. The results of the survey under the cut. Immediately it is worth making a reservation that the article does not pretend to complete coverage of all existing JVMs, testing was conducted only on the Sun HotSpot JVM.

Formulation of the problem


One day, the customer needed to find out which classes his application uses. The application was already well known to the author and was an explosive mixture of code of different races, which implements (to the credit of system developers, mostly competently and to the point) the mechanisms of inheritance, late binding and dynamic compilation. Therefore, information on the actually used classes could significantly help in refactoring the application.
The task is set as follows: during the operation of the application, a file should be generated containing the names of all the classes directly used by the application. It, by the way, consists of two main parts: the application server, which hosts the web interface of the application, and the processing server (a separate server, on which various periodic tasks are run using Ant scripts). Of course, information about classes must be collected from both parts of the application.
Let's start looking for a solution to the problem and at the same time we will look into the class loading mechanisms in Java.

Overriding Class Loader


The first direction that came to mind when solving this problem was to take advantage of the possibilities of expanding the class loading mechanism in Java. Many articles have been written on this topic, including in Russian (links at the end of the article).
The essence of this mechanism is as follows:
  1. descendants of the abstract class java.lang.ClassLoader are used to directly load classes, as evidenced by the signature of the Class loadClass(String name) method Class loadClass(String name) . This method should find the byte array that is the bytecode of the required class and pass it to the protected Class defineClass(String name, byte[] b, int off, int len) , which will turn it into an instance of java.lang.Class . Thus, by implementing their loaders, developers can download from any place where they can get an array of bytes;
  2. framework developers declare a clever mechanism of hierarchy and inheritance of loaders. In this case, inheritance here should be understood not in terms of class inheritance in OOP, but as a separate hierarchy organized with the help of the getParent method of the ClassLoader class. When you start the JVM, a vertex of this hierarchy of three main loaders is created: the base one (Bootstrap Classloader, responsible for loading the base classes of the framework), the extension loader (Extension Classloader, responsible for loading classes from lib / ext) and the system loader (System Classloader, responsible for loading custom classes). Further, developers are free to continue this hierarchy from the bootloader and below. By default, the HotSpot JVM uses the sun.misc.Launcher$AppClassLoader class as the system bootloader, but you can easily override it using the java.system.class.loader system property of the java.system.class.loader command line key java -Djava.system.class.loader=.. ;
  3. A loading delegation rule is declared: any loader, before attempting to load any class, must first contact its parent, and only if it could not load the required class, try to load it itself. Unfortunately, the beauty and convenience of this rule are compensated by the lack of mandatory execution. The author still has to face the consequences of non-fulfillment of this rule.

However, at this stage, the first concept of solving the problem has appeared:
  1. Implement your own class loader, replacing the system one, which, when you call the loadClass method, will simply write the class name to a file, and send a class load request to the real system loader. Subject to the rules of delegation described above, this should allow catching all loaded custom classes, even if they are loaded by other loaders;
  2. To force all JVMs running on the machine to use this class loader as the system one.

To implement the second paragraph, it is necessary to solve the following tasks:

So, the job is done, the bootloader is compiled and placed in lib / ext, the value of the JAVA_TOOL_OPTIONS environment JAVA_TOOL_OPTIONS is set, we run the application, work, open the log and see ... a scant list of a dozen classes, including system and several third-party classes. It was here that I had to recall the non-obligation to execute the delegation delegation rule, as well as to look into the source code for Apache Ant and Tomcat. As it turned out, these applications use their own class loaders. This, on the one hand, partly allowed them to acquire their powerful functionality. However, for one reason or another, the developers of these products decided not to adhere to the recommended rule for delegating loading and the loaders written by them do not always turn to their parents before downloading the next class. That is why our system loader knows almost nothing about classes that are loaded by Tomcat and Ant.
Thus, the described method does not allow catching all the required classes, especially considering the variety of application servers used - who knows how developers of the application server used by the customer reacted to the load delegation rule.

Attempt number two. Apply class instrumentation


Sometimes knowledge and skills alone are not enough to solve a problem. Sometimes intuition and a bit of luck are needed to achieve a goal. Now the author no longer remembers in response to a search query about class loaders, the search giant issued a link to an article about the mechanism of class instrumentation. As it turned out, this tool is designed to change the bytes of Java code classes during their loading (for example, JProfiler using this mechanism is built into the classes for measuring performance). Stop, what does it mean during their loading? That is, this mechanism knows about each loaded class? Yes, he knows, and, as it turned out, it is even better than class loaders — the byte[] transform(ClassLoader loader, String className, Class<?> classBeingRedefined, ProtectionDomain protectionDomain, byte[] classfileBuffer) method byte[] transform(ClassLoader loader, String className, Class<?> classBeingRedefined, ProtectionDomain protectionDomain, byte[] classfileBuffer) the ClassFileTransformer interface called by the transformer class implementing loading any class. This method turned out to be the bottleneck through which any loadable class passes, with the exception of a very small number of system classes.
Now the task is as follows:
  1. Write your own transformer class that implements the ClassFileTransformer.transform method, which, however, will not perform any transformation, but will only write the name of the loaded class to a file.
  2. And again it is necessary to make so that the class written by us was connected to any started Java application.

The source code of the transformer class is presented below:
 package com.test; import java.io.File; import java.lang.instrument.Instrumentation; import java.lang.instrument.ClassFileTransformer; import java.security.ProtectionDomain; import java.lang.instrument.IllegalClassFormatException; public class LoggingClassFileTransformer implements ClassFileTransformer { public static void premain(String agentArguments, Instrumentation instrumentation) { instrumentation.addTransformer(new LoggingClassFileTransformer()); } /** *        * @param className        * @return  classfileBuffer  -  */ public byte[] transform(ClassLoader loader, String className, Class<?> classBeingRedefined, ProtectionDomain protectionDomain, byte[] classfileBuffer) throws IllegalClassFormatException { log(className); return classfileBuffer; } //     lib/ext private static final File logFile = new File(System.getProperty("java.ext.dirs").split(File.pathSeparator)[0]+"/LoggingClassFileTransformer.log"); public static synchronized void log(String text) { //    // ... } } 

Here it is necessary to clarify the mechanism of using transformer classes. To connect such a class to the application, we need the so-called premain class, i.e. class containing the public static void premain(String paramString, Instrumentation paramInstrumentation) . From the name of the method it is clear that it is called before calling the main method. At this moment, you can connect transformer classes to your application using the addTransformer method of the addTransformer interface. Thus, the above class is both a transformer class and a premain class. To use this class, it must be placed in a JAR file whose manifest (file META-INF / MANIFEST.MF) contains the Premain-Class parameter indicating the full name of the premain class, in our case Premain-Class: com.test.LoggingClassFileTransformer . Then you need to specify the full path to this archive using the -javaagent parameter when starting the JVM. This is where the JAVA_TOOL_OPTIONS variable comes to the rescue.
So, the class is written, compiled, packaged with the manifest in the JAR, the environment variable JAVA_TOOL_OPTIONS=-javaagent:" LoggingClassFileTransformer.jar" set, the application is running, the log is compiled, PROFIT!
upd3.
Habrayuser grossws offered another way that uses instrumentation — AspectJ: “It’s relevant if you need to cheaply instruct all classes in their part of the application without affecting the environment, but this is a slightly different task.”
')
upd.

The third way. Simple as scrap


Thanks to spiff and apangin habrausers, which in a personal reminded me of another way that I tried, but was undeservedly forgotten, because ultimately did not fit.
This method is based on running JVM with the -verbose:class parameters -verbose:class or --XX:+TraceClassLoading . When using any of these parameters, messages like [Loaded java.util.Date from shared objects file] added to the standard JVM output stream. However, this method, despite its simplicity, has one major drawback - it is difficult to control the format of the displayed message, as well as the direction of the output. And the application in question already displays enough debugging information to stdout, and the ability to filter out the necessary messages from the stream and redirect them to a separate file for all JVM instances running on the server is very problematic.
upd2.
Habrayuzer apangin this method was completed to the next version of the launch of the JVM:
java -XX:+TraceClassLoading -XX:+UnlockDiagnosticVMOptions -XX:+LogVMOutput -XX:LogFile=java_*.log -XX:-DisplayVMOutput HelloWorld
Instead of * automatically pid will be substituted.
After launching such a command, a file with the name, for example, java_580.log, will be formed, approximately as follows:
 <?xml version='1.0' encoding='UTF-8'?> <hotspot_log version='160 1' process='580' time_ms='1334248301214'> <vm_version> <name> Java HotSpot(TM) Client VM </name> <release> 20.6-b01 </release> <info> Java HotSpot(TM) Client VM (20.6-b01) for windows-x86 JRE (1.6.0_31-b05), built on Feb 3 2012 18:44:09 by "java_re" with MS VC++ 7.1 (VS2003) </info> </vm_version> <vm_arguments> <args> -XX:+TraceClassLoading -XX:+UnlockDiagnosticVMOptions -XX:+LogVMOutput -XX:LogFile=java_*.log -XX:-DisplayVMOutput </args> <command> test </command> <launcher> SUN_STANDARD </launcher> <properties> java.vm.specification.name=Java Virtual Machine Specification java.vm.version=20.6-b01 java.vm.name=Java HotSpot(TM) Client VM java.vm.info=mixed mode, sharing java.ext.dirs=C:\Program Files\Java\jre6\lib\ext;C:\WINDOWS\Sun\Java\lib\ext java.endorsed.dirs=C:\Program Files\Java\jre6\lib\endorsed sun.boot.library.path=C:\Program Files\Java\jre6\bin java.library.path=C:\WINDOWS\system32;C:\WINDOWS\Sun\Java\bin;C:\WINDOWS\system32;C:\WINDOWS;C:\Program Files\PC Connectivity Solution\;C:\Program Files\Rockwell Software\RSCommon;C:\WINDOWS\system32;C:\WINDOWS;C:\WINDOWS\System32\Wbem;C:\Program Files\Java\jdk1.6.0_06\bin;c:\hibernate;C:\WINDOWS\system32\WindowsPowerShell\v1.0;C:\Program Files\TortoiseSVN\bin;C:\Program Files\Nmap;;C:\PROGRA~1\COMMON~1\MUVEET~1\030625;C:\PROGRA~1\COMMON~1\MUVEET~1\030625;. java.home=C:\Program Files\Java\jre6 java.class.path=. sun.boot.class.path=C:\Program Files\Java\jre6\lib\resources.jar;C:\Program Files\Java\jre6\lib\rt.jar;C:\Program Files\Java\jre6\lib\sunrsasign.jar;C:\Program Files\Java\jre6\lib\jsse.jar;C:\Program Files\Java\jre6\lib\jce.jar;C:\Program Files\Java\jre6\lib\charsets.jar;C:\Program Files\Java\jre6\lib\modules\jdk.boot.jar;C:\Program Files\Java\jre6\classes java.vm.specification.vendor=Sun Microsystems Inc. java.vm.specification.version=1.0 java.vm.vendor=Sun Microsystems Inc. sun.java.command=test sun.java.launcher=SUN_STANDARD </properties> </vm_arguments> <tty> <writer thread='1504'/> [Loaded java.lang.Object from shared objects file] ... <writer thread='3092'/> <destroy_vm stamp='0.281'/> <tty_done stamp='0.283'/> </tty> <hotspot_log_done stamp='0.283'/> </hotspot_log> 

At the same time, nothing new is written to the standard output stream, thanks to the -XX:-DisplayVMOutput option -XX:-DisplayVMOutput .
The main advantage of this method is its simplicity. It’s not a fact that such a file format would suit the customer, but this is already a debatable question and it is not within the scope of this article.

Conclusion


So, what conclusions can be made after the project is completed:


List of used sources


Java class loader articles:

Other useful links:

Source: https://habr.com/ru/post/141699/


All Articles