📜 ⬆️ ⬇️

Java bytecode "Hello world"

On Habré there is already an article about java bytecode . I decided to supplement it a little and, to the best of my ability, develop the theme. It seems to me quite logical to make out the simplest Java application. And what could be simpler "Hello world"?
For my experiment, I created the src directory, where I put the App.java file in the hello folder:

package hello; public class App { public static void main(String[] args) { System.out.println("Hello world!"); } } 



')
Copy the file with the command:

javac src / hello / App.java -d classes /

At the output in the classes folder, I had an App.class file. First, let's compare the sizes of java and class files.

App.java 139B
App.class 418B

It was unexpected. For some reason it seemed to me that the compiled file should be smaller. I will try to open the class file:

hexdump App.class

 0000000 ca fe ba be 00 00 00 34 00 1d 0a 00 06 00 0f 09 0000010 00 10 00 11 08 00 12 0a 00 13 00 14 07 00 15 07 0000020 00 16 01 00 06 3c 69 6e 69 74 3e 01 00 03 28 29 0000030 56 01 00 04 43 6f 64 65 01 00 0f 4c 69 6e 65 4e 0000040 75 6d 62 65 72 54 61 62 6c 65 01 00 04 6d 61 69 0000050 6e 01 00 16 28 5b 4c 6a 61 76 61 2f 6c 61 6e 67 0000060 2f 53 74 72 69 6e 67 3b 29 56 01 00 0a 53 6f 75 0000070 72 63 65 46 69 6c 65 01 00 08 41 70 70 2e 6a 61 0000080 76 61 0c 00 07 00 08 07 00 17 0c 00 18 00 19 01 0000090 00 0c 48 65 6c 6c 6f 20 77 6f 72 6c 64 21 07 00 00000a0 1a 0c 00 1b 00 1c 01 00 09 68 65 6c 6c 6f 2f 41 00000b0 70 70 01 00 10 6a 61 76 61 2f 6c 61 6e 67 2f 4f 00000c0 62 6a 65 63 74 01 00 10 6a 61 76 61 2f 6c 61 6e 00000d0 67 2f 53 79 73 74 65 6d 01 00 03 6f 75 74 01 00 00000e0 15 4c 6a 61 76 61 2f 69 6f 2f 50 72 69 6e 74 53 00000f0 74 72 65 61 6d 3b 01 00 13 6a 61 76 61 2f 69 6f 0000100 2f 50 72 69 6e 74 53 74 72 65 61 6d 01 00 07 70 0000110 72 69 6e 74 6c 6e 01 00 15 28 4c 6a 61 76 61 2f 0000120 6c 61 6e 67 2f 53 74 72 69 6e 67 3b 29 56 00 21 0000130 00 05 00 06 00 00 00 00 00 02 00 01 00 07 00 08 0000140 00 01 00 09 00 00 00 1d 00 01 00 01 00 00 00 05 0000150 2a b7 00 01 b1 00 00 00 01 00 0a 00 00 00 06 00 0000160 01 00 00 00 03 00 09 00 0b 00 0c 00 01 00 09 00 0000170 00 00 25 00 02 00 01 00 00 00 09 b2 00 02 12 03 0000180 b6 00 04 b1 00 00 00 01 00 0a 00 00 00 0a 00 02 0000190 00 00 00 06 00 08 00 07 00 01 00 0d 00 00 00 02 00001a0 00 0e 00001a2 


Quite unusual for Java code. Let's try using the description of the class file format to understand what is encoded here.

 ca fe ba be 


This is 4 bytes for magic, which defines the file format.

 00 00 

minor version - minor version as the name implies

 00 34 

major version - 2 bytes for major version.
The combination of the minor and major version says that I compiled this code using J2SE 8.

 00 1d 

These two bytes represent constant_pool_count and are responsible for the size of constant_pool. In my case, the count is 29, and the size of the pool, respectively, 28. Next are elements of the form:

cp_info {
u1 tag; // 1 byte per tag
u1 info []; // array with description
}

Consider the items in constant_pool.

1st item:

 0a 

This tag corresponds to CONSTANT_Methodref, which means that the description should continue:

CONSTANT_Methodref_info {
u1 tag;
u2 class_index;
u2 name_and_type_index;
}
respectively:
 00 06 

class_index, points to the 6th element in constant_pool
 00 0f 

name_and_type_index, points to the 15th element in constant_pool

It is not yet clear what method this link points to and we go further:

2nd element:

 09 

This is CONSTANT_Fieldref, which means that we are looking further:

CONSTANT_Fieldref_info {
u1 tag;
u2 class_index;
u2 name_and_type_index;
}

And here everything is very similar to the previous element, although it is not clear what kind of field it is, in my class I didn’t declare anything like that.
 00 10 

class_index in the 16th element
 00 11 

name_and_type_index in the 17th element

3rd element:
 08 

tag for CONSTANT_String

And by:

 CONSTANT_String_info { u1 tag; u2 string_index; } 


we find that the most interesting lies in the 18th element:
 00 12 


4th element:
 0a 

Tag matching method reference:
whose class is described in the 19th element
 00 13 

a name and type in the 20th element:
 00 14 


5th element:
Tag for CONSTANT_Class
 07 

name in 21 elements
 00 15 


6th item:
CONSTANT_Class
 07 

c name in the 22 element
 00 16 

As we remember, the first constant_pool element belongs to this class.

7th item:
tag, CONSTANT_Utf8, first line
 01 

It must comply with:

CONSTANT_Utf8_info {
u1 tag;
u2 length;
u1 bytes [length];
}

Then the length of our line is 6 bytes:
 00 06 

And the value of "<init>":
 3c 69 6e 69 74 3e 

This is a special name, so are the constructors .

8th item:

CONSTANT_Utf8
 01 


line length 3 - "() V":

 00 03 28 29 56 


This is a description of our parameterless constructor, which was mentioned in the seventh element.

9th element:
CONSTANT_Utf8
 01 


The string "Code":

 00 04 43 6f 64 65 


10th element:
LineNumberTable string
 01 00 0f 4c 69 6e 65 4e 75 6d 62 65 72 54 61 62 6c 65 


Eleventh element
"Main":
 01 00 04 6d 61 69 6e 


12th element
"([Ljava / lang / String;) V"
 01 00 16 28 5b 4c 6a 61 76 61 2f 6c 61 6e 67 2f 53 74 72 69 6e 67 3b 29 56 


13th element
SourceFile
 01 00 0a 53 6f 75 72 63 65 46 69 6c 65 


14th element
"App.java":
 01 00 08 41 70 70 2e 6a 61 76 61 


15th element
Tag, corresponds to CONSTANT_NameAndType
 0c 


which means we will need

CONSTANT_NameAndType_info {
u1 tag;
u2 name_index;
u2 descriptor_index;
}

and then:
link to the 7th element
 00 07 

link to the 8th element
 00 08 


Given that the first element refers to this, we can conclude that the first class constructor was declared without parameters. The class name we need to find in element 22.

16th item:
Tag, for CONSTANT_Class
 07 

with the name in the 23rd element
 00 17 


17th element:
Tag, CONSTANT_NameAndType, with reference to 24 and 25 constant_pool elements
 0c 00 18 00 19 


18th element:
Cheers "Hello world!"
 01 00 0c 48 65 6c 6c 6f 20 77 6f 72 6c 64 21 


19th element:
Tag, for CONSTANT_class with the name in the 25th element
 07 00 1a 


20th element:
Tag CONSTANT_NameAndType with a link to elements 27 and 28
 0c 00 1b 00 1c 


21st item:
"Hello / App"
 01 00 09 68 65 6c 6c 6f 2f 41 70 70 


22nd item:
"Java / lang / Object"
 01 00 10 6a 61 76 61 2f 6c 61 6e 67 2f 4f 62 6a 65 63 74 


23rd element:
"Java / lang / System"
 01 00 10 6a 61 76 61 2f 6c 61 6e 67 2f 53 79 73 74 65 6d 


24th item:
"Out"
 01 00 03 6f 75 74 


25th item:
"Ljava / io / PrintStream;"
 01 00 15 4c 6a 61 76 61 2f 69 6f 2f 50 72 69 6e 74 53 74 72 65 61 6d 3b 


26th item:
"Java / io / PrintStream"
 01 00 13 6a 61 76 61 2f 69 6f 2f 50 72 69 6e 74 53 74 72 65 61 6d 


27th item:
"Println"
 01 00 07 70 72 69 6e 74 6c 6e 


28th item:
"(Ljava / lang / String;) V"
 01 00 15 28 4c 6a 61 76 61 2f 6c 61 6e 67 2f 53 74 72 69 6e 67 3b 29 56 


This is where the constant_pool table ends. Go further
access_flags docs.oracle.com/javase/specs/jvms/se7/html/jvms-4.html#jvms-4.1-200-E.1
 00 21 

this_class
 00 05 

super_class
 00 06 

 00 00 // interfaces_count 00 00 // fields_count 

methods_count we have 2 methods in the class, the default constructor and the main method:
 00 02 

Method 1 - Constructor
 00 01 - access_flags 00 07 - name_index 00 08 - descriptor_index 00 01 - attributes_count 

Attribute 1
 00 09 // name_index (Code) 00 00 00 1d // attribute_length 00 01 // max_stack 00 01 // max_locals 00 00 00 05 // code_length 

One of the most interesting attributes with the code of our method code [code_length], the analysis of instructions is a separate large topic:
 2a // aload_0 b7 00 01 // invokespecial (    constant_pool) b1 // return 


The attribute has ended and the method description continues.
 00 00 // exception_table_length 00 01 // attributes_count 00 0a // attribute_name_index (LineNumberTable - 10 ) 00 00 00 06 // attribute_length 00 01 // line_number_table_length 00 00 // start_pc 00 03 // line_number 


Method 2 - main
 00 09 // access_flags 00 0b // name_index 00 0c // descriptor_index 00 01 // atributes_count 

Attribute 1 main method code
 00 09 // name_index (Code) 00 00 00 25 // attribute_length 00 02 // max_stack 00 01 // max_locals 00 00 00 09 // code_length 


code [code_length]
 b2 00 02 // getstatic 2,    java.lang.System 12 03 // ldc 3 b6 00 04 // invokevirtual 4 b1 // return 


 00 00 // exception_table_length 00 01 // attributes_count 00 0a // attribute_name_index (LineNumberTable - 10 ) 00 00 00 0a // attribute_length 00 02 // line_nuber_table_length 00 00 // start_pc 00 06 // line_number 00 08 // start_pc 00 07 // line_number 


The description of the methods is complete and the class attributes are described.
 00 01 // attributes_count 00 0d // name_index (SourceFile) 00 00 00 02 // attributes_length 00 0e // sourcefile_index(App.java) 


Now that we’ve finished with a byte-by-by-by-by-‑ col class file, it becomes clear how it works:

javap -c -s -verbose classes / hello / App.class

It automatically displays the same thing that I wrote out with my hands:

 Classfile /.../classes/hello/App.class Last modified Aug 14, 2015; size 418 bytes MD5 checksum e9d96126a9f5bbd95f154f1a40d46b53 Compiled from "App.java" public class hello.App minor version: 0 major version: 52 flags: ACC_PUBLIC, ACC_SUPER Constant pool: #1 = Methodref #6.#15 // java/lang/Object."<init>":()V #2 = Fieldref #16.#17 // java/lang/System.out:Ljava/io/PrintStream; #3 = String #18 // Hello world! #4 = Methodref #19.#20 // java/io/PrintStream.println:(Ljava/lang/String;)V #5 = Class #21 // hello/App #6 = Class #22 // java/lang/Object #7 = Utf8 <init> #8 = Utf8 ()V #9 = Utf8 Code #10 = Utf8 LineNumberTable #11 = Utf8 main #12 = Utf8 ([Ljava/lang/String;)V #13 = Utf8 SourceFile #14 = Utf8 App.java #15 = NameAndType #7:#8 // "<init>":()V #16 = Class #23 // java/lang/System #17 = NameAndType #24:#25 // out:Ljava/io/PrintStream; #18 = Utf8 Hello world! #19 = Class #26 // java/io/PrintStream #20 = NameAndType #27:#28 // println:(Ljava/lang/String;)V #21 = Utf8 hello/App #22 = Utf8 java/lang/Object #23 = Utf8 java/lang/System #24 = Utf8 out #25 = Utf8 Ljava/io/PrintStream; #26 = Utf8 java/io/PrintStream #27 = Utf8 println #28 = Utf8 (Ljava/lang/String;)V { public hello.App(); descriptor: ()V flags: ACC_PUBLIC Code: stack=1, locals=1, args_size=1 0: aload_0 1: invokespecial #1 // Method java/lang/Object."<init>":()V 4: return LineNumberTable: line 3: 0 public static void main(java.lang.String[]); descriptor: ([Ljava/lang/String;)V flags: ACC_PUBLIC, ACC_STATIC Code: stack=2, locals=1, args_size=1 0: getstatic #2 // Field java/lang/System.out:Ljava/io/PrintStream; 3: ldc #3 // String Hello world! 5: invokevirtual #4 // Method java/io/PrintStream.println:(Ljava/lang/String;)V 8: return LineNumberTable: line 6: 0 line 7: 8 } SourceFile: "App.java" 


And here you can see an example of parsing the class file :

 ClassFile(InputStream in, Attribute.Factory attributeFactory) throws IOException, ConstantPoolException { ClassReader cr = new ClassReader(this, in, attributeFactory); magic = cr.readInt(); minor_version = cr.readUnsignedShort(); major_version = cr.readUnsignedShort(); constant_pool = new ConstantPool(cr); access_flags = new AccessFlags(cr); this_class = cr.readUnsignedShort(); super_class = cr.readUnsignedShort(); int interfaces_count = cr.readUnsignedShort(); interfaces = new int[interfaces_count]; for (int i = 0; i < interfaces_count; i++) interfaces[i] = cr.readUnsignedShort(); int fields_count = cr.readUnsignedShort(); fields = new Field[fields_count]; for (int i = 0; i < fields_count; i++) fields[i] = new Field(cr); int methods_count = cr.readUnsignedShort(); methods = new Method[methods_count]; for (int i = 0; i < methods_count; i++) methods[i] = new Method(cr); attributes = new Attributes(cr); } 

Source: https://habr.com/ru/post/264919/


All Articles