📜 ⬆️ ⬇️

Java serialization

Serialization is the process of maintaining the state of an object in a sequence of bytes; deserialization is the process of restoring an object from these bytes. The Java Serialization API provides a standard mechanism for creating serializable objects. In this article, you will see how to serialize an object, and why serialization is sometimes necessary. You will learn about the serialization algorithm used in Java and see an example that illustrates the serialized format of the object. In the end, you should have a clear idea of ​​how the serialization algorithm works, as well as how parts of the object are represented in serialized form.

Why serialization is needed?



In today's world, a typical industrial application will have many components and will be distributed across various systems and networks. In Java, everything is represented as objects; If two Java components need to communicate with each other, then they need a mechanism for exchanging data. There are several ways to implement this mechanism. The first way is to develop your own protocol and transfer the object. This means that the recipient must know the protocol used by the sender to recreate the object, which complicates the development of third-party components. Consequently, there must be a universal and efficient protocol for transferring objects between components. Serialization is created for this, and Java components use this protocol to transfer objects.

Figure 1 demonstrates a high-level view of client-server communication, where an object is passed from client to server through serialization.

Picture 1.
')

How to serialize an object?



First you need to make sure that the class of the object being serialized implements the java.io.Serializable interface as shown in Listing 1.

Listing 1.

import java.io.Serializable;

class TestSerial implements Serializable {
public byte version = 100;
public byte count = 0;
}


* This source code was highlighted with Source Code Highlighter .


In Listing 1, only one thing is different from creating a normal class; this is the implementation of the java.io.Serializable interface. The Serializable interface is a marker interface; there is no method declared in it. But it tells the serializing mechanism that the class can be serialized.

Now we have everything we need to serialize the object, the next step is the actual serialization of the object. It is made by calling the writeObject() method of the java.io.ObjectOutputStream class, as shown in Listing 2.

Listing 2.
public static void main( String args[]) throws IOException {
FileOutputStream fos = new FileOutputStream( "temp.out" );
ObjectOutputStream oos = new ObjectOutputStream(fos);
TestSerial ts = new TestSerial();
oos.writeObject(ts);
oos.flush();
oos.close();
}


* This source code was highlighted with Source Code Highlighter .


Listing 2 shows saving the state of a TestSerial instance to a file called temp.out

To recreate an object from a file, you need to apply the code from Listing 3.

Listing 3.
public static void main( String args[]) throws IOException {
FileInputStream fis = new FileInputStream( "temp.out" );
ObjectInputStream oin = new ObjectInputStream(fis);
TestSerial ts = (TestSerial) oin.readObject();
System. out .println( "version=" +ts.version);
}


* This source code was highlighted with Source Code Highlighter .

The object is restored by calling the oin.readObject() method. The method reads a set of bytes from the file and creates an exact copy of the graph of the original object. oin.readObject() can read any serialized object, so you must oin.readObject() resulting object to a specific type.
The executed code will output version=100 to standard output.

The format of the serialized object


What should a serialized object look like? Recall the simple code from the previous section that serializes an object of the TestSerial class and writes to temp.out . Listing 4 shows the contents of the temp.out file, in hexadecimal.

Listing 4.
AC ED 00 05 73 72 00 0A 53 65 72 69 61 6C 54 65
73 74 A0 0C 34 00 FE B1 DD F9 02 00 02 42 00 05
63 6F 75 6E 74 42 00 07 76 65 72 73 69 6F 6E 78
70 00 64

If you look at TestSerial again, you will see that it has only 2 byte members. As shown in Listing 5.

Listing 5.
public byte version = 100;
public byte count = 0;


* This source code was highlighted with Source Code Highlighter .

The size of a byte variable is one byte, and therefore the full size of the object (without a header) is two bytes. But the size of the serialized object is 51 bytes. Surprised? Where did these extra bytes come from and what do they mean? They are added by a serialization algorithm and are needed to recreate the object. The following paragraph will describe this algorithm in detail.

Java serialization algorithm


At this point, you should already have enough knowledge to serialize the object. But how does this mechanism work? The serialization algorithm does the following things:



Listing 6 shows an example covering all possible serialization cases.

Listing 6.
class parent implements Serializable {
int parentVersion = 10;
}

class contain implements Serializable {
int containVersion = 11;
}
public class SerialTest extends parent implements Serializable {
int version = 66;
contain con = new contain();

public int getVersion() {
return version;
}
public static void main( String args[]) throws IOException {
FileOutputStream fos = new FileOutputStream( "temp.out" );
ObjectOutputStream oos = new ObjectOutputStream(fos);
SerialTest st = new SerialTest();
oos.writeObject(st);
oos.flush();
oos.close();
}
}


* This source code was highlighted with Source Code Highlighter .


In the example, an object of the SerialTest class is SerialTest , which is inherited from parent and contains the container object of the contain class. Listing 7 shows the serialized object.

Listing 7.
AC ED 00 05 73 72 00 0A 53 65 72 69 61 6C 54 65
73 74 05 52 81 5A AC 66 02 F6 02 00 02 49 00 07
76 65 72 73 69 6F 6E 4C 00 03 63 6F 6E 74 00 09
4C 63 6F 6E 74 61 69 6E 3B 78 72 00 06 70 61 72
65 6E 74 0E DB D2 BD 85 EE 63 7A 02 00 01 49 00
0D 70 61 72 65 6E 74 56 65 72 73 69 6F 6E 78 70
00 00 00 0A 00 00 00 42 73 72 00 07 63 6F 6E 74
61 69 6E FC BB E6 0E FB CB 60 C7 02 00 01 49 00
0E 63 6F 6E 74 61 69 6E 56 65 72 73 69 6F 6E 78
70 00 00 00 0B

Figure 2 shows the serialization algorithm script.

Figure 2.

Let's take a look at what each byte in a serialized object represents. At the beginning there is information about the serialization protocol:

In the first step, the serialization algorithm records the description of the class associated with the object. In the example, an object of the SerialTest class was serialized, hence the algorithm began to record the description of the SerailTest class.

Now the algorithm writes the field int version = 66; .

Then the algorithm writes the next field, contain con = new contain(); . This object will therefore be written in the canonical JVM designation of this field.

The next step of the algorithm is to write the description of the parent class, which is the immediate superclass for SerialTest .

Now the algorithm writes the description of the parent class fields, the class has one field, int parentVersion = 100; .

Prior to this, the serialization algorithm recorded a description of the classes associated with the object and all its superclasses. Now the actual data associated with the object will be recorded. Members of the parent class are written first:

Next move to SerialTest

The next few bytes are very interesting. The algorithm needs to write information to the object contain class.

Listing 8.
contain con = new contain();

* This source code was highlighted with Source Code Highlighter .

The serialization algorithm has not yet recorded the contain class description. The time has come to do it.

The algorithm should write the description of the only field of the class conatin, int containVersion = 11; .

Next, the algorithm checks whether contain parent class. If it does, then the algorithm starts recording this class; but in our case there is no superclass for contain , and the algorithm writes TC_NULL .

At the end, the algorithm records the actual data associated with the object of the class conatin .

Conclusion


In this article, you saw how to serialize an object, and learn how the serialization algorithm works. I hope this article has helped you better understand what happens when you serialize an object.

about the author


Sathiskumar Palaniappan has more than 4 years of experience in IT industry, and has been working with Java technology for more than 3 years. He currently runs a system software engineer at the Java Technology Center, IBM Labs. He also has experience in the telecommunications industry.

Links


Java object serialization specification . (Spec is a PDF.)
"Flatten your objects: Discover the secrets of the Java Serialization API" (Todd M. Greanier, JavaWorld, July 2000).
Chapter 10 of Java RMI (William Grosso, O'Reilly, October 2001).

Source: https://habr.com/ru/post/60317/


All Articles