Serialization ( Serialize
, later “save”) is the process of saving object data in external storage.
This operation is paired with reverse - data recovery, called deserialization ( Deserealize
, in the following "recovery").
The operations of saving and restoring data are used very often. In classical programming languages, there are no ready-made mechanisms for saving and restoring object data and, if such a need arises, you have to create them yourself.
In Java, such ready-made mechanisms exist and, even, in quantities of more than one. Let's see what mechanisms are there and what opportunities they provide for programs on Kotlin
.
The concept of serialization itself is in no way tied to the data format in which the data will be saved, so no matter what result will be obtained - a binary file with its own structure, XML
, JSON
or even a text file - all this will be serialization.
In classical programming languages, there are no ready-made possibilities for saving structured objects, but since data structures are directly stored in memory, there is an easy way to save and restore data directly. On the one hand, you need to create your own means for preservation only when you need to save complex objects with interconnections with each other, but on the other hand, even if you want to use some simple, ready-made mechanism, the implementation of saving data in its format is This is quite a laborious operation.
In Java, the data of elements of a single object are randomly scattered over JVM
memory, so even if it were possible to preserve the structure of the object as a whole, save the data would not work, so the only possible way in Java is to save element-wise element-type data from which the object consists.
On the one hand, it is impossible to use saving objects entirely, in a single operation, in Java, but, thanks to the presence of developed RTTI
, the use of ready-made preservation tools can be implemented very easily.
Many stream classes, such as Writer
or PrintStream
provide ready-made options for storing elementary data types, but using them is as inconvenient as in classical programming languages because of the very large number of descriptions that need to be done.
But, in addition to working with elementary types, in Java there are several different types of ready-made mechanisms for storing class data and many libraries that work with the same formats, differing from each other in performance, volume, and capabilities.
Below we will discuss typical ways to save data: built into the standard Java library, as well as saving in XML
and JSON
.
The simplest feature that exists in the standard Java library is to save and restore data in fully automatic mode in binary form. To implement this feature, all you need to do is specify all the classes whose data should be automatically saved and restored, the Serializable interface as realizable. This is the interface "marker", which does not require the implementation of any method. It is used simply to indicate that data of this class should be saved and restored.
Using this class is elementary - one operation and does not require writing a single extra letter.
class DataClass(s : String) : Serializable { @JvmField var strField = "" @JvmField var intField = 0 @JvmField var dbField = 0.0 protected @JvmField var strProt = "" private var strPriv = "" @JvmField val valStr : String protected @JvmField val valProt : String init { valStr = s valProt = "prot=" + s strField = s + ":baseText" intField = s.hashCode() dbField = s.hashCode().toDouble() / 1000 strProt = s+":prot" strPriv = s+":priv" } fun print() { outn("str = [%s]\nint = [%d]\ndb = [%f]", strField, intField, dbField) outn("prot = [%s]\npriv = [%s]", strProt, strPriv) outn("value = [%s]\nprot value = [%s]", valStr, valProt) } } fun Action() { outn( "Simple object IO test" ) val a = DataClass("dataA") outn("Saved contents:") a.print() Holder(ObjectOutputStream(File("out.bin").outputStream())).h.writeObject(a) val b = Holder(ObjectInputStream(File("out.bin").inputStream())).h.readObject() outn("Class: %s", b.javaClass.name) if (b is DataClass) { outn("Loaded contents:") b.print() } }
As a result of this test, we get the following output:
Simple object IO test Saved contents: str = [dataA:baseText] int = [95356375] db = [95356,375000] prot = [dataA:prot] priv = [dataA:priv] value = [dataA] prot value = [prot=dataA] Class: app.test.Externalize.Test$DataClass Loaded contents: str = [dataA:baseText] int = [95356375] db = [95356,375000] prot = [dataA:prot] priv = [dataA:priv] value = [dataA] prot value = [prot=dataA]
As you can see from the result, the saving and restoration of the object was successful and, after restoration, the new object has exactly the same content as the saved one.
When executing the program, the file " out.bin
" was created with a size of 244 bytes in binary format. The format description can be found in a variety of sources, but, in my opinion, it makes no sense to understand it, it is enough for it to be successfully understood by the save and restore operations.
If we consider the above example in more detail, we can see the following features.
private
" and " protected
".val
” were processed, i.e. immutable by Kotlin
standards.As a result, it is clear that the state of the object is preserved and restored bypassing all the syntactic restrictions that are indicated in the program text. Sometimes this feature of the implementation is a plus but, sometimes, it can be a fundamental restriction on its use.
To save data, use a special ObjectOutputStream
stream (and analogues) for loading. This stream can work with any data types, including objects as a whole, which we have used. The data generated by this stream contains an independent set of information blocks, so there are no restrictions on its use. You can save as many objects or elementary types as one stream, the main thing, when restoring, to read them in the reverse order.
An important feature of such a mechanism of preservation and restoration is that the reading functions automatically control the boundaries of the recorded and will not allow to read the data outside the side where they were written.
If an object was written, then its contents cannot be read byte by byte.
The read byte function will throw an exception with an error as soon as an attempt is made to read the data of a block that was not saved as a byte. This is a rudimentary data protection mechanism, which is very useful because provides automatic data integrity checking.
Classes for saving and restoring data are streams and their contents can be wrapped by any class that processes streams. You can compress stored data, encrypt, transfer over the network, save to memory, archive or any internal container.
Despite the seeming simplicity of this method, it provides a very powerful mechanism that is extremely easy and simple to use. It has its drawbacks, which will be described below, but, often, its capabilities are sufficient for all that a programmer may need.
Any unique object is saved to the stream only once. If several objects are saved that are links to the same one, then the object data will be saved only for one of them, and for the rest only the link to the one already saved will be recorded.
When restoring data objects will be restored so that all links will be restored in the same form that existed in the original objects.
When saving objects, their references to each other are automatically tracked and, during restoration, similar objects will refer to the same objects. Those. if you save object «»
and object «»
and at the same time one of the fields of object «»
is a reference to the saved object «»
, then not two different copies of class «»
will be saved, but only one. When restoring fields, the new object «»
will still refer to object «»
restored from the same stream, i.e. connection between objects will be restored.
This feature allows you to completely transparently maintain a coherent hierarchy of objects that link to each other without destroying relationships and duplicating data.
Supports the preservation of classes of type «enum»
with their correct recovery.
It supports saving and restoring any objects that have a Serializable interface marker. In particular, all standard JDK collections based on List, Set and Map will be automatically saved. all their implementations have this marker.
Those. in order to save and restore all elements of a list or even a tree, it is not necessary to write any additional code, it is enough that the objects are marked with the "Serializable" interface.
To more accurately control the process of saving and restoring data, you can use additional mechanisms.
Since the object data is stored fully automatically, then there must be a mechanism that controls the compatibility of the saved data with the current structure of the object. Those. if we added or deleted a field in the class, changed the order of the fields or their type, then when restoring the data they should fall on the place for which the stored data is intended.
Such a mechanism for controlling compatibility exists. When saving, the library automatically calculates the code for the class being used, which describes its state and, when restored, checks whether the object is compatible with the data that was previously saved for it. If fields were added to the class being restored or their order was changed, then it is possible to recover data from a saved copy, but if the fields were deleted or their type changed, then such data could no longer be recovered, and an attempt to read such an object would be thrown.
The code describing the state of the class can be calculated automatically when the object is saved, but if the class is not planned to be changed later or you want to disable this check mechanism, then you can use a special class field.
class DataClass : Serializable { companion object { const private val serialVersionUID = 1L } }
This field must be a static constant of type Long
, described in the class.
In the case of Kotlin, this constant must be described using the @JvmStatic
annotation or the const
modifier, otherwise the load library will not see it.
When restoring data, the code value from the stream is checked with the one that is calculated or written by a constant for the required class at the moment of saving and, if these values do not match, an exception will be thrown with an error.
The access type of the serialVersionUID
field does not play any role; it can be both public and hidden.
After the class has taken the final form and its modification is no longer planned, it is recommended to describe this constant in the class in order to avoid calculating it at every load and save. The value of this constant can reflect the real state of the class, and then it must be calculated using library methods, or it can contain any arbitrary value if the class match is not important.
To calculate the class state value, you can use the «serialver»
utility from the Java delivery, but it is inconvenient to use it, so it is much easier to obtain this value programmatically. To do this, in the program that uses the required class, call the method to calculate its state and set the resulting value in the serialVersionUID
field.
fun Action() { println( "ID: %d\n", ObjectStreamClass.lookup(DataClass::class.java).serialVersionUID ) //… }
The output of the program:
ID: 991989581060349712
Often, you do not need to save and restore all the existing data on the object, but only a part of them or restore them in a format that does not correspond to their actual type.
The first control option is the exception mechanism.
In order to exclude any field from the list of processed it must be marked with a special type of "transient". In Java, a special keyword is used for this, and in Kotlin
, a special annotation is required.
class DataClass : Serializable { @JvmField var strField = "" @JvmField var intField = 0 @Transient @JvmField var dbField = 0.0 }
When processing objects of this class, the serialization library will neither save nor restore values for the " dbField
" field. All other fields will be saved and restored as usual.
This mechanism is convenient to use in cases where field objects whose values do not make sense or cannot be saved.
To set the values of the fields that will not be processed automatically, the programmer must independently, after loading. To do this, you can use the " readResolve
" method, which is described below.
The second possibility of managing the preservation is that you can specify the names and types of fields that will be used for saving and loading.
With the development of the object, sometimes, the ability to restore its contents is required, even if the format of the object has already completely changed. To do this, you need to be able to read the fields that are no longer in the class, and whose name or type has changed. This can be accomplished using the field filtering mechanism by describing a static constant named serialPersistentFields
.
open class DataClass(s : String) : Serializable { @JvmField var strField = "" @JvmField var intField = 0 @JvmField var dbField = 0.0 companion object { const private val serialVersionUID1 = 1L @JvmStatic val serialPersistentFields = arrayOf( ObjectStreamField("strField",String::class.java), ObjectStreamField("intField",Int::class.java) ) } init { strField = s + ":baseText" intField = s.hashCode() dbField = s.hashCode().toDouble() / 1000 } fun print() = outn("str = [%s]\nint = [%d]\ndb = [%f]", strField, intField, dbField) } fun Action() { val a = DataClass("dataA") outn("Saved contents:") a.print() Holder(ObjectOutputStream(File("out.bin").outputStream())).h.writeObject(a) val b = Holder(ObjectInputStream(File("out.bin").inputStream())).h.readObject() if (b is DataClass) { outn("Loaded contents:") b.print() } }
Now our example saves only two of the three available fields.
Saved contents: str = [dataA:baseText] int = [95356375] db = [95356,375000] Loaded contents: str = [dataA:baseText] int = [95356375] db = [0,000000]
Sometimes, it is not possible to describe the structure or changes in it so that automated tools work without errors. In this case, you need to save or restore the values of the class fields manually, but you would not want to lose all the advantages provided by automated tools.
Suppose that in our object one of the fields has changed the name and at the same time it is necessary to ensure the preservation
and loading data with both the new and old name. Automation tools are unable to cope with this change, but you can manually operate the fields of the object. For a serializable object, you can describe the writeObject
and readObject
functions that will be called to load and save its contents.
open class DataClass(s : String) : Serializable { @JvmField var strField = "" @JvmField var intFieldChanged = 0 @JvmField var dbField = 0.0 companion object { const private val serialVersionUID1 = 1L @JvmStatic val serialPersistentFields = arrayOf( ObjectStreamField("strField", String::class.java), ObjectStreamField("intField", Int::class.java) ) } init { strField = s + ":baseText" intFieldChanged = s.hashCode() dbField = s.hashCode().toDouble() / 1000 } fun print() = outn("str = [%s]\nint = [%d]\ndb = [%f]", strField, intFieldChanged, dbField) private fun readObject(s : ObjectInputStream) { val fields = s.readFields() strField = fields.get("strField", "" as Any?) as String intFieldChanged = fields.get("intField", 0) } private fun writeObject(s : ObjectOutputStream) { val fields = s.putFields() fields.put("strField", strField as Any?) fields.put("intField", intFieldChanged) s.writeFields() } }
In this example, the class field is now called intFieldChanged
, but in the saved data its name will still appear as intField
, which will allow you to load data saved with the old name and save it in such a way that the old class will be able to load it.
In the text of the writeObject
and readObject
you can implement arbitrary logic for storing and loading data.
You can use the mechanisms provided by the library, as implemented in the example above, or you can save and restore the object completely manually. True, in the latter case, it will be difficult to ensure the continuity of the structure to be maintained, as this is implemented in the example, but, often, there is no such need. In the case of manual data handling, it is necessary to ensure that the data is restored in the same order in which they were saved.
open class DataClass(s : String) : Serializable { @JvmField var strField = "" @JvmField var intField = 0 companion object { const private val serialVersionUID1 = 1L } init { strField = s + ":baseText" intField = s.hashCode() } fun print() = outn("str = [%s]\nint = [%d]", strField, intField) private fun readObject(s : ObjectInputStream) { strField = s.readUTF() intField = s.readInt() } private fun writeObject(s : ObjectOutputStream) { s.writeUTF(strField) s.writeInt(intField) } }
When restoring data, the library will automatically restore links to objects, but for working with objects that exist in a single copy, this is not enough. when loading, a new but new object will be created, whereas it is necessary that the loaded elements refer to an existing one in the program.
This behavior can also be provided.
For this, it is enough for the class, which must ensure uniqueness, to create the readResolve
method. This method will be called after loading any object of this class and allows you to replace it with another.
class LinkedData private constructor(@JvmField val value : Int) : Serializable { companion object { @JvmField val ZERO = LinkedData(0) @JvmField val NONZERO = LinkedData(1) @JvmStatic fun make(v : Int) = if (v == 0) ZERO else NONZERO } private fun readResolve() : Any = if ( value == 0 ) ZERO else NONZERO } open class DataClass(v : Int) : Serializable { @JvmField val link = LinkedData.make(v) @JvmField var intField = v companion object { const private val serialVersionUID1 = 1L } fun print() = outn("int = [%d]\nlink = [%s]", intField, if (link == LinkedData.ZERO) "ZERO" else if (link == LinkedData.NONZERO) "NONZERO" else "OTHER!" ) }
The result of the program:
Saved contents: int = [100] link = [NONZERO] Loaded contents: int = [100] link = [NONZERO]
, LinkedData
readResolve
, , .
, .
, , (), .ObjectInputStream
ObjectOutputStream
annotateClass
annotateProxyClass
.
, .
, Serializable
, , .
.
, , . , . , .
,XML
, .. . Serializable
, .
, , .
, JSON
, .
– .
- , , .
: Serializable !
, .
, «» , , , . , , .
, Java – Externalizable
.
ObjectInput
ObjectOutput
Serializable
, , , .readExternal
writeExternal
, .
Serializable
, Externalizable
, . , , .
Serializable
Externalizable
, .
. . readResolve
, , .
Externalizable
:
, .
, , . .
, .
, , .
class LinkedData private constructor(@JvmField val value : Int) { companion object { @JvmField val ZERO = LinkedData(0) @JvmField val NONZERO = LinkedData(1) @JvmStatic fun make(v : Int) = if (v == 0) ZERO else NONZERO } } open class DataClass : Externalizable { @JvmField var link : LinkedData @JvmField var intField : Int constructor() { link = LinkedData.ZERO; intField = 0 } constructor(v : Int) { link = LinkedData.make(v); intField = v } fun print() = outn("int = [%d]\nlink = [%s]", intField, if (link == LinkedData.ZERO) "ZERO" else if (link == LinkedData.NONZERO) "NONZERO" else "OTHER!") override fun readExternal(s : ObjectInput) { link = if ( s.readByte().toInt() == 0 ) LinkedData.ZERO else LinkedData.NONZERO intField = s.readInt() } override fun writeExternal(s : ObjectOutput) { s.writeByte(if(link == LinkedData.ZERO) 0 else 1) s.writeInt(intField) } }
link
.. . link
.
, , , . , .
«DataClass» , .
Externalizable
, , .
, , , Serializable
.
, , . , , , .
, Externalizable
, , , , - .
, , , , , , . - , .
, , ObjectStream
, . , .
, , , , , Serialilzable
. , .
, , , , , , . , .
.. , , . , , .
, , .
:
0) <noname> addCaller.t_ForOrGetPut org.sun.NotNotEmptyGetEach( SetEmptyCombineSplit ) 1) fNotRandom addCaller.For(void, addCaller.t_HasEmpty Has, addCaller.t_Combine GetCombineHas ) 2) fOrSet app.PutHasForGet( app.t_Set HasCombineAdd, app.t_ForNotOrAdd Combine ) 3) fHasSplit org.sun.Set( Set, sec.sun.t_JoinEmptyHasCombineCombine EmptyCombineOr ) 4) fJoinEach sec.sun.t_OrForEmptySet sec.sun.EachPutOrNot( org.sun.t_Combine SetHasSplitJoinEmpty, void ) 5) fEachSet org.sun.t_RandomSplit app.OrIsFor( sec.sun.t_Set CombineGetRandom, void ) 6) <noname> app.t_NotSetForForGet sun.NotHasForSplitAdd( org.sun.t_IsRandomOrHas Each, void) 7) fNotSplit addCaller.t_NotAdd sec.sun.IsHasNot( app.t_HasForSplitHas ForGet, void ) 8) <noname> sun.SetForSplitSet( PutCombine, void, void ) 9) fCombineNot sun.t_SplitRandomGetRandom sun.AddAdd( void, org.sun.t_NotRandomHasEmpty AddPutNotSplit ) 10) <noname> addCaller.t_ForIs sun.EachIs( NotFor, void, void, PutSplitAddNot ) 11) fSetOr app.HasJoin( OrOr, void, void, addCaller.t_NotAddHas Each )
, .
, <noname>
void
, .
, , .. , . , , , .
:
SerialFull
— Serializable
.
Extern+Ser
— Externalizable
, .
ExternFull
— Externalizable
.
JsJsonMini
— «minimal-json», JSON
.minimal-json-0.9.4.jar
, : https://github.com/ralfstx/minimal-json .
fasterXML-jackson
JSON
.2.0.4
, : https://github.com/FasterXML/jackson . . ( JsJackAnn
), , , annotations-databind
. ( JsJackSream
) , stream
.
XML
«org.w3c.dom.Document
org.w3c.dom.Element
..
Title | Version | A source | |
---|---|---|---|
XML, Serializable, Externalizable | Java 1.8 | Java | , Java. |
minimal-json | 0.9.4 | https://github.com/ralfstx/minimal-json | minimal-json-0.9.4.jar – 30 |
fasterXML-jackson | 2.0.4 | https://github.com/FasterXML/jackson | jackson-core-2.0.4.jar – 194, jackson-databind-2.0.4.jar – 847, jackson-annotations-2.0.4.jar – 34 |
, , .
:
USAGE: SerializableTest.jar [-opts] Where OPTS are: -Count=<number> - set number of items to generate -Retry=<number> - set number of iterations for each test -Out=<file> - set file name to output -Nout - disable items output -gc - run gc after every test
:
Since , . , , , .
JVM , , .
>sb -n "-c=100000" "-r=10" Output file : test_out Number of elements: 100000 Number of retries : 10 Tests complete in 0:01:28.050 sec :: Save 0:00:31.903, Load 0:00:30.149, Total 0:01:02.103, Waste 0:00:25.947
100.000 , 10 . 1 28
, 25.9
.
A place | Name | Record | Loading | Total | File | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|
6 | SerialFull | 0:00:07.599 | 2,34 | 0:00:04.217 | 1,05 | 1.45 | 0:00:11.826 | 1,56 | 0,41 | 18 | |
one | ExternFull | 0:00:02.550 | 0.12 | 1.98 | 0:00:02.061 | 4,02 | 0:00:04.616 | 2,60 | 16 | ||
five | Extern+Ser | 0:00:05.744 | 1,52 | 0.32 | 0:00:04.112 | 1.00 | 1.51 | 0:00:09.862 | 1,14 | 0,69 | 22.5 |
7 | XMLw3c | 0:00:06.278 | 1,76 | 0,21 | 0:00:10.337 | 4,02 | 0:00:16.620 | 2,60 | 32 | ||
four | JsJsonMini | 0:00:04.678 | 1,05 | 0,62 | 0:00:04.614 | 1,24 | 1,24 | 0:00:09.302 | 1,02 | 0.79 | 25.9 |
3 | JsJackAnn | 0:00:02.776 | 0,22 | 1,74 | 0:00:02.431 | 0.18 | 3,25 | 0:00:05.215 | 0,13 | 2,19 | 25.9 |
2 | JsJackSream | 0:00:02.278 | 2,34 | 0:00:02.377 | 0,15 | 3,35 | 0:00:04.662 | 0.01 | 2.56 | 25.9 |
.
«»
, .«»
.«»
, «»
«»
, , .«»
«»
, «»
.«»
, ..
. , , , .
, , — . , , .
.
, . ( Extern+Ser
) , . .
.. , , .
, , .
Java , - (, , ), . , Java – EOF
, , .
.. JSON
, .
, .
, . , . .
– , , .
. , JSON
. , , , .
databind
, , .
, .
, , . , Serializable
, , .
, . . , , , .
JavaDOC
, .. , .
, , . , , . – , .
, :
.
JSON
, .
, , .. .
, , , , , , .
stream
JSON
.
.
, JSON
, . , JSON
. , .
, , .
jackson-core
, 200, 4 databind
.
, , .
.
, . 400.000 - , 5 JVM
.
, . fasterXML-jackson
stream
.. , , .
, , , . .
Source: https://habr.com/ru/post/319604/
All Articles