JSON vs. XML and a bit of refactoring

Introduction

Working in the field of RIA creation, sooner or later, you have to think about the choice of protocols for transferring data between a server and its clients. If earlier I unconditionally used XML, now I’ve increasingly thought about changing my priority on this issue in favor of JSON. Above, I was not mistaken when I talked about clients; increasingly, for their service, customers require a mobile version and therefore we have to think about creating a server infrastructure capable of supporting several clients (a browser and, say, an iPhone application) and their versions. It seems that XML fits all the parameters, but doubts arise.

Doubts

Over the past couple of years, I have cautiously noticed the ubiquitous use of XML - yes, the format is universal, but you need to know the measure. The main XML flag is machine-readable and human-readable data storage format, I note that it is easy to read. In the literature, you will find a reference to the fact that XML is a hierarchical structure for storing any data; visually, the structure can be represented as a tree.

And here comes the first doubt: but if you take a small tree, two hundred nodes and save it as XML, it will be so easy to read it. A person without a special editor with a backlight is almost impossible to read such files (especially if they are written in one line without hyphens or tabs). It is difficult to disagree with this statement.

Thus, one should discard human readability of the format as an advantage and consider XML primarily as a means of communication between services, programs or platforms, a kind of platform-independent means of communication between systems.
')
Based on the above reasoned position, a second doubt arises. Isn't it expensive? In this case, processor time is estimated. In the last project, we had to measure the performance (you can say hard to “profile”) and, I must say, one of the bottlenecks was the serialization of objects into XML and their reincarnation from it. It can be assumed that the hands should be washed more often, but for almost a month of struggle for the cleanliness of the rows, the performance has risen, but not enough to please the development team.

The last of the stones of doubt, the easiest and, at the same time, fundamentally irreparable, is associated with the volume of transmitted data, that is, the volume of generated traffic. As long as there were few customers, this problem did not exist. However, when a mobile client appeared to the service and at the same time crowds of corporate employees rushed into the system, the volume of generated traffic clearly indicated XML redundancy (white noise generation) as a protocol for working with client applications (with their large mass).

It can be concluded that the complexity of using XML exceeds the complexity of the problems that this technology solves. You yourself can come up with a lot of pros and cons, as even the creators of XML [TB, TB2] do. But in favor of XML, I want to note the following: using it as a protocol for synchronizing servers is almost perfect, and development automation tools allow you to relax and have fun. But the problem of client traffic made us pay attention to the JSON data presentation format.

Alternative

The main source of information about JSON is the official [JSON] site, containing the following definition: JSON (JavaScript Object Notation) is a simple data exchange format that is easy to read and write by both humans and computers. Noting the definition that is focused on promoting JSON to the masses, we note that the format is: textual, open, simple and well documented [JSONdoc].

In my opinion, the ease of human readability of the JSON format is even lower compared to XML. To verify this claim, execute any request to the twitter.com API, for example: search.twitter.com/search.json?q=golodnyj , and try to interpret the result without familiarizing yourself with the API itself. You will have difficulties, at least due to the fact that JSON does not transfer semantics, but is intended to represent content. Based on the foregoing, JSON should be defined as a text-based, open, lightweight protocol for interfacing client-server applications, focused on transferring objects.

The transition to the use of JSON really significantly (up to 25-30%) reduced the volume of user traffic and allowed to significantly reduce the load on the network infrastructure of the service. However, as you understand, it was necessary to replace the generation and parsing from XML with JSON. It would seem that there is nothing good in this - significant code refactoring, debugging and lengthy testing, but we were lucky.

Hero of the day

The savior of mankind in its individual areas was the XStream [XS] framework, which, by a happy coincidence, was used to work with XML in that part of the system on which the experiment was conducted.

As an example, consider the Client class (see the full source code in Appendix 1), which has three attributes, a constructor, a getter, and a setter:

package ru.golodnyj.lection.json; public class Client { private String userName; private boolean verified; private int userId; public Client( String userName, boolean verified, int userId) { this .userName = userName; this .verified = verified; this .userId = userId; } ... } * This source code was highlighted with Source Code Highlighter .

As well as the ConsoleJSON class (Appendix 2), which first demonstrates the serialization of an instance of the Client class into XML format, and then into JSON:

package ru.golodnyj.lection.json; ... public class ConsoleJSON { public static void main( String [] args) { Client c = new Client( "name" , true , 5); // XML XStream xstream = new XStream( new DomDriver()); String xml = xstream.toXML(c ); System. out .println( "xml " + xml.length() + " : \n" + xml); // JSON XStream xstream1 = new XStream( new JsonHierarchicalStreamDriver()); String json = xstream1.toXML(c ); System. out .println( "json " + json.length() + " : \n" + json); } } * This source code was highlighted with Source Code Highlighter .

After starting the program, you will receive a message in the console like:

xml 145 : < ru.golodnyj.lection.json.Client > < userName > name </ userName > < verified > true </ verified > < userId > 5 </ userId > </ ru.golodnyj.lection.json.Client > json 96 : {"ru.golodnyj.lection.json.Client": { "userName": "name", "verified": true, "userId": 5 }} * This source code was highlighted with Source Code Highlighter .

First, XML is displayed - the resulting string is 145 characters, and then JSON is 96 characters long. The difference in the length of the lines when serializing such a simple object is significant, what to say about complex constructions. Of course, the output for XML and for JSON can be made compact, for example, the full canonical name of the class and the formatted output are completely irrelevant for data transfer.

To improve the output, you can use the alias mechanism in XStream, which allows you to match a string variable with the class name. Also, for the JSON format, you need to change the driver from JsonHierarchicalStreamDriver (), which provides formatted output, to JettisonMappedXmlDriver (), which is intended for streaming output.

XStream xstream1 = new XStream( new JettisonMappedXmlDriver()); xstream1.alias(«client», Client. class ); * This source code was highlighted with Source Code Highlighter .

In this case, the JSON variant for an instance of the Client object will be 57 characters long and look like this:

{"client":{"userName":"name","verified":true,"userId":5}}

The process of deserialization from XML to POJO is also simple and consists of calling the fromXML () method of an instance of the XStream class, which has a JSON string passed as a parameter, followed by the casting of data types.

findings

If you look again at the ConsoleJSON class, you will notice how little is required to replace the output from XML with JSON. As a result, we have significant savings in the volume of generated traffic. In this particular case, from the server side, the refactoring process, which aimed at replacing the XML data transfer protocol with JSON, was very simple and did not require much effort.

LITERATURE

[TB] Tim Bray www.tbray.org/ongoing/When/200x/2003/03/16/XML-Prog
[TB2] Tim Bray www.tbray.org/ongoing/When/200x/2003/03/24/XMLisOK
[JSON] www.json.org
[JSONdoc] www.json.org/json-ru.html
[XS] xstream.codehaus.org

Applications

Application 1 dumpz.org/12012
Application 2 dumpz.org/12013

Source: https://habr.com/ru/post/68621/

All Articles