📜 ⬆️ ⬇️

Json vs. YAML Comparison

The day has come, and the configuration files for our application have become so large that managers have hinted that in JSON-configs there are suspiciously many curly and non-curly braces, and they would like to get rid of them. A subtle hint was given that it would be nice to take a closer look at YAML, because it is rumored that he is very human-readable. And no brackets there. And the lists are beautiful. Naturally, we could not ignore the older ones, we had to study the question, look for the difference, the pros and cons of both formats. Obviously, such comparisons are started only in order to confirm the opinion of the leaders or even if they are not confirmed, they will find why they are right and why you should make changes :)




I am sure that many people are familiar with these formats, but I’ll give you a short description from Wikipedia:
')
JSON (English JavaScript Object Notation) is a text-based data exchange format based on JavaScript and commonly used with this language. Like many other text formats, JSON is easy to read by people. Despite being derived from JavaScript (more precisely, from a subset of the ECMA-262 standard language of 1999), the format is considered language independent and can be used with almost any programming language. For many languages, there is ready-made code for creating and processing data in JSON format.

YAML is a human-readable data serialization format, conceptually close to markup languages, but focused on the convenience of I / O of typical data structures in many programming languages. The name YAML is a recursive acronym YAML Ain't Markup Language (“YAML is not a markup language”). The title reflects the history of development: in the early stages, the language was called Yet Another Markup Language (“Another markup language”) and was even considered as a competitor to XML, but was later renamed to focus on data, and not on document markup.

And so what we need:


Obviously, we will not write our own parsers, so for the beginning we will choose for each format on an already existing parser.
For json, we’ll use gson (from google), and for yaml, snakeyaml (from not-know-who).

As you can see, everything is simple, you just need to create a fairly complex model that will simulate the complexity of the config files, and write a module that will test the yaml and json parsers. Let's get started
We need a model like this: 20 attributes of different types + 5 collections with 5-10 elements + 5 nested objects with 5-10 elements and 5 collections.
This stage of the comparison can be safely called the most boring and uninteresting. Classes were created, with non-sound names like Model, Emdedded1, etc. But we are not chasing the readability of the code (at least in this part), so we’ll leave it at that.
file.json
"embedded2": { "strel1": "el1", "strel2": "el2", "strel4": "el4", "strel5": "el5", "strel6": "el6", "strel7": "el7", "intel1": 1, "intel2": 2, "intel3": 3, "list1": [ 1, 2, 3, 4, 5 ], "list2": [ 1, 2, 3, 4, 5, 6, 7 ], "list3": [ "1", "2", "3", "4" ], "list4": [ "1", "2", "3", "4", "5", "6" ], "map1": { "3": 3, "2": 2, "1": 1 }, "map2": { "1": "1", "2": "2", "3": "3" } } 

file.yml
 embedded2: intel1: 1 intel2: 2 intel3: 3 list1: - 1 - 2 - 3 - 4 - 5 list2: - 1 - 2 - 3 - 4 - 5 - 6 - 7 list3: - '1' - '2' - '3' - '4' list4: - '1' - '2' - '3' - '4' - '5' - '6' map1: '3': 3 '2': 2 '1': 1 map2: 1: '1' 2: '2' 3: '3' strel1: el1 strel2: el2 strel4: el4 strel5: el5 strel6: el6 strel7: el7 

I agree that the human readability parameter is quite subjective. But still, in my opinion, yaml is a bit more pleasant looking and more intuitive.

Next, for each parser we will write a small wrapper. With the methods serialize () and deserialize ().

yaml parser
 public class BookYAMLParser implements Parser<Book> { String filename; public BookYAMLParser(String filename) { this.filename = filename; } @Override public void serialize(Book book) { try { DumperOptions options = new DumperOptions(); options.setDefaultFlowStyle(DumperOptions.FlowStyle.BLOCK); Yaml yaml = new Yaml(options); FileWriter writer = new FileWriter(filename); yaml.dump(book, writer); writer.close(); } catch (IOException e) { e.printStackTrace(); } } @Override public Book deserialize() { try { InputStream input = new FileInputStream(new File(filename)); Yaml yaml = new Yaml(); Book data = (Book) yaml.load(input); input.close(); return data; } catch (FileNotFoundException e) { e.printStackTrace(); } catch (YamlException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } catch (Exception e) { String message = "Exception in file " + filename + ", "; throw new Exception(message + e.getMessage()); } return null; } } 


json parser
 public class BookJSONParser implements Parser<Book> { String filename; public BookJSONParser(String filename) { this.filename = filename; } @Override public void serialize(Book book) { Gson gson = new GsonBuilder().setPrettyPrinting().create();; try { FileWriter writer = new FileWriter(filename); String json = gson.toJson(book); writer.write(json); writer.close(); } catch (IOException e) { e.printStackTrace(); } } @Override public Book deserialize() { Gson gson = new Gson(); try { BufferedReader br = new BufferedReader( new FileReader(filename)); JsonReader jsonReader = new JsonReader(br); Book book = gson.fromJson(jsonReader, Book.class); return book; } catch (IOException e) { e.printStackTrace(); } return null; } } 


As we can see, both formats have support in java. But for json, the choice is much wider, it is undeniable.
Parsers are ready, now consider the implementation of the comparison. Here, too, everything is extremely simple and obvious. There is a simple method that deserializes objects from a file 30 times. If anyone is interested - the code under the spoiler.

testing code
 public static void main(String[] args) { String jsonFilename = "file.json"; String yamlFilename = "file.yml"; BookJSONParser jsonParser = new BookJSONParser(jsonFilename); jsonParser.serialize(new Book(new Author("name", "123-123-123"), 123, "dfsas")); BookYAMLParser yamlParser = new BookYAMLParser(yamlFilename); yamlParser.serialize(new Book(new Author("name", "123-123-123"), 123, "dfsas")); //json deserialization StopWatch stopWatch = new StopWatch(); stopWatch.start(); for (int i = 0; i < LOOPS; i++) { Book e = jsonParser.deserialize(); } stopWatch.stop(); System.out.println("json worked: " + stopWatch.getTime()); stopWatch.reset(); //yaml deserialization stopWatch.start(); for (int i = 0; i < LOOPS; i++) { Book e; e = yamlParser.deserialize(); } stopWatch.stop(); System.out.println("yaml worked: " + stopWatch.getTime()); } 


As a result, we get the following result:
 json worked: 278 yaml worked: 669 




As you can see, json files are about three times faster. But the absolute difference is not critical in our scope. Therefore, this is not a strong plus in favor of json.
This happens because json is parsed "on the fly", that is, it is read character by character and immediately stored in the object. It turns out the object is formed in one pass through the file. Actually, I don’t know how this particular parser works, but in general this scheme.
And yaml, in turn, is more measured. The data processing stage is divided into 3 stages. First, a tree of objects is built. Then it is still somehow transformed. And only after this stage is converted into the necessary data structures.

A small comparative table ("+" - advantage, "-" - lag, "+ -" - no clear advantage):
JsonYaml
work speed+-
human readability-+
java support+ -+ -


How can this be summarized?
Everything is obvious, if speed is important to you - then json, if human readability is yaml. You just need to decide what is more important. For us it turned out - the second.
In fact, there are still many different arguments in favor of each of the formats, but I think that the two most important are all the same.

Further, when working with yaml, I had to deal with not very nice exception handling, especially with syntax errors. Also, I had to test different yaml libraries. Also, at the end it was necessary to write some validation. Validation was tested using schemes (they had to call Ruby gems there), and bean-validation based on jsr-303. If you are interested in any of these topics, I will be happy to answer questions.
Thanks for attention:)

PS
Already at the end of this article I came across the following comparison between yaml and json:
www.csc.kth.se/utbildning/kth/kurser/DD143X/dkand11/Group2Mads/victor.hallberg.malin.eriksson.report.pdf

In fact, there was a feeling that I had done the work already done, and more thoroughly and thoroughly done, but the main thing was the experience :)

Pps
Description of the work of parsers took from there. My apologies if the translation is not accurate and clear.

Source: https://habr.com/ru/post/238603/


All Articles