⬆️ ⬇️

Mutual transformations of JSON, YAML, XML

JSON, YAML are now popular, and XML technologies are considered a relic of the past.





Let's try to use "retro technology" to work with data in JSON and YAML. And speculate on the reasons for applying them in our days.



There is a task - to transfer the logic of data transformation to the application configuration, preferably in a declarative style and unified for different formats. The data can be in various textual serialization formats json, yaml, xml, java properties, ini file. But at the same time, Data Lake is too heavy artillery for this. Putting data into a document-oriented or object-relational database and trying to perform queries on the data loaded there is also over engineering for the first stage of the ETL transformation.



JsonPath repeats a subset of XPath, but only to JSON format. And to write a declarative query without programming will not work - there is no equivalent XQuery. As an option, it would be possible to use some kind of embedded database in jvm with its declarative query language, but this is a topic for a separate publication and the original data model in json, yaml is not relational.

')

Approach to data requests from JSON / YAML



XQuery can be performed on data in the Document Object Model. How to convert data from a JSON / YAML to a DOM object ... You can use camel-xmljson or json2xml . In these libraries, the data source is only json. Therefore, let's rush to our dom-transformation bike. This library is able to accept as input Map <String, Object> and turn it into org.w3c.dom.Node, and there is also an inverse transformation.



It remains to learn how to turn JSON and YAML into Map <String, Object>. For example, this can be done using the com.fasterxml.jackson.databind.ObjectMapper class from jackson .



Turn JSON into a Map:



ObjectMapper mapper = new ObjectMapper(); Map<String, Object> objectTree = mapper.readValue(yaml, new TypeReference<Map<String, Object>>() {}); 


Turn YAML into a Map:



 ObjectMapper mapper = new ObjectMapper(new YAMLFactory()); Map<String, Object> objectTree = mapper.readValue(yaml, new TypeReference<Map<String, Object>>() {}); 


We turn Map into Document Object Model, having connected library to the project:



 DomTransformer toDom = new DomTransformer(new TypeAutoDetect()).transform(objectTree.size() == 1 ? objectTree : Collections.singletonMap("root", objectTree)); Node document = toDom.translate(objectTree); 


You can use any XQuery implementation to execute queries. I like basex as a still developing open source project. We connect the dependency org.basex: basex: jar: 9.0 to the project and execute the declarative query:



 String yaml = IOUtils.toString(TranslateTest.class.getResource("/pipeline.yml").toURI(), StandardCharsets.UTF_8); ObjectMapper mapper = new ObjectMapper(new YAMLFactory()); Map<String, Object> objectGraph = mapper.readValue(yaml, new TypeReference<Map<String, Object>>() {}); Node document = new DomTransformer(new TypeAutoDetect()).transform( objectGraph.size() == 1 ? objectGraph : Collections.singletonMap("root", objectGraph)); try(QueryProcessor proc = new QueryProcessor("declare variable $extDataset external; " + " $extDataset//*[text()='git-repo']", new Context())) { proc.bind("extDataset", document); Value queryResult = proc.value(); // execute the query queryResult.iter().forEach(System.out::println); } 


Work results for data from pipeline.yml







If you need to convert DOM / XML to JSON / YAML using jackson, then transform (Node currentNode) can help with this.



findings



With XQuery, you can query not only XML data. This query language is still successfully coped with and this “old man” will still live in java data transformation projects even in JSON and YAML formats.



Of course, semi-structured data is not only JSON, YAML and XML. And it's still too early to put an end to the processing of everything ...







I hope that the approach from the publication will help you to perform declarative queries on heterogeneous data in the application. Or did you encounter a similar task in the JVM and you have better ideas, share in the comments!



On April 26, my colleague and I hold an open java meetup in the Moscow office. We will welcome guests!
You can relax after a working day, learn something new, debate, have a snack on pizza and chat with developers.



The meeting will be held on April 26, 2018 19: 00-21: 00 at the Moscow office of Aligh Technology at 9, Warsaw Street., P. 4. Register here: Amazon Web Services and JVM for server side projects.



The program has three reports.



Emulate Amazon web services in the JVM process: accelerating development, testing, and saving money.

Speaker: Igor Sukhorukov



How to effectively develop Big Data applications on Amazon Web Services infrastructure. We will try to locally develop a solution was easy, and integration tests worked quickly and as cheaply as possible for us. We remember the savings, while Amazon’s CEO Jeff Bezos launches rockets in Blue Origin and walks with the SpotMini robot from Boston Dynamics. In the report I will tell how it turned out to emulate the S3 filesystem, the Redshift data warehouse, the SQS queues, the PostgreSQL RDS service in the JVM process based on open source projects. The report will also be a comparison of popular Big Data solutions for BI analytics.



Code for production.

Speaker: Yuri Geinish



Simple but not obvious tips for developers on building resilient and diagnosable server applications. The report will focus on the basic principles, so it may be useful to developers in different programming languages ​​and platforms.



Meldonium for Groovy: soaring in the "clouds" or diving into the "bloody enterprise"

Speaker: Igor Sukhorukov



In the "clouds" with Groovy on steroids can be done more and more convenient. Let's talk about dynamic loading of classes from maven artifacts and how convenient it is to run scripts in Amazon Web Services and Docker containers. How to survive scripts in an isolated corporate environment. And of course, let's talk about the topic why Groovy is needed, when there is Kotlin.

Source: https://habr.com/ru/post/352810/



All Articles