📜 ⬆️ ⬇️

Java XML API: we select correctly. StAX: we work with pleasure

Hello!
Despite the decline in the popularity of the XML format since the early 2000s, it has firmly occupied its niche. I came across XML processing in 60% of the projects and dedicated my internship to Masterjava to it. Its most frequent applications are: XHTML, SOAP, various configurations (for example, Tomcat, SoapUI, IntelliJ IDEA, Spring XML configuration), data import and export.

In Java, there are several APIs for working with XML and it is important for a developer to understand which API to choose in each specific situation. In this article, I will briefly list all the Java XML API, their purpose and examples of use, and I’ll dwell more on the work with a rather rare, but in some cases the only true StAX technology. It is assumed that you are already familiar with the elements of XML .

Java XML API: choosing the right



In the comparative API label on their capabilities, Easy of Use for SAX / StAX says that the author does not know how to work with StAX and the rest of the article will be about how to “prepare it correctly”.
')

StAX: we work with pleasure


First of all, I want to note that you can work with StAX through 2 APIs: a low-level XMLStreamReader , which returns primitives and a high-level XMLEventReader , which returns objects and consumes more memory. Further I will work with XMLStreamReader. Making a wrapper on it will make working with XML simple and convenient. Let's look at a small example: there is a simple XML with cities and users:
  <Payload> <Cities> <City id="spb">-</City> <City id="mow"></City> ... </Cities> ... <Users> <User city="mow"> <email>gmail@gmail.com</email> <fullName>Gmail User</fullName> </User> <User city="spb"> <email>admin@javaops.ru</email> <fullName>Admin</fullName> </User> ... </Users> ... </Payload> 
In reality, this XML may contain hundreds of cities and hundreds of thousands / millions of users. All that is required: print a list of cities. In this case, the StAX API is the only right choice. Add the StaxStreamProcessor helper class to the StaxStreamProcessor :
 public class StaxStreamProcessor implements AutoCloseable { private static final XMLInputFactory FACTORY = XMLInputFactory.newInstance(); private final XMLStreamReader reader; public StaxStreamProcessor(InputStream is) throws XMLStreamException { reader = FACTORY.createXMLStreamReader(is); } public XMLStreamReader getReader() { return reader; } @Override public void close() { if (reader != null) { try { reader.close(); } catch (XMLStreamException e) { // empty } } } } 
Next, we go through XML sequentially, read all the events that are interesting to us and output the required information:
 try (StaxStreamProcessor processor = new StaxStreamProcessor(Files.newInputStream(Paths.get("payload.xml")))) { XMLStreamReader reader = processor.getReader(); while (reader.hasNext()) { // while not end of XML int event = reader.next(); // read next event if (event == XMLEvent.START_ELEMENT && "City".equals(reader.getLocalName())) { System.out.println(reader.getElementText()); } } } 
In order not to constantly duplicate in the program the often repeated code for finding the desired event in XML, we can add it to StaxStreamProcessor :
 public boolean doUntil(int stopEvent, String value) throws XMLStreamException { while (reader.hasNext()) { int event = reader.next(); if (event == stopEvent && value.equals(reader.getLocalName())) { return true; } } return false; } 
It will not be easy, but very easy to use the utility class:
 while (processor.doUntil(XMLEvent.START_ELEMENT, "City")){ System.out.println(reader.getElementText()); } 
The disadvantage of this code is that we absolutely uselessly spend resources on passing hundreds of thousands of unnecessary users to us instead of completing the program. You need to add a condition to stop scanning XML. This is usually the end of the tag of the parent element (in our case Cities ). Add another utility method to StaxStreamProcessor that scans XML either to the end of the parent tag or to a specified element:
 public boolean startElement(String element, String parent) throws XMLStreamException { while (reader.hasNext()) { int event = reader.next(); if (parent != null && event == XMLEvent.END_ELEMENT && parent.equals(reader.getLocalName())) { return false; } if (event == XMLEvent.START_ELEMENT && element.equals(reader.getLocalName())) { return true; } } return false; } 
Add attribute and text reading methods:
 public String getAttribute(String name) throws XMLStreamException { return reader.getAttributeValue(null, name); } public String getText() throws XMLStreamException { return reader.getElementText(); } 
The call code will remain super simple and we will stop processing XML immediately after the end of the Cities tag:
 while (processor.startElement("City", "Cities")) { System.out.println(processor.getAttribute("id") +":" + processor.getText()); } 
StAX API requires accuracy when reading events. If in the output we memorize reading the attribute and text, but the code becomes inoperative: after reading the city name from the XML, the attribute will be left behind and will be unavailable. It should also be remembered that, depending on the current position of the XML, some API read methods from XML are available and others are not available. Using startElement you can get to XML elements of any nesting level and, as necessary, add other utilities to StaxStreamProcessor. I hope that with this approach, working with StAX will seem easy and convenient.

Thank you for your attention and pleasant coding!

Source: https://habr.com/ru/post/339716/


All Articles