In the life of every analyst and programmer, a day comes when he learns about the existence of patterns (patterns) for designing XML schemas and his life is changing. For me, for example, this knowledge began to comprehend the beauty of design.
Today I want to talk about what XSD design patterns are, about the advantages and disadvantages of each, and why we chose the Garden of Eden for our tasks.
For example, take the following XML document as a data source.
')
<?xml version="1.0" encoding="UTF-8"?> <Customer> <CustomerId>100</CustomerId> <FirstName></FirstName> <LastName></LastName> <Address> <StreetAddress> 2</StreetAddress> <City></City> <Zip>115088</Zip> </Address> </Customer>
And let's see how you can describe the same structure of an XML document in different ways.
The basis of the patterning is the definition of global elements and / or data types within XSD.
Matryoshka (Russian Doll)
The essence of the template is that the schema is a mirror of the XML document that it describes: if complex elements contain other complex elements within themselves, and those in turn contain simple ones, then in XSD descriptions of such elements will be nested into each other. The name of the pattern was in honor of the world-famous dolls of our dolls, in the same way as the child elements in the pattern are encapsulated into the parent ones.
The scheme describing the structure of our source file using the “Matryoshka” pattern looks like this:
<?xml version="1.0" encoding="UTF-8"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.example.org/Customer" xmlns:tns="http://www.example.org/Customer" elementFormDefault="qualified"> <xsd:element name="Customer"> <xsd:complexType> <xsd:sequence> <xsd:element name="CustomerId" type="xsd:int" /> <xsd:element name="FirstName" type="xsd:string" /> <xsd:element name="LastName" type="xsd:string" /> <xsd:element name="Address"> <xsd:complexType> <xsd:sequence> <xsd:element name="StreetAddress" type="xsd:string"/> <xsd:element name="City" type="xsd:string"/> <xsd:element name="Zip" type="xsd:string"/> </xsd:sequence> </xsd:complexType> </xsd:element> </xsd:sequence> </xsd:complexType> </xsd:element> </xsd:schema>
Characteristics of the template:
- Opacity content. The content of XSD is opaque to other schemes, and even to other parts of the same scheme. As a result, none of the types or elements within the XSD can be reused.
- Hidden areas. The areas of the scheme in which local elements are defined (“City” and “Zip” in the example) are localized within the root element (“Address”). As a result, if you set elementFormDefault = “unqualified” in the schema, then the local element namespaces (“City” and “Zip”) are hidden within the schema.
- Independence. With this design, each component of the circuit is autonomous (that is, it is not interconnected with other components). Consequently, changes to individual components will have limited impact. For example, if you add the element “FlatNumber” to the address, it will not affect other elements of the scheme.
- Compactness. Thanks to this design, all related data are combined into autonomous components in the schema, i.e. components are compact.
Salami (Salami Slice)
The essence of the template is that the described XML document is divided into constituent elements, each of which is described as global in XSD. Then the described elements are connected together.
The diagram describing the structure of the source file using the “Salami” template looks like this:
<?xml version="1.0" encoding="UTF-8"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.example.org/Customer" xmlns:tns="http://www.example.org/Customer" elementFormDefault="qualified"> <xsd:element name="CustomerId" type="xsd:int" /> <xsd:element name="FirstName" type="xsd:string" /> <xsd:element name="LastName" type="xsd:string" /> <xsd:element name="StreetAddress" type="xsd:string"/> <xsd:element name="City" type="xsd:string"/> <xsd:element name="Zip" type="xsd:string"/> <xsd:element name="Customer"> <xsd:complexType> <xsd:sequence> <xsd:element ref="tns:CustomerId" /> <xsd:element ref="tns:FirstName" /> <xsd:element ref="tns:LastName" /> <xsd:element name="Address"> <xsd:complexType> <xsd:sequence> <xsd:element ref="tns:StreetAddress" /> <xsd:element ref="tns:City" /> <xsd:element ref="tns:Zip" /> </xsd:sequence> </xsd:complexType> </xsd:element> </xsd:sequence> </xsd:complexType> </xsd:element> </xsd:schema>
Characteristics of the template:
- Transparency content. All elements can see other schemes as well as other components of this XSD.
- Globality Since all schema elements are globally declared, regardless of the elementFormDefault value of the schema namespace, the entire XML attribute document will be shown in the XML document (something may be empty).
- Interdependence. With this design, complex elements refer to other parts of the scheme, that is, depend on them. Consequently, a change in the individual components may entail extensive changes in the entire circuit.
- Compactness. Thanks to this design, all related data are combined into autonomous components in a schema, that is, the components are compact.
Venetian blinds (Venetian Blind)
The essence of the template is that the described XML document is divided into complex types, each of which is described in the XSD as global. Then the root element is declared, corresponding to a global type that connects the schema together.
The scheme describing the structure of the source file using the Venetian Blinds template looks like this:
<?xml version="1.0" encoding="UTF-8"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.example.org/Customer" xmlns:tns="http://www.example.org/Customer" elementFormDefault="qualified"> <xsd:complexType name="AddressType"> <xsd:sequence> <xsd:element name="StreetAddress" type="xsd:string"/> <xsd:element name="City" type="xsd:string"/> <xsd:element name="Zip" type="xsd:string"/> </xsd:sequence> </xsd:complexType> <xsd:complexType name="CustomerType"> <xsd:sequence> <xsd:element name="CustomerId" type="xsd:int" /> <xsd:element name="FirstName" type="xsd:string" /> <xsd:element name="LastName" type="xsd:string" /> <xsd:element name="Address" type="tns:AddressType" /> </xsd:sequence> </xsd:complexType> <xsd:element name="Customer" type="tns:CustomerType" /> </xsd:schema>
Characteristics of the template:
- Transparency content. Data types are visible from other schemas, as well as components of this XSD.
- Maximum hiding names. Element declarations are local, so the template has the maximum potential for hiding names.
- Simple control hiding empty attributes. If namespaces are hidden, whether or not to display empty attributes in documents is controlled by a single elementFormDefault switch.
- Interdependence. With this design, complex data types refer to other parts of the scheme, that is, depend on them. Consequently, a change in the individual components may entail extensive changes in the entire circuit.
- Compactness. Thanks to this design, all related data are combined into autonomous components in a schema, that is, the components are compact.
Garden of Eden
The Garden of Eden is good because it defines each element and composite data type as global. This allows you to refer to any type or element within one XSD or from any other XSD and even from WSDL. This is the only way to fully control the semantics and types and elements.
The scheme describing the structure of the source file using the template "Garden of Eden" looks like this:
<?xml version="1.0" encoding="UTF-8"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.example.org/Customer" xmlns:tns="http://www.example.org/Customer" elementFormDefault="qualified"> <xsd:element name="CustomerId" type="xsd:int"/> <xsd:element name="FirstName" type="xsd:string"/> <xsd:element name="LastName" type="xsd:string"/> <xsd:element name="StreetAddress" type="xsd:string"/> <xsd:element name="City" type="xsd:string"/> <xsd:element name="Zip" type="xsd:string"/> <xsd:element name="Address" type="tns:AddressType"/> <xsd:element name="Customer" type="tns:CustomerType"/> <xsd:complexType name="AddressType"> <xsd:sequence> <xsd:element ref="tns:StreetAddress"/> <xsd:element ref="tns:City"/> <xsd:element ref="tns:Zip"/> </xsd:sequence> </xsd:complexType> <xsd:complexType name="CustomerType"> <xsd:sequence> <xsd:element ref="tns:CustomerId"/> <xsd:element ref="tns:FirstName"/> <xsd:element ref="tns:LastName"/> <xsd:element ref="tns:Address"/> </xsd:sequence> </xsd:complexType> </xsd:schema>
Characteristics of the template:
- Maximum content transparency. Both types and data elements are visible from other schemas, as well as components of this XSD.
- Maximum disclosure of names. Nothing is locally determined, therefore the visibility of names is maximal.
- Interdependence. With this design, complex data types and elements refer to other parts of the scheme, that is, depend on them. Consequently, a change in the individual components may entail extensive changes in the entire circuit.
- Bulkiness. Related data is “smeared” by definition of type and element. "Read" this scheme is more difficult.
Template selection
When choosing a template, it is important to consider several criteria:
- As far as possible to reuse schema components;
- How easy it is to work with the scheme;
- How much the components of the schema should be interdependent or independent.
Often when choosing a design pattern, you have to find a balance between the ability to reuse the components of the circuit and the depth of the relationship between the components. The figure shows the potential of each of the patterns in terms of these two aspects.
Those schemes in which component reuse is well supported, on the other hand, have strong interconnections between components. If necessary, something to change in such a scheme, the developer of the scheme may be faced with the fact that you will have to change a lot of related elements and / or types. Such schemes are subsequently difficult to manage.
In general, the following template selection rules can be displayed:
- if the reuse of the components of the scheme is not necessary, if the convenience of using XSD by developers, and the strict need to control the names of the components is not important, then the “Matryoshka” should be chosen;
- if reuse of components is more important than convenience for developers, and the names of data elements need to be controlled throughout the system, then “Salami” should be chosen;
- if, in addition to the previous item, it is important to control the names of data types and to be able to reuse data types, you should choose the "Garden of Eden"
- “Venetian blinds” are suitable if both the reuse of components and the freedom to define their local names (or the ability to hide them inside the scheme) is important.
The most important thing for us in the project was the reuse of types and elements of the scheme, and secondly the total semantic control of names. The choice of pattern was obvious - the Garden of Eden.
A small lyrical digression. The most interesting use of XML design patterns in my memory has been and remains the hypnosis of the audience. One of our titled analyst likes to take the initiative through a story on this topic. I noticed time, after 5 minutes the listeners look dim, and they go somewhere far in themselves. On the "Garden of Eden" most consciousness is turned off.
And in conclusion I want to add that template mixes are also possible, we met with them.
Sources and Resources:
XML Schema specification
Schema scope: Primer and best practices
Introducing Design Patterns in XML Schemas