
Many services and applications (especially web services) accept tree data. For example, this form has the data received via JSON-PRC, JSON-REST, PHP-GET / POST. Naturally, the task appears to validate their structure. There are many options for solving this problem, ranging from the clutter of if-s in controllers and ending with classes that implement validation on a variety of configurations. Most often, solving this problem requires a recursive validator that works with the data schemes described by a certain standard. One of these standards is JSON-Schema, let's take a closer look.
JSON-schema is a standard for describing data structures in
JSON format , developed on the basis of
XML-Schema , the draft can be found
here (further described will correspond to version 03). The schemas described by this standard have MIME "application / schema + json". The standard is useful for validating and documenting data structures consisting of numbers, strings, arrays, and key-value structures (which, depending on the programming language, can be called: object, dictionary, hash table, associative array, or map, below the name “object” or “object” will be used). At the moment, there are full and partial
implementations for different platforms and languages, in particular javascript, php, ruby, python, java.
Scheme
A schema is a JSON object intended to describe any data in JSON format. The properties of this object are not mandatory, each of them is an instruction of a certain validation rule (hereinafter - the rule). First of all, the scheme can limit the data type (the rule type or disallow, can be either a string or an array):
- string (string)
- number (number including all real numbers)
- integer (integer, is a subset of number)
- boolean (true or false)
- object (an object, in some languages, is called an associative array, hash, hash table, map, or dictionary)
- array
- null (“no data” or “unknown”, only null is possible)
- any (any type including null)
Further, depending on the type of data being verified, additional rules apply. For example, if the data being checked is a number, minimum, maximum, divisibleBy can be applied to it. If the data being checked is an array, the rules take effect: minItems, maxItems, uniqueItems, items. If the data to be checked is a string, apply: pattern, minLength, maxLength. If the object is checked, the rules are considered: properties, patternProperties, additionalProperties.
In addition to the type-specific rules, there are additional generalized rules, such as required and format, as well as descriptive rules, such as id, title, description, $ schema. The specification defines several microformats, such as: date-time (ISO 8601), date, time, utc-millisec, regex, color (W3C.CR-CSS21-20070719), style (W3C.CR-CSS21-20070719), phone, uri, email, ip-address (V4), ipv6, host-name, which can optionally be checked if defined and supported by the current implementation. More details on these and other rules can be found in the
specification .
')
Since the scheme is a JSON object, it can also be verified by the corresponding scheme. The schema that corresponds to the current schema is recorded in the $ schema attribute. According to it you can determine the version of the draft, which was used to write the scheme. Find these schemes
here .
One of the most powerful and attractive features of JSON-Schema is the ability to refer to other schemas from a schema, as well as inherit (extend) schemas (using
JSON-Ref links). This is done using id, extends and $ ref. When expanding the schema, you cannot redefine the rules, only supplement them. When the validator works, all rules from the parent and child schema should be applied to the data being verified. Consider further examples.
Examples
Suppose there is information about the goods. Every product has a name. This is a string from 3 to 50 characters, with no spaces at the ends. We define the scheme for the name of the goods:
{ "$schema": "http://json-schema.org/draft-03/schema#",
Well, now this scheme can describe or validate any string to match the name of the product. Further, the product has a non-distinct price, type ('phone' or 'notebook'), and support for wi-fi n and g. Define a scheme for the product:
{ "$schema":"http://json-schema.org/draft-03/schema#", "id": "urn:product#", "type": "object", "additionalProperties": false, "properties": { "name": { "extends": {"$ref": "urn:product_name#"}, "required": true }, "price": { "type": "integer", "min": 0, "required": true }, "type": { "type": "string", "enum": ["phone", "notebook"], "required": true }, "wi_fi": { "type": "array", "items": { "type": "string", "enum": ["n", "g"] }, "uniqueItems": true } } }
This scheme uses a link to the previous scheme and its extension with the required rule. This cannot be done in the previous scheme, because somewhere the name may be optional, and all the rules will apply.
Performance
The performance of the validator based on JSON-Schema, of course, will depend on the implementation of the validator and the full support of the rules. Let's test for
nodejs and the most “complete”
JSV validator (you can install via “npm install JSV”). First, we will generate a thousand different products with non-valid properties, then we will drive them through the validator. After that, we show the number of errors of each type.
Test Source Code var jsv = require('JSV').JSV.createEnvironment(); console.time('load schemas'); jsv.createSchema( { "$schema": "http://json-schema.org/draft-03/schema#", "id": "urn:product_name#", "type": "string", "pattern": "^\\S.*\\S$", "minLength": 3, "maxLength": 50, } ); jsv.createSchema( { "$schema":"http://json-schema.org/draft-03/schema#", "id": "urn:product#", "type": "object", "additionalProperties": false, "properties": { "name": { "extends": {"$ref": "urn:product_name#"}, "required": true }, "price": { "type": "integer", "min": 0, "required": true }, "type": { "type": "string", "enum": ["phone", "notebook"], "required": true }, "wi_fi": { "type": "array", "items": { "type": "string", "enum": ["n", "g"] }, "uniqueItems": true } } } ); console.timeEnd('load schemas'); console.time('prepare data'); var i, j; var product; var products = []; var names = []; for (i = 0; i < 1000; i++) { product = { name: 'product ' + i }; if (Math.random() < 0.05) { while (product.name.length < 60) { product.name += 'long'; } } names.push(product.name); if (Math.random() < 0.95) { product.price = Math.floor(Math.random() * 200 - 2); } if (Math.random() < 0.95) { product.type = ['notebook', 'phone', 'something'][Math.floor(Math.random() * 3)]; } if (Math.random() < 0.5) { product.wi_fi = []; for (j = 0; j < 3; j++) { if (Math.random() < 0.5) { product.wi_fi.push(['g', 'n', 'a'][Math.floor(Math.random() * 3)]); } } } products.push(product); } console.timeEnd('prepare data'); var errors; var results = {}; var schema; var message; schema = jsv.findSchema('urn:product_name#'); console.time('names validation'); for (i = 0; i < names.length; i++) { errors = schema.validate(names[i]).errors; for (j = 0; j < errors.length; j++) { message = errors[j].message; if (!results.hasOwnProperty(message)) { results[message] = 0; } results[message]++; } } console.timeEnd('names validation'); console.dir(results); results = {}; schema = jsv.findSchema('urn:product#'); console.time('products validation'); for (i = 0; i < products.length; i++) { errors = schema.validate(products[i]).errors; for (j = 0; j < errors.length; j++) { message = errors[j].message; if (!results.hasOwnProperty(message)) { results[message] = 0; } results[message]++; } } console.timeEnd('products validation'); console.dir(results);
The results for 1000 checks are quite satisfactory.
(however, some libraries
claim an order of magnitude greater speed).
On my laptop (MBA, OSX, 1.86 GHz Core2Duo):
names validation: 180ms
products validation: 743ms
Conclusion
JSON-Schema is a fairly handy tool for documenting data structures and configuring automatic validators of external data in applications. It looks simpler and more readable than XML Schema, while taking up less text. It does not depend on a programming language and can find application in many areas: validation of forms of POST requests, JSON REST API, checking packages when exchanging data through sockets, validating documents in document-oriented databases, etc. The main advantage of using JSON-Schema is standardization and, as a result, simplified support and improved software integration.