jsonex - simplify complex client-server dialogs

The interaction between the client and the server is usually very simple and relies on a rather primitive toolkit. This does not create problems in and of itself, but often even a slight complication of the task set doesn’t fit well with the usual approaches, giving rise to not very elegant patch solutions. Many tasks are solved in each new project anew, haphazardly and independently from each other. Such tasks include, for example:

Batch requests
Date transfer as part of a complex data structure
Designation of custom data types
Forwarding round-trip data that the server should return in response
Supplementing the request and response with metadata
Handling errors that came in response

Developers spend a lot of time over and over again creating odd bikes on the server side, after which they also have to be supported on the client side.
')
jsonex is an attempt to combine the solution of the above and many other tasks within a simple unified approach based on the concept of computable data (callable data).

Content

jsonex
Call notation
Context
Communication jsonex and JS
Callable data
Customer perspective
Benefits of Computable Data
Work with HTTP and web sockets
Safety considerations
JSON representation
Asynchronous calls
Arc
Conclusion

jsonex

The concept of computable data is simple and can be used for a wide variety of data formats. A little further I will tell you how to use it in the framework of the JSON representation . But in order to demonstrate the idea in its pure form, let's start from afar.

In many cases we use JSON, and it is beautiful. It is simple, easy to read, allows you to represent hierarchical data structures, is widely supported. However, some things would like to be improved, for example:

Allow comments
Allow apostrophes for strings
Allow to lower quotes for keys in dictionaries
Allow trailing commas in dictionaries and lists

Our extended version of JSON (let's call it jsonex) could look, for example, like this:

{ //  name: 'John', familyName: 'Smith', dateOfBirth: '1901-01-01', friendIds: [ 124124, 283746, /*   */ ], num: 123, }

Already not bad. I would give a lot for the opportunity to write comments in JSON-configs. But there is one rather dubious line in this example:

  dateOfBirth: '1901-01-01',

What is it? Line? Date? A person can guess from the context, but the parser is unlikely to be just as thoughtful. Date type is not provided by the format. To recognize it, you can use two approaches - to describe the data scheme, or use some kind of annotation hint, which could direct the parser in the right direction.

JSON does not initially require a scheme, and it would be strange to require its presence just for the sake of being able to prompt the parser in which of the fields the date lies. So let's go the second way. You can think of many ways to add annotations for a date, but it would be nice to have a simple and at the same time universal way to designate any type of data. Call notation is fine for this purpose.

Call notation

Let's write our field like this:

  dateOfBirth: Date('1901-01-01'),

Now the data type looks obvious. But what exactly should a parser do when it encounters a similar entry? The approach is quite straightforward. Having encountered a construction like SomeName(args...) parser should:

Find in your bins a predefined handler function named SomeName
Execute this function on args... arguments args...
Use the result of its execution in the processed data instead of the original construction

Thus, the result of the analysis depends entirely on the implementation of the handler function. Our Date('1901-01-01') call Date('1901-01-01') will turn into a Date object in JavaScript, date or datetime in Python, and so on.

In the case when the parser cannot find a handler with the specified name, it calls the default handler, which will either return something reasonable or throw an exception.

It may seem that, processing the call notation, we simply execute an arbitrary piece of code contained in the data, but this is not so:

Data remains data and processing control is completely on our side.
The parser cannot use any functions other than the explicitly defined handlers.
Any handler function is defined by us and can perform any preliminary checks before performing active actions.

Next we look at these points in more detail. In the meantime, note that the call notation gives us just amazing flexibility. Adding new handlers we can easily expand the system:

 var handlers = { Date: function (ctx, v) { return new Date(v); }, Complex: function (ctx, real, imag) { return new Complex(real, imag); }, ArrayBuffer: function (ctx, v) { return base64DecToArr(v).buffer; }, Person: function (ctx, personDataDict) { return new Person(personDataDict); } }; //    ,        var person = new JsonexParser(handlers).parse( "Person({"+ " name: 'John',"+ " dateOfBirth: Date('1901-01-01'),"+ " i: Complex(0, 1),"+ " song: ArrayBuffer('Q2FsbCBub3RhdGlvbiBpcyBjb29sIQ=='),"+ "})" );

Here we added the possibility of parsing dates, complex numbers, binary data in the base64 representation - all this is literally a few lines of code and without changing the structure of our hypothetical parser.

Context

As you can see, each handler, in addition to its own arguments, takes the ctx parameter. This parameter passes the processing context. At this stage, we assume that ctx is initially empty dictionary. Thanks to the context, we can:

Allow handlers to "communicate" with each other
Passing some settings to handlers
Pass additional data to the handlers from the external environment
Save additional information received during parsing

For example, using the context, it is easy to create get and set handlers that allow using previously computed objects:

 var handlers = { //         set: function (ctx, key, data) { ctx.box = ctx.box || {}; ctx.box[key] = data; return data; }, //  ,    get: function (ctx, key) { return ctx.box ? ctx.box[key] : undefined; } }; var data = new JsonexParser(handlers).parse( "[ set('x', { a: 'a' }), get('x') ]" ); data[0].a = 5; //  data[0]  data[1]         console.log(JSON.stringify(data)); // [{"a":5},{"a":5}]

Communication jsonex and JS

Like JSON, jsonex is closely related to JS syntax. It is a syntactically correct JS expression, and in the simplest cases, it can even be computed as a JS expression, provided that each call notation is defined in the context of evaluation. For example,

 [ foo(), bar.baz() // , jsonex      ]

is a correct and, moreover, a computable JS-expression, provided that foo and bar are defined correctly.

This, of course, does not mean that it is worth taking and calculating jsonex with eval() , defining the necessary variables in the corresponding closure. In addition to potential security problems, this approach loses some of the flexibility that makes it possible to analyze data as data rather than as something performed. However, in some cases jsonex can really be considered as a limited subset of JS and consider jsonex data as a JS expression.

Callable data

The data in jsonex is a computable expression that can be easily analyzed before computing or directly in the calculation process. Why not use such expressions as server requests? For example, a query might look like this:

 getUsers([1, 15, 7])

The server could calculate it using the appropriate handler:

 var handlers = { getUsers: function (ctx, userIds) { var listOfUsers = getUsersFromDbOrWhatever(userIds); return listOfUsers; } };

Then serialize the result in jsonex and send a response to the client:

 [ User({id: 1, name: 'John', ...}), ...]

The client will receive data with honest objects, ready to use. At the same time, the server turns into a simple calculator of jsonex expressions. To extend the API, it’s enough to add a new handler - you don’t have to mess around with urls, parse arguments, bring them to the necessary types, distinguish between GET and POST, everything will work.

Customer perspective

Let's think about how to organize a call from the client. It would be inconvenient to compile the jsonex representation manually as strings of the type "getUsers([1, 15, 7])" . Therefore, an auxiliary object describing the call notation and understood by the serializer is useful to us. This is how its use might look like:

 var getUsers = function (userIds) { return new jsonex.Call('getUsers', userIds); //   }; //     jsonex.stringify(getUsers([1, 15, 7])); // 'getUsers([1,15,7])'

In this case, the request to the server might look like this:

 server.ask( getUsers([1, 15, 7]), //  function (err, result) { //   ... } );

server.ask() should do the following:

Turn the first argument to jsonex
Send it as a request to the server
Wait for the jsonex-response and parse it
Give the result to the callback function

In our example, the first argument will be the value that getUsers() returns, that is, an object of type jsonex.Call , which is serialized to the string 'getUsers([1,15,7])'

Looks simple and pretty. In terms of writing code, all actions are performed on ready-to-use objects, any transformations are hidden under the hood. The example uses callback, but when using Promise, everything will look even nicer.

If the result received from the server is a successor to the Error class, server.ask() assumes that the server returned an error and calls a callback with the appropriate arguments. This approach is possible, since the result of parsing is a ready-to-use object of the desired class.

Accordingly, in order to inform the client about the error, it is enough for the server to return an object of the required class using the appropriate call notation. In this case, the client must implement a handler that replaces this notation with an object of the class inherited from Error.

Example of server response with an error message:

 UnexpectedError('Error details message')

Handler example:

 handlers.UnexpectedError = function(ctx, msg) { return new ServerError(msg); // ServerError    Error };

Benefits of Computable Data

What computable data gives us:

The ability to specify the types of transmitted objects
Standardized error handling - just return the object of the desired type.
Batch requests are implemented elementarily
Batch requests for steroids - the results of some calls can be used as arguments of others
Easy to forward round-trip data
Easily convey additional information - any headers and metadata that should be taken into account when processing, but should not be mixed with other data

A batch request, for example, might look like this:

 [ getFoo(), getBar(1, 2, 3), ]

In response, an array with the results of getFoo() and getBar() calls will come.

Use the result of one calculation in another:

 [ set('x', getUserBooks(17)), //    17 getAuthors( //    getProps( //   'authorId'     get('x') get('x'), 'authorId' ) ), ]

The answer will be an array with a list of books and a list of authors of these books.
Note: In this example, a call to getProps() can be potentially dangerous, presenting the possibility of reaching out to properties that you might not want to disclose - be careful with implementing such handlers.

Transfer round-trip data:

 [ 137, // ,  , , id  someRequest(...) ]

The response will be an array with the number 137 and the result of the call to someRequest() .
Note: In reality, we would have to use a more complex structure to ensure that round-trip data is returned, even if an exception is thrown during processing of someRequest() .

Transfer of additional data:

 last( //    metaInfo('   ', 1, 3, 4), someRequest(...) )

Here, the metaInfo() call can change something in the context, trigger additional actions, or somehow affect processing, but its return value will not be returned, since last() will return only its last argument.

Work with HTTP and web sockets

An HTTP request, in addition to the main data (the request body), contains the path, method, and headers. The HTTP response contains a return code. When using jsonex, it is convenient to use a single path for all requests, just as is usually done for batch requests and when interacting with web sockets. You can scatter API in different ways, but it rarely makes sense.

We do not need the HTTP method, since each request can include any calls, both receiving data and changing them. However, support for various HTTP methods can be useful or even necessary to ensure proper operation with browsers, proxy servers and the rest of the HTTP world. It is easy to implement; it is enough to add request and response objects to the evaluation context, and our handlers can take into account the subtleties of the HTTP protocol. The same applies to the return code. It is not needed as part of jsonex computing, but for proper interaction with the HTTP environment, it is worth exposing it correctly.

As for the transfer of the data themselves, everything is fine when they are transmitted in the body of the HTTP request. In most cases, it will be so, since using the POST method for jsonex requests looks reasonable. But if GET, HEAD or DELETE is used for some purposes, you will have to pass data as part of the URL, since according to the standard, the body of these requests should be ignored. There is a simple and cheap way to do this - pass jsonex in a single parameter query string, for example, query . Thus, the getUsers([1,2,3]) request getUsers([1,2,3]) will turn into a call to example.com/api?query=getUsers%28%5B1%2C2%2C3%5D%29

It looks awful, but since this is an internal API call, only programmers will see it during the debugging process. This approach is often used to transfer data in JSON-format, since it greatly facilitates both the packaging on the client side and unpacking on the server side. If the scary addresses are still confusing, it’s easy to write tools that will hide them under the hood.

To transfer metadata, you can use both HTTP headers and jsonex features:

 last( authToken('myAuthToken'), //   ,   someOtherHeader('blah blah'), getUsers([1, 15]) )

Since there is no regular way for web sockets to transfer headers with each message, the ability to integrate such data into the request itself is very convenient. For web sockets it is also important to have round-trip data and no need to specify the path and method for each request.

For HTTP, the idempotency of requests is also important. This property is determined by the HTTP method: some methods must be idempotent, others are not. Since a jsonex request can be a mixture of idempotent and nonidempotent calls, we need a mechanism to do something about it. For example, you can cock a flag that requires idempotency and check it in calls:

 var handlers = { idempotent: function (ctx) { ctx.mustBeIdempotent = true; }, updateUser(ctx, userData) { if (ctx.mustBeIdempotent) { throw new NonIdempotentCallError('updateUser'); } ... } };

Request example:

 last( idempotent(), [ getUsers([1,2]), updateUser({id:1, ...}) //   ] )

Which of the calls are idempotent should be clear from the API documentation, and the call idempotent() will give us confidence that nothing dangerous is used in a complex query.

Safety considerations

When creating handlers, you need to remember that in the request they can be called with any arguments and in any combinations. So that this does not lead to security problems, you should adhere to some restrictions:

Handlers must not perform potentially dangerous system actions.
Handlers are required to check their arguments and avoid unintended use.
Handlers are required to check the amount of data in their arguments, limiting the size of the processing package to a reasonable value.
Handlers should check access rights if this is provided by the system.
Total computational complex query should be limited

The last point imposes a restriction on the request as a whole and deserves separate consideration. The easiest way to control the complexity of a query is to calculate it as it is calculated and stop if a certain threshold is exceeded:

 handlers.expesiveCall = function (ctx, args...) { ctx.cost += calcCost(args...); if (ctx.cost > ctx.costThreshold) { throw new TooExpensiveError(); } ... } };

When calculating the cost, arguments must be taken into account, since the nature and amount of the data transferred to them can greatly influence the computational complexity.

Another option may be a preliminary analysis of the request as a whole. This is possible because the request is just data. But analyzing complex queries can in itself be a daunting task, the computational complexity of which also has to be monitored.

Another approach is to run calculations in a sandbox with artificially limited access to resources. If the environment allows you to do it easily and cheaply, this can be a good option.

It is important to remember that a combination of handlers can potentially have an effect that cannot be achieved using the same handlers separately. This fact may conceal new threats, but they are easy to avoid. If a subset of handlers gives rise to fear in terms of unexpected combinations with others, they can always be put into a separate, isolated set. This set can be accessed at a specific address, or activated by adding special metadata to the request, imposing the necessary restrictions.

In the extreme case, a number of handlers can be allowed to be used only one by one, or in combination with very primitive calls such as Date() . These handlers will essentially return to the familiar model, which does not allow more than one call in the request. Computable data provide ample opportunities, but if necessary they are easy to limit.

JSON representation

If you want to use jsonex right now, you will encounter one problem - unlike JSON, high-performance parsing and serialization libraries jsonex, to put it mildly, are not yet so widely available) But there is a solution - you can use jsonex based on its JSON representation. jsonex turns into JSON using three simple rules:

 //   f(...) => {"?": ["f", ...]} //   '?' {'?': value, ...} => {"?": ["?", value], ...} //  ,    f({...}) => {"?": "f", ...}

The first rule shows how to write a call notation in the JSON representation. At the same time, the dictionary has the property '?' acquires special significance. The second rule answers the question of how to write an ordinary dictionary with the property '?' so as not to confuse it with the call notation. And the third is syntactic sugar, a special form of recording call notation for cases when it has a single argument and this argument is a dictionary. Here is an example of the data in jsonex and theirs in the JSON representation:

 Person({ //  Person name: 'John', dateOfBirth: Date('1901-01-01'), i: Complex(0, 1), d: { '?': 123 } })

JSON representation:

 { "?": "Person", "name": "John", "dateOfBirth": {"?": ["Date", "1901-01-01"]}, "i": {"?": ["Complex", 0, 1]}, "d": {"?": ["?", 123]} }

JSON representation looks more complicated, but it means the same thing. It can be JSON.parse() standard JSON.parse() , and then doiscalculate with the second pass. Or, in some simple cases, it can be calculated directly during parsing using the reviver function passed to JSON.parse () . The same goes for serialization to the JSON representation — it is easy to do with the help of the replacer function passed to JSON.stringify () .

Generally speaking, the concept of computable data can be added as an extension to almost any data format.

Asynchronous calls

In the examples shown earlier, it was implicitly assumed that all calls are synchronous. Consider one of them more closely:

 [ set('x', getUserBooks(17)), getAuthors(getProps(get('x'), 'authorId')), ]

getUserBooks() and getAuthors() should probably go to some kind of data storage, enable I / O and, accordingly, be asynchronous. So we can not calculate them on the spot. And even if we can (for example, using fibers ), I would still like to be able to make independent asynchronous calls in parallel, and not one after the other.

The solution could be a computational engine that would place asynchronous calls in a queue for execution, and then substitute the results in the right places.Then, having calculated all the synchronous parts, we would wait for the asynchronous execution and after that we considered that the calculation is ready. You can use something like async.queue () as a queue for execution , performing tasks with a given level of parallelism.

But in fact, the task is more difficult. In our example, the calculation of some calls depends on others; we cannot calculate set()until we calculate getUserBooks(). Therefore, when we try to call set(), we must postpone this calculation, indicating to all the calculations on which it depends, that, once they are ready, it is necessary to do the computation set(). Here we depend on exactly one deferred calculation getUserBooks(), but more complex dependencies are possible.

But that's not all.The call is get()not able to return anything useful until it is completed set(). Therefore, get()it should also be a pending calculation, this time waiting for a signal from set(). In turn, getProps()depends on get(), and getAuthors()depends on getProps().

Our computational engine will have to take into account all these dependencies and perform all pending calculations in the right order. This seems to be not a very simple task, so I would like to have an implementation that proves the efficiency of the approach.

Arc

The arc project was created to show that the concept of computable data is viable, including in the case of asynchronous handlers and dependent calls. The engine itself is very simple, it consists of several hundred lines of code and at the moment does not contain any clever optimizations, which makes it very understandable.

The source data for the engine is a stream of tokens. Currently, tokenization is only available for JSON representation, but working with jsonex is not far off the line. Arc can be used both in node.js and in a browser using browserify . Writing asynchronous handlers for arc looks very simple, just wrap an asynchronous call into the appropriate directive, and the engine will do all the hard work under the hood:

 handlers.getUserBooks = function (ctx, userId) { return ctx.async(function (cb) { doSomethingAsync(...args, cb); }); };

While only callbacks are supported, Promise support is scheduled. Examples of using the engine, as well as serialization in jsonex and JSON representation can be seen in the corresponding section .

Addition
: Surprisingly, after just a few hours from the moment the article was laid out, arc had an alternative . The hacker user mayorovp presented his version of the jsonex parsing library in a JSON representation. This library version consists of a single file, has no dependencies, uses the standard one JSON.parse()and supports A + promise . Unfortunately due to limitationsJSON.parse(), in this implementation, the processing of asynchronous calls will not be as effective, so it is made disconnected. Here is an example of using an alternative library with jQuery:

 var parser = new JSONEX.parser({ allow_async: true, functions: { Foo: function() { var result = $.Deferred(); setTimeout(function() { result.resolve("Hello, world!"); }, 1000); return result.promise(); } }}); $.when(parser.parse('[{"?": ["Foo"]}]')) .done(function(result) { console.log(result); });

Conclusion

Both jsonex and arc are currently under development. The jsonex features mentioned in this article probably won't change, but new ones will be added, such as namespaces, binary parts (binary chunks), streaming data (streams), and scopes. Arc is likely to change quite a bit.

By virtue of the novelty and nekatannosti presented ideas, I would recommend using them with some caution. Nevertheless, I will be very happy if you like some of them and will be useful in your projects and experiments. I would also appreciate it if you share your impressions, experiences and discoveries.

Source: https://habr.com/ru/post/224261/

All Articles