Spaghetti in the sequential invocation of asynchronous functions. Theory and practice

In the continuation of the article Sequential call asynchronous functions .

Part 1. Theory

Most traditional non-web programming languages are synchronous (blocking).
How can I determine if this language is synchronous or asynchronous? For example, by the presence / absence of the sleep function (it can also be called delay, pause, etc.) when the program completely stops for a certain amount of time.

In JavaScript, as you know, there is no such function. There are for example setTimeout , but it does something completely different. It may delay the execution of the command, but this does not mean that after setTimeout , the program stops and nothing can be done in it.
On the contrary, theoretically, after setTimeout has been called, some of the resources may even become free and the delayed callbacks (functions in the queue) can be executed faster.
')
It is recommended not to confuse synchronicity / asynchrony with single-threaded / multi-threaded. These concepts are loosely connected.
Implementing asynchronous JavaScript is just one approach to another concept — multitasking, for which there is the most traditional solution — multithreading.

Advantages of multithreading:

No need to change thinking, retrain from traditional blocking languages (C ++, Java, Python)
If you have a multiprocessor computer and the program is written using threads and they do not block each other, i.e. operate with independent data, then the program on such a processor will run somewhat faster

Disadvantages of multithreading:

Each thread needs a place to work with data, so each thread eats away the RAM even if this memory is not used and the stream is sleeping.
Previously, it was a big problem, now the memory is cheap, but it still somehow becomes unpleasant when simple but multi-threaded Java program takes 2 GB in memory.
If two streams use the same scarce resource (an object in memory, a network connection, etc.), a race condition for the resource (Race Condition) may occur. “It can happen” is even worse than if we said “it will happen”.
To combat the competition of threads, flags, semaphores, etc. are used, with one goal being to make other threads wait for a scarce resource. But then there may be a deadlock problem when thread A is waiting for thread B, and B is waiting for A.
These are the two biggest disadvantages of multi-threading; this is such a big problem that, for example, in an interview when hiring a Java programmer, there are necessarily questions about threads like " how does Thread.start () differ from Thread.run ()? ", " to deal with deadlocks? ", etc. Java programmers spend a huge amount of time on the heroic creation of threads and then on the no less heroic overcoming of the problems associated with this.
As you probably know, the development of traditional processors rested on a certain limit, and it is no longer possible to increase the productivity of the race with gigahertz. In this case, multi-streaming gives a big boost in processing a large amount of more or less independent data, for example, when encoding a video.
Despite this, the impression of multitasking through multithreading on the part of the programmer is this: "it is incredibly difficult to program so that the memory is spent so that 90% of the time threads wait for each other, and as soon as they stop waiting, they fight with each other deadlock . "

In Javascript, to create a parallel task you need to write only:

setTimeout(function () { console.log('Async'); }, 0);

 button.addEventListener('click', function () { console.log('Async click'); }, false)

However, the “parallel task” does not mean that your JavaScript will work faster on a 10-core processor. JavaScript is thread-neutral, the ECMA specification does not describe how a JavaScript machine implements multitasking. As far as I know, all existing implementations of JavaScript use a multitasking type of “Threads in user space” (the processor quickly and quickly switches tasks by interrupting with a timer inside one process), which however does not guarantee that nuclear multithreading can never appear in JavaScript in the future.
Looking ahead, I’ll say that in the end the threads were forcibly entered into JavaScript in a slightly strange way through the Web Worker, but this will be discussed later in the second part.

So, in standard JavaScript everything is done differently, through an endless loop of events (Event Loop) and non-blocking calls. In the main and only UI thread, this Event Loop runs, which accesses the callback queue and selects and sequentially executes them until the queue is cleared.
A call to setTimeout, onclick, and XmlHttpRequest with the true flag places a new callback on the event queue. When a callback is selected from the queue and executed, it can place another callback in the queue, etc.
If you want a fast working site with rich JavaScript, which doesn’t “ load in two hours ”, you should postpone as many operations as possible and jump into the main thread as soon as possible to release the UI, and the event manager will figure out when and what to call from the queue, and items will be loaded gradually.
The scanning process of a queue of callbacks is never interrupted but never waits. Although gradual loading of data will not change the final speed of the program itself, but an asynchronous site with gradually appearing elements will be perceived by the visitor as faster.

JavaScript is very well suited for asynchronous operation and was conceived like this.
Unfortunately, there are synchronous exceptions in the syntax - these are the alert , confirm , promt and xmlhttprequest commands with the false flag, which block everything.
It is strongly not recommended to use these commands in any cases, except, perhaps, one exception which will be discussed at the end of this article.
An asynchronous call is always better than a synchronous one in terms of performance. Look, for example, at nginx - it became super-popular precisely because of the high performance that is achieved, basically, asynchronous work.
To my great regret, node.js still could not resist and introduced another synchronous command - require. As long as this command is in node.js, I personally will never use it, because I am convinced that performance will always be lame.

Why, then, in an asynchronous language are introduced synchronous commands that spoil the whole ideology of the language?
First, the JavaScript machine does not work by itself, but in the browser in which the user sits, and the blocking commands were added not to the JavaScript language, but to the environment that surrounds it — the browser, although it is difficult for us to logically separate these concepts.
"On the other side of the browser," there are programmers, the most diverse, coming from different languages, most often synchronous. Writing asynchronous code is much more difficult, it requires a completely different way of thinking.
Therefore, " according to numerous requests from programmers, they have little understanding of asynchrony, " they added the usual vicious synchronous functions that completely stop the Event Loop.
It is always possible to perform a task asynchronously, but the temptation to simplify your life by replacing an asynchronous call with a synchronous call is too great.

What is the complexity of asynchronous development?
For example, sooner or later, any JavaScript programmer will encounter such a “bug” (One of the most popular questions on StackOverflow):

Server code

 <?php #booklist.php header('Content-type: application/json'); echo json_encode(array(1, 2, 88)); ?>

Client Code

 var getBookList = function () { var boolListReturn; $.ajax({ url : 'bookList.php', dataType : 'json', success : function (data) { boolListReturn = data; } }); return boolListReturn; }; console.log(getBookListSorted()); // -   :)

Here, of course, misunderstanding is caused by the fact that the ajax request went to the queue, and console.log remained in the main Ui thread.
When ajax is executed successfully, it will queue the success callback, which may also be executed sometime. Of course, console.log will already be far in the past with the fact that the function returned (undefined).

It is more correct to change the program a little, let's say by passing the console.log call inside the success callback.

 var getBookList = function (callback) { $.ajax({ url : 'bookList.php', dataType : 'json', success : function (data) { callback(data); } }); }; getBookList(function (bookList) { console.log(bookList); });

An even more modern way is to move to some convenient interface for working with callbacks, for example, the so-called concept of “promise” (promise, also known as Deferred) - a special object that stores its own queue of callbacks, flags of the current state and other goodies.

 var getBookList = function () { return $.ajax({ url : 'bookList.php', dataType : 'json', }).promise(); }; //     promise    done getBookList().done(function (bookList) { console.log(bookList); });

However, increasing the load on callbacks, there may be a second problem, which is that it is problematic to use more than one or two asynchronous commands.
Imagine that by the received id list we need to find the names of books using another service book.php

Server part:

 <?php #book.php $id = $_REQUEST['id']; $response = array( "id" => $id ); switch ($id) { case '1': $response['title'] = "Bobcat 1"; break; case '2': $response['title'] = "Lion 2"; break; case '88': $response['title'] = "Tiger 88"; break; } header('Content-type: application/json'); echo json_encode($response); ?>

Our client code will be:

 var getBookList = function () { return $.ajax({ url : 'bookList.php', dataType : 'json', }).promise(); }; var getBook = function (id) { return $.ajax({ url : 'book.php?id='+id, }).promise(); }; getBookList().done(function (bookList) { $.each(bookList, function (index, bookId) { getBook(bookId).done(function (book) { console.log(book.title); }); }); });

Here this three-story code is not very. Of course, it is possible to understand what is happening here, but a large level of nesting is very disturbing and becomes a place where bugs can potentially arise.
Here is one of the bugs: If the id list has been sorted, a sorting loss may occur. For example, if some requests are returned more slowly than others, or simply the user simultaneously runs the torrent to swing, the speed of issuing the results of the requests can “ride”.
On php, we emulate this with the sleep command:
...
case '2':
sleep (2);
$ response ['title'] = "Lion 2";
break;
...
our script will output
Bobcat 1
Tiger 88
Lion 2
The trouble is visible here, because our list is no longer sorted alphabetically! ..
How can we maintain the orderliness of the list, while the requests take different time?
This problem is not as simple as it seems, even promising objects will not help much here, try to solve this problem yourself and you will feel the drama of the situation on your skin.

Part 2. Practice

Look at this incomplete list of JavaScript libraries:
async.js, async, async-mini, atbar, begin, chainsaw, channels, Cinch, cloudd, deferred, each, EventProxy.js, fiberize, fibers, proms, asyncblock, first, flow-js, funk, futures, promise, groupie, Ignite, jam, Jscex, JobManager, jsdeferred, LAEH2, miniqueue, $ N, nestableflow, node.flow, node-fnqueue, node-chain, node-continuables, node-cron, node-crontab, node-inflow , node_memo, node-parallel, node-promise, narrow, neuron, noflo, observer, poolr, q, read-files, Rubberduck, SCION, seq, sexy, Signals, simple-schedule, Slide, soda.js, Step, stepc , streamline.js, sync, QBox, zo.js, pauseable, waterfall
All these bike libraries promise to solve approximately one problem "Write async code in sync form.". Those. allow to write asynchronous code as easily as in the synchronous style.
I tried most of them, but in reality they do not really help. I didn’t notice much comfort compared to standard jQuery.Deferred.
But still let's consider what are the options:

Option 1 "Synchronous calls"

The obvious solution to the problem of subqueries (get a list, go through the list items and execute another request for each item, while maintaining the orderliness of the original list) will be stupid to make all calls synchronous:

 var getBookList = function () { var strReturn; $.ajax({ url : '../bookList.php', dataType : 'json', success : function (html) { strReturn = html; }, async : false }); return strReturn; }; var getBook = function (id) { var strReturn; $.ajax({ url : '../book.php?id='+id, success : function (html) { strReturn = html; }, async : false }); return strReturn; }; var getBookTitles = function () { return $.map(getBookList(), function (val, i) { return getBook(val).title; }); }; var ul = $('<ul/>').appendTo($('body')); $.each(getBookTitles(), function (index, title) { $('<li>'+ title +'</li>').appendTo(ul); });

This solution is from the “very fast and dirty” series because requests not only block the browser but also make it take longer, because each next request is waiting for the previous one.

Virtues

Simple code, easy to catch bugs
Easy to test

Disadvantages:

Blocks browser
The resulting time is the sum of the time of all requests.

Option 2 “A promise that awaits the fulfillment of all the promises on his list”

The list of books in itself will be one promise (not a list), but inside it will contain a list of individual promises and only after all the promises in it are fulfilled,
the result will be returned as an array containing synchronous data

 var getBookTitles = function () { var listOfDeferreds = []; var listDeferred = $.Deferred(); getBookList().done(function (bookList) { $.each(bookList, function (i, val) { listOfDeferreds.push(getBook(val)); }); $.when.apply(null, listOfDeferreds).then(function () { listDeferred.resolve($.map(arguments, function (triple) { return triple[0].title; })); }); }); return listDeferred.promise(); }; getBookTitles().done(function (bookTitles) { $.each(bookTitles, function (index, title) { $('<li>'+ title +'</li>').appendTo('#ul'); }); });

The getBookTitles function code is quite heavy. The main problem is that it is mistaken, difficult to catch problems, difficult to debug.

Advantages of this option:

Does not block the browser
The resulting time is the longest of the requests.

Disadvantages:

Difficult, erratic code
Hard to test

Option 3 "Reservation of a place for ui result"

In this case, having received the id list, we iterate over the elements included in it, immediately create a UI object.
In the same iteration, we request the second portion of asynchronous data, while the UI element is visible through the closure and we fill it with the contents:

 getBookList().done(function (bookList) { $.each(bookList, function (index, id) { var li = $('<li>Loading...</li>'); li.appendTo('#ul'); getBook(id).done(function (book) { li.html(book.title); }); }); });

Virtues

Does not block the browser
The result is shown immediately as each individual request ends.
Requests go in parallel

Disadvantages:

Average readability code
Hard to test

Option 4 "Synchronous calls in a separate thread webworker"

In the process of writing this article, I came up with a slightly exotic option - to run synchronous requests but in a separate thread through WebWorker and modules. At the same time, the browser is not blocked, but the code is simplified.
To do this, we will write a file for the worker, plus there will be a synchronous function like require from node.js.

 // wwarpc.js var require = function () { // Only load the module if it is not already cached. var cache = {}; var gettext = function (url) { var xhr = new XMLHttpRequest(); xhr.open("GET", url, false); // sync xhr.send(null); if (xhr.status && xhr.status != 200) throw xhr.statusText; return xhr.responseText; }; return function (url) { if (!cache.hasOwnProperty(url)) { try { // Load the text of the module var modtext = gettext(url); // Wrap it in a function var f = new Function("require", "exports", "module", modtext); // Prepare function arguments var context = {}; // Invoke on empty obj var exports = cache[url] = {}; // API goes here var module = { id: url, uri: url }; // For Modules 1.1 f.call(context, require, exports, module); // Execute the module } catch (x) { throw Error("ERROR in require: Can't load module " + url + ": " + x); } } return cache[url]; } }(); onmessage = function(e){ if ( e.data.message !== "start" ) { return } var url = e.data.url; var funcname = e.data.funcname; var args = e.data.args; var module = require(url); postMessage(module[funcname].apply(null, args)); };

An auxiliary function for the convenient start of the worker will be as follows:
It will also cache both the worker and modules in order not to load the module from the server with each call.

 //   ,     ,  <script src="http://ie-web-worker.googlecode.com/svn/trunk/worker.js"></script> /** * Web Worker Asynchroneous Remote Procedure Call */ var wwarpc = function () { var worker; var getWorker = function () { // for lazy load if (worker === undefined) { worker = new Worker("wwarpc.js"); } return worker; }; return function (url, funcname) { var args = Array.prototype.slice.call(arguments, 2); var d = $.Deferred(); var worker = getWorker(); worker.onmessage = function (e) { d.resolve(e.data); }; worker.postMessage({ message : "start", url : url, funcname : funcname, args : args }); return d.promise(); }; }();

It is interesting that the modules will be node.js-like

 //modules/Books.js exports.getBookList = function () { var the_object = {}; var http_request = new XMLHttpRequest(); http_request.open( "GET", 'bookList.php', false ); http_request.send(null); if ( http_request.readyState == 4 && http_request.status == 200 ) { the_object = JSON.parse( http_request.responseText ); } return the_object; }; exports.getBook = function (id) { var the_object = {}; var http_request = new XMLHttpRequest(); http_request.open( "GET", 'book.php?id='+id, false ); http_request.send(null); if ( http_request.readyState == 4 && http_request.status == 200 ) { the_object = JSON.parse( http_request.responseText ); } return the_object; }; exports.getBookTitles = function () { var Books = exports; return Array.prototype.map.call(Books.getBookList(), function (val, i) { return Books.getBook(val).title; }); };

In this case, the same module code can be called both synchronously (during testing with unit tests) and asynchronously (in production).
Thanks to this, the main code will be much simpler, two-story instead of three:

 wwarpc('modules/Books.js', 'getBookTitles').done(function (bookTitles) { $.each(bookTitles, function (index, title) { $('<li>'+ title +'</li>').appendTo('#ul'); }); });

The ideology will be such that all operations will be performed in the worker but the worker himself will be called asynchronously, as a result, the nesting will always be minimal.

Advantages:

Does not block new browsers
Simple, easy to understand code.
Easy to test (can be tested in synchronous mode, and called asynchronous)

Disadvantages:

Blocks IE and old browsers that do not support workers
The resulting time is the sum of the time of all requests.

Conclusion:

Option 3 "Reservation of a place for the result in ui" is the fastest option, but a bit complicated code
If the readability and testability of the code is more important than the comfort of IE users, then Option 4 “Synchronous calls in a separate webworker thread” can be a good choice.
Option 2 should be strictly avoided (“A promise that awaits the fulfillment of all the promises on its list”)
The use of synchronous calls cannot be justified.
IE must die!

The article was written under the impression of the following materials:

Source: https://habr.com/ru/post/137778/

All Articles

Spaghetti in the sequential invocation of asynchronous functions. Theory and practice

Part 1. Theory

Advantages of multithreading:

Disadvantages of multithreading:

Server code

Client Code

Server part:

Our client code will be:

Part 2. Practice

Option 1 "Synchronous calls"

Virtues

Disadvantages:

Option 2 “A promise that awaits the fulfillment of all the promises on his list”

Advantages of this option:

Disadvantages:

Option 3 "Reservation of a place for ui result"

Virtues

Disadvantages:

Option 4 "Synchronous calls in a separate thread webworker"

Advantages:

Disadvantages:

Conclusion:

More articles: