Hello, my name is Dmitry Karlovsky and I ... a multi-tasking person. I mean, I have a lot of tasks and not enough time to finish them all. In part, this is for the better - there is always something to do. On the other hand, while you are torn between projects, the world is heading somewhere not there and there is no one to climb on the armored car and urge the crowd to stop and think a little. And the question is serious - for a long time the world of JS was immersed in the hell of callbacks and not only did not fight with them - they were idolized. Then he is a little less than completely mired in promises. Now, props of varying degrees of curvature are being strenuously inserted from different sides. A light at the end of the tunnel can not see everything. But first things first...
First, we define the terms. In the process of work, the application performs various tasks. For example, "download file from remote server" or "process user request".
It is not uncommon for one task to perform additional tasks - "subtasks". For example, to process a user request, you need to download a file from a remote server.
We can launch a subtask synchronously, and then the current task will be blocked waiting for the subtask to complete. And we can run asynchronously, and then the current task will continue its execution without waiting for the completion of the subtask.
However, it is usually to complete the task, though not immediately, but it also requires completion of the subtask and subsequent processing of its results. The blocking of one task while waiting for signals from another will be called “synchronization”. In general, the synchronization of the same tasks can occur many times, according to the most diverse logic, but in the future we will consider only the simplest and most common option - synchronization upon completion of the subtask.
In languages ​​that support multithreading, usually each task runs in a separate "system thread" or (more correctly) "thread". Each thread can run on a separate processor core, in parallel with other threads. Since there may be a lot of threads, and the number of cores is very limited, the operating system implements a "preemptive multithreading" mechanism, when any thread, if it is long running, can be forcibly suspended to allow other threads to work.
Parallel work of tasks leads to various problems when working with shared memory, for solving which it is necessary to use non-trivial synchronization mechanisms. In order to simplify the work of a programmer and increase the reliability of the software produced by him, some languages ​​completely abandon multithreading and run all tasks in a single thread. In this case, multitasking is implemented in one of the following ways:
The fibers, also known as coroutines. In fact, these are the same threads, but implementing “cooperative multitasking”. All fibers have their stacks, but are executed within the framework of one thread, and therefore cannot be executed in parallel. The decision about when to switch the thread to another fiber, takes the fiber itself.
Chains of tasks . The essence of the approach is that instead of suspending the current task for the duration of the subtask, we divide the task into many small subtasks and tell each one what task to perform at the end of this task.
State machines, also known as "generators", "asynchronous functions" (async functions) and "half-programs" (semicoroutines) and "coroutines without stack" (stackless coroutines). In fact, these are objects that store the local state of a single method, at the beginning of which there is a branch with a transition to the code of one of the steps of the original task. When the step is completed, control is returned to the calling function. Recalling the asynchronous function already leads to a transition to another step.
In the nin-jin / async-js repository, separate implementations of a simple application on different multitasking models are collected in separate branches. The essence of the application is simple and consists of 3 parts:
The config is simple:
{ "name" : "Anonymous" }
user.js
var fs = require( 'fs' ) var config var getConfig = () => { if( config ) return config var configText = fs.readFileSync( 'config.json' ) return config = JSON.parse( configText ) } module.exports.getName = () => { return getConfig().name }
greeter.js
module.exports.say = ( greeting , user ) => { console.log( greeting + ', ' + user.getName() + '!' ) }
index.js
var user = require( './user' ) var greeter = require( './greeter' ) try { console.time( 'time' ) greeter.say( 'Hello' , user ) greeter.say( 'Bye' , user ) console.timeEnd( 'time' ) } catch( error ) { console.error( error ) }
Extremely simple and straightforward. It is easy to understand and just as easy to make changes. But it has one major drawback - while this task is being performed no other task can be completed, even if we are waiting for the file to be downloaded from the network drive and are not doing anything useful. If this is a script of one task, as in the example above, then that's okay, but if we need a web server that needs to process multiple requests at the same time, then a single-task solution does not suit us.
Many synchronous methods in the NodeJS API have their asynchronous counterparts, where the last argument is the "continuation", that is, the function that should be called after the end of the asynchronous task.
user.js
var fs = require( 'fs' ) var config var getConfig = done => { if( config ) return setImmediate( () => { return done( null , config ) }) fs.readFile( 'config.json' , ( error , configText ) => { if( error ) return done( error ) try { config = JSON.parse( configText ) } catch( error ) { return done( error ) } return done( null , config ) }) } module.exports.getName = done => { getConfig( ( error , config ) => { if( error ) return done( error ) try { var name = config.name } catch( error ) { return done( error ) } return done( null , name ) } ) }
greeter.js
module.exports.say = ( greeting , user , done ) => { user.getName( ( error , name ) => { if( error ) return done( error ) console.log( greeting + ', ' + name + '!' ) return done() }) }
index.js
var user = require( './user' ) var greeter = require( './greeter' ) var script = done => { console.time( 'time' ) greeter.say( 'Hello' , user , error => { if( error ) return done( error ) greeter.say( 'Bye' , user , error => { if( error ) return done( error ) console.timeEnd( 'time' ) done() } ) } ) } script( error => { if( !error ) return console.error( error ) } )
As you can see, the code is much more complicated. We had to rewrite all (even synchronous) functions in a chained style. At the same time, proper error handling is a special pain: if you forget to handle an error somewhere, the application may or may not fall, or may fall, but not immediately, but a little later, away from the place of the error. And if it does not fall by some miracle, then the error will not be pledged in any way. Writing code in this style requires a programmer to be sensitive and attentive, so most modules in NPM are charged guns that can give you an unforgettable clock at any time in a debugger company.
Implemented through promises, they take on the bulk of the work of throwing up errors. The only thing that needs to be remembered is that there should be an error handler at the end of the chain, otherwise the application may terminate by completing the task without saying anything.
user.js
var fs = require( 'fs' ) var config var getConfig = () => { return new Promise( ( resolve , reject ) => { if( config ) return resolve( config ) fs.readFile( 'config.json' , ( error , configText ) => { if( error ) return reject( error ) return resolve( config = JSON.parse( configText ) ) } ) } ) } module.exports.getName = () => { return getConfig().then( config => { return config.name } ) }
greeter.js
module.exports.say = ( greeting , user ) => { return user.getName().then( name => { console.log( greeting + ', ' + name + '!' ) } ) }
index.js
var user = require( './user' ) var greeter = require( './greeter' ) Promise.resolve() .then( () => { console.time( 'time' ) return greeter.say( 'Hello' , user ) } ) .then( () => { return greeter.say( 'Bye' , user ) } ) .then( () => { console.timeEnd( 'time' ) } ) .catch( error => { console.error( error ) } )
Compared to the predefined chains, the code turned out to be simpler, but it is still broken down into many small functions. The advantage of this approach is that it will work equally well in any environment. Even where there are no promises initially - they are easy to add with a complicated library.
In general, both types of chains lead to a large amount of visual noise and complicate the writing of nonlinear algorithms using cycles, conditional branchings, local variables, and so on.
Some JS engines support generators that are fairly elegantly integrated with promises, which makes it possible to implement “suspendable functions” (awaitable).
user.js
var fs = require( 'fs' ) var co = require( 'co' ) var config var getConfig = () => { if( config ) return config return config = new Promise( ( resolve , reject ) => { fs.readFile( 'config.json' , ( error , configText ) => { if( error ) return reject( error ) resolve( JSON.parse( configText ) ) } ) } ) } module.exports.getName = co.wrap( function* () { return ( yield getConfig() ).name } )
greeter.js
var co = require( 'co' ) module.exports.say = co.wrap( function* ( greeting , user ) { console.log( greeting + ', ' + ( yield user.getName() ) + '!' ) } )
index.js
var co = require( 'co' ) var user = require( './user' ) var greeter = require( './greeter' ) co( function*() { console.time( 'time' ) yield greeter.say( 'Hello' , user ) yield greeter.say( 'Bye' , user ) console.timeEnd( 'time' ) } ).catch( error => { console.error( error ) } )
The code was almost as simple as synchronous, except that we had to turn all functions into generators and wrap them in a special wrapper that received a promise (yield) from the generator, subscribes to its rezolv, after which the generator continues to transmit he received values. Thus, we can again use conditional branches, cycles, and other flow control idioms.
In fact, this is nothing more than syntactic sugar for generators. But this sugar is still in few places supported, so for the time being it is necessary to use babel to transform it into code on generators.
user.js
var fs = require( 'fs' ) var config var getConfig = () => { if( config ) return config return config = new Promise( ( resolve , reject ) => { fs.readFile( 'config.json' , ( error , configText ) => { if( error ) return reject( error ) resolve( JSON.parse( configText ) ) } ) } ) } module.exports.getName = async () => { return ( await getConfig() ).name }
greeter.js
module.exports.say = async ( greeting , user ) => { console.log( greeting + ', ' + ( await user.getName() ) + '!' ) }
index.js
var user = require( './user' ) var greeter = require( './greeter' ) async function app() { console.time('time') await greeter.say('Hello', user) await greeter.say('Bye', user) console.timeEnd('time') } app().catch( error => { console.error( error ) } )
Simple native extension for NodeJS implements full-fledged fibers. All you need is to run the task in the fiber and further, at any level of nesting of function calls, you can pause the fiber, transferring control to another. In the example below, so-called "futures" (futures) are used, which allow you to synchronize one task with another at any time.
user.js
var Future = require( 'fibers/future' ) var FS = Future.wrap( require( 'fs' ) ) var config var getConfig = () => { if( config ) return config var configText = FS.readFileFuture( 'config.json' ) return config = JSON.parse( configText.wait() ) } module.exports.getName = () => { return getConfig().name }
greeter.js
And it didn’t even need to be changed - it is still the same synchronous.
index.js
var Future = require( 'fibers/future' ) var user = require( './user' ) var greeter = require( './greeter' ) Future.task( () => { try { console.time('time') greeter.say('Hello', user) greeter.say('Bye', user) console.timeEnd('time') } catch( error ) { console.error( error ) } } ).detach()
When using fibers, most of the code remains synchronous, but if necessary, wait, not all the thread is blocked, but only a single fiber. The result is a parallel execution of synchronous fibers.
Let's compare the execution time of the main task in each multitasking version on NodeJS v6.3.1:
Findings:
Let's see how our applications respond to an exceptional situation. For example, in the config instead of an object we simply put null
. Loading and parsing the config will take place normally, but the getName
method should fall with an error. We have already taken care that the application does not fall, does not ignore the error, but logged the security in the console. Here is what our implementations will derive:
TypeError: Cannot read property 'name' of null at Object.module.exports.getName (./user.js:13:23) at Object.module.exports.say (./greeter.js:2:41) at Object.<anonymous> (./index.js:7:13) at Module._compile (module.js:541:32) at Object.Module._extensions..js (module.js:550:10) at Module.load (module.js:456:32) at tryModuleLoad (module.js:415:12) at Function.Module._load (module.js:407:3) at Function.Module.runMain (module.js:575:10) at startup (node.js:160:18)
It seems that stektrays captured a fair share of NodeJS internals, but the main thing is that the sequence of calls that we are interested in is index.js:7 -> say@greeter.js:2 -> getName@user.js:13
present, which means we can understand how the application has come to this error.
TypeError: Cannot read property 'name' of null at error (./user.js:31:30) at fs.readFile.error (./user.js:20:16) at FSReqWrap.readFileAfterClose [as oncomplete] (fs.js:439:3)
The structure starts from the arrival of a file upload event. What happened before that we will not know.
TypeError: Cannot read property 'name' of null at getConfig.then.config (./user.js:19:22)
Maximum minimalistic stektrays.
TypeError: Cannot read property 'name' of null at Object.<anonymous> (./user.js:18:33) at next (native) at onFulfilled (./node_modules/co/index.js:65:19)
Here the same promises are used with all the ensuing consequences.
TypeError: Cannot read property 'name' of null at Object.<anonymous> (user.js:18:12) at undefined.next (native) at step (C:\proj\async-js\user.js:1:253) at C:\proj\async-js\user.js:1:430
It would be strange to expect something else here.
TypeError: Cannot read property 'name' of null at Object.module.exports.getName (./user.js:14:23) at Object.module.exports.say (./greeter.js:2:41) at Future.task.error (./index.js:11:17) at ./node_modules/fibers/future.js:467:21
All that is needed and almost nothing extra.
In the debugger, you will see the same picture: you can walk through the stack of synchronous and fiberized code, look at the values ​​of local variables, set breakpoints and walk along the steps of executing your application. At the same time, code broken into chains of functions, flavored with promises or wrapped in generators is a real nightmare for a debugger. And if you also used the crooked transpiler, then Mustafa Hussein himself would envy the developer’s courage to debug this code.
Fibers, objectively, by the sum of qualities, are better than the other solutions presented here. The only drawback is that this is by no means a standard and is not even planned for implementation in browsers. But this is not so much a minus of fibers as a minus of a community that pushes promises, generators, asynchronous functions into standards, but completely ignores much simpler and more direct solutions.
Source: https://habr.com/ru/post/307288/
All Articles