Mojolicious Documentation: Lost Chapters

This is a continuation of a series of articles about the Perl web framework - Mojolicious: the first part .

This series of articles assumes that the reader is already superficially familiar with the framework, and he has a need to understand the details that are either not described in the documentation or described in insufficient detail and understandable. Official documentation is ideal for initial familiarization (in English).

Asynchronous: sync using Mojo :: IOLoop :: Delay

Mojo :: IOLoop :: Delay provides a mechanism for providing asynchronously executed callbacks:
')

description of successive operations without the "noodles" of callbacks
transferring the results from the current callback (s) to the next
general data for callbacks combined into one task
callback group synchronization
interception and exception handling in callback

Terms used:

The (asynchronous) operation is usually a call to an asynchronous function, such as a timer or deflating the url, to which a callback must be passed.
step - callback, which analyzes the data received from the previous step (if this is not the first step), and starts one or several new operations, or returns the final result (if this is the last step)
task - a list of steps that should be performed sequentially (i.e. the next step is called only after all operations started in the previous step are completed)

Alternative Promises

This is an alternative approach to a problem usually solved with Promise / Deferred or Future . Here is a rough comparison with the Promises / A + specification :

Instead of the chain ->then(\&cb1)->then(\&cb2)->… one call ->steps(\&cb1, \&cb2, …) used ->steps(\&cb1, \&cb2, …) .
Instead of passing the error handler to the second parameter in ->then() it is set via ->catch() . Consequence: there can be only one error handler for all steps of this task.
The result is returned via ->pass() , but unlike ->resolve() in most cases it is called implicitly - the result of the call to the anonymous function generator is transferred to the asynchronous operation ->begin , and the function returned by it automatically does ->pass() , passing a slice of its parameters (ie, the result of the operation of an asynchronous operation) to the next step. Corollary: it is not necessary to write for each asynchronous callback function, which will convert the result returned by it to ->resolve() and ->reject() .
Errors are returned only through exceptions, there is no analogue ->reject() .
There is an additional step performed at the very end ->on(finish=>\&cb) , which can also be accessed from the error handler.
There is support for groups of asynchronous operations: if several operations are run in the current step, the next step will be called when all of them are completed.
There is a user data storage available for all steps of the current task.

These differences show a typical approach for Mojo: everything that can be simplified and convenient “sloths” for typical tasks are provided.

What's left overs

I will not describe the work ->wait , everything is simple and clear from the official documentation.

In addition, there are synonyms / alternatives:

 Mojo::IOLoop->delay(@params) #     : Mojo::IOLoop::Delay->new->steps(@params)

 $delay->catch(\&cb) #    (..  $delay,   \&cb, #     ) : $delay->on(error=>\&cb)

$ delay → begin

This is a key function; without it, using Mojo :: IOLoop :: Delay will fail. Each call ->begin increases the counter of running (usually asynchronous) operations and returns a reference to a new anonymous function . This returned function must be called once upon completion of the operation - it will reduce the counter of running operations and allow you to transfer the results of the operation to the next step (which will be started when the counter reaches zero).

There are two ways to use ->begin : manually and automatically.

In the first variant, the function returned ->begin remembered in a temporary variable and upon completion of the operation it is called manually:

 my $delay = Mojo::IOLoop->delay; for my $i (1 .. 10) { my $end = $delay->begin; Mojo::IOLoop->timer($i => sub { say 10 - $i; $end->(); }); }

In the second variant, the function returned ->begin used as the callback for the operation:

 my $delay = Mojo::IOLoop->delay; for my $i (1 .. 10) { Mojo::IOLoop->timer($i => $delay->begin); }

In both cases, if the next step is defined for $delay (in this case, it is the first and only) step, then it will be called after all 10 operations are completed:

 $delay->steps(sub{ say "all timers done" });

In this example, there is a problem: in the second variant, say 10 - $i not executed. the timer does not pass any parameters to its callback, and we cannot find out the value of $i in the callback unless we shut it down as in the first version. But even if the timer passed $i parameter to the callback, it wouldn't help you much anyway, because a chance to fulfill all ten say 10 - $i we would get only in the next step, and it will start only after all the timers are completed - i.e. the countdown effect will disappear when say performed once a second.

In such rare situations it is necessary to use the first “manual” version of working with ->begin . But in all others it is much better to use the second option: this will eliminate the time variable, the “noodles” of callbacks, and will make it possible to use (more precisely, intercept) exceptions in callbacks (the exception in the usual callback is not a “step” - will fall not into $delay->catch but into the event loop exception handler and, by default, will be ignored).

Functions ->begin you can pass the parameters, and at first glance (in the official documentation) they may not look very clear. The bottom line is that when the function returned ->begin used not in the manual version (when you call it yourself and control with what parameters it will be called), but as a direct callback for the operation, it will be called with the parameters with which it will cause this operation. And you will receive all these parameters as a result of this operation in the parameters of the next step.

For example, $ua->get($url,\&cb) sends two parameters to the callback: ($ua,$tx) , and if you start pumping out 3 url in one step, the next step will receive 6 parameters (each step gets the first required parameter is $ delay, and why in this example I use ->begin(0) I will explain soon):

 Mojo::IOLoop->delay( sub { my ($delay) = @_; $ua->get($url1, $delay->begin(0)); $ua->get($url2, $delay->begin(0)); $ua->get($url3, $delay->begin(0)); }, sub { my ($delay, $ua1,$tx1, $ua2,$tx2, $ua3,$tx3) = @_; }, );

In this case, all three $ua obtained by the second step will be the same. Since this is a typical situation, ->begin gives you the ability to control which of the parameters passed by the operation should be passed on to the next step. To do this, it takes two parameters: the index of the first parameter and their number in order to pass the slice to the next step. By default ->begin works like ->begin(1) - i.e. passes to the next step all parameters passed by the operation except the first:

 Mojo::IOLoop->delay( sub { my ($delay) = @_; $ua->get($url1, $delay->begin); $ua->get($url2, $delay->begin); $ua->get($url3, $delay->begin); }, sub { my ($delay, $tx1, $tx2, $tx3) = @_; }, );

$ delay → data

In principle, with ->data everything is trivial: a hash accessible to all steps is an alternative to transferring data from one step to another through parameters.

 Mojo::IOLoop->delay( sub { my ($delay) = @_; $delay->data->{key} = 'value'; ... }, sub { my ($delay) = @_; say $delay->data->{key}; }, );

The alternative is to use a clozure that looks more lazy, familiar and readable:

 sub do_task { my $key; Mojo::IOLoop->delay( sub { $key = 'value'; ... }, sub { say $key; }, ); }

But here you will find an unpleasant surprise. Clocks live while someone refers to them. And as you complete the steps, Mojo deletes them from memory . Thus, when the last step, referring to the set variable, is executed, it will also be deleted. This leads to an unpleasant effect if this variable was, for example, a Mojo :: UserAgent object:

 sub do_task { my $ua = Mojo::UserAgent->new->max_redirects(5); Mojo::IOLoop->delay( sub { my ($delay) = @_; $ua->get($url1, $delay->begin); $ua->get($url2, $delay->begin); $ua->get($url3, $delay->begin); }, sub { my ($delay, $tx1, $tx2, $tx3) = @_; #  $tx    " " }, ); }

As soon as the first step starts non-blocking pumping operations, the url is completed, and will be removed from memory - the $ua variable will be deleted as well. no more steps that refer to it. And as soon as $ua is removed, all open connections related to it will be broken and their callback will be called with an error in the $tx parameter.

One of the solutions to this problem is to use ->data to guarantee the lifetime of the closet no less than the execution time of the entire task:

 sub do_task { my $ua = Mojo::UserAgent->new->max_redirects(5); Mojo::IOLoop->delay->data(ua=>$ua)->steps( sub { my ($delay) = @_; $ua->get($url1, $delay->begin); $ua->get($url2, $delay->begin); $ua->get($url3, $delay->begin); }, sub { my ($delay, $tx1, $tx2, $tx3) = @_; #  $tx    }, ); }

finish

It is not necessary to set the “finish” event handler, but in many cases it is very convenient to specify the last step not after the remaining steps, but by the “finish” event handler. This will give you the following features:

If an exception handler is used ->catch , and there are no fatal errors, after which it still makes sense to finish the current task on a regular basis by completing the last step - the exception handler can transfer control to the “finish” handler via ->emit("finish",@results) but can not the usual step.
If the final result is obtained at an intermediate step, then to transfer it to the last step, you need to implement a manual “throwing” the finished result through all the steps between them - but if the “finish” handler is used instead of the last step, you can immediately call it through ->remaining([])->pass(@result) .
- It is also necessary to take into account that if this step managed to start some operations before transferring the results to “finish”, then the “finish” handler will be launched only after these operations are completed, and it will receive not only the above-mentioned @result parameters, but and all that will return operations.

ATTENTION! You can do ->emit("finish") only inside the exception handler, but you cannot do it in a normal step. At the same time, in the usual step this is done through ->remaining([])->pass(@result) , but this will not work in the exception handler.

$ delay → pass

Very often, a step starts operations conditionally - inside an if or in a loop that can have 0 iterations. In this case, as a rule, it is necessary that this step (usually at the very beginning or end) triggers:

 $delay->pass;

This command stimulates the launch of a single operation, which immediately ended and returned an empty list as a result. Since she returned an empty list, this her “launch” will not affect the parameters that the next step will receive.

The fact is that if a step does not launch a single operation at all, then it will be considered the last step (which is logical - the next step has nothing to “expect” so that the meaning disappears in it). Sometimes this way to complete the task is suitable, but if you install the handler “finish”, it will be called after this step, and you will receive the parameters of this step with the parameters - which, as a rule, is not what you wanted.

Complex parser example

Let's look at an example that uses almost all of the above. Suppose we need to download data from the site. First you need to login ( $url_login ), then go to the page with the list of necessary entries ( $url_list ), for some entries there may be a link to the page with details, and on the page with details there may be links to several files attached to this entry that need to be downloaded.

 sub parse_site { my ($user, $pass) = @_; #       : # @records = ( # { # key1 => "value1", # … # attaches => [ "content of file1", … ], # }, # … # ); my @records; #      $ua, ..   #      $user/$pass,   #   $ua      my $ua = Mojo::UserAgent->new->max_redirects(5); #  ,  $ua    Mojo::IOLoop->delay->data(ua=>$ua)->steps( sub { $ua->post($url_login, form=>{user=>$user,pass=>$pass}, shift->begin); }, sub { my ($delay, $tx) = @_; die $tx->error->{message} if $tx->error; #    if (!$tx->res->dom->at('#logout')) { die 'failed to login: bad user/pass'; } #   ,    $ua->get($url_list, $delay->begin); }, sub { my ($delay, $tx) = @_; die $tx->error->{message} if $tx->error; #          #      -     $delay->pass; #    for ($tx->res->dom('.record')->each) { #      my $record = { key1 => $_->at('.key1')->text, # … }; #       push @records, $record; #      -  if (my $a = $_->at('.details a')) { #         #     -   # ,     ..  #       #        #  (      #  $record       #       # ) -      #     ,   #       #        #   Mojo::IOLoop->delay( sub { $ua->get($a->{href}, shift->begin); }, sub { my ($delay, $tx) = @_; die $tx->error->{message} if $tx->error; #     -   . $delay->pass; #  0     $tx->res->dom('.file a')->each(sub{ $ua->get($_->{href}, $delay->begin); }); }, sub { my ($delay, @tx) = @_; die $_->error->{message} for grep {$_->error} @tx; #      for my $tx (@tx) { push @{ $record->{attaches} }, $tx->body; } #    finish   # ,     @tx, : $delay->pass; }, )->catch( sub { my ($delay, $err) = @_; warn $err; #     $delay->emit(finish => 'failed to get details'); } )->on(finish => $delay->begin); } ### if .details } ### for .record }, )->catch( sub { my ($delay, $err) = @_; warn $err; #  ,    $delay->emit(finish => 'failed to get records'); } )->on(finish => sub { my ($delay, @err) = @_; if (!@err) { process_records(@records); } } ); }

A slightly non-obvious point is the error handling method. Since the results of the work are not required to be transferred between steps (they are accumulated in the locked @records ), an empty list is transmitted to the next step (through $delay->pass; ), and an error text is sent to the next step. Thus, if the last step in the finish handler receives some parameters, it means there was an error (s) somewhere in the process of pumping out or parsing. The error itself has already been intercepted and processed (through warn ) in the handlers ->catch - in fact, this is exactly what provided the transfer of the error with the parameter to the finish handler.

If someone knows how to solve such a problem as simply and / or as clearly as possible, write. An example of a similar decision on Promises would also be useful.

______________________

Text converted using habrahabr backend for AsciiDoc .

Source: https://habr.com/ru/post/228141/

All Articles