Asynchronous HTTP requests in C ++: incoming through RESTinio, outgoing through libcurl. Part 3

In the previous article, we sorted out the implementation of a two-threaded bridge_server. On a single thread, incoming HTTP requests are asynchronously processed via RESTinio . On the second thread, asynchronous requests are made to delay_server via libcurl in the form of curl_multi using the functions curl_multi_perform and curl_multi_wait .

Today we will examine another implementation of bridge_server, which asynchronously serves both incoming and outgoing HTTP requests on the same thread pool. From libcurl, the curl_multi_socket_action function is used for this purpose.

This implementation took us the most time. It has long been not necessary to smoke as much bamboo as happened during the proceedings with the documentation for this function and examples of its use. At first, all of this was generally perceived as some kind of black magic, but then the light at the end of the tunnel still appeared, and the code earned. How exactly did it work? But about this today and talk.

What and how does bridge_server_2 do?

The bridge_server_2 considered in this article does the same thing as the bridge_server_1 considered earlier. Only he does it differently. Instead of dividing the functionality into two different threads, as it was in bridge_server_1, where RESTinio worked on a separate stream and curl_multi on another stream, in bridge_server_2 all operations are performed on the same thread pool.
')
To do this, we ourselves undertake the creation of sockets, which libcurl needs, and we track the moments of readiness of these sockets for read / write operations.

The structure of this article

The material in this article is structured as follows:

First, it will be discussed that bridge_server_2 is reused from bridge_server_1. In order not to increase the volume of the article by repeating what has already been explained earlier;
then we will tell you how the RESTinio-part of bridge_server_2 differs from the RESTinio-part of bridge_server_1;
then we move on to the story of how the curl part is implemented in bridge_server_2.

So in this part of the article there will be no analysis of the bridge_server_2 code from the beginning to the end. Therefore, if you have not read the previous article with the analysis of bridge_server_1 , then it may make sense to do it. For example, so that you understand what the check_curl_op_completion () function is, what it does and how it works.

What is common between bridge_server_1 and bridge_server_2?

In bridge_server_2, the following fragments are reused (using the copy & paste method):

configuring and parsing command line arguments (config_t structure and parse_cmd_line_args () function);
the request_info_t structure to send information about the received incoming request from the RESTinio part to the curl part, as well as the complete_request_processing () function to send a response to the received incoming HTTP request;
the write_callback () and check_curl_op_completion () functions. This also means that the control over the lifetime of request_info_t instances during the processing of an outgoing HTTP request is constructed in the same way: the bare pointer to request_info_t is stored in the curl_easy instance, and then retrieved from there and deleted.

Differences in RESTinio-part from bridge_server_1

In the RESTinio part of bridge_server_2 there are two main differences from bridge_server_1.

First, in bridge_server_1, a separate working thread was involved in processing outgoing HTTP requests. Therefore, to exchange information between the RESTinio thread and the curl thread, a special thread-safe container was used, into which the request_info_t instances were placed.

But in bridge_server_2, RESTinio- and curl-parts of the application have a common working context (a pool of worker threads). Therefore, to transfer information between the RESTinio and curl parts, a separate container is not needed. But the curl part is represented by an object of the curl_multi_processor_t class, which is fed all received incoming HTTP requests. Therefore, the handler () function, which is the real handler of incoming HTTP requests, is now passed a reference to curl_multi_processor_t:

//   . restinio::request_handling_status_t handler( const config_t & config, curl_multi_processor_t & req_processor, restinio::request_handle_t req) { if(restinio::http_method_get() == req->header().method() && "/data" == req->header().path()) { //    . const auto qp = restinio::parse_query(req->header().query()); //          //      curl_multi. auto url = fmt::format("http://{}:{}/{}/{}/{}", config.target_address_, config.target_port_, qp["year"], qp["month"], qp["day"]); auto info = std::make_unique<request_info_t>( std::move(url), std::move(req)); req_processor.perform_request(std::move(info)); // ,         - //   . return restinio::request_accepted(); } //       . return restinio::request_rejected(); }

Secondly, now RESTinio runs on a pool of worker threads. Therefore, the code in the run_server function has changed, instead of on_this_thread, on_thread_pool is now used there:

 //  ,       . template<typename Server_Traits, typename Handler> void run_server( const config_t & config, restinio::asio_ns::io_context & ioctx, Handler && handler) { restinio::run( ioctx, restinio::on_thread_pool<Server_Traits>(std::thread::hardware_concurrency()) .address(config.address_) .port(config.port_) .request_handler(std::forward<Handler>(handler))); }

Well, actually, the main () function has changed a bit. Although it does not matter.

The code of the main () function from bridge_server_2

 int main(int argc, char ** argv) { try { const auto cfg = parse_cmd_line_args(argc, argv); if(cfg.help_requested_) return 1; //   curl. curl_global_init(CURL_GLOBAL_ALL); auto curl_global_deinitializer = cpp_util_3::at_scope_exit([]{ curl_global_cleanup(); }); //   Asio- io_context, ..    //  curl_multi_processor-,   HTTP-. restinio::asio_ns::io_context ioctx; //     . curl_multi_processor_t curl_multi{ioctx}; //    HTTP-. auto actual_handler = [&cfg, &curl_multi](auto req) { return handler(cfg.config_, curl_multi, std::move(req)); }; //     HTTP-. //     ,   //    . if(cfg.config_.tracing_) { //  ,    ,   //     . struct traceable_server_traits_t : public restinio::default_traits_t { //     . using logger_t = restinio::shared_ostream_logger_t; }; //         . run_server<traceable_server_traits_t>( cfg.config_, ioctx, std::move(actual_handler)); } else { //   ,     . run_server<restinio::default_traits_t>( cfg.config_, ioctx, std::move(actual_handler)); } // ,     . } catch( const std::exception & ex ) { std::cerr << "Error: " << ex.what() << std::endl; return 2; } return 0; }

Implementing the curl part of bridge_server_2

Before we proceed to the analysis of the curl-part, you need to repeat the archival disclimer from the previous article, which is even more relevant in the case of bridge_server_2: in order to simplify and shorten the code of the demo applications as much as possible, we didn’t do any error control at all. If we properly controlled the return codes of curl functions, then the code would be three times swollen, significantly losing clarity, but not gaining anything in functionality. Therefore, in our demonstration, we expect that the calls to libcurl will always succeed. This is our conscious decision for this particular experiment, but we would never do that in a real production code.

The general idea of the curl part

The bridge_server_2 curl part is built around the use of the curl_multi_socket_action function. And, we admit honestly, all this looks unclear and leaves a feeling of some magic: (

Perhaps the cause is not the most responsible documentation for the curl_multi_socket_action function itself on the official website . After reading it, many questions remain. Even more confusing matter is a regular libcurl example called asiohiper.cpp . Which would have to demonstrate exactly how Asio and curl_multi can be integrated through curl_multi_socket_action. But which is written so that you can not figure it out for sober head. If someone can read the source code of this example as an open book, then we frankly envy you :)

Plus, we wanted to make our demonstration as quickly and as easily as possible. Therefore, to spend extra hours on a detailed investigation with curl_multi_socket_action simply could not. As a result, the option we got is quite a working one. But we are not sure that we did everything completely correctly and didn’t make a mistake somewhere. So for sure you can do better. And if someone starts from our decision in his code, then this should be taken into account.

So, here we come to what understanding of using curl_multi_socket_action on the basis of smoking out official information, an example of an asiohiper and fragmentary information from the Internet:

for the curl_multi instance, you need to set the CURLMOPT_TIMERFUNCTION property. This should be the callback provided by us, which libcurl will jerk when it needs to count timeouts associated with processing requests. Accordingly, we need to implement service timers;
For the curl_multi instance, you need to set the CURLMOPT_SOCKETFUNCTION property. This will be the callback we provided, which libcurl will pull (possibly several times) inside curl_multi_socket_action;
we have to somehow monitor the status of I / O operations,
which are executed on the sockets created by libcurl. When a socket is ready to be read and / or written, we need to call curl_multi_socket_action for that socket. Then the question arises: how do we know exactly what sockets libcurl creates when processing our requests? We will discuss the very correct question and the answer to it below;
periodically we have to pull the curl_multi_socket_action function with the CURL_SOCKET_TIMEOUT parameter. This call causes libcurl to check the expiration of timeouts for expired requests. Basically, the time when we have to call curl_multi_socket_action with the value CURL_SOCKET_TIMEOUT is set to us by libcurl itself via the callback, which we set via the CURLMOPT_TIMERFUNCTION property;
periodically we have to pull the curl_multi_info_read function in order
to determine which HTTP requests have ended (here we just reuse check_curl_op_completion () from bridge_server_1).

Sounds not very clear? Well, that's the way it is, so let's go further. Now there will be some more incomprehensibility. But then we move on to parsing the code and, there is hope, when we look at the code, everything will fall into place.

But first you need to close an important question with how to find out exactly which sockets libcurl creates to serve our requests. Indeed, without these sockets, we cannot control their accessibility for read-write operations.

Putting libcurl into your own sockets

Extremely important in the context of our story, the feature of libcurl is that libcurl has separate curl_easy instances and there are separate sockets that libcurl creates to serve requests initiated through curl_easy instances. Moreover, curl_easy instances and sockets are not at all related to the 1: 1 relationship. Those. several curl_easy instances may well be served by the same socket if all requests go to the same server (especially if keep-alive is also specified).

When working with curl_multi it turns out generally interesting. We push a bundle of curl_easy instances into one instance of curl_multi. And curl_multi decides for itself at what point and how many sockets it needs to create in order to service the HTTP requests that are inside the curl_multi instance. And we need to know that these sockets are created, we need to get their descriptors, and we have to push the descriptors of these sockets into some kind of our own event-loop mechanism (be it select, epoll, kqueue, IOCP).

Basically, learning a socket descriptor is easy. Libcurl itself sends it to us in callback, which we registered through the CURLMOPT_SOCKETFUNCTION property. Plus, calling this callback libcurl tells us which operations on the socket we need to monitor (for this we use the values CURL_POLL_IN, CURL_POLL_OUT, CURL_POLL_INOUT, CURL_POLL_REMOVE) . And if we ourselves performed the event-loop, then that would be enough for us.

But the problem is that the real event-loop is inside Asio. And we need to be able to somehow make friends the sockets created by libcurl with our io_context instance. And how to do it?

We saw the solution in the asiohiper example. This moment there was probably the most understandable and obvious. The point is to create sockets in the form of asio :: io :: tcp :: socket. Thus, we can make io_context monitor their readiness to read / write. And in libcurl we will give the real descriptors of our sockets so that libcurl can read and write.

It is done this way: for curl_easy, we assign additional properties CURLOPT_OPENSOCKETFUNCTION and CURLOPT_CLOSESOCKETFUNCTION . These are two callbacks. The first is called by libcurl when it wants to create a new socket. And this callback, accordingly, must return a handle to the new socket. The second callback is called when libcurl no longer needs a socket and wants to close it. Accordingly, we provide the libcurl with these same callbacks that create and destroy objects of type asio :: ip :: tcp :: socket.

Parsing the curl part code

In addition to the functions already familiar to us, which were reused from bridge_server_1, the curl part in bridge_server_2 is represented by two classes described below.

Class active_socket_t

The class active_socket_t is auxiliary and we need it because we ourselves create sockets for libcurl. Created sockets need to be stored somewhere. Active_socket_t helps us to do just that. Here is its definition:

 //      . class active_socket_t final { public: using status_t = std::int_fast8_t; static constexpr status_t poll_in = 1u; static constexpr status_t poll_out = 2u; private: restinio::asio_ns::ip::tcp::socket socket_; status_t status_{0}; public: active_socket_t(restinio::asio_ns::io_service & io_service) : socket_{io_service, restinio::asio_ns::ip::tcp::v4()} {} auto & socket() noexcept { return socket_; } auto handle() noexcept { return socket_.native_handle(); } void clear_status() noexcept { status_ = 0; } auto status() noexcept { return status_; } void update_status( status_t flag ) noexcept { status_ |= flag; } };

The class active_socket_t encapsulates a socket and a set of flags defining the status of the operations that we want to monitor for this socket. For example, if we need to wait for readiness to read, then the active_socket_t :: poll_in flag will be set in the status. The statuses are updated inside callbacks that are called when working with libcurl.

Just the presence of active_socket_t with the status inside is the main simplification of our implementation in comparison with the code of the asiohiper. There, an additional int is allocated for storing flags for each socket. These are extra overheads, and an extra headache (this int must be removed), and extra complexity in implementation. The latter is especially important, since the most difficult part of the asiohiper is to figure out what is the actual flag for the next operation on the socket, and what is the old flags from this additional int.

Class curl_multi_processor_t

The curl_multi_processor_t class is the main element of the implementation of working with curl_multi. This is where all the magic related to the maintenance of an instance of curl_multi and calls to curl_multi_socket_action, as well as almost all the associated callbacks, is executed.

To begin with, we will give a complete definition of the class, and then we will go through all its parts in more detail. So, this is how the class itself looks like:

 class curl_multi_processor_t { public: curl_multi_processor_t(restinio::asio_ns::io_context & ioctx); ~curl_multi_processor_t(); //   Copyable   Moveable . curl_multi_processor_t(const curl_multi_processor_t &) = delete; curl_multi_processor_t(curl_multi_processor_t &&) = delete; //   ,     // ,       . void perform_request(std::unique_ptr<request_info_t> info); private: //  curl_multi,       . CURLM * curlm_; // Asio- ,     . restinio::asio_ns::io_context & ioctx_; //        . restinio::asio_ns::strand<restinio::asio_ns::executor> strand_{ioctx_.get_executor()}; // ,     timer_function-. restinio::asio_ns::steady_timer timer_{ioctx_}; //    ,     //   . std::unordered_map<curl_socket_t, std::unique_ptr<active_socket_t>> active_sockets_; //  ,    reinterpret_cast . static auto cast_to(void * ptr) { return reinterpret_cast<curl_multi_processor_t *>(ptr); } //   CURLMOPT_SOCKETFUNCTION. static int socket_function( CURL *, curl_socket_t s, int what, void * userp, void *); //   CURLMOPT_TIMERFUNCTION. static int timer_function(CURLM *, long timeout_ms, void * userp); //      . void check_timeouts(); //  ,   ,  - //       . void event_cb( curl_socket_t socket, int what, const restinio::asio_ns::error_code & ec); //   CURLOPT_OPENSOCKETFUNCTION. static curl_socket_t open_socket_function( void * cbp, curlsocktype type, curl_sockaddr * addr); //   CURLOPT_CLOSESOCKETFUNCTION. static int close_socket_function(void * cbp, curl_socket_t socket); //    ,   Asio  //       . void schedule_wait_read_for(active_socket_t & act_socket); void schedule_wait_write_for(active_socket_t & act_socket); };

And that's how it all works ...

Class data curl_multi_performer_t

There are several members inside curl_multi_performer_t, without which we will not be able to service requests. It:

curlm_ : a curl_multi instance in which individual curl_easy instances will be placed for outgoing requests;
ioctx_ : Asio context on which all work is performed;
strand_ : a special Asio object that acts as a synchronization primitive. Does not allow Asio to run curl_multi_performer_t class event handlers in parallel on several threads;
timer_ : Asio-shny timer, which we will use to count timeouts,
about which libcurl tells us;
active_sockets_ : dictionary of all created active_socket_t objects. The key in this dictionary is a socket descriptor. We need this dictionary because in some of the callbacks libcurl passes only the socket descriptor, and we need to find the active_socket_t object by descriptor. So we are looking for sockets in this dictionary.

Constructor and destructor

In the constructor, we need to create an instance of curl_multi and configure it. And in the destructor, respectively, you need to remove curl_multi:

 curl_multi_processor_t::curl_multi_processor_t( restinio::asio_ns::io_context & ioctx) : curlm_{curl_multi_init()} , ioctx_{ioctx} { //    curl_multi. //       . curl_multi_setopt(curlm_, CURLMOPT_SOCKETFUNCTION, &curl_multi_processor_t::socket_function); curl_multi_setopt(curlm_, CURLMOPT_SOCKETDATA, this); //       . curl_multi_setopt(curlm_, CURLMOPT_TIMERFUNCTION, &curl_multi_processor_t::timer_function); curl_multi_setopt(curlm_, CURLMOPT_TIMERDATA, this); } curl_multi_processor_t::~curl_multi_processor_t() { curl_multi_cleanup(curlm_); }

Since Since libcurl is a purely library library, we can only use static class methods as callbacks. And in order to access curl_multi_performer_t data from a static method, we pass this as an additional parameter to the corresponding callbacks. This is what the CURLMOPT_SOCKETDATA and CURLMOPT_TIMERDATA properties are used for .

The only public method is perform_request

The curl_multi_performer_t class has only one public method, perform_request, which is designed so that the RESTinio-part can transmit the received incoming processing request to the curl-part.

Here is its implementation:

 void curl_multi_processor_t::perform_request( std::unique_ptr<request_info_t> info) { //  ,      curl_multi  // callback  Asio. restinio::asio_ns::post(strand_, [this, info = std::move(info)]() mutable { //       curl_easy-  //    . auto handle = curl_easy_init(); //   curl_easy ,  URL  writefunction. curl_easy_setopt(handle, CURLOPT_URL, info->url_.c_str()); curl_easy_setopt(handle, CURLOPT_PRIVATE, info.get()); curl_easy_setopt(handle, CURLOPT_WRITEFUNCTION, write_callback); curl_easy_setopt(handle, CURLOPT_WRITEDATA, info.get()); //    . //   ,       //  . curl_easy_setopt(handle, CURLOPT_OPENSOCKETFUNCTION, &curl_multi_processor_t::open_socket_function); curl_easy_setopt(handle, CURLOPT_OPENSOCKETDATA, this); //   ,      . curl_easy_setopt(handle, CURLOPT_CLOSESOCKETFUNCTION, &curl_multi_processor_t::close_socket_function); curl_easy_setopt(handle, CURLOPT_CLOSESOCKETDATA, this); //  curl_easy ,     curl_multi. curl_multi_add_handle(curlm_, handle); // unique_ptr       . //        . info.release(); }); }

In general, this code is very similar to what we have already seen in bridge_server_1. But there are several important differences.

First, all actions on a new request are collected in lambda, which is sent to processing via asio :: io_context :: post (). This is necessary because curl_multi is not a thread-safe object. And we should avoid working with him from different threads at the same time. Just the fact that we post-them lambda through the strand and protects us. If at the moment some part of the curl_multi_performer_t is already working on another thread, then the lambda will be executed only when the other thread has finished working with curl_multi_performer_t.

Secondly, we are setting the CURLOPT_OPENSOCKETFUNCTION and CURLOPT_CLOSESOCKETFUNCTION properties, which were discussed above. It is through these properties that we can create and push libcurl our own sockets.

First tricky callback: socket_function

So we got to the first tricky callback that is used in bridge_server_2 - this is the static method socket_function, which we register through the CURLMOPT_SOCKETFUNCTION property in the designer curl_multi_performer_t. This callback is called (possibly several times in a row) from curl_multi_socket_action so that we can force our event-loop to control the readiness of sockets for a particular I / O operation.

Here is the socket_function code:

 int curl_multi_processor_t::socket_function( CURL *, curl_socket_t s, int what, void * userp, void *) { auto self = cast_to(userp); // ,     ,    . //    ,    . const auto it = self->active_sockets_.find(s); if(it != self->active_sockets_.end()) { auto & act_socket = *(it->second); //     .     //     what. act_socket.clear_status(); //     . if(CURL_POLL_IN == what || CURL_POLL_INOUT == what) { //      . act_socket.update_status(active_socket_t::poll_in); self->schedule_wait_read_for(act_socket); } if(CURL_POLL_OUT == what || CURL_POLL_INOUT == what) { //      . act_socket.update_status(active_socket_t::poll_out); self->schedule_wait_write_for(act_socket); } } return 0; }

Here it is necessary to note a few points.

The moment the first. We are given a socket handle with which we need to do something. But, in principle, not the fact that this socket will still be in our dictionary of active sockets. Therefore, we perform operations on the socket only if it is found in our dictionary.

The second moment. We first forcibly reset all flags for a particular active socket. And then we calculate their new values and save the new flags. We then need them inside the event_cb ().

Well, the third point: depending on what we need to wait (read, write or read-write), we ask Asio to start monitoring the readiness of our socket for the corresponding operation. This is done by calling the helper methods:

 void curl_multi_processor_t::schedule_wait_read_for( active_socket_t & act_socket) { act_socket.socket().async_wait( restinio::asio_ns::ip::tcp::socket::wait_read, restinio::asio_ns::bind_executor(strand_, [this, s = act_socket.handle()]( const auto & ec ){ this->event_cb(s, CURL_POLL_IN, ec); })); } void curl_multi_processor_t::schedule_wait_write_for( active_socket_t & act_socket) { act_socket.socket().async_wait( restinio::asio_ns::ip::tcp::socket::wait_write, restinio::asio_ns::bind_executor(strand_, [this, s = act_socket.handle()]( const auto & ec ){ this->event_cb(s, CURL_POLL_OUT, ec); })); }

Here, in the schedule_wait _ * _ for methods, we specify lambda functions that Asio will call when the socket is ready for the corresponding operation. Here we see the first call to event_cb. This is another tricky callback, which will be discussed below.

Not very tricky callback: timer_function

In the curl_multi_performer_t constructor, another callback is set for the curl_multi instance: the static timer_function method of the following form:

 int curl_multi_processor_t::timer_function( CURLM *, long timeout_ms, void * userp) { auto self = cast_to(userp); if(timeout_ms < 0) { //   . self->timer_.cancel(); } else if(0 == timeout_ms) { //     -   . self->check_timeouts(); } else { //    . self->timer_.cancel(); self->timer_.expires_after(std::chrono::milliseconds{timeout_ms}); self->timer_.async_wait( restinio::asio_ns::bind_executor(self->strand_, [self](const auto & ec) { if( !ec ) self->check_timeouts(); })); } return 0; }

With him, everything is more or less simple. libcurl-in during the processing of requests need to control the time-outs of its operations. But sincelibcurl in our case does not have its own event-loop, then it needs our help. To do this, we set a timer callback, and libcurl calls it when it needs to control the times of its operations.

The actions performed by timer callbacks depend entirely on the value of the timeout_ms argument. If there is -1, then the current timer must be canceled. If the value is 0, then you need to call curl_multi_socket_action with the CURL_SOCK_TIMEOUT parameter as quickly as possible (we immediately make this call). If timeout_ms is greater than 0, then you need to reset the timer for a new time. And when that time comes, you need to call curl_multi_socket_action. What we are doing in the lambda, which is passed to the Asio-shny timer.

Let's also look at the helper function that we call when the timer is triggered:

 void curl_multi_processor_t::check_timeouts() { int running_handles_count = 0; //  curl    . curl_multi_socket_action(curlm_, CURL_SOCKET_TIMEOUT, 0, &running_handles_count); //      -. check_curl_op_completion(curlm_); }

: curl_multi_socket_action CURL_SOCKET_TIMEOUT, check_curl_op_completion() , - .

callback: event_cb

callback- . event_cb(), Asio, Asio , /, - .

event_cb, , :

 void curl_multi_processor_t::event_cb( curl_socket_t socket, int what, const restinio::asio_ns::error_code & ec) { //       .  ,   //   .       . auto it = active_sockets_.find(socket); if(it != active_sockets_.end()) { if( ec ) what = CURL_CSELECT_ERR; int running_handles_count = 0; //  curl    . curl_multi_socket_action(curlm_, socket, what, &running_handles_count ); //      -. check_curl_op_completion(curlm_); if(running_handles_count <= 0) //    .    . timer_.cancel(); //      , ..     //  curl_multi_socket_action  check_active_sockets. it = active_sockets_.find(socket); if(!ec && it != active_sockets_.end()) { //       . auto & act_socket = *(it->second); // ,        //  . if(CURL_POLL_IN == what && 0 != (active_socket_t::poll_in & act_socket.status())) { schedule_wait_read_for(act_socket); } if(CURL_POLL_OUT == what && 0 != (active_socket_t::poll_out & act_socket.status())) { schedule_wait_write_for(act_socket); } } } }

, event_cb. . , libcurl - -. .

curl_multi_socket_action. , what. , , event_cb (.. CURL_POLL_IN, CURL_POLL_OUT, schedule_wait_*_for). Asio - ( ec ), what CURL_CSELECT_ERR.

Then we pull curl_multi_socket_action so that libcurl can perform all the operations it needs for the socket. And, after leaving curl_multi_socket_action, we traditionally check to see if any requests have completed. Well, one more thing: if, after exiting curl_multi_socket_action, we find that there are no more current operations, then cancel the timer. He's not needed anymore. Although, in principle, it is not necessary to do this.

But then begins some kind of obvious magic ... :)

curl_multi_socket_action() check_curl_op_completion() , curl_easy-, , . ? . , libcurl. , libcurl .

curl_multi_socket_action() , . Asio, -.

, . - - -. .

, curl_multi_socket_action() check_curl_op_completion() active_sockets_ , — . curl_multi_socket_action() callback-, callback- , active_sockets_.

curl_multi_socket_action() . , , schedule_wait_*_for. , , libcurl curl_multi_socket_action() ( socket_function CURL_PULL_IN), . , libcurl . , , event_cb.

callback-: open_socket_function close_socket_function

It remains to parse the two callbacks, which are used to create and destroy sockets for libcurl. It is these callbacks that are registered in each new curl_easy instance through the CURLOPT_OPENSOCKETFUNCTION and CURLOPT_CLOSESOCKETFUNCTION properties. Here is the code for these functions:

 curl_socket_t curl_multi_processor_t::open_socket_function( void * cbp, curlsocktype type, curl_sockaddr * addr) { auto self = cast_to(cbp); curl_socket_t sockfd = CURL_SOCKET_BAD; //       IPv4. if(CURLSOCKTYPE_IPCXN == type && AF_INET == addr->family) { //  ,       //   . auto act_socket = std::make_unique<active_socket_t>(self->ioctx_); const auto native_handle = act_socket->handle(); //         . self->active_sockets_.emplace( native_handle, std::move(act_socket) ); sockfd = native_handle; } return sockfd; } int curl_multi_processor_t::close_socket_function( void * cbp, curl_socket_t socket) { auto self = cast_to(cbp); //        . //      active_socket_t. self->active_sockets_.erase(socket); return 0; }

We hope that in this case there will be no difficulties with understanding. At least in close_socket_function, everything is more than trivial.

But on the open_socket_function you need to give an explanation. The schematic diagram of this function was taken from the already mentioned asiohiper.cpp example . There was a restriction on working only with IPv4, which is why we left it at home.

However, adding support for IPv6 should not be difficult. To do this, we need to slightly modify the constructor active_socket_t:

 active_socket_t( restinio::asio_ns::io_service & io_service, restinio::asio_ns::ip::tcp tcp) : socket_{io_service, tcp} {}

And to alter the open_socket_function part, for example, in this way:

 if(CURLSOCKTYPE_IPCXN == type && (AF_INET == addr->family || AF_INET6 == addr->family)) { //  ,       //   . auto act_socket = std::make_unique<active_socket_t>(self->ioctx_, AF_INET == addr->family ? restinio::asio_ns::ip::tcp::v4() : restinio::asio_ns::ip::tcp::v6());

So how does it still work?

Now we will try to do the most difficult thing: to combine the general description of the principles of work from the beginning of the article and the explanations made when parsing the code. To try again to explain how the whole kitchen works.

So, there is a pool of workflows. Asio-shny io_context works on this pool. This io_context serves both RESTinio and libcurl.

When RESTinio receives an incoming request, this request is converted to request_info_t and sent to curl_multi_performer_t. There for this request_info_t a new instance of curl_easy is created. And this curl_easy is added to a single curl_multi instance.

libcurl asks us to create a new socket for curl_easy. We create it in open_socket_function and save it in the dictionary of active sockets.

Next, libcurl jerks for a new socket, a socket_function, in which it says, what kind of operation is it willing to wait for the new socket. At this point, the value of CURL_POLL_OUT or CURL_POLL_INOUT is passed to the socket_function. We update the status for the socket and make Asio wait for the socket to write.

When Asio detects that the socket is ready to write, we call event_cb. In which we pull curl_multi_socket_action. There, inside curl_multi_socket_action, libcurl can send an HTTP request to a remote server. And call the socket_function in order to ask us to wait until the socket is ready to read (in this case, CURL_POLL_IN or CURL_POLL_INOUT is passed to the socket_function). We again update the status for the socket and make it wait for the readiness of the socket to read. At the same time, we are still inside the curl_multi_socket_action, which is called inside the event_cb :)

When Asio detects that the socket is ready for reading, then we again call event_cb. In which we again pull curl_multi_socket_action ...

And all this continues until we find that the HTTP request is complete. Then we form the answer to the incoming HTTP request, we withdraw the corresponding curl_eary from curl_multi, and then we destroy both curl_easy and request_info_t.

And what about the socket?

And the socket continues to live inside libcurl. It can be reused by libcurl to handle other requests. And the code will be deleted sometime in the future. If at all required.

Conclusion

That, in general, is all that we wanted to tell. Thanks to everyone who was able to master all three articles. We hope you find something interesting for yourself.

The demos described in this series can be found in this repository .

Integration with a pure C-shny code, of course, delivered. It was, we can safely say, the confirmation of popular wisdom: if you want to write in C without pain - write in C ++;) And it remains only to regret that there is no native C ++ equivalent of libcurl. For with all due respect to libcurl, it should be noted that its use requires very serious attention to details.

Source: https://habr.com/ru/post/349986/

All Articles