Asynchronous HTTP requests in C ++: incoming through RESTinio, outgoing through libcurl. Part 2

In the previous article, we began to talk about how you can implement asynchronous processing of incoming HTTP requests, within which you need to perform asynchronous outgoing HTTP requests. We considered the simulation of a third-party server implemented in C ++ and RESTinio, which takes a long time to respond to HTTP requests. Now we will talk about how you can implement the issuance of asynchronous outgoing HTTP requests to this server using curl_multi_perform .

A few words about how to use curl_multi

The libcurl library is widely known in the world of C and C ++. But it is probably the most widely known as so-called. curl_easy . Using curl_easy is simple: first call curl_easy_init, then call curl_easy_setopt several times, then curl_easy_perform once. And, in general, everything.

In the context of our curl_easy story, the bad thing is that this is a synchronous interface. Those. every call to curl_easy_perform blocks the worker thread that caused it until the completion of the query. What categorically does not suit us, since we do not want to block our working threads while the slow third-party server deigns to respond. From libcurl, we need asynchronous work with HTTP requests.
')
And libcurl allows you to work with HTTP requests asynchronously through the so-called. curl_multi . When using curl_multi, the programmer still calls curl_easy_init and curl_easy_setopt to prepare each of his HTTP requests. But it does not call curl_easy_perform. Instead, the user creates an instance of curl_multi through a call to curl_multi_init . Then it adds to this curl_multi-instance prepared curl_easy-instances via curl_multi_add_handle and ...

But then curl_multi gives the programmer a choice:

or call the curl_multi_perform call,
or call the curl_multi_socket_action call

followed by curl_multi_info_read to determine which requests have ended.

We show the use of both approaches. This article will discuss working with curl_multi_perform, and in the final article in the series about working with curl_multi_socket_action.

What will be discussed today?

In the last article, we outlined a small demonstration consisting of delay_server and several bridge_servers, as well as detailed implementation of delay_server. Today we will talk about bridge_server_1, which performs requests to delay_server via curl_multi_perform.

bridge_server_1

What does bridge_server_1 do?

bridge_server_1 accepts HTTP GET requests for URLs like / data? year = YYYY & month = MM & day = DD. Each received request is transformed into an HTTP GET request to delay_server. When a response comes from delay_server, this response is appropriately transformed in response to the original HTTP GET request.

If you start delay_server first:

  delay_server -p 4040

then run bridge_server_1:

  bridge_server_1 -p 8080 -P 4040

and then query the bridge_server_1, then you can get the following:

  curl -4 -v "http: // localhost: 8080 / data? year = 2018 & month = 02 & day = 25"
 * Trying 127.0.0.1 ...
 * TCP_NODELAY set
 * Connected to localhost (127.0.0.1) port 8080 (# 0)
 > GET / data? Year = 2018 & month = 02 & day = 25 HTTP / 1.1
 > Host: localhost: 8080
 > User-Agent: curl / 7.58.0
 > Accept: * / *
 >
 <HTTP / 1.1 200 OK
 <Connection: keep-alive
 <Content-Length: 111
 <Server: RESTinio hello world server
 <Date: Sat, Feb 24, 2018 10:15:41 GMT
 <Content-Type: text / plain;  charset = utf-8
 <
 Request processed.
 Path: / data
 Query: year = 2018 & month = 02 & day = 25
 Response:
 ===
 Hello world!
 Pause: 4376ms.

 ===
 * Connection # 0 to host localhost left intact

bridge_server_1 takes the values of the parameters year, month and day from the URL and transmits them unchanged to the delay_server. Therefore, if the value of one of the parameters is set incorrectly, then bridge_server_1 will transmit this incorrect value to delay_server and the consequences will be visible in the response to the initial request:

  curl -4 -v "http: // localhost: 8080 / data? year = 2018 & month = Feb & day = 25"
 * Trying 127.0.0.1 ...
 * TCP_NODELAY set
 * Connected to localhost (127.0.0.1) port 8080 (# 0)
 > GET / data? Year = 2018 & month = Feb & day = 25 HTTP / 1.1
 > Host: localhost: 8080
 > User-Agent: curl / 7.58.0
 > Accept: * / *
 >
 <HTTP / 1.1 200 OK
 <Connection: keep-alive
 <Content-Length: 81
 <Server: RESTinio hello world server
 <Date: Sat, Feb 24, 2018 10:19:55 GMT
 <Content-Type: text / plain;  charset = utf-8
 <
 Request failed.
 Path: / data
 Query: year = 2018 & month = Feb & day = 25
 Response code: 404
 * Connection # 0 to host localhost left intact

bridge_server_1 accepts only HTTP GET requests and only for URL / data. All other requests bridge_server_1 rejects.

How does bridge_server_1 work?

bridge_server_1 is a C ++ application that runs on two threads. RESTinio runs on the main thread (i.e., the embedded HTTP server runs on the main thread). And on the second thread, which is started from the main () function, manipulations are performed with curl_multi (we will call this thread the curl thread). The transfer of information from the main thread to the working curl thread is carried out through a simple homemade thread-safe container.

When RESTinio accepts a new HTTP request, this request is sent to the RESTinio callback set at startup. The URL of the request is checked there and, if it is a request of interest, an object is created with a description of the request received. The created object is pushed into a thread-safe container, from which this object will be extracted by a working curl thread.

The working curl thread periodically retrieves objects with descriptions of received requests from the thread-safe container. For each received request, a corresponding curl_easy instance is created on this working thread. This instance is registered with the curl_multi instance.

The working curl thread performs processing through periodic calls to curl_multi_perform,
curl_multi_wait and curl_multi_info_read, but more on this below. When the curl thread detects that the next request has been processed (that is, a response from delay_server is received), a response is also immediately generated to the original incoming HTTP request. Those. it turns out that the incoming HTTP request is received on the main thread of the application, then it is sent to the curl thread, where the response to the received incoming HTTP request is formed.

Code parsing bridge_server_1

Analysis of the bride_server_1 code will be performed as follows:

first, the main () function will be shown with the necessary explanations;
then the code for several functions that are related to RESTinio will be shown;
then the code that works with curl_multi will be parsed.

A number of points, for example, related to RESTinio or parsing command line arguments, were touched upon in the previous article , so we will not dwell on them here in detail.

Main () function

Here is the main () function code for bridge_server_1:

int main(int argc, char ** argv) { try { const auto cfg = parse_cmd_line_args(argc, argv); if(cfg.help_requested_) return 1; //        //  . request_info_queue_t queue; //    HTTP-. auto actual_handler = [&cfg, &queue](auto req) { return handler(cfg.config_, queue, std::move(req)); }; //    ,     //      curl_multi_perform. std::thread curl_thread{[&queue]{ curl_multi_work_thread(queue); }}; //         //    . auto curl_thread_stopper = cpp_util_3::at_scope_exit([&] { queue.close(); curl_thread.join(); }); //     HTTP-. //     ,   //    . if(cfg.config_.tracing_) { //  ,    ,   //     . struct traceable_server_traits_t : public restinio::default_single_thread_traits_t { //     . using logger_t = restinio::single_threaded_ostream_logger_t; }; //         . run_server<traceable_server_traits_t>( cfg.config_, std::move(actual_handler)); } else { //   ,     . run_server<restinio::default_single_thread_traits_t>( cfg.config_, std::move(actual_handler)); } // ,     . } catch( const std::exception & ex ) { std::cerr << "Error: " << ex.what() << std::endl; return 2; } return 0; }

A significant part of main () - and repeats main () from the delay_server described in the previous article. The same analysis of command line arguments. The same actual_handler variable for storing lambda functions with a call to a real HTTP request handler. The same run_server call with the choice of a specific type of Traits depending on whether the HTTP server trace should be used or not.

But there are a few differences.

First, we need a thread-safe container to transfer information about the received requests from the main thread to the curl thread. This container will use the variable queue, having the type request_info_queue_t. More on the implementation of the container, we consider below.

Secondly, we need to run an additional worker thread, on which we will work with curl_multi. And we also need to stop this additional working thread when we are going out of main () - and get out. All this happens here in these lines:

 //    ,     //      curl_multi_perform. std::thread curl_thread{[&queue]{ curl_multi_work_thread(queue); }}; //         //    . auto curl_thread_stopper = cpp_util_3::at_scope_exit([&] { queue.close(); curl_thread.join(); });

We hope that the code to start the thread does not cause questions. And to complete the working thread, we need to perform two actions:

1. Give a signal to the working thread to complete its work. This is done through the queue.close () operation.
2. Wait until the end of the working thread. This is due to curl_thread.join ().

Both of these actions in the form of lambda are passed to the auxiliary function at_scope_exit () from our utilitarian library . This at_scope_exit () is just an uncomplicated analogue of such well-known things as BOOST_SCOPE_EXIT from Boost, defer from Go, and scope (exit) from D. D. With at_scope_exit () we automatically terminate the curl thread, regardless of what reason we exit main ().

Configuration and analysis of command line arguments

If anyone is interested, then below you can see how the configuration for bridge_server_1 appears. And how this configuration is formed as a result of parsing command line arguments. Everything is very similar to how we did in delay_server, so the details are hidden under the spoiler so as not to distract attention.

The config_t structure and the parse_cmd_line_args () function

 // ,   . struct config_t { // ,       . std::string address_{"localhost"}; // ,    . std::uint16_t port_{8080}; // ,      . std::string target_address_{"localhost"}; // ,      . std::uint16_t target_port_{8090}; //    ? bool tracing_{false}; }; //    . //     . auto parse_cmd_line_args(int argc, char ** argv) { struct result_t { bool help_requested_{false}; config_t config_; }; result_t result; //     . using namespace clara; auto cli = Opt(result.config_.address_, "address")["-a"]["--address"] ("address to listen (default: localhost)") | Opt(result.config_.port_, "port")["-p"]["--port"] (fmt::format("port to listen (default: {})", result.config_.port_)) | Opt(result.config_.target_address_, "target address")["-T"]["--target-address"] (fmt::format("target address (default: {})", result.config_.target_address_)) | Opt(result.config_.target_port_, "target port")["-P"]["--target-port"] (fmt::format("target port (default: {})", result.config_.target_port_)) | Opt(result.config_.tracing_)["-t"]["--tracing"] ("turn server tracing ON (default: OFF)") | Help(result.help_requested_); //  ... auto parse_result = cli.parse(Args(argc, argv)); // ...      . if(!parse_result) throw std::runtime_error("Invalid command line: " + parse_result.errorMessage()); if(result.help_requested_) std::cout << cli << std::endl; return result; }

Interaction details between RESTinio and curl parts

Information about the received incoming HTTP request is transferred from the RESTinio-part of bridge_server_1 to the curl-part via instances of the following structure:

 // ,        curl_multi_perform //  ,      . struct request_info_t { // URL,     . const std::string url_; // ,         . restinio::request_handle_t original_req_; //     curl-. CURLcode curl_code_{CURLE_OK}; //    . //       . long response_code_{0}; //  ,      . std::string reply_data_; request_info_t(std::string url, restinio::request_handle_t req) : url_{std::move(url)}, original_req_{std::move(req)} {} };

Initially, only two fields are filled in it: url_ and req_. But after the request is processed by the curl thread, the rest of the fields will be filled. First of all, this is the curl_code_ field. If it contains CURLE_OK, then the response_code_ and reply_data_ fields will also receive their values.

To transfer request_info_t instances between worker threads, use the following homemade thread-safe container:

 //   thread-safe     //    . //           //   .       //    . template<typename T> class thread_safe_queue_t { using unique_ptr_t = std::unique_ptr<T>; std::mutex lock_; std::queue<unique_ptr_t> content_; bool closed_{false}; public: enum class status_t { extracted, empty_queue, closed }; void push(unique_ptr_t what) { std::lock_guard<std::mutex> l{lock_}; content_.emplace(std::move(what)); } //  pop  -,     //     ,    . //      mutex-,  , //         ,  pop() //    . template<typename Acceptor> status_t pop(Acceptor && acceptor) { std::lock_guard<std::mutex> l{lock_}; if(closed_) { return status_t::closed; } else if(content_.empty()) { return status_t::empty_queue; } else { while(!content_.empty()) { acceptor(std::move(content_.front())); content_.pop(); } return status_t::extracted; } } void close() { std::lock_guard<std::mutex> l{lock_}; closed_ = true; } }; //        . using request_info_queue_t = thread_safe_queue_t<request_info_t>;

In principle, thread_safe_queue_t was not necessary to make a pattern. But it so happened that the thread_safe_queue_t template class was first made, and later it became clear that it will be used only with the request_info_t type. But we did not begin to alter the implementation from the template to the usual class.

RESTinio-part bridge_server_1

In the code bridge_server_1 there are only three functions that interact with RESTinio. First, it is the run_server () template function, which is responsible for starting the HTTP server in the context of the main application thread:

 //  ,       . template<typename Server_Traits, typename Handler> void run_server( const config_t & config, Handler && handler) { restinio::run( restinio::on_this_thread<Server_Traits>() .address(config.address_) .port(config.port_) .request_handler(std::forward<Handler>(handler))); }

In bridge_server_1, it is even simpler than in delay_server. And, generally speaking, one could do without it. One could just call restinio :: run () directly in main () - e. But it is better to still have a separate run_server (), so that if you need to change the settings of the HTTP server being started, you would have to change them in just one place.

Secondly, this is the handler () function, which is the HTTP request handler. It is a bit more complicated than its counterpart in delay_server, but it is also unlikely to cause difficulties with understanding:

 //   . restinio::request_handling_status_t handler( const config_t & config, request_info_queue_t & queue, restinio::request_handle_t req) { if(restinio::http_method_get() == req->header().method() && "/data" == req->header().path()) { //    . const auto qp = restinio::parse_query(req->header().query()); //          //      curl_multi. auto url = fmt::format("http://{}:{}/{}/{}/{}", config.target_address_, config.target_port_, qp["year"], qp["month"], qp["day"]); auto info = std::make_unique<request_info_t>( std::move(url), std::move(req)); queue.push(std::move(info)); // ,         - //   . return restinio::request_accepted(); } //       . return restinio::request_rejected(); }

Here we first manually check the type of the incoming request and the URL from it. If this is not HTTP GET for / data, then we refuse to process the request. In bridge_server_1 we have to do this check manually, whereas in delay_server due to the use of the Express router, there was no need for this.

Further, if this is a request we expect, then we parse the query string into its components and form a URL to delay_server for our own outgoing request. After that we create the request_info_t object in which we save the generated URL and a smart link to the received incoming request. And we pass this request_info_t to the processing of the curl thread (by storing it in a thread-safe container).

Well, and thirdly, the complete_request_processing () function, in which we respond to a received incoming HTTP request:

 //       . // curl_multi    .   http-response, //        http-request. void complete_request_processing(request_info_t & info) { auto response = info.original_req_->create_response(); response.append_header(restinio::http_field::server, "RESTinio hello world server"); response.append_header_date_field(); response.append_header(restinio::http_field::content_type, "text/plain; charset=utf-8"); if(CURLE_OK == info.curl_code_) { if(200 == info.response_code_) response.set_body( fmt::format("Request processed.\nPath: {}\nQuery: {}\n" "Response:\n===\n{}\n===\n", info.original_req_->header().path(), info.original_req_->header().query(), info.reply_data_)); else response.set_body( fmt::format("Request failed.\nPath: {}\nQuery: {}\n" "Response code: {}\n", info.original_req_->header().path(), info.original_req_->header().query(), info.response_code_)); } else response.set_body("Target service unavailable\n"); response.done(); }

Here we use the original incoming request, which was saved in the request_info_t :: original_req_ field. The restinio :: request_t :: create_response () method returns an object that should be used to generate an HTTP response. We save this object to the response variable. The fact that the type of this variable is not recorded is clearly not accidental. The fact is that create_response () can return different types of objects (details can be found here ). And in this case, it does not matter to us what exactly the simplest form of create_response () returns.

Next, we fill in the HTTP response, depending on what our HTTP request to delay_server has completed. And when the HTTP response is fully formed, we instruct RESTinio to send the response to the HTTP client by calling response.done ().

Regarding the complete_request_processing () function, one very important thing needs to be emphasized: it is called on the context of a curl thread. But when we call response.done (), the delivery of the generated response is automatically delegated to the main thread of the application on which the HTTP server is running.

curl-part bridge_server_1

The curl-part of bridge_server_1 includes several functions that work with curl_multi and curl_easy. We start the analysis of this part from its main function, curl_multi_work_thread (), and then consider the other functions, directly or indirectly called from curl_multi_work_thread ().

But first, a little explanation about why we used “bare” libcurl in our demonstration without using any C ++ wrappers around it. The reason is more than prosaic: ~~what is there to think about, to shake it was~~ not ~~necessary~~ to waste time searching for a suitable wrapper and proceeding with what and how this wrapper does. Given that at one time we had experience with libcurl, how to interact with it at the level of its native C-shny API we imagined. We needed here was only a minimal set of features libcurl. And at the same time I wanted to keep everything under my complete control. Therefore, we decided not to use any third-party C ++ add-ons over libcurl.

And one more important disclaimer needs to be done before parsing the curl code: in order to simplify and shorten the code of the demo applications as much as possible, we didn’t do any error checking at all. If we properly controlled the return codes of curl functions, then the code would be three times swollen, significantly losing clarity, but not gaining anything in functionality. Therefore, in our demonstration, we expect that the calls to libcurl will always succeed. This is our conscious decision for this particular experiment, but we would never do that in a real production code.

Well, now, after all the necessary explanations, let's move on to considering how curl_multi_perform allowed us to organize work with asynchronous outgoing HTTP requests.

Function curl_multi_work_thread ()

Here is the code for the main function that runs on a separate curl thread in bridge_server_1:

 //   ,      // curl_multi_perform. void curl_multi_work_thread(request_info_queue_t & queue) { using namespace cpp_util_3; //   curl. curl_global_init(CURL_GLOBAL_ALL); auto curl_global_deinitializer = at_scope_exit([]{ curl_global_cleanup(); }); //   curl_multi,      //    . auto curlm = curl_multi_init(); auto curlm_destroyer = at_scope_exit([&]{ curl_multi_cleanup(curlm); }); //   . int still_running{ 0 }; while(true) { //     .     , //     . auto status = try_extract_new_requests(queue, curlm); if(request_info_queue_t::status_t::closed == status) //   . // ,      . return; //   -      , //   curl_multi_perform. if(0 != still_running || request_info_queue_t::status_t::extracted == status) { curl_multi_perform(curlm, &still_running); //  ,   - . check_curl_op_completion(curlm); } //    ,   curl_multi_wait, //    -. if(0 != still_running) { curl_multi_wait(curlm, nullptr, 0, 50 /*ms*/, nullptr); } else { //   ,   ,    // ,     . std::this_thread::sleep_for(std::chrono::milliseconds(50)); } } }

It can be divided into two parts: in the first part, the necessary initialization of libcurl and the creation of an instance of curl_multi take place, and in the second part, the main loop is executed to service outgoing HTTP requests.

The first part is quite simple. To initialize libcurl, you need to call curl_global_init (), and then, at the very end of the work, call curl_global_cleanup (). What we are doing with the above focus with at_scope_exit. A similar trick is used to create / delete an instance of curl_multi. Hopefully this code is straightforward.

But the second part is more complicated. The idea is this:

we cycle the HTTP requests service cycle until we are given a command to shut down (in the main () function, this is done by calling queue.close ());
At each iteration of the cycle, we first try to take new HTTP requests from the thread-safe container. If new requests are there, then each of them is converted to a curl_easy instance, which is added to the curl_multi instance;
after that we call curl_multi_perform () in order to try to serve those requests that already exist and / or new requests that could only be added to the curl_multi-instance. And after calling curl_multi_perform (), we immediately try to call curl_multi_info_read () in order to detect those HTTP requests that were completed by libcurl (all this is done inside check_curl_op_completion ());
then we either call curl_multi_wait () to wait for the availability of IO operations if some HTTP requests are currently being serviced, or simply fall asleep for 50ms if there is nothing in the processing right now.

Roughly speaking, curl-thread works on a tact basis. At the beginning of each cycle, new queries are retrieved and the results of active queries are checked. After that, the curl thread falls asleep either until the IO operations are ready, or before the 50-millisecond pause expires. At the same time, waiting for the availability of IO-operations is also limited to the 50-ms interval.

The scheme is very simple. But having a couple of drawbacks. Depending on the situation, these shortcomings can be fatal, and may not be flawed at all:

1. The function curl_multi_info_read () is called after each call to curl_multi_perform (). Although, in principle, curl_multi_perform returns the number of requests that are currently being processed. And based on the change of this value, it is possible to determine the moment when the number of requests decreases, and only after that call curl_multi_info_read. However, we use the most primitive version of the work so as not to bother with the situation when one request is completed, one new one is added, while the total number of running requests remains the same.

2. Increases the latency of processing the next request. So, if there are currently no active requests and a new incoming HTTP request arrives, then the curl thread will receive information about it only after exiting the next call to this_thread :: sleep_for (). With a curl_multi_work_thread () clock cycle size of 50 milliseconds, this means + 50ms to the latency of the request processing (in the worst case). In bridge_server_1 we do not care. But in the implementation of bridge_server_1_pipe we tried to get rid of this drawback by using an additional pipe with notifications for the curl thread. We did not plan to analyze bridge_server_1_pipe in detail, but if someone has a desire to see such an analysis, please write down the comments, please. If there are such wishes, we will make an additional article with analysis.

So, in general terms, the curl thread in the example bridge_server_1 works. If you have any questions, ask them in the comments, we will try to answer. In the meantime, we proceed to the analysis of the remaining functions related to the curl-part of the bridge_server_1.

Functions for receiving new incoming HTTP requests

At the beginning of each iteration of the main loop inside the curl_multi_work_thread (), an attempt is made to pick up all new incoming HTTP requests from the thread-safe container, convert them to curl_easy instances and add these new curl_easy instances to the curl_multi instance. This is all done with a few auxiliary functions.

First, it is the try_extract_new_requests () function:

 //    ,    . //   status_t::closed,     // . auto try_extract_new_requests(request_info_queue_t & queue, CURLM * curlm) { return queue.pop([curlm](auto info) { introduce_new_request_to_curl_multi(curlm, std::move(info)); }); }

In fact, its job is to call the pop () method of our thread-safe container and pass the desired lambda function to pop (). By and large, this could all be written directly inside curl_multi_work_thread (), but initially try_extract_new_requests () was more voluminous. And its presence simplifies the curl_multi_work_thread () code.

Secondly, this is the function introduce_new_request_to_curl_multi (), in which, in fact, all the main work is done. Namely:

 //  curl_easy    ,    //        curl_easy  curl_multi. void introduce_new_request_to_curl_multi( CURLM * curlm, std::unique_ptr<request_info_t> info) { //    curl_easy    . CURL * h = curl_easy_init(); curl_easy_setopt(h, CURLOPT_URL, info->url_.c_str()); curl_easy_setopt(h, CURLOPT_PRIVATE, info.get()); curl_easy_setopt(h, CURLOPT_WRITEFUNCTION, write_callback); curl_easy_setopt(h, CURLOPT_WRITEDATA, info.get()); //  curl_easy ,     curl_multi. curl_multi_add_handle(curlm, h); // unique_ptr       . //        . info.release(); }

If you worked with curl_easy, then you will not see anything new for yourself here. Is that except for the call to curl_multi_add_handle () . This is how the transfer of control over the execution of a separate HTTP request to the curl_multi instance is performed. If you didn’t work with curl_easy before, you will need to familiarize yourself with the official documentation in order to figure out what curl_easy_setopt () is called for and what effect it gives.

The key point in introduce_new_request_to_curl_multi () is related to managing the lifetime of the instance request_info_t. The fact is that request_info_t is passed between worker threads via unique_ptr. And in introduce_new_request_to_curl_multi () it also comes in the form of unique_ptr. So, if you do not take any special action, the request_info_t instance will be destroyed when you exit from introduce_new_request_to_curl_multi (). But we need to save request_info_t until libcurl has completed processing this request.

Therefore, we save the pointer to request_info_t as private data inside the curl_easy instance. And call release () from unique_ptr so that unique_ptr stops controlling the lifetime of our object. When the request is completed, we manually retrieve the private data from the curl_easy instance and destroy the request_info_t object ourselves (this can be seen inside the check_curl_op_completion () function, which is discussed below).

This, by the way, is connected with one more thing that we didn’t get distracted in our demo application, but which will have to take some time while writing the production code: when the application completes its work, request_info_t objects, pointers to which were stored inside curl_easy instances are not deleted. Those.when we exit the main loop in curl_multi_work_thread (), then we do not go through the remaining live instances of curl_easy and do not wipe the request_info_t after us. For good, it should be done.

And, thirdly, the function write_callback is responsible for preparing HTTP requests, the pointer to which we save in a curl_easy instance:

 //     curl     //   .       // CURLOPT_WRITEFUNCTION. std::size_t write_callback( char *ptr, size_t size, size_t nmemb, void *userdata) { auto info = reinterpret_cast<request_info_t *>(userdata); const auto total_size = size * nmemb; info->reply_data_.append(ptr, total_size); return total_size; }

This function is called by libcurl when a remote server sends some data in response to our outgoing request. We accumulate this data in the field request_info_t :: reply_data_. It also uses the fact that the pointer to the request_info_t instance is stored as private data inside the curl_easy instance.

Function check_curl_op_completion ()

Finally, we consider one of the main functions of the curl-part bridge_server_1, which is responsible for finding the executed HTTP requests and completing their processing.

The bottom line is that inside the curl_multi instance there is a queue of some messages generated by the libcurl during the work of the curl_multi. When curl_multi finishes processing the next request within the curl_multi_perform, a message with a special status CURLMSG_DONE is placed in this message queue. This message contains information about the processed request. Our task is to run through this queue and process all the CURLMSG_DONE messages found in it.

It looks like this:

 //    ,      //  curl_multi. void check_curl_op_completion(CURLM * curlm) { CURLMsg * msg; int messages_left{0}; //       curl_multi   //   CURLMSG_DONE. while(nullptr != (msg = curl_multi_info_read(curlm, &messages_left))) { if(CURLMSG_DONE == msg->msg) { //  ,   . //     unique_ptr,     // curl_easy_cleanup. std::unique_ptr<CURL, decltype(&curl_easy_cleanup)> easy_handle{ msg->easy_handle, &curl_easy_cleanup}; //    curl_multi    . curl_multi_remove_handle(curlm, easy_handle.get()); //    ,     //  . request_info_t * info_raw_ptr{nullptr}; curl_easy_getinfo(easy_handle.get(), CURLINFO_PRIVATE, &info_raw_ptr); //    unique_ptr,   . std::unique_ptr<request_info_t> info{info_raw_ptr}; info->curl_code_ = msg->data.result; if(CURLE_OK == info->curl_code_) { //   ,     . curl_easy_getinfo( easy_handle.get(), CURLINFO_RESPONSE_CODE, &info->response_code_); } //     . complete_request_processing(*info); } } }

We just pull curl_multi_info_read () in the loop as long as there is anything in the queue. If we retrieve a message of the CURLMSG_DONE type, then we take an instance of curl_easy from the message and:

we remove it from the curl_multi-instance, since there he is no longer needed;
get curl_easy pointer to request_info_t and take control of the time of his life;
We deal with the result of processing the request (i.e., we get the result of the outgoing request from curl_easy);
form the answer to the original incoming request (the function complete_request_processing was discussed above);
we delete everything that is no longer needed (via unique_ptr).

And so for all requests that have already been completed by this point.

The conclusion of the second part

In this part of the story, we looked at how you can receive incoming HTTP requests on one thread and transfer them to the processing of a second working thread, on which outgoing HTTP requests are executed using curl_multi_perform. We have tried to highlight the main points in the text of the article. But, if something remains unclear, then ask questions, we will try to answer them in the comments.

Also, if someone is interested in reading the analysis of the implementation of bridge_server_1_pipe, which uses the notification pipe, then let us know. We will then make an article on this topic.

Well, it remains to consider bridge_server_2, which uses a more cunning curl_multi_socket_action mechanism . Everything is much more fun there. At least it seemed so while we were dealing with this very curl_multi_socket_action :)

To be continued ...

Source: https://habr.com/ru/post/349818/

All Articles