MySQL in NGINX: using blocking libraries in a non-blocking server

As you know, when developing high-load servers, an event model of working with sockets is often used. The key component of the system is epoll (in FreeBSD and Windows there are some solutions, but we will focus on Linux). The epoll_wait function, being the only blocking call, returns information about all network events that interest us. Similarly, of course, the well-known NGINX server also works.

The event-based programming model makes the code quite peculiar, as if it is turned inside out. But this problem is not so terrible. There is another problem - the use of existing libraries in the event-oriented code that were not originally intended for it. If such a library makes blocking calls (for example, connect, recv, etc.), the whole event model may lose its meaning, since All other clients will wait for the end of one such call, which is completely unacceptable if you are writing a serious product.

One of the libraries not originally intended for use in a non-blocking environment is the client library, libmysqlclient. However, it is often needed in NGINX. There are several solutions that allow access to MySQL from NGINX, for example, drizzle and HandlerSocket (the trivial protocol is quite simply implemented using the standard NGINX upstream mechanism). However, all the same, the most convenient is to use all the power of the standard libmysqlclient library and the SQL language.

Context switching and interceptions

There is a simple solution to the problem of blocking calls. To convert a blocking code to a nonblocking one, it’s enough to intercept the blocking call and replace it with a nonblocking call, and if you need a blocking, go to the execution of the main server cycle, but so that when the expected event occurs we return to the very place we have left. Those. create such a user space thread. It will cost us pretty cheap because preemptive multitasking is of no use to him, and we will do all context switches at the right time.
')
First, find out which calls can block the flow of execution.
Here are the main ones:

accept
connect
read / recv
write / send
poll

The last function in this list, poll, doesn’t look very fair. itself is sometimes a sign of non-blocking behavior. However, libmysqlclient uses it, so we have to intercept it. Obviously, epoll_wait is also blocking, but we hope that it will not be used by blocking code. There is still a select call, but it has a number of problems and therefore (thank God!) It is used less and less. Also exclude.

These functions are defined in libc, so if our code is linked dynamically, we have all the possibilities in order to use the standard method for interception. I will give an example for read:

typedef ssize_t (*read_pt)(int fd, void *buf, size_t count); static read_pt orig_read; ssize_t read(int fd, void *buf, size_t count) { ssize_t ret; for(;;) { /*   read */ ret = orig_read(fd, buf, count); if (!mtask_scheduled || ret != -1 || errno != EAGAIN) return ret; /*   ;    */ if (mtask_yield(fd, NGX_READ_EVENT)) { errno = EIO; return -1; } } } ... /* -   */ orig_read = (read_pt)dlsym(RTLD_NEXT, "read");

Here, mtask_yield is a function that implements context switching in the main event loop. It is called when the normal blocking code should have been blocked; mtask_scheduled is a macro that allows you to determine whether to block the blocking behavior by switching contexts, or to behave in a standard way. Obviously, outside of our handler, all context switches will only interfere. Moreover, calls made by NGINX itself (for example, to receive and send requests), obviously, do not need our help and are initially designed for non-blocking behavior.

It should also be noted that the socket on which this read operation is performed must be transferred to the non-blocking mode, for which it is necessary to make the corresponding call to fcntl in the captured connect and accept .

Contexts

What is the user context execution context? These are the stack + registers (there is also a mask for receiving signals, but we are not jumping out of signals, so now we are not interested). If everything is so simple, it is obvious that the context can be switched within the framework of a single process when we feel like it. There are standard tools for this.

makecontext - creates a context, specifies the stack and function
swapcontext - switch contexts
setcontext / getcontext - set / read context

We fasten to NGINX

In NGINX, content for upload is generated by the view processor:

static ngx_int_t ngx_http_mtask_handler(ngx_http_request_t *r);

This handler is added to the NGX_HTTP_CONTENT_PHASE phase handler list:

 h = ngx_array_push(&cmcf->phases[NGX_HTTP_CONTENT_PHASE].handlers); *h = ngx_http_mtask_handler;

In normal use, the handler creates chains (ngx_chain_t) with buffers to return to the client, after which it calls functions

ngx_http_send_header - to return the HTTP header
ngx_http_output_filter - for recoil body.

The body, obviously, may not "climb" into the socket as a whole, but, of course, blocking does not occur, and NGINX itself is engaged in sending the data after the completion of the client handler.

So, we want the handler to perform blocking operations. To do this, do the following.

 /*  ,   */ getcontext(&ctx->wctx); ctx->wctx.uc_stack.ss_sp = ngx_palloc(r->pool, mlcf->stack_size); ctx->wctx.uc_stack.ss_size = mlcf->stack_size; ctx->wctx.uc_stack.ss_flags = 0; ctx->wctx.uc_link = NULL; makecontext(&ctx->wctx, &mtask_proc, 0); /*     */ mtask_wake(r, MTASK_WAKE_NOFINALIZE); /*  NGINX',        ,   ,      ;    */ r->main->count++;

The mtask_wake function does the following basic things:

 /*      */ /*        - */ mtask_setcurrent( r ); /*    */ /*       ! */ swapcontext(&ctx->rctx, &ctx->wctx); /*   ,  ,    ! */ if (!mtask_scheduled) { /*   */ if (!(flags & MTASK_WAKE_NOFINALIZE)) ngx_http_finalize_request(r, NGX_OK); return 1; } /*   ,   -   */ /*    -    */ /*         */ mtask_resetcurrent();

The most important work is done by the mtask_yield function - it converts the blocking call into NGINX events and returns control to the main thread:

 /*    NGINX        */ c = ngx_get_connection(fd, mtask_current->connection->log); c->data = mtask_current; /*   /  NGINX */ if (event == NGX_READ_EVENT) e = c->read; else e = c->write; e->data = c; e->handler = &mtask_event_handler; e->log = mtask_current->connection->log; ngx_add_event(e, event, 0); /*         ,     */ swapcontext(&ctx->wctx, &ctx->rctx); /*  !    /.  */ ngx_del_event(e, event, 0); ngx_free_connection( c );

The NGINX event handler does one main thing: it switches the context when an I / O event occurs:

 static void mtask_event_handler(ngx_event_t *ev) { ... mtask_wake(r, wf); ... }

It is also worth mentioning that in a user-specific thread you cannot call functions of NGINX itself, originally designed for non-blocking behavior. However, such functions are likely to be called from ngx_http_send_header and ngx_http_output_filter. In order to prevent these calls, we transfer the current connection to the buffering mode as follows:

 c->write->delayed = 1

At the end of the thread, this flag is reset and the data is transmitted to the client. Obviously, this solution is not suitable for outputting large amounts of data, but in most cases this task is not worth it (and when it does, it can still be solved in a slightly less beautiful way).

We fasten libmysqlclient

Having a mechanism for executing a blocking code, accessing MySQL becomes easy. First of all, it creates the most common handler CONTENT_PHASE. Recall that the prototype of the function of a user-space thread completely coincides with the prototype of an ordinary handler. Thus, having forgotten about the blocking nature of our code, we use the standard tools of the libmysqlclient library in the standard type of handler:

 ngx_int_t ngx_http_mysql_handler(ngx_http_request_t *r) { ... mysql_real_connect(...) ... mysql_query(...) ... mysql_store_result(...) ... mysql_fetch_row(...) ... }

We output the data in a simple textual form, field by field, and use one link of the ngx_chain_t chain per field. This gives us a simple opportunity to take advantage of the results of queries within NGINX itself. To do this, the mysql_subrequest directive is implemented, which executes the MySQL query described in another locale, and then assigns its results in order to the variables passed as arguments to this command (see example below). Variables can then be used, for example, to proxy the connection to the desired (obtained from the database) backend or to transfer their values to a script that does not have access to the database.

The handler itself is not registered as usual (in the CONTENT_PHASE phase - here we need an “honest” non-blocking code), but in the configuration of the mtask module.

 ngx_http_mtask_loc_conf_t *mlcf; mlcf = ngx_http_conf_get_module_loc_conf(cf, ngx_http_mtask_module); mlcf->handler = &ngx_http_mysql_handler;

Examples

An example of the nginx.conf config demonstrating the capabilities of this module.

 server { ... #      server # unix socket access (default) mysql_host localhost; #mysql_user theuser; #mysql_password thepass; #mysql_port theport; mysql_database test; mysql_connections 32; #          # .. NGINX       #       . mtask_stack 65536; #  ! location /select { mysql_query "SELECT name FROM users WHERE id='$arg_id'"; } location /insert { mysql_query "INSERT INTO users(name) VALUES('$arg_name')"; } location /update { mysql_query "UPDATE users SET name='$arg_name' WHERE id=$arg_id"; } location /delete { mysql_query "DELETE FROM users WHERE id=$arg_id"; } #     location /pass { #  name       $name mysql_subrequest /select?id=$arg_id $name; #   $name   proxy_pass http://myserver.com/path?name=$name; } ... }

Modules can be viewed and downloaded at the addresses.

github.com/arut/nginx-mtask-module
github.com/arut/nginx-mysql-module

Thanks to all!

Source: https://habr.com/ru/post/137454/

All Articles