Multi-threaded application under Tornado

The documentation for a non-blocking web server Tornado beautifully describes how great it is to cope with the load, and in general is the crown of human creation in the field of non-blocking servers. This is partly true. But when building complex applications beyond the framework of “one more chat,” many unobvious and subtle points come to light, which we should know about before the trip to the rake. Under the "cut" the developers of the club of intellectual games Trellis are ready to share their thoughts about the pitfalls.

Immediately make a reservation that we are talking about the second branch of python, the latest version of tornado 1.2.1, and postgresql, connected via psycopg2.

Application instance

Many programmers love to use the antiton singleton pattern for quick access.
to the instance class of the application. If we think of further horizontal scaling,
This is not recommended. The request object will bring you thread-safe
application on a platter that can be used without the risk of shooting yourself a leg in
unexpected place.
')

Websockets

Alas and ah. Everyone's favorite nginx does not know how to proxy the websocket protocol. For fans of “laiti”, there is also little good news in this regard. Many praise ha-proxy, but in our case it turned out to be more convenient to move all the static to another node with honest nginx, and to give all dynamic content to the tornado-server itself. Six months of life under, sometimes stressful loads, have shown that this solution is quite viable. If a flash strip is used for emulating the ws protocol, then it should be sent from the same domain to avoid switching to the insecure version. The solution with a gasket also requires flash policy xml, which can be given from port 843 by the same nginx.

Connect to database

Obviously, on loaded services, there can be no talk about terribly expensive connection operations with a base for every other person. It is quite possible to use the simplest pool of connections from psycopg2. We immediately take a thread-safe ThreadedConnectionPool from which we choose connections as necessary, and after the end of the requests we do not forget to return them back. By "do not forget to return" means never forget. Whatever the exception we have happened inside. Using python finally constructs is more than appropriate.

Asynchronous requests

In single-threaded non-blocking servers, everything looks beautiful, exactly as long as you do not have to perform some kind of relatively long action. Send a letter, make a selection from the database, send a request to an external web service, etc. In this case, all other connected clients will dutifully wait for the handler's turn to reach them.

If all you need in the request handler is to simply jerk the base and output something, then you can use the asynchronous wrapper momoko . Then the simplest query will look something like this:

 class MainHandler(BaseHandler): @tornado.web.asynchronous def get(self): self.db.execute('SELECT 4, 8, 15, 16, 23, 42;', callback=self._on_response) def _on_response(self, cursor): self.write('Query results: %s' % cursor.fetchall()) self.finish()

Secure multithreading

So, for the base and for external web services there are asynchronous means. But what to do if you need to perform a large piece of work from many sql queries, something to count the cumbersome in the depths of the server, and even load the disk i / o subsystem? Of course, we can concoct bunches of asynchronous callback functions in the worst twisted traditions, but this is exactly what I would like to get away from.

Here, it seemed, would be the use of standard threading. But the use of regular python threads will lead to monstrous glitches and catastrophic results on production under load. Yes Yes. On the development machines, everything will work fine. Usually in such cases, programmers begin to pray for GIL, and frantically wrapping all that is possible and what is impossible with locks. But the problem is that not all of the tornado is thread-safe. In order to get around this neatly, you need to process the http request in several stages.

Decorate your get / post function via tornado .web.asynchronous
Accept the request, check the input parameters if any, and store them in the request instance.
Run the stream from the query class member function
Do all the work inside this function by carefully applying locks when the shared data changes
Call a callback that will do the final _finish () process with the data already prepared.

For these purposes, you can write a small mixin:

 class ThreadableMixin: def start_worker(self): threading.Thread(target=self.worker).start() def worker(self): try: self._worker() except tornado.web.HTTPError, e: self.set_status(e.status_code) except: logging.error("_worker problem", exc_info=True) self.set_status(500) tornado.ioloop.IOLoop.instance().add_callback(self.async_callback(self.results)) def results(self): if self.get_status()!=200: self.send_error(self.get_status()) return if hasattr(self, 'res'): self.finish(self.res) return if hasattr(self, 'redir'): self.redirect(self.redir) return self.send_error(500)

And in this case safe multi-threaded request processing will look simple and elegant:

 class Handler(tornado.web.RequestHandler, ThreadableMixin): def _worker(self): self.res = self.render_string("template.html", title = _("Title"), data = self.application.db.query("select ... where object_id=%s", self.object_id) ) @tornado.web.asynchronous def get(self, object_id): self.object_id = object_id self.start_worker()

If we need a redirect, then in _worker () we set the self.redir variable to the desired url. If you need json for ajax request, then in self.res instead of the generated page we assign the generated dict with the data.

Another point is related to python C extensions. When using any external libraries in call flows, be sure to verify their thread-safe status.

Batch processes

Often it is necessary to start a function after a specific time This includes user timeouts, game process implementation, system maintenance procedures, and much more. For these purposes, so-called "periodic processes" are used.

Traditionally, regular threading.Timer is used to organize periodic processes in python. If we try to use it, then again we will get a certain number of subtle problems. For these purposes, ioloop.PeriodicCallback is provided in the tornado. Please use it always instead of regular timers. This will save a lot of time and nerves for the above reasons.

Localization and other

In conclusion, let me give you some tips that are not related to multi-threaded processing, but sometimes sometimes significantly improve the performance of a spread application.

Do not use the built-in tornado stub for localization. Tornado perfectly knows how to make a standard gettext and gives with it much better results on large amounts of translations.
Cache in memory all that is possible. Forget memcached & co. You do not need it. Already in the design process, you need to know on which hardware platform your application will run. The extra couple of gigabytes of memory in the server can fundamentally change the approach to a particular caching strategy.
If the time of page generation is entirely dependent on the data in the system and you cannot know its limits in advance - always postpone the new stream for this request.
Despite the fact that the Tornado is very fast, always give static by means intended for this. For example nginx. You just can’t imagine what the i7 / 16Gb / SAS server with FreeBSD / amd64 and nginx onboard is capable of in static sharing. Faster nothing can be just physically.

Result

With load testing, 5000 simultaneous connections that are actively playing on the site (and this is thousands of websocket messages per second) server pulls without problems (LA ~ = 0.2, the server process eats up about 400Mb of memory with free 8Gb). 150 real players online, having fun writing bullets, the server doesn’t notice at all (zero load and huge power margin).

At the front, it looks like this:

And may the force be with you!

Source: https://habr.com/ru/post/116892/

All Articles