📜 ⬆️ ⬇️

Examples of using asyncio: HTTPServer ?!

Not so long ago, a new version of Python 3.4 was released in the changelog, which included many "goodies". One of these is the asyncio module, which contains an infrastructure suitable for writing asynchronous network applications. Thanks to the coroutines concept, the asynchronous application code is easy to understand and maintain.

In the article, using the example of a simple TCP (Echo) server, I will try to show what asyncio eating, and I would venture to eliminate the “fatal flaw” of this module, namely the lack of an asynchronous HTTP server implementation.

Intro


The direct competitor and “brother” is the tornado framework, which has proven itself and enjoys well-deserved popularity. However, in my opinion, asyncore looks simpler, more logical and thought out. However, this is not surprising, because we are dealing with a standard language library.

You will say that it was possible to write asynchronous services in Python before and you will be right. But this required third-party libraries and / or the use of callback programming style. The concept of coroutine brought to this perfection in this version of Python almost allows you to write linear asynchronous code using only the capabilities of standard language libraries.
')
I just want to make a reservation that I wrote all this under Linux, but all the components used are cross-platform and under Windows should also work. But the Python 3.4 version is required.

Echoserver


An example of an Echo server is in the standard documentation, but this refers to the low-level API "Transports and protocols" . For "everyday" use, high-level API Streams is recommended. There is no example of TCP server code in it, however, having studied an example from the low-level API and looking at the sources of this and another module, it is easy to write a simple TCP server.

 import asyncio import logging import concurrent.futures @asyncio.coroutine def handle_connection(reader, writer): peername = writer.get_extra_info('peername') logging.info('Accepted connection from {}'.format(peername)) while True: try: data = yield from asyncio.wait_for(reader.readline(), timeout=10.0) if data: writer.write(data) else: logging.info('Connection from {} closed by peer'.format(peername)) break except concurrent.futures.TimeoutError: logging.info('Connection from {} closed by timeout'.format(peername)) break writer.close() if __name__ == '__main__': loop = asyncio.get_event_loop() logging.basicConfig(level=logging.INFO) server_gen = asyncio.start_server(handle_connection, port=2007) server = loop.run_until_complete(server_gen) logging.info('Listening established on {0}'.format(server.sockets[0].getsockname())) try: loop.run_forever() except KeyboardInterrupt: pass # Press Ctrl+C to stop finally: server.close() loop.close() 

Everything is quite obvious, but there are a couple of nuances that are worth paying attention to.

  server_gen = asyncio.start_server(handle_connection, port=2007) server = loop.run_until_complete(server_gen) 

The first line does not create the server itself, but the generator, which, when it is first accessed to it and the subsoil, asyncio creates and initializes the TCP server using the specified parameters. The second line is an example of such an appeal.

  try: data = yield from asyncio.wait_for(reader.readline(), timeout=10.0) if data: writer.write(data) else: logging.info('Connection from {} closed by peer'.format(peername)) break except concurrent.futures.TimeoutError: logging.info('Connection from {} closed by timeout'.format(peername)) break 

The coroutine reader.readline() function performs asynchronous reading of data from the input stream. But waiting for data to read is not time limited, if you need to stop it by timeout, you need to wrap the call to the coroutine function in asyncio.wait_for() . In this case, after the time interval specified in seconds has elapsed, the exception concurrent.futures.TimeoutError will be raised, which can be processed as needed.
Checking that reader.readline() returns a non-empty value in this example is required. Otherwise, after the connection is disconnected by the client (connection reset by peer), attempts to read and return an empty value will continue indefinitely.

What about OOP?

OOP is also good. It is enough to wrap the methods that use calls of functions-coroutine into the decorator @ asyncio.coroutine. What functions are run as coroutine in the API is clearly indicated. Below is an example implementing the EchoServer class

 import asyncio import logging import concurrent.futures class EchoServer(object): """Echo server class""" def __init__(self, host, port, loop=None): self._loop = loop or asyncio.get_event_loop() self._server = asyncio.start_server(self.handle_connection, host=host, port=port) def start(self, and_loop=True): self._server = self._loop.run_until_complete(self._server) logging.info('Listening established on {0}'.format(self._server.sockets[0].getsockname())) if and_loop: self._loop.run_forever() def stop(self, and_loop=True): self._server.close() if and_loop: self._loop.close() @asyncio.coroutine def handle_connection(self, reader, writer): peername = writer.get_extra_info('peername') logging.info('Accepted connection from {}'.format(peername)) while not reader.at_eof(): try: data = yield from asyncio.wait_for(reader.readline(), timeout=10.0) writer.write(data) except concurrent.futures.TimeoutError: break writer.close() if __name__ == '__main__': logging.basicConfig(level=logging.DEBUG) server = EchoServer('127.0.0.1', 2007) try: server.start() except KeyboardInterrupt: pass # Press Ctrl+C to stop finally: server.stop() 


As can be seen in the first and second cases, the code is linear and readable. And in the second case, moreover, the code is decorated in a self-sufficient class.

HTTP Server


Having dealt with all this, there is an involuntary desire to do something more substantial. The asyncio module provides us with this opportunity. Unlike the tornado for example, tornado does not implement an HTTP server. As they say it is a sin not to try to correct this omission :)

Writing an HTTP server from scratch with all its classes like HTTPRequest, etc. is not sporty, considering that there are a lot of ready-made frameworks running on top of the WSGI protocol. Those in the know will rightly notice that WSGI is a synchronous protocol. This is true, but you can read data for environ and the request body asynchronously. The output of the result in WSGI is recommended as a generator, and it fits well with the concept of coroutines used in asyncio .

One of the frameworks that does everything right with content is the bottle . So he for example gives the contents of a file not entirely, but in portions through a generator. Therefore, I chose it for testing the developed WSGI server and was pleased with the result. For example, a demo application was completely capable of sending a large file to several client connections at the same time.

Fully see what happened you can on my github . No tests, no documentation there yet, but there is a demo application using the bottle framework. It gives a list of files in a specific directory and gives the selected one in asynchronous mode regardless of size. So if you put films in this directory, you can organize a small video hosting :)

I would like to say a special thanks to the CherryPy development team, I often glanced at their code and took something entirely, so as not to invent "my bikes."

View sample application
 import bottle import os.path from os import listdir from bottle import route, template, static_file root = os.path.abspath(os.path.dirname(__file__)) @route('/') def index(): tmpl = """<!DOCTYPE html> <html> <head><title>Bottle of Aqua</title></head> </body> <h3>List of files:</h3> <ul> % for item in files: <li><a href="/files/{{item}}">{{item}}</a></li> % end </ul> </body> </html> """ files = [file_name for file_name in listdir(os.path.join(root, 'files')) if os.path.isfile(os.path.join(root, 'files', file_name))] return template(tmpl, files=files) @route('/files/<filename>') def server_static(filename): return static_file(filename, root=os.path.join(root,'files')) class AquaServer(bottle.ServerAdapter): """Bottle server adapter""" def run(self, handler): import asyncio import logging from aqua.wsgiserver import WSGIServer logging.basicConfig(level=logging.ERROR) loop = asyncio.get_event_loop() server = WSGIServer(handler, loop=loop) server.bind(self.host, self.port) try: loop.run_forever() except KeyboardInterrupt: pass # Press Ctrl+C to stop finally: server.unbindAll() loop.close() if __name__ == '__main__': bottle.run(server=AquaServer, port=5000) 


When writing the WSGI server code, I did not notice any nuances that could be attributed to the asyncio module. The only moment is a feature of browsers (for example, chrome), to reset the request if it sees that it is starting to receive a large file. Obviously, this is done in order to switch to a more optimized way to download large files, because the request is repeated and the file starts to be received normally. But the first dropped request throws a ConnectionResetError exception, if the file has already been sent to it by using the StreamWriter.write() function call. This case needs to handle and close the connection with StreamWriter.close() .

Performance


For the comparative test, I chose the siege utility. “Our patient” (also aqua :) in conjunction with the bottle , quite popular Waitress WSGI server also in conjunction with the bottle and of course Tornado, acted as the test subjects. As an application was the lowest possible helloword. Tests conducted with the following parameters: 100 and 1000 simultaneous connections; test duration 10 seconds for 13 bytes and kilobytes; test duration 60 seconds for 13 megabytes; The three options for the size of the data given are 13 bytes, 13 kilobytes and 13 megabytes respectively. Below is the result:
100
concurent users
13 b (10 sec)
13 Kb (10 sec)
13 Mb (60 sec)
Avail.
Trans / sec
Avail.
Trans / sec
Avail.
Trans / sec
aqua + bottle
100.0%
835.24
100.0%
804.49
99.9%
26.28
waitress + bootle
100.0%
707.24
100.0%
642.03
100.0%
8.67
tornado
100.0%
2282.45
100.0%
2071.27
100.0%
15.78

1000
concurent users
13 b (10 sec)
13 Kb (10 sec)
13 Mb (60 sec)
Avail.
Trans / sec
Avail.
Trans / sec
Avail.
Trans / sec
aqua + bottle
99.9%
800.41
99.9%
777.15
60.2%
26.24
waitress + bootle
94.9%
689.23
99.9%
621.03
37.5%
8.89
tornado
100.0%
1239.88
100.0%
978.73
55.7%
14.51

What can I say? Tornado certainly steers, but "our patient" seems to be breaking out ahead on large files and has improved the relative performance on a larger number of connections. In addition, he confidently bypassed the waitress (with its four child processes by the number of cores), which is not in a bad tally among developers. I will not say that my testing is 100% adequate, but as an estimate, it will probably fit.

Updated: Drew attention to weird numbers for 13 megabytes of the response body. And indeed, in 10 seconds the test there really probably did not have time to start :) Corrected for the numbers that I received when the test duration was 60 seconds.

Example of running the siege utility and full results for the last column of the second table
 $ siege -c 1000 -b -t 60S http://127.0.0.1:5000/ ** SIEGE 2.70 ** Preparing 1000 concurrent users for battle. Transactions: 1570 hits Availability: 60.18 % Elapsed time: 59.84 secs Data transferred: 20410.00 MB Response time: 5.56 secs Transaction rate: 26.24 trans/sec Throughput: 341.08 MB/sec Concurrency: 145.80 Successful transactions: 1570 Failed transactions: 1039 Longest transaction: 20.44 Shortest transaction: 0.00 $ siege -c 1000 -b -t 60S http://127.0.0.1:5001/ ** SIEGE 2.70 ** Preparing 1000 concurrent users for battle. The server is now under siege... Lifting the server siege... done. Transactions: 526 hits Availability: 37.49 % Elapsed time: 59.20 secs Data transferred: 6838.00 MB Response time: 16.05 secs Transaction rate: 8.89 trans/sec Throughput: 115.51 MB/sec Concurrency: 142.58 Successful transactions: 526 Failed transactions: 877 Longest transaction: 42.43 Shortest transaction: 0.00 $ siege -c 1000 -b -t 60S http://127.0.0.1:5002/ ** SIEGE 2.70 ** Preparing 1000 concurrent users for battle. The server is now under siege... Lifting the server siege... done. Transactions: 857 hits Availability: 55.65 % Elapsed time: 59.07 secs Data transferred: 11141.00 MB Response time: 20.14 secs Transaction rate: 14.51 trans/sec Throughput: 188.61 MB/sec Concurrency: 292.16 Successful transactions: 857 Failed transactions: 683 Longest transaction: 51.19 Shortest transaction: 3.26 



Outro


Asynchronous web server using asyncio has the right to life. It is possible to talk about using such servers in serious projects early, but after testing, running in and with the advent of async asyncio drivers for databases and key-value repositories, it may well be possible.

Source: https://habr.com/ru/post/217143/


All Articles