Comparison of the effectiveness of ways to run web applications in Python

Recently, in the field of web development, the Python programming language has become increasingly popular. However, the mass distribution of Python is hampered by the problem of efficiently launching applications in this language. So far, in most cases, this is the lot of dedicated or virtual servers. Modular languages as opposed to monolithic in the basic php functionality for each request load at least a runtime library, and at most a few dozen more modules requested by the user. Therefore, a classic approach like mod_php for Python and Perl is not very appropriate, and keeping the application constantly in memory was a little expensive. But time moves, technology has become more powerful and cheaper, and for quite a long time already we can safely talk about constantly running processes with an application within the framework of mass hosting.

What is there

From time to time, various suggestions appear on the web for how to run an application in Python. For example, recently hosting Gino uniquely corrected mod_python and suggested hosting it with his help . Behind him, a certain hosting of Locum generally rejected mod_python with its security (it seems that the essence of original security is the only problem for IT Co. on the way to nirvana) and conducted a victorious testing of modwsgi against fastcgi . The community, judging by my search, is torn between mod_python and FastCGI. Moreover, FastCGI usually means the one that comes in the delivery of Django - flup. Being a popular hosting of Python-applications , we could not pass by and decided to contribute to this holy war.

I sincerely believe that any technology should optimally relate kosher implementation, performance, usability and versatility. Based on this, each of the presented solutions will be described. I approached the issue of choice subjectively, focusing on the apache web server as a universal process manager understandable to all. From www.wsgi.org/wsgi/Servers, I chose flup ( trac.saddi.com/flup ), python-fastcgi ( pypi.python.org/pypi/python-fastcgi ), and mod_wsgi ( www.modwsgi.org ). At the same time I took mod_python ( www.modpython.org ) - as the most popular way to start python among the average hosting provider.

Closer to the body

I tried to create the ideal conditions for all options, there are no reboots after a certain number of requests, everything is done by regular simple ways. Practically, the efficiency and performance of the Apache-> Publisher-> Application are tested. For some reason, many such tests also test the interpreter's performance, but I found it difficult to explain to myself why to compare the same interpreter, and in the case of different ones, to formalize the implementation of which functionality and why it is required to test. I especially want to draw attention to the fact that all technical tests provide only a comparative assessment of performance. Therefore, no special configuration or performance improvement was done. To avoid unnecessary phrases about php - mod_php is included in the test.
')
For all the demonized processes, conditions were selected - 2 pre-launched processes of 5 threads each. In addition to the special case with flup. All applications are tested with the ab utility of 100,000 (one hundred thousand) requests, 10 simultaneously, plus an additional mod_python test for 10,000 (ten thousand) queries. The tests were consistently conducted on an Apache with 5th, 30th, and 100 pre-launched processes (MPM Prefork) to determine trends.

Experimental

Dual-processor Xeon E5335 2.00GHz, RAM 4Gb, SCSI hard drives with SCSI-3 interface. FreeBSD 7.2-RELEASE amd64, Apache 2.2.11, php 5.2.10, python 2.5.4, mod_wsgi 3.0, mod_python 3.3.1, mod_fastcgi 2.4.6, python-fastcgi 1.1 and fcgi devkit 2.4.0, flup 1.0.2 installed . All tests were conducted locally on the server, the load never went beyond 1.

flup

It is a WSGI server with a FastCGI interface. This is the main and only regular way to launch a Django application docs.djangoproject.com/en/dev/howto/deployment/fastcgi . For tests, I used the following program:

#!/usr/local/bin/python def my_wsgi_application (environ, start_response): status = '200 OK' output = 'Hello World!' response_headers = [( 'Content-type' , 'text/plain' )] start_response(status, response_headers) return [output] application = my_wsgi_application from flup.server.fcgi import WSGIServer wsgi_opts = { 'maxSpare' : 5 , 'minSpare' : 5 , 'maxThreads' : 5 } WSGIServer(application,**wsgi_opts).run() 

There are several difficulties inherent in this method of launching applications: the impossibility of restarting the application without restarting the server, the impossibility of reloading the application code without restarting or some third-party improvements, the need to self-declare the fastcgi processor and its parameters in the application. This is true for python-fastcgi. As can be seen from the results, flup has a certain saturation already on the test with the 5th pre-launched Apache processes. Also, he noted that he throws out everything that the flup cannot process right away. I received up to 40% of query errors on tests. Sadness and sadness causes this test, because according to my statistics, programmers rarely see how the programs will work, and for many I now discover America. Very surprised, I decided to see the behavior of the flup without strictly limiting the running threads and wrote the following program, removing the extra parameters:

#!/usr/local/bin/python def my_wsgi_application (environ, start_response): status = '200 OK' output = 'Hello World!' response_headers = [( 'Content-type' , 'text/plain' )] start_response(status, response_headers) return [output] application = my_wsgi_application from flup.server.fcgi import WSGIServer WSGIServer(application).run() 

The result was expected. There is no loss, flup creates threads as needed (I followed the ps output), but as it should have been supposed, the performance dropped almost twice.
So, I give you my heartfelt greetings, the most popular way to launch Django today ...

modwsgi

It is a WSGI server designed as a module for Apache. The main use is in daemon mode. Those. when the web server is only an intermediary between the resident programs created and managed by them. It is the main recommended way to run Django docs.djangoproject.com/en/dev/howto/deployment/modwsgi. Due to its use with Apache, you can use all sorts of Apache "things" like .htaccess and not so scary system administrators. The same fact greatly scares the developers who have heard of nginx and consider Apache to be evil in the flesh. The program I used for the tests looks like this:

def my_wsgi_application (environ, start_response): status = '200 OK' output = 'Hello World!' response_headers = [( 'Content-type' , 'text/plain' )] start_response(status, response_headers) return [output] application = my_wsgi_application 

Test results show an increase in performance with an increase in Apache handlers, i.e. lack of saturation. And obviously more productive flup.

I would like to mention some features of modwsgi. First, it has its own setting for how many requests to process before rebooting. This allows him to effectively deal with memory leaks. I did not make this setting in the examples (as well as for other methods), for it is obvious that this will slightly drop the performance. Secondly, it has a unique, in contrast to other methods, setting idle time, after which it is overloaded. This allows you to not keep in memory a deployed application or an application with a leaked memory at a time when it is not required. Thirdly, it automatically reboots when the application file is updated. Those. when modifying the program, we are always sure that we will see a new version. None of these methods can do anything without special modifications. Another important feature is the removal of even remembering how to launch an application from the application’s area of responsibility. Pay attention to an example - the program really has only the WSGI interface and that's it.

python-fastcgi

It is ... bingo! - WSGI_server, with FastCGI interface. In fact - a wrapper around the standard C ++ FastCGI. The program looks like this:

#!/usr/local/bin/python import fastcgi def my_wsgi_application (environ, start_response): status = '200 OK' output = 'Hello World!' response_headers = [( 'Content-type' , 'text/plain' )] start_response(status, response_headers) return [output] application = my_wsgi_application s = fastcgi.ThreadedWSGIServer( my_wsgi_application , workers= 5 ) s.serve_forever() 

Test results speak for themselves. As server handlers grow, performance increases. Obviously, python-fastcgi is the leader of our tests (hello, Locum). In general, after I basically won the FastCGI upgrade using Apache, this module caused the least questions and complaints. Naturally, it has all the drawbacks of such a launch method - the complexity of setup, the dependence of the server on the application, the lack of regular reboot tools (for example, to update the program), etc.

mod_python

Represents an Apache server module like mod_php. It has several "hooks", does not have a WSGI interface. The main problem is considered to be security, because without modification it is executed on behalf of the server. True, the same drawback hurts any built-in module, including mod_php. I wrote the following program for testing:

#!/usr/local/bin/python def index (req): return "Hello World!\n" 

Unexpectedly, the results were quite modest. In the process of testing, one more feature emerged. Here are the test results for 10,000 queries:

It can be seen that with an increase in the number of processors, productivity ... decreases. This is explained by the fact that when the server starts, apache does not “pull” the application, but does so only after the request hits one of the handlers. Accordingly, the more I made handlers, the more requests came “for the first time”. It is obvious that in the presence of 2-3 active applications, the overload will be quite frequent. Whether to choose a way to launch an application, when you can only configure the entire server, is of course your business. Also, mod_python has trouble updating code. Although it has the appropriate configuration, we were not able to force it to effectively update the application code when it changes without rebooting the entire server. On some hosts it works without the visible problem of updating the code through the use of the diffpriv module. But the second problem arises - the application is loaded by the server to EVERY request, which even by extrapolating our tests results in a serious drop in performance. And a separate acute question, of course - the choice of “publishers” and work with them. It turned out that mod_python is the very basement of our ranking by the sum of indicators.

mod_php

For comparison, I decided to run through tests and php. The program looks quite obvious:

<?php echo ( "Hello, World!" ); ?> 

The results are obvious, but the imagination is not struck. In the absence of an extra connection and the solidity of php itself, I expected a multiplier of 2 or higher.

The summary is quite simple and obvious. the simpler the technology, the more efficient it is. The leader of the rating in the performance nomination is undoubtedly python-fastcgi, the leader in the convenience category is modwsgi. Moreover, modwsgi obviously today represents the optimal solution for the sum of characteristics, although it is neither the most productive nor the most bug-free.

Source: https://habr.com/ru/post/67475/

All Articles