A few tips on organizing a Python application on the server

In this article I want to share several convenient ways to organize your project on a working (even production) server.

I work mainly with the Python / Django stack, so all the examples will be, first of all, applied to this set. Also key technologies: Ubuntu (17.10), Python3 (3.6).

Content:

Logs (logrotate)
Daemons (systemd)
local settings

It is assumed that you are doing everything correctly, the application is stored in the repository, is deposited in a separate folder on the server, using, for example, virtualenv. To start, a separately created user is used, which has enough rights, but not too much (for example, it does not have sudo and login via ssh is not allowed).

To begin with, I will repeat the truisms that all that is stored in the repository is only pure code and static data. All settings contain only defaults, no passwords and keys, even in unused files.

Even on the work computer (laptop) where you write the code in your project folder there should not be anything that you could not upload to production. This refers to the vicious practice of using the file "local_settings.py" in the settings folder inside the project (as an option - development_settings.py). I will analyze this example below.

Logs

Surely you use logging. The built-in logging module is very good, but it is not always worth it to refine and use it for everything in the world.

For example, the rotation of the logs. The Internet comes across complex and sophisticated methods ranging from the standard RotatingFileHandler and ending with writing your own demon on sockets for recording logs from several sources. Problems begin because of the desire to do everything in "pure Python". This is stupid and inefficient, but it brings a bunch of possible places of occurrence of errors.

Use the logrotate service. Below is a simple config for celery logs.

Using standard tools, we write the file /var/log/myproject/celery.log, every day it is placed in the / var / log / myproject / archive / folder and the previous day suffix is added to the name.

/var/log/myproject/celery.log { size 1 su myuser myuser copytruncate create rotate 10 missingok postrotate timeext=`date -d '1 day ago' "+%Y-%m-%d"` # daily # timeext=$(date +%Y-%m-%d_%H) # hourly mv /var/log/myproject/celery.log.1 /var/log/myproject/archive/celery_$timeext.log endscript }

If your log is written very quickly and you want to rotate it every hour, then comment the lines "daily" and "hourly" in the config. You also need to configure logrotate so that it runs every hour (by default, usually daily). Run in bash:

 sudo cp /etc/cron.daily/logrotate /etc/cron.hourly/logrotate sudo sed -i -r "s/^[[:digit:]]*( .+cron.hourly)/0\1/" /etc/crontab

Config (file myservice) must be put in the logrotate folder

 sudo cp config/logrotate/myservice /etc/logrotate.d/myservice

important points:

the config should be copied exactly, simlinks will not work
in principle, the config is almost taken from the docks by logrotate, but it is very important to put the copytruncate directive
(added from rusnasonov comment) logrotate is stupid and uses the simplest rotation system, without buffers. The rotation takes place in two steps - the old file is first copied, and then it is cut off at the old place. This can lead to loss of logs that were recorded in between (as reflected in the documentation )

copytruncate is important because the file will not be closed when rotated. Since we rotate the file on a live system, the file is open and the services that write to it, do it on some file descriptor. If you just move the file and create a new one, it will not be used. copytruncate says that you need to copy the contents, and then clear the file, but not close it.

Services

How do you run your app? There are a lot of different ways. According to my observations, the main ones are:

we launch screen / tmux and inside we launch an interactive script
"-D" daemon mode for gunicorn or celery
supervisord
init.d script
Docker

All of them have their advantages, but, in my opinion, they have even more disadvantages.

I will not consider Docker here. First, I don't have much experience with it. And secondly, if you use containers, then you and the rest of the tips in this article are not very necessary. There is another approach.

I believe that if the system provides us with a convenient tool, then why not use it.

In Ubuntu, starting from version 15.04, systemd is supplied by default to manage services (and not only).

systemd is very convenient because it does everything right:

runs the application under the desired user and sets the environment variables, if necessary
when the application suddenly stops - restarts its specified number of times
very flexible and allows you to customize dependencies and launch order

Of course, if you do not have systemd, then you can look towards the supervisord, but I have a rather big dislike for this tool and I avoid using it.
I hope there will be no people who will doubt that in the presence of systemd it is harmful to use supervisord.

Below I will give an example of the config to run.

Run gunicorn (proxied through local nginx, but this is not important here).

 [Unit] Description=My Web service Documentation= StartLimitIntervalSec=11 [Service] Type=simple Environment=DJANGO_SETTINGS_MODULE=myservice.settings.production ExecStart=/opt/venv/bin/python3 -W ignore /opt/venv/bin/gunicorn -c /opt/myservice/config/gunicorn/gunicorn.conf.py --chdir /opt/myservice myservice.wsgi:application Restart=always RestartSec=2 StartLimitBurst=5 User=myuser Group=myuser ExecStop=/bin/kill -s TERM $MAINPID WorkingDirectory=/opt/myservice ReadWriteDirectories=/opt/myservice [Install] WantedBy=multi-user.target Alias=my-web.service

Here it is important not to use the demonization mode. We start gunicorn with a normal process, and systemd demonizes it itself, it also monitors the restart when it crashes.

Please note that we use the path to python and gunicorn relative to the virtualenv folder.

For celery, everything will be the same, but I recommend the start line (put your own paths and values):

 ExecStart=/opt/venv/bin/celery worker -A myservice.settings.celery_settings -Ofair --concurrency=3 --queues=celery --logfile=/var/log/myservice/celery.log --max-tasks-per-child 1 --pidfile=/tmp/celery_myservice.pid -n main.%h -l INFO -B

It is worth paying attention to the parameters for restarting:

 StartLimitIntervalSec=11 RestartSec=2 StartLimitBurst=5

In short, this means the following: if the service is down, then start it again after 2 seconds, but no more than 5 times in 11 seconds. It is important to understand that if the value in StartLimitIntervalSec is, for example, 9 seconds, then if the service stops 5 times in a row (immediately after launch), then after the fifth fall, systemd will give up and will not raise it anymore (2 * 5). The value 11 is chosen precisely in order to exclude such variations. If, for example, you had a network failure for 15 seconds and the application crashes immediately after the start (without a timeout), then let it be better to hammer to the victorious than it just stops.

To install this config into the system, you can simply make a symlink from the working folder:

 ~~sudo ln -s /opt/myservice/config/systemd/*.service /etc/systemd/system/~~ sudo systemctl daemon-reload

However, you need to be careful with the symlinks - if your project is not on the system disk, then there is a possibility that it can be mounted after the start of services (for example, a network disk or memory-mapped). In this case, it simply will not start. Here you will have to google how to properly configure dependencies, and indeed the config file is then best copied to the systemd folder.
Update: after the Andreymal remark , I think that it would be more correct to copy the configs to a folder and set them the correct rights:

 sudo chown root: /opt/myservice/config/systemd/*.service sudo chmod 770 /opt/myservice/config/systemd/*.service sudo cp /opt/myservice/config/systemd/*.service /etc/systemd/system/ sudo systemctl daemon-reload

I also advise you to disable the output to the console, otherwise everything will fall into the syslog.

When you have all the components in systemd, then using each of them comes down to:

 sudo systemctl stop my-web.service sudo systemctl stop my-celery.service sudo systemctl start my-web.service sudo systemctl start my-celery.service

When there are more than one component, it makes sense to write a script for mass management of the entire farm on the server.

 bash manage.sh migrate bash manage.sh start

For remote debugging via the console (run shell_plus from django-extensions):

 bash manage.sh debug

If you have more than one server, then, of course, you already use your own scripts for deployment, this is just an example.

Local settings

This is a topic for holivar and I want to note that this paragraph is exclusively my point of view, which is based on my experience. Do what you want, I just advise.

The essence of the problem is that no matter how you write the application, it will most likely have to deal with data that is stored somewhere. List of users, the path to the keys, passwords, the path to the database, and so on ... All these settings can not be stored in the repository. Someone keeps, but this is a vicious practice. Insecure and inflexible.

What are the most frequent ways to store such settings and what are the problems with them:

file local_settings.py, which is stored in the project folder next to the default setting.py
Problem: you can accidentally commit a file, you can wipe it when copying / updating a folder (rsync or archive)
environment variables. Also not very safe, not very convenient (for example, when soft-reload)
separate file outside the project folder

I recommend this method. Usually I create a yaml file for the project in the folder "/ usr / local / etc /". I have written a small module that using ~~the magic~~ Hacks loads variables from a file into the locals() or globals() importing module.

It is used very simply. Somewhere in the depths of settings.py for Django (better towards the end), just call:

 import_settings("/usr/local/etc/myservice.yaml")

And all the content will be mixed into global settings. I use merge for lists and dictionaries, it may not be convenient for everyone. It is important to remember that Django imports only UPPERCASE constants, that is, in the first level settings file you should immediately be in upper case.

That's all folks!

The rest is discussed in the comments.

Source: https://habr.com/ru/post/351566/

All Articles

A few tips on organizing a Python application on the server

Logs

Services

Local settings

More articles: