Ansible is not so simple

I have three servers, but I'm not a professional sysadmin. This means that despite the four databases and applications, backups are not conducted anywhere, I come up with any problem on the server, sighing loudly and throwing the plate into the wall, and the operating systems reached EOL there two years ago. I would be happy to update, but this must be allocated, probably, a week to save and rearrange everything. Simply forget about yum update and apt-get upgrade .

Of course, this is wrong. I have long been eyeing the chef and Puppet, which I thought would solve all my problems. But I looked at the configs of familiar projects and put it off. It is also necessary to study, deal with ruby, fight with numerous, according to reviews, jambs and restrictions. Two weeks ago, an article by George amarao became a life-giving kick. Not even the article itself, but a listing of configuration management systems. After reading the comments and easy googling, I decided: I'll take Ansible. Because the python, and no one complains about the problems.

')
Well, then I will be the first.

First, I dug a bunch of documentation and textbooks on Ansible, starting with the useless Quick Start video on the official website. Of course, there are many of them, made for different tasks and written by different people, but one thing unites them: textbooks were made for people who already understand Ansible. For people with a spherical server in a vacuum, which is enough to suggest that there are roles, modules and tasks. But I came with a clean slate and collected all the rakes that I found. I hope this article will help you get around them.

From the configuration management systems, I was waiting for miracles, such as automatically updated applications from git. But it turned out that Ansible is only a way to preserve the sequence of actions when setting up a new server. You can do in Ansible only what you can do from the console yourself. There are no miracles.

Start. Vagrant

Task: I do not make a new host, because I want to save ip. That is, I will clear the droplet through the control panel, then initialize with Ansible. Plan: write a playbook and debug it on Vagrant.

Getting started is very difficult. All Ansible textbooks begin with a description of inventory, where you need to write the server address. But what is the ip of the vagrant? God knows. The Ansible documentation has instructions on how to run a playbook in Vagrant; The Vagrant documentation has instructions on how to connect Ansible, and they are not exactly identical. As a result, I scored on an ip search and took a general one: the minimum Vagrantfile that launches the playbook.

 Vagrant.configure(2) do |config| config.vm.box = "ubuntu/xenial64" config.vm.network "forwarded_port", guest: 80, host: 8080 #    ,  : config.ssh.insert_key = false config.vm.provision "ansible" do |ansible| ansible.verbose = "v" ansible.playbook = "playbook.yml" end end

I sketched a playbook draft, created role blanks, and launched vagrant up . Did not take off. Since the official xenial image is only for VirtualBox, and in Fedora Linux virtualization is through libvirt. Long remembered the correct command: vagrant up --provider virtualbox . Then syntax errors in yaml rules (why are there three mandatory hyphens at the beginning?). Remember that after starting the box to restart Ansible we write vagrant provision .

And the first surprise: in the Ubuntu 16.04 box there is no default python! Wildness for Fedora, where the package manager is written in python. Ansible, as I found out, uploads its modules to the server and executes them there. We go to StackOverflow, we find the magic task (more precisely, ten variations of one task and it is not clear how best):

 - name: Install python for Ansible become: yes raw: test -e /usr/bin/python || (apt -qy update && apt install -y python-minimal) register: output changed_when: output.stdout

Superuser, become!

Even with the documentation and examples, much is not clear. I don’t understand, for example, why Vagrant redefines remote_user , and how it turns out that there is a superuser in each box. I will run the playbook on a clean server, where there will be only root, and I will need to make my superuser. But doing it under a vagrant is needed differently than on a clean server, apparently. In general, it is not clear: will there be two playbooks for staging and production?

Or here you are become and become_user : one does not mean the other. What of this need to be specified in the root playbook, if you constantly need to enable root to configure the server? I first put become: yes and in every second task I wrote become_user: root . Then it turned out that without become_user everything also works from root! Because root is the default value and I, in fact, made sudo -i from the very beginning without being able to let go.

Somewhere here, I remembered that I had not updated the system on my laptop for a long time, and launched dnf update . Continuing to pop up with the playbook. Vagrant worked, and dnf in the next tab updated VirtualBox. It seems that it is not necessary to do this, because the next vagrant provision said: “everything broke and I am not guilty.” It lacked VirtualBox, which “ terminated unexpectedly during startup with exit code 1 (0x1) ” - and even though you are cracking. The vboxheadless -h command (I'm not a real devops, I googled) showed the error -1912. On the Internet, one and all answer: reinstall VirtualBox. Fuck it does not help. Desperate, I found the xenial box for libvirt and switched to it. Well, when there is a choice.

From some example I copied the call to the apt call with a bunch of parameters, and then I learned that it would be nice to do update_cache=yes as a separate task. And this task, that's the trouble, all the time returns "changed". It turned out that you need to register cache_valid_time=3600 to check for updates no more than once per hour. At first I thought to write 86400 (day), but I'm not going to call Ansible in the crown, and let it live once a month.

Deploy the database

PostgreSQL installation - five lines in the console or the whole epic in Ansible. At some point, you need to make become_user: postgres . And here the box produced a strange error: "It was an unrealized user ." Remember how Ansible loads modules on the server and starts there? Well, he downloads them from root or from another superuser, and then the postgres user does not have access to them. Here is bad luck.

StackOverflow to help again: it turns out there are three ways out. One of them is to make ansible.cfg and write pipelining=True inside (and to solve some other arising problem, I temporarily set pipelining=False ). The second way out - literally, “do not do this.” And the third is the simplest: put the acl package and everything works in a magical way. Rather, it does not work in another way: " sudo: a password is required ". Well, what's the deal, where are the passwords here at all, do I enter with the key?

It turned out, I go to a virtual machine without a key, a vagrant user. Which was made before us and for us. Ansible with become_user , apparently, does sudo -u postgres , but it requires the password of the user vagrant. There is no password.

I'm starting to sort through the options. become_method: su timed out because the server asks for a password, but Ansible does not understand this. What he does there is not clear, because sudo su postgres does not ask me for a password. There is an option in the /etc/sudoers.d/vagrant file /etc/sudoers.d/vagrant write “ vagrant ALL=(ALL) ... ”, because the word in brackets will allow you to do sudo -u without a password. But then the playbook becomes sharpened by Vagrant, and I still have to run it in the sale. Inaccurately.

From hopelessness, I try to remove become . Postgres expected squeeze: " Peer authentication failed for user" postgres "." Digging up a stewardess. New plan: to run a role under the user zverik, who has everything in the world right. I split the playbook into two: in the first I install the python and make the user, the second I install and configure everything else with remote_user: zverik . I run. And again " sudo: a password is required ". Why? Well, yes, Vagrant passes the value of remote_user and does not allow it to change. Well damn.

In order to distract himself, he opened a text editor and began writing these notes. At this point, I have been working with Ansible for a week and a half to two hours, and I have not even created a database in the post-gres. In the textbooks, it all looks so simple ... I counted the tabs associated with Ansible in firefox: 48 pieces. Forty eight. Approximately one sixth of the total.

Then I disabled ansible.force_remote_user in Vagrantfile and restarted provision . Hurray, a new mistake! Reminds that the user login zverik works only with a certificate. But I also have a certificate, and vagrant ssh -p works and lets in without a password. Googled the solution: you need to specify the path to the certificate in ansible.cfg . It will not work for the same reason as remote_user : Vagrant wins. This time it's easier to override the main variable: add the “ ansible_ssh_private_key_file: "{{ lookup('env', 'HOME') }}/.ssh/id_rsa" , and everything works! Not very nice, but hurray!

After I dealt with the users, the writing of the roles went smoothly. Already ready one role out of six, sixty task. But starting is more difficult than it seems in the textbooks.

Useful stuff

While writing playbooks, you find or google a lot of useful little things. Some are described in the documentation, some - in the articles (look for "Ansible" on Habré). Here are a few of them.

For command execution - only command or shell modules. The latter, as the documentation writes, only in extreme cases, so forget about redirecting the output and && . The result is always "changed", which is bad. Manage the result with either the creates parameter (more conveniently - in the args block, along with chdir ), or register and changed_when . It is useful to check the conditions before the execution: first, command + register + changed_when: False reconnaissance, and then with the help of when check the stored stdout to start the command.

The fewer command module calls, the better. Google: there is almost always a module. For example, I first did command: npm install -g {{ item }} , and then I discovered that you can npm: name={{ item }} global=yes . A module is always better than a command, because there is no need to check the configuration and because the result of the work will not be in the stdout line, but in a convenient structure.

Configuration files are almost always governed by lineinfile , which searches for a line by a regular expression and replaces it with another. The blockinfile module adds whole blocks of text. There is a nuance with it: if several tasks are written into one file, then you need to override the marker: # {mark} block name . Otherwise, everyone will overwrite other blocks.

Before modifying PostgreSQL tables, it is convenient to check their status with pg_tables. For example:

 command: psql -A -t -d {{ gisdb }} -c "SELECT tableowner FROM pg_tables WHERE schemaname = 'public' AND tablename = 'spatial_ref_sys'"

Inheritance is our all: if you can, instead of two almost identical tasks, write one with conditional expressions and with_items , then do so. A group of repetitive tasks with similar parameters are taken into a separate file and called via include_role with vars . There still has to be about parameterization of roles, but I'm still learning and I have one role.

In one of the articles I found advice not to reinvent the bicycle, but to look for suitable roles in the Ansible Galaxy catalog. Indeed, php-fpm and postfix put thousands of people before you, and often there is a well-written role with convenient default values.

On the other hand, what's the point of downloading the role of geerlingguy.apache , when apt: pkg=apache2 solves all my tasks? Or, here, I found the role to install osm2pgsql from the sources, and it is 2014 and outdated sudo: yes . That is, I, of course, recorded roles_path = roles.galaxy:roles in ansible.cfg and made a playbook to install all the roles, but there is nothing to set yet. Here is what it looks like:

 - hosts: localhost vars: galaxy_path: roles.galaxy tasks:   - name: Remove old galaxy roles     file: path={{ galaxy_path }} state=absent   - name: Install Ansible Galaxy roles     local_action: command ansible-galaxy install -r requirements.yml --roles-path {{ galaxy_path }}

And in the requirements.yml write lines for each role from the Galaxy:

 - src: .

Have you written a playbook and did it work in Vagrant to the end? Great, now do vagrant destroy and re-create the box. Absolutely find several jambs: forgotten sudo, missing mode: 0755 for executable files, missing packages (help dnf provides or apt-file to be installed). Finally, the most important thing: after the second launch of the vagrant provision should be "changed: 0".

***

Transferring servers to a configuration management system is difficult, no matter what system you choose. But after the initial rake field, the programming of the playbook arises. The main thing - do not forget about the goal, so as not to burn out: there, now I have the target OS Ubuntu 16.04, and in a month I will transfer the server to 18.04 without too much difficulty. And the pleasure of a full-featured server from scratch one by one in the console will help along the way.

Source: https://habr.com/ru/post/352616/

All Articles

Ansible is not so simple

Start. Vagrant

Superuser, become!

Deploy the database

Useful stuff

***

More articles: