How to become a puppeteer or Puppet for beginners

Hello.

This topic opens a series of articles on the use of the Puppet configuration management system.

What is a configuration management system?

Suppose you have a fleet of servers that perform various tasks. As long as the servers are small and you are not growing, you can easily configure each server manually. Install the OS (maybe automated), add users, install software, enter commands into the console, configure services, edit the configs of your favorite text editors (nanorc, vimrc), set the same DNS server settings on them, install the monitoring system agent, configure syslog for the centralized collection of logs ... In a word, there is a lot of work and it is not particularly interesting.

I truly believe that a good admin is a lazy admin. He doesn't like to do something a few times. My first thought is to write a couple of scripts, in which there will be something like:

servers.sh

servers="server00 server01 server02 server03 server04" for server in $servers ; do scp /path/to/job/file/job.sh $server:/tmp/job.sh ssh $server sh /tmp/job.sh done

')
job.sh

 #!/bin/bash apt-get update apt-get install nginx service nginx start

It seems everything became easy and good. We need to do something - we write a new script, we launch it. Changes come to all servers consistently. If the script is well debugged - everything will be fine. For the time being.

Now imagine that there are more servers. For example, a hundred. And the change is long - for example, building something big and scary (for example, a kernel) from sources. The script will be executed for a hundred years, but this is half the trouble.

Imagine that you only need to do this on a specific group of hundreds of servers. And after two days you need to do another big task on a different server slice. You will have to rewrite scripts every time and check many times if there are any errors in them, whether it will cause any problems at startup.

The worst thing is that in such scripts you describe the actions that need to be performed to bring the system into a certain state, and not the state itself. So, if the system was originally not in the condition that you assumed, then everything will definitely go wrong. The Puppet Manifests declaratively describe the required state of the system, and computing how to get to it from the current state is the task of the configuration management system itself.

For comparison: the manifesto puppet, performing the same work as a couple of scripts from the beginning of the topic:

nginx.pp

 class nginx { package { 'nginx': ensure => latest } service { 'nginx': ensure => running, enable => true, require => Package['nginx'] } } node /^server(\d+)$/ { include nginx }

If you use the servers correctly and spend some time on the initial configuration of the configuration management system, you can achieve such a state of the server fleet that you do not need to log in to them to do the work. All necessary changes will come to them automatically.

What is Puppet?

Puppet is a configuration management system. The architecture is client-server, configs are stored on the server (in terms of puppet, they are called manifests ), clients access the server, get them and apply them. Puppet is written in the Ruby language, the manifests themselves are written on a special DSL, very similar to Ruby itself.

The first steps

Let's forget about clients, servers, their interactions, etc. Suppose we have only one server on which a bare OS is installed (hereinafter I work in Ubuntu 12.04, for other systems, the actions will be somewhat different).

First install the latest version of puppet.

 wget http://apt.puppetlabs.com/puppetlabs-release-precise.deb dpkg -i puppetlabs-release-precise.deb apt-get update apt-get install puppet puppetmaster

Wonderful. Now we have puppet installed on our system and we can play with it.

Hello, world!

Create the first manifest:

/tmp/helloworld.pp

 file { '/tmp/helloworld': ensure => present, content => 'Hello, world!', mode => 0644, owner => 'root', group => 'root' }

And apply it:

 $ puppet apply helloworld.pp /Stage[main]//File[/tmp/helloworld]/ensure: created Finished catalog run in 0.06 seconds

A little bit about the launch

The manifestos in this topic can be applied manually using puppet apply. However, in subsequent topics, the master-slave configuration (standard for Puppet) will be used for work.

Now look at the contents of the / tmp / helloworld file. It will (surprisingly!) Be the string "Hello, world!", Which we asked in the manifest.

You can say what you could do with echo "Hello, world!" > /tmp/helloworld echo "Hello, world!" > /tmp/helloworld , it would be faster, easier, I wouldn’t have to think, write some terrible manifestos, and in general ~~it’s nafig that nobody needs~~ is somehow too difficult, but think more seriously. In fact, it would be necessary to write

touch /tmp/helloworld && echo "Hello, world!" > /tmp/helloworld && chmod 644 /tmp/helloworld && chown root /tmp/helloworld && chgrp root /tmp/helloworld

touch /tmp/helloworld && echo "Hello, world!" > /tmp/helloworld && chmod 644 /tmp/helloworld && chown root /tmp/helloworld && chgrp root /tmp/helloworld

to ensure the same result.

Let's sort through the lines exactly what is contained in our manifest:

/tmp/helloworld.pp

 file { '/tmp/helloworld': ensure => present, #    content => 'Hello, world!', #      "Hello, world!" mode => 0644, #    - 0644 owner => 'root', #   - root group => 'root' #   - root }

In terms of Puppet, a resource of the type file named (title) / tmp / helloworld is described here .

Resources

A resource is the smallest unit of abstraction in Puppet. Resources can be:

files;
packages (Puppet supports package systems of many distributions);
Services;
users;
groups;
cron tasks;
etc.

Resource syntax you can spy on in the documentation .

Puppet has the ability to add your own resources. Therefore, if you thoroughly get stuck, you can roll up to manifestos like:

webserver.pp

 include webserver; webserver::vhost { 'example.com': ensure => present, size => '1G', php => false, https => true }

Puppet will then create a logical volume of 1 GiB on the server, mount it where necessary (for example, in /var/www/example.com), add the necessary entries to fstab, create the necessary virtual hosts in nginx and apache, restart both daemons, add example.com to ftp and sftp with password mySuperSecretPassWord with write access to this virtual host.

Yummy? Not that word!

Moreover, the most delicious, in my opinion, is not the automation of routine. If you are an idiot, for example, and are constantly rebuilding your servers in production, Puppet will allow you to pick up an old lovingly created set of packages and configs from scratch in fully automatic mode. You simply install the Puppet agent, connect it to your Puppet master and wait. Everything will come by itself. On the server, the packages will appear magically (no, really magically!), Your ssh-keys will be decomposed, the firewall will be installed, the individual bash settings will come, the networks will be installed and all the software you have prudently installed using Puppet will be installed and configured.
In addition, Puppet, when you try, allows you to get a self-documenting system, because the configuration (manifests) themselves are the backbone of the documentation. They are always relevant (they are already working), there are no errors in them (you check your settings before launching), they are minimally detailed (it works the same).

Some more magic

A bit about cross-distribution

Puppet has the ability to use cross-distributive manifests, this is one of the purposes for which it was created. I deliberately never used it and do not recommend it to you. The server park should be as homogeneous as possible in terms of system software, this allows not to think at critical moments “iblin, here
rc.d, but not init.d ”(a curtsey in the direction of ArchLinux) and in general allows you to think less about routine tasks.

Many resources depend on other resources. For example, for the resource “sshd service” you need the resource “sshd package” and optionally “sshd config”
Let's see how this is implemented:

 file { 'sshd_config': path => '/etc/ssh/sshd_config', ensure => file, content => "Port 22 Protocol 2 HostKey /etc/ssh/ssh_host_rsa_key HostKey /etc/ssh/ssh_host_dsa_key HostKey /etc/ssh/ssh_host_ecdsa_key UsePrivilegeSeparation yes KeyRegenerationInterval 3600 ServerKeyBits 768 SyslogFacility AUTH LogLevel INFO LoginGraceTime 120 PermitRootLogin yes StrictModes yes RSAAuthentication yes PubkeyAuthentication yes IgnoreRhosts yes RhostsRSAAuthentication no HostbasedAuthentication no PermitEmptyPasswords no ChallengeResponseAuthentication no X11Forwarding yes X11DisplayOffset 10 PrintMotd no PrintLastLog yes TCPKeepAlive yes AcceptEnv LANG LC_* Subsystem sftp /usr/lib/openssh/sftp-server UsePAM yes", mode => 0644, owner => root, group => root, require => Package['sshd'] } package { 'sshd': ensure => latest, name => 'openssh-server' } service { 'sshd': ensure => running, enable => true, name => 'ssh' subscribe => File['sshd_config'], require => Package['sshd'] }

It uses inline config, which makes the manifesto ugly. In fact, this is almost never done, there is an ERB-based template mechanism and the ability to simply use external files. But we are not interested.

The most delicious lines here are the dependency lines - require and subscribe.

Puppet supports many dependency descriptions. Details, as always, can be read in the documentation .

Require means exactly what is expected. If resource A depends (require) on resource B, then Puppet will first process resource B, and then return to resource A.
Subscribe gives a little more tricky behavior. If resource A is subscribed to resource B, then Puppet will first process resource B, and then return to resource A (behavior as require), and then as B changes, it will be processed again A. This is very convenient for creating services that depend from their configs (as in the example above). If the config changes, the server is restarted, no need to worry about it yourself.

There are also notify , before , but we will not touch them here. Interested - in the already mentioned documentation .

Total

At the moment we have already learned how to write simple manifests indicating the dependencies between resources. Very many simple demons fall into the “package-config-service” model, so even in this form, puppet is already suitable for use.
Subsequent topics will describe how to use more powerful puppet features when creating a spherical LAMP hosting in a vacuum (if there are other ideas of a spherical project for training - welcome to the PM or in the comments).