What is interesting here?

The recipe for making a tasty and healthy PrestoDB cluster using a Terraform pressure cooker and SaltStack in an AWS public cloud. Let us consider in detail the nuances of preparation for the work of the pressure cooker itself, the necessary steps for the proper preparation of the dish itself and, naturally, we will tell a little about the consumption of the finished dish. This part can be used as a study material on Terraform.
so let's get started:
Ingredients for the recipe
- Terraform - 1 pc.
- SaltStack - 1 Master and 1+ Minions
- PrestoDB - 1 coordinator and 1+ worker
- AWS account - 1 pc.
- Smarty and file - to taste
Consider the ingredients in more detail: (without the rules of their preparation)
1.
Terraform - A wonderful tool from the guys from Hashicorp (they also made such very useful things like Vagrant,
Consul , Packer, Vault, etc.) used to create and modify infrastructures in various
cloud and not only environments.
2.
SaltStack - A tool for automated configuration and configuration of servers. Your humble servant has already written about it
here and
here .
3.
PrestoDB - Add-on for Big Data providers to be able to query them in native and understandable SQL. Developed by the guys from Facebook, who transferred it to
OSS status for which many thanks to them.
4.
AWS (or any other public / private cloud, for example:
GCE or
OpenStack ) from the list supported by Terraform in which our PrestoDB cluster will work later. We will use AWS because it is the most common (among public cloud platforms) and is understandable to many without mass of additional explanations.
5. The article will describe only the basic principles of the work of a bundle of these products, and some tricks to facilitate the process, but I will not elaborate on the nuances of the work of a component - for each of them, in principle, you can write a book. Because adapt these techniques using the head is very welcome. And yet - do not write in the comments that something is not optimally tuned (in particular PrestoDB) - this is not the goal that I am pursuing.
Cooking a pressure cooker!
In any culinary recipe there is a default stating that the pans and pots are ready to be cooked, but in our case the correct preparation of the pressure cooker (Terraform + SaltStack) is almost 80% key to successful cooking.
So, let's start with Terraform. Well, there is
CloudFormation for AWS or
SaltCloud from the creators of SaltStack, so why was Terraform chosen? The main feature of Terraform is its simplicity and understandable DSL - to create an instance (or 10), this description is necessary and sufficient (we mean Terraform is
downloaded and is within $ PATH):
provider "aws" { access_key = "XXXXXXXXXXXXXXXXXXXXX"
and a simple sequence of commands:
')
terraform plan
terraform apply
the narrative is understandable and, it seems to me, does not require explanation for those who are familiar with AWS. Learn more about the available AWS resources
here . Of course, we mean that an AWS account whose keys are specified in the Terraform configuration has the privileges to create the necessary resources.
Actually, the most interesting thing lies in the calls of Terraform itself - the terraform plan - does a “gauge” of what needs to be done from the last state (in our example, you need to create a new instance) and shows what resources will be created, deleted or modified, apply - actually start the process create scheduled resources. If Terraform has already been launched and you have changed the configuration (say, added instances), the planning stage will show which missing resources will be created and apply can create the missing ones.
terraform destroy
helps to completely remove all resources created with Terraform (the files in the current .tfstate directory that contain the description of the state of the created infrastructure are taken into account).
An important point, which you should not forget - terraform in most cases will not modify the existing resources - it will simply delete the old ones and recreate it again. This means, for example, that if you created an instance of type t2.medium and then changed the configuration by specifying a new type for the instance, say m4.xlarge, then when you run apply Terraform, you first destroy the previously created one, and then create a new one. This may seem strange to AWS users (it was possible to stop the instance, change its type and start it again without losing the data on the disk), but this was done to provide the same predictable behavior on all platforms. And one more thing: Terraform does not know how (and should not be able by its nature) to control resources during their life cycle - this means that Terraform does not provide commands like stop or reboot for instances created with it - you must use other means management of the created infrastructure.
Terraform provides an excellent set of functionality available in its DSL - these are
variables (https://www.terraform.io/docs/configuration/variables.html),
interpolators (necessary for iteration, modifying variables),
modules , etc. Here is one example of using all of this:
Here is an example of using variables, arithmetic operations on them, interpolation using format, using the index of the current element (if several instances of the same type are created), and resource tagging.
But it is not enough just to create / destroy instances - it is necessary to somehow initialize them (copy files, install and configure specific software, update the system, perform cluster configuration, etc.) for this Terraform introduces the concept of
Provisioners . The main ones are
file ,
remote-exec ,
chef and
null-resource . Typical operations are copying files and running scripts on a remote instance.
Here is the previous example with provisioning operations enabled:
The main note - specifying information about the connection to the remote host - for AWS, this is most often access by key - therefore you have to specify exactly where this key lies (for convenience, a variable was entered). Note that the private_key attribute in the connection section cannot accept the path to the file (only key with text) - instead, the $ file {} interpolator is used to open the file on the disk and return its contents.
We got to create a simple cluster consisting of several instances (we will not go into the details of the contents of the bootstrap-script.sh file - we assume that the installation of the necessary software is registered there). Let's look at how to make a cluster with a dedicated master in our pressure cooker. In general, we assume that the worker nodes of the cluster need to know where the master node is located in order to register in it and subsequently receive tasks (let's leave all sorts of goodies like
Raft and
Gossip protocols to establish a master and distribute information in the cluster for other articles) - for simplicity - let's set the worker to know the IP address of the master. How to implement it in Terraform? First you need to create a separate instance for the master:
resource "aws_instance" "master_node" { ami = "${var.cluster_node_ami}" instance_type = "${var.cluster_node_type}" count = "1" <...skipped...> provisioners { <...skipped...> } }
then, add a dependency to worker node:
The depends_on resource modifier can be used to specify the order in which to perform tasks on creating infrastructure - Terraform will not create a worker node until the master node is fully created. As you can see from the example, as a dependency (tey), you can specify a list constructed from the type of the resource, indicating its name through the point. In AWS, you can create not only instances, but also VPCs, networks, etc. - they will need to be specified as dependencies for resources using VPC - this will guarantee the correct order of creation.
But, let's continue with passing the address of the master node to all worker nodes. For this, Terraform provides a mechanism for referencing previously created resources — that is, You can simply extract information about the ip address of the master node in the worker’s description:
those. using variables of the form $ {aws_instance.master_node.private_ip} you can access almost any information about the resource. In this example, we assume that bootstrap-script.sh can take as its parameter the address of the master node and use it later for internal configuration.
Sometimes there are not enough such connections, - for example, you need to call some scripts on the master side after connecting worker nodes (accept keys, run init tasks on worker nodes, etc.) for this there is a mechanism in Terraform called null -resource is a fake resource that can be created using the dependency mechanism (see above) after all the master and worker nodes have been created. Here is an example of such a resource:
resource "null_resource" "cluster_provision" { depends_on = [ "aws_instance.master_node", "aws_instance.worker_nodes" ]
small explanation:
1. depends_on - we specify a list of those resources that must be ready in advance.
2. triggers - form the stock (id of all instances, separated by a comma, in our case) a change of which will cause the fulfillment of all the provider's specified in this resource.
3. We indicate on which instance we need to run the provisioning scripts specified in this resource in the connection section.
If you need to perform several steps on different servers, create several null-resource with the necessary dependencies.
In general, the described will be enough to create rather complex infrastructures using Terraform.
Here are some more important tips for those who like to learn from the mistakes of others:
1. Do not forget to carefully store .tfstate files in which Terraform stores the latest state of the created infrastructure (in addition, it is a json file that can be used as an exhaustive source of information about the created resources)
2. Do not change the resources created using Terraform manually (using the management console of the services themselves and other external frameworks) - the next time you start plan & apply, you will receive a re-creation of a resource that does not correspond to the current description, which will be very unexpected and often deplorable.
3. Try first to test your configurations on small instances / a small number of them - it is very difficult to catch a lot of errors during the creation of configurations, and the validator built into Terraform will show only syntax errors (and not all).
In the second part, we will consider the continuation of preparation for the work of the pressure cooker - we will describe how to put on top of the created infrastructure SaltStack master + minions to put PrestoDB.