Author: Oleg Gelbukh
There are several basic requirements for deploying the OpenStack platform for commercial use, both as a small cluster for development environments in startups, and as a large-scale installation for the provider of resources for cloud services. The following requirements are the most common and, as a result, the most important:
- Service continuity (HA) and redundancy
- Cluster Scalability
- Automation of technological operations
')
Mirantis has developed an approach that satisfies all these three requirements. This article is the first in a series of articles that describe our approach. The article provides an overview of the methods and tools used.
Seamless Security (HA) and Backup
In general, services based on the OpenStack platform can be divided into several groups, in this case, based on the approach of ensuring continuity for each service.
API servicesThe first group includes API servers, namely:
- nova-api
- glance-api
- glance registry
- keystone
Since HTTP / REST protocols are used, redundancy is relatively simply achieved using a load balancer added to the cluster. If the load balancer supports health checks, this is sufficient to provide a basic high availability API. Note that in the 2012.1 (Essex) release of the OpenStack platform, only the Swift API supports the “health check” call. To test the performance of other services, API additions are required to support such a call.
Calculation ServicesThe second group includes services that actually manage virtual servers and provide resources for them:
- nova-compute
- nova-network
- nova-volume
These services do not require special redundancy in the production environment. The approach to these groups of services is based on the basic paradigm of cloud computing, that is, a paradigm where there are many interchangeable workflows and the loss of one of them leads only to a temporary local breakdown of control, and not a failure of the service provided by the cluster. Thus, it is enough to track these services with the help of an external monitoring system, and also to have at your disposal the main recovery scenarios implemented as event handlers. The simplest scenario is to send a notification to the administrator and attempt to restart the service that ended with an error.
The high availability of the data transfer service over the network provided by the multi-host function of the nova-network service is described in the official OpenStack documentation. In real-world environments, however, with frequent switching to this scheme, load transfer via network routing to an external hardware router is applied. Thus, the nova-network service performs only the functions of a DHCP server, and the support of several nodes ensures that the DHCP server is not the only point of failure.
SchedulerReservations are an integral part of the nova-scheduler service. When you start the first instance of nova-scheduler, it starts to receive messages from the scheduler queue on the RabbitMQ server. This creates an additional scheduler_fanout_ queue, which nova-compute services use to update statuses. The parameter in the queue name is replaced with the identifier of the new scheduler instance. All further started schedulers nova-scheduler act in a similar way, which allows them to work in parallel without additional efforts.
Queue ServerThe RabbitMQ queue server is the primary communication channel for all nova services, and it must be reliable in any production environment configuration. Clustering and queue mirroring are initially supported by the RabbitMQ server, and the load balancer can be used to distribute connections between RabbitMQ servers operating in cluster mode. Mirantis has also developed an update for the Nova RPC library, which allows it to fail over to a backup RabbitMQ server if the main server fails and cannot accept connections.
DatabaseMost often, when deploying the OpenStack platform, a MySQL database is used. Mirantis also uses this database most often in its installations. Today, there are several solutions that ensure the uninterrupted and scalable MySQL database. The most commonly used replication management tool with multiple sources of data changes (multi-master replication manager) MySQL-MMM. This solution is used in several installations made by Mirantis and works quite reliably, with the exception of known limitations.
Although there were no serious problems with using MMM, we are considering using more modern open source solutions to ensure the uninterrupted operation of the database, in particular, the clustering engine for MySQL based on WSREP, Galera. The Galera cluster provides a simple and transparent scalability mechanism and supports fault tolerance using synchronous replication with multiple sources of data changes implemented at the WSREP level.
Scalability
Now that we know how to balance the load or distribute the load in parallel, we need a mechanism that allows us to add service processes to the cluster and expand it to provide more load, that is, to perform “horizontal” scaling. For most components of the OpenStack platform, simply add an instance of the server, include it in the load balancer configuration, and scale out the cluster. However, this leads to two problems in industrial installations:
Most clusters are scaled by node, not by service instance. In this regard, it is necessary to determine the role of nodes that allow you to perform “smart” cluster scaling. The role essentially corresponds to the set of services running on the node and is scaled by adding the node to the cluster.
Scaling the cluster horizontally by adding a controlling node requires changing the configuration in several places in a specific order, that is, you need to deploy the cluster, start the services, and then update the load balancer configuration to include the new node. For compute nodes, the procedure is simpler, but, nevertheless, it requires a high degree of automation at all levels, from equipment to service configuration.
Nodes and rolesWhile OpenStack services can be distributed across servers with a high degree of flexibility, the most common option for deploying the OpenStack platform is to have two types of nodes: a control node and compute nodes. A typical OpenStack installation for development includes one control node that runs all services, with the exception of a computation group, and several compute nodes that run computational services and which host virtual servers.
It becomes clear that this architecture is not suitable for commercial installations. For small clusters, we recommend making cluster nodes as self-sufficient as possible by installing API servers on the compute nodes, leaving only the database, the queue server, and the control panel on the control node. The configuration of the control nodes must provide redundancy. The following node roles are defined for this architecture:
-
End node . This site is running load balancing and uninterrupted services, which may include load balancing and cluster creation software. As the end node can act as a software and hardware complex in the network, designed for load balancing. In a cluster, it is recommended to create at least two end nodes for redundancy.
-
Managing site . This site hosts communication services that support the entire cloud, including a queue server, a database, a Horizon control panel, and possibly a monitoring system. This node can optionally have a nova-scheduler service and API servers that balance the load distribution to which the end node manages. In a cluster, you must create at least two control nodes for redundancy. The management node and the end node can be combined on the same physical server, but you must make changes to the configuration of the nova services — you must transfer them from the ports that the load balancer uses.
-
Computing node . This node hosts the hypervisor and virtual instances that use its computational power. The compute node can also act as a network controller for the virtual instances located on it, if a multihost scheme is used.
Configuration managementTo implement the above architecture, each physical server requires a certain sequence of steps. Some steps are quite complex, others include setting up multiple nodes; for example, configuring a load balancer or setting up replication with multiple sources of data modification. Due to the complexity of the current deployment process of the OpenStack platform, it is important for its successful implementation to write scripts for these operations. This led to the emergence of several projects, including the well-known Devstack and Crowbar.
Simply scripting the installation process is not enough either to successfully install an OpenStack solution in production environments, nor to ensure cluster scalability. Also, if you need to change something in your architecture or update versions of components, you will need to develop new scenarios. This can be done using tools designed specifically for these tasks: configurator programs. The most famous among them are Puppet and Chef, and there are also products developed based on them (for example, the above-mentioned Crowbar uses Chef as the engine).
To deploy OpenStack in various projects, we used both Puppet and Chef. Naturally, each of the programs has its limitations. Our experience shows that the best results are achieved when the configurator is supported by a centralized orchestration engine for smooth and successful deployment. When combined with an application to configure physical servers at the hardware level and a set of tests to confirm the quality of the installation, we get an integrated approach that allows you to quickly install the OpenStack platform in a wide range of equipment configurations and logical architectures.
Automation of operations
Using the orchestration engine with a configuration system that takes into account the roles of the nodes allows us to automate the deployment process to a rather high degree. We can also automate scaling. All this reduces the cost of operating and maintaining OpenStack. Most modern orchestration engines have an API interface that allows you to create a command line interface or web interfaces for operators who perform tasks to manage the entire cluster or its individual parts.
More about this we will tell in the following articles.