
Following the
shell-operator, we introduce its older brother, the
addon-operator . This is an open source project that is used to install system components into the Kubernetes cluster, which can be called a common word - add-ons.
Why do any supplements?
It is no secret that Kubernetes is not an all-in-one finished product, and various additions will be needed to build an “adult” cluster. Addon-operator will help to install, configure and maintain these add-ons up to date.
')
The need for additional components in the cluster is disclosed in the
report of a colleague
driusha . In short, the situation with Kubernetes at the moment is such that for a simple installation you can “play” components from the box, for developers and testing you can add Ingress, but for a full installation, which you can say “your production is ready”, you need to add a dozen different add-ons: something for monitoring, something for logs, don't forget ingress and cert-manager, highlight groups of nodes, add network policies, spice up with sysctl and pod autoscaler settings ...

What are the specifics of working with them?
As practice shows, the case is not limited to one installation. For comfortable work with a cluster, add-ons will need to be updated, disconnected (removed from the cluster), and you will want to test something before installing it in a production-cluster.
So maybe Ansible is enough? Maybe. But
full-fledged add-ons generally do not live without settings . These settings may differ depending on the cluster option (aws, gce, azure, bare-metal, do, ...). Some settings can not be set in advance - they need to be received from the cluster. And the cluster is not static: for some settings you will have to follow the changes. And here Ansible is not enough: we need a program that lives in a cluster, i.e. Kubernetes Operator.
Those who have tried
shell-operator will say that the tasks of installing and updating add-ons and tracking settings can be solved with the help of shell-operator
hooks . You can write a script that will do the conditional
kubectl apply
and follow, for example, the ConfigMap where the settings will be stored. Approximately it is also implemented in addon-operator.
How is this organized in addon-operator?
Creating a new solution, we proceeded from the following principles:
- The add-on installer must support templating and declarative configuration . Do not make magic scripts that install add-ons. Addon-operator uses Helm to install add-ons. To install you need to create a chart and highlight the values ​​that will be used for setting.
- Settings can be generated during installation , they can be obtained from the cluster , or receive updates by monitoring the cluster resources. These operations can be implemented using hooks.
- Settings can be stored in a cluster . To store settings in a cluster, a ConfigMap / addon-operator is created and an Addon-operator monitors changes to this ConfigMap. An addon-operator gives hooks access to settings using simple conventions.
- Addition depends on the settings . If the settings have changed, then the Addon-operator rolls out the Helm-chart with new values. The union of the Helm-chart, the values ​​for it and the hooks, we called the module (see below for more details).
- Staging No magic release scripts. The update mechanism is similar to the usual application - to collect add-ons and addon-operator in the image, to run and roll out.
- Control of the result . Addon-operator can give metrics to Prometheus.
What is the add-on addon-operator?
Addition can be considered everything that adds new functions to the cluster. For example, installing Ingress is a great example of addition. This can be any operator or controller with its CRD: prometheus-operator, cert-manager, kube-controller-manager, etc. Or something small, but simplifying the operation - for example, secret copier, copying registry secrets to new namespaces, or sysctl tuner, which configures sysctl parameters on new nodes.
To implement add-ons, the Addon-operator provides several concepts:
- The helm chart is used to install various software in a cluster - for example, Prometheus, Grafana, nginx-ingress. If the desired component has a Helm-chart, then installing it using the Addon-operator will be very easy.
- Storage values . Helm charts usually have many different settings that can change over time. The addon-operator maintains the storage of these settings and is able to monitor their changes in order to reset the Helm-chart with new values.
- Hooks are executable files that the Addon-operator launches by event and which gain access to the values ​​store. A hook can monitor changes in the cluster and update values ​​in the values ​​store. Those. With the help of hooks, you can do discovery to collect values ​​from the cluster at startup or on a schedule, or you can use continuous discovery, collecting values ​​from the cluster according to changes in the cluster.
- A module is a union of the helm-chart, the values ​​repository and hooks. Modules can be turned on and off. Disabling the module is the removal of all releases of the Helm-chart. The modules can turn on themselves dynamically, for example, if all the modules it needs are turned on, or if the discovery found the necessary parameters in the hooks, this is done using an auxiliary enabled-script.
- Global hooks . These are hooks “by themselves”, they are not included in the modules and have access to the global values ​​store, the values ​​from which are available to all hooks in the modules.
How do these parts work together? Consider a picture from the documentation:

There are two work scenarios:
- A global hook is triggered by an event — for example, when a resource changes in a cluster. This hook handles changes and writes new values ​​to the global values ​​store. The addon-operator notices that the global repository has changed and is launching all modules. Each module with its own hooks determines whether it needs to be included, and updates its storage values. If the module is enabled, the Addon-operator starts the installation of the Helm-chart. At the same time, the Helm chart contains values ​​from the module storage and from the global storage.
- The second scenario is simpler: the modular hook is triggered by an event; it changes the values ​​in the module's value store. The addon-operator notices this and launches the Helm-chart with updated values.
Addition can be implemented as a single hook or as one Helm-chart, or
even as several dependent modules - this depends on the complexity of the component installed in the cluster and on the desired level of flexibility of settings. For example, in the repository (
/ examples ) there is the addition of sysctl-tuner, which is implemented both as a simple module with a hook and a Helm-chart, and using the values ​​storage, which makes it possible to add settings through editing the ConfigMap.
Delivery of updates
A few words about the organization of updates to the components that Addon-operator installs.
To run an Addon-operator in a cluster, you need
to build an image with additions in the form of hooks and Helm-charts, add a binary
addon-operator
file and everything you need for hooks:
bash
,
kubectl
,
jq
,
python
, etc. Then this image can be rolled out into a cluster as a normal application and most likely you will want to organize a particular tagging scheme. If there are not too many clusters, the same approach as with the applications can come up: a new release, a new version, go over all the clusters and correct the image from the Pods. However, in the case of roll-out on a tangible number of clusters, the concept of self-updating from the channel more suited us.
We have it as follows:
- A channel is essentially an identifier that can be set by anyone (for example, dev / stage / ea / stable).
- The channel name is an image tag. When you need to roll out updates to the channel, then a new image is assembled and tagged with the name of the channel.
- When a new image appears in the registry, the Addon-operator is restarted and launched with the new image.
This is not the best practice, as described in the
Kubernetes documentation . It is not recommended to do this, but we are talking about a
regular application that lives in the same cluster . In the case of an Addon-operator, an application is a multitude of Deployments scattered across clusters, and self-updating greatly helps and simplifies life.
Channels also help
in testing : if there is an auxiliary cluster, you can configure it on the
stage
channel and roll updates into it before rolling out to the
ea
and
stable
channels. If an error has occurred with the cluster on the
ea
channel, you can switch it to
stable
while the problem is being investigated with this cluster. If the cluster is removed from active support, it switches to its “frozen” channel — for example,
freeze-2019-03-20
.
In addition to updates of hooks and Helm-charts, you may need to
update the third-party component . For example, you noticed an error in the conditional node-exporter and even figured out how to patch it. Next, open the PR and wait for a new release to go through all the clusters and increase the version of the image. In order not to wait indefinitely, you can assemble your node-exporter and switch to it before accepting PR.
In general, it can be done without the Addon-operator, but with the Addon-operator, the module for installing node-exporter will be visible in one repository, you can keep the Dockerfile right there, it’s easier for all participants in the process to understand that it happens ... And if there are several clusters, it becomes easier to test your PR and roll out a new version!
This component update organization works successfully with us, but you can implement any other suitable scheme -
in this case, the Addon-operator is a simple binary file .
Conclusion
The principles implemented in Addon-operator allow you to build a transparent process for creating, testing, installing and updating add-ons in a cluster, similar to the processes for developing ordinary applications.
Add-ons for Addon-operator in the format of modules (Helm-chart + hooks) can be spread in wide access. We, the company Flant, plan to lay out our developments in the form of such additions during the summer. Join the development on GitHub (
shell-operator ,
addon-operator ), try to make your addition based on
examples and
documentation , wait for news on Habré and on our
channel on YouTube !
UPDATED (June 14) : If you have English-speaking colleagues who may be interested in the addon-operator, the corresponding announcement for them is available in our blog on Medium .PS
Read also in our blog: