📜 ⬆️ ⬇️

Kubernetes 1.8: a review of major innovations



The large and well-organized Open Source community behind Kubernetes has taught us to wait for significant and numerous changes from each release. And Kubernetes 1.8 was no exception, presenting DevOps-engineers and everyone with feeling participants improvements and new features in almost all of their components.

The official release of Kubernetes 1.8 was scheduled for last Wednesday, but the official announcements (in the project blog and CNCF) have not yet taken place. However, today at 3:35 am MSK, a change in CHANGELOG was observed in the Git-repository of the project, which indicates that Kubernetes 1.8 is ready for download and use:
')


So, what did the new release of Kubernetes 1.8 bring?

Network


An alpha version of IPVS mode support has been added to kube-proxy for load balancing (instead of iptables). In this mode, kube-proxy monitors the services and endpoints in Kubernetes, creating a netlink interface ( virtual server and real server respectively). In addition, it periodically synchronizes them, maintaining the consistency of the IPVS state. When requesting access to the service, the traffic is redirected to one of the backend platforms. At the same time, IPVS offers various algorithms for load balancing (round-robin, least connection, destination hashing, source hashing, shortest expected delay, never queue). Such an opportunity was often requested in Kubernetes tickets, and we, too, were waiting for her very much.

Other network innovations include beta support for outgoing traffic EgressRules in the NetworkPolicy API, as well as the possibility (in the same NetworkPolicy ) of applying source / recipient CIDR rules (via ipBlockRule ).

Scheduler


The main innovation in the scheduler is the ability to set priorities (in the hearth specification, PodSpec , users define the PriorityClassName field, and Kubernetes set the Priority on its basis). The goal is simple: to improve the allocation of resources in cases where they are not enough, and at the same time you need to perform truly critical tasks and less urgent / important ones. Now high priority pods will get a greater chance of execution. In addition, when releasing resources in a cluster (preemption) , lower priority will be affected rather than high priority. In particular, for this, kubelet has changed its strategy for selecting pods (eviction strategy) , which now takes into account both the priority of supply and their consumption of resources. The implementation of all these features has the status of an alpha version. Kubernetes priorities and working with them are described in detail in the architecture documentation .

Another interesting innovation presented in the alpha version is a more complex mechanism for processing the conditions field ( Condition , see the documentation ) on the nodes. Traditionally, this field records the problematic states of the node — for example, in the absence of a network, the NetworkUnavailable condition is set to True , as a result of which the values ​​will no longer be assigned to this node. Using the new Taints Node by Condition approach, the same situation will lead to a node marking with a certain status (for example, node.kubernetes.io/networkUnavailable=:NoSchedule ), based on which (in the hearth specification) you can decide what to do next ( assign under this problem node).

Storage


Specifying mount options for volumes has become stable, and at the same time:


The Kubernetes API metric has added information about the available space in permanent volumes (PV), as well as metrics for success and latency for all mount / unmount / attach / detach / provision / delete calls.

In the PersistentVolume specification for Azure File, CephFS, iSCSI, GlusterFS, you can now refer to resources in namespaces.

Among unstable innovations (in alpha and beta statuses):


kubelet


The kubelet has an alpha version of the new component - CPU Manager , which interacts directly with kuberuntime and allows you to assign dedicated processor cores to container containers (that is, CPU affinity policies at the container level). As specified in the documentation , its appearance was the answer to two problems:

  1. poor or unpredictable performance compared to virtual machines (due to the large number of context switches and insufficiently efficient use of the cache),
  2. unacceptable delays related to the OS process scheduler, which is especially noticeable in the functions of virtual network interfaces.

Dynamic kubelet configuration is another feature in alpha status that allows you to update the configuration of this agent in all nodes of the live cluster. Bringing it to a stable state (GA) is expected only in release 1.10.

Metrics


Support for user metrics in Horizontal Pod Autoscaler (HPA) has received beta status, and its associated API has been translated to v1beta1 .

metrics-server has become the recommended way to provide APIs for resource metrics. Deployed as a supplement by analogy with Heapster . Direct receipt of metrics from Heapster is deprecated.

Cluster Autoscaler


The Cluster Autoscaler utility, created to automatically resize the Kubernetes cluster size (when there are scams that do not start due to lack of resources, or some nodes are not used for a long time), has received stable status (GA) and support up to 1000 nodes.

In addition, when deleting nodes, Cluster Autoscaler now gives 10 minutes of service for correct shutdown (graceful termination) . If the sub has not been stopped during this time, the node is still deleted. Previously, this limit was 1 minute or did not wait for the correct completion at all.

kubeadm and kops


An alpha implementation of a self-hosted control plane cluster ( kubeadm init with the flag --feature-gates=SelfHosting=true ) has --feature-gates=SelfHosting=true . Certificates can be stored on disk ( hostPath ) or in secrets. And the new kubeadm upgrade subcommand (in beta status) allows you to automatically upgrade the self-hosted cluster created with kubeadm.

Another new feature of kubeadm in the status of alpha is the execution of subtasks instead of the whole kubeadm init cycle using the phase subcommand (currently available as kubeadm alpha phase and will be brought to official form in the next Kubernetes release). The main purpose is the possibility of better integration of kubeadm with provisioning utilities like kops and GKE.

In kops , meanwhile, there are two new features in the status of alpha: support for bare metal machines as targets and the ability to run as a server (see Kops HTTP API Server ). Finally, GCE’s support for kops has been upgraded to beta status.

CLI


The kubectl console utility received experimental (alpha) support for add-ons. This means that the standard set of commands included in it can now be expanded using plug-ins.

The rollout and rollback commands in kubectl now support StatefulSet .

API


API changes include APIListChunking , a new approach to issuing responses to LIST requests. Now they are broken into small pieces and given out to the client in accordance with the limit specified by him. As a result, the server consumes less memory and CPU when issuing very large lists, and this behavior will become standard for all information in Kubernetes 1.9.

The CustomResourceDefinition API learned how to validate objects based on the JSON scheme (from the CRD specification) CustomResourceValidation alpha implementation is available as a CustomResourceValidation in kube-apiserver .

The garbage collector received support for custom APIs added via CustomResourceDefinition or aggregated API servers. Since the controller updates occur periodically, between adding an API and starting work of the garbage collector for it you should expect a delay of about 30 seconds.

Workload API


The so-called Workload API is the basic part of the Kubernetes API related to “workloads” and includes DaemonSet , Deployment , ReplicaSet , StatefulSet . At the moment, these APIs have been moved to the apps group, and with the release of Kubernetes 1.8, they have obtained version v1beta2. Stabilizing the Workload API implies putting these APIs in a separate group and achieving the highest possible consistency by standardizing these APIs by deleting / adding / renaming existing fields, determining the same default values, and general validation. For example, the default spec.updateStrategy strategy for StatefulSet and DaemonSet was RollingUpdate , and the default spec.selector for all Workload API (due to incompatibility with kubectl apply and strategic merge patch ) is disabled and now requires explicit definition by the user in the manifest. Summarizing ticket with details - # 353 .

Other


Among other (and quite numerous!) Changes in the release of Kubernetes 1.8, I note:


PS


During the preparation of Kubernetes 1.8, the project was built with the following Docker versions: 1.11.2, 1.12.6, 1.13.1, and 17.03.2. For a list of known issues for them, see here . In the same document, entitled “ Introduction to v1.8.0 ”, you can find a more complete list of all major changes.

We ourselves delayed the update of the Kubernetes serviced clusters from release 1.6 to 1.7 and carried out the main migration only 2 weeks ago (at the moment there are several installations with version 1.6 left). A global update to the new release - 1.8 - is planned in October.

Read also in our blog:

Source: https://habr.com/ru/post/338230/


All Articles