📜 ⬆️ ⬇️

Kubernetes 1.10: a review of major innovations

At the end of March, the release of Kubernetes 1.10 . Maintaining our tradition of telling details about the most significant changes in the next release of Kubernetes, we are publishing this review, based on CHANGELOG-1.10 , as well as numerous issues, pull requests and design proposals. So what does the new K8s 1.10 bring?



Storage


Mount propagation - the ability of containers to mount volumes as rslave , so that mounted host directories are visible inside the container ( HostToContainer value), or rshared , so that mounted container directories are visible on the host ( Bidirectional value). Status - beta version ( documentation on the site). Not supported on Windows.
')
Added the ability to create local persistent storage ( Local Persistent Storage ), i.e. PersistentVolumes (PVs) can now be not only network volumes, but also be based on locally attached disks. The innovation has two goals: a) improve performance (local SSDs have a better speed than network drives), b) ensure the possibility of using cheaper storages on bare metal Kubernetes installations. These works will be introduced together with the creation of Ephemeral Local Storage, the limitations / limits in which (first introduced in K8s 1.8 ) also received improvements in the next release - announced in beta and now included by default.

Topology Aware Volume Scheduling became available (in beta ) , the idea of ​​which comes down to the fact that the standard scheduler Kubernetes knows (and takes into account) the limitations of the topology of the volumes, and during the binding process PersistentVolumeClaims (PVCs) to PVs are accounted for by planner decisions. It is implemented in such a way that it can now request PVs, which should be compatible with its other limitations: resource requirements, affinity / anti-affinity policies. At the same time, planning of pods that do not use PVs with restrictions should occur with the same performance. Details are in design-proposals .

Among other improvements in volume / file system support:


Finally, additional metrics have been added (and declared stable) that indicate the internal state of the storage subsystem in Kubernetes and are intended for debugging, as well as for obtaining an extended view of the cluster state. For example, now for each volume (by volume_plugin ) you can find out the total time for the mount / umount and attach / detach operations, the total time of privision and deletion, as well as the number of volumes in ActualStateofWorld and DesiredStateOfWorld , bound / unbound PVCs and PVs, the number of used PVCs, etc. For more details, see the documentation .

Kubelet, nodes and their management


Kubelet was able to customize through a versioned configuration file (instead of the traditional way with flags on the command line), which has the KubeletConfiguration structure. In order for Kubelet to pick up the config, you need to run it with the --config flag (see the documentation for details). This approach is called recommended because it simplifies node deployment and configuration management. This was made possible by the emergence of an API group called kubelet.config.k8s.io , which has beta status for the release of Kubernetes 1.10. Example configuration file for Kubelet :

 kind: KubeletConfiguration apiVersion: kubelet.config.k8s.io/v1beta1 evictionHard: memory.available: "200Mi" 

With the help of a new option in the shareProcessNamespace spec, shareProcessNamespace in PodSpec , containers can now use a common PodSpec for processes (PID namespace) . Previously, this feature was not due to the lack of necessary support in Docker, which led to the emergence of an additional API, which has since been used by some container images ... Now everything has been unified , maintaining backward compatibility. The result of the implementation is the support of three modes of PID namespace separation in the Container Runtime Interface (CRI): for each container (that is, its namespace for each container), for the hearth (common namespace for the hearth containers), for the node. Readiness status - alpha.

Another significant change in CRI is the emergence of support for the Windows Container Configuration . Until now, only Linux containers could be configured in CRI, however, the features of other platforms, in particular, Windows , were described in the Open Container Initiative, Runtime Specification (OCI) executable environment. Now CRI supports memory and processor limits for Windows containers (alpha version).

In addition, the status of the beta version reached three development Resource Management Working Group:

  1. CPU Manager (assignment I will give you specific processor cores - read more about it in the article about K8s 1.8 );
  2. Huge Pages (the ability to use 2Mi and 1Gi Huge Pages pods, which is important for applications that consume large amounts of memory);
  3. Device Plugin (framework for vendors that allows you to declare resources in kubelet : for example, from GPU, NIC, FPGA, InfiniBand, etc. - without the need to modify the main Kubernetes code).

The number of processes running in the pod can now be limited using the --pod-max-pids for the --pod-max-pids console command. The implementation has the status of an alpha version and requires the inclusion of the SupportPodPidsLimit feature.

Due to the fact that a native CRI v1alpha2 support appeared in containerd 1.1, in Kubernetes 1.10 you can work directly with containerd 1.1 without the need for a cri-containerd mediator (we wrote more about it at the end of this article ) . CRI-O also updated the CRI version to v1alpha2, and the CRI (Container Runtime Interface) added support for specifying the container GID in LinuxSandboxSecurityContext and in LinuxContainerSecurityContext (in addition to UID) - support is implemented for dockershim and has alpha version status.

Network


An option using CoreDNS instead of kube-dns has reached beta status. In particular, this brought the ability to migrate to CoreDNS when upgrading using the kubeadm cluster using kube-dns : in this case, kubeadm will generate a CoreDNS configuration (i.e. Corefile ) based on the ConfigMap from kube-dns .

Traditionally, /etc/resolv.conf on the pod is managed by a kubelet , and the data of this config is generated based on pod.dnsPolicy . Kubernetes 1.10 (in beta status) provides support for the resolv.conf configuration of resolv.conf . To do this, the PodSpec field has been added to the dnsParams , which allows you to overwrite the existing DNS settings. Read more in design-proposals . Illustration of using dnsPolicy: Custom with dnsParams :

 # Pod spec apiVersion: v1 kind: Pod metadata: {"namespace": "ns1", "name": "example"} spec: ... dnsPolicy: Custom dnsParams: nameservers: ["1.2.3.4"] search: - ns1.svc.cluster.local - my.dns.search.suffix options: - name: ndots value: 2 - name: edns0 

An option has been added to kube-proxy that allows defining a range of IP addresses for NodePort , i.e. initiate filtering of valid values ​​using --nodeport-addresses (with a default value of 0.0.0.0/0 , i.e. skip everything that the current behavior of NodePort ). Implementation in kube-proxy for iptables, Linux userspace, IPVS, Window userspace, winkernel is provided. Status - alpha version.

Authentication


Added new authentication methods (alpha version):

  1. external client providers : responding to long-standing requests from K8s users for exec-based plugins, kubectl (client-go) implemented support for executable plug-ins that can receive authentication data by executing an arbitrary command and reading its output (the GCP plugin can also be configured to invoke commands other than gcloud ). One application is that cloud providers will be able to create their own authentication systems (instead of using standard Kubernetes mechanisms);
  2. TokenRequest API for receiving JWT (JSON Web Tokens) tokens associated with clients (audience) and time.

In addition, the stable status gained the ability to restrict node access to certain APIs (using the Node authorization mode and the NodeRestriction admission plugin) in order to grant them permission only for a limited number of objects and their associated secrets.

CLI


Progress has been made in processing the output shown by the kubectl get and kubectl describe . The global objective of the initiative , which has received beta status in Kubernetes 1.10, is that getting columns for tabular display of data should occur on the server side (and not the client), this is done to improve the user interface when working with extensions. The work begun earlier (in K8s 1.8) on the server side has been brought to the beta level, and major changes have been made on the client side.

In kubectl port-forward , the ability to use the resource name is added to select a suitable hearth (and the --pod-running-timeout flag to wait for at least one under the launch), as well as support for specifying a service for port forwarding (for example: kubectl port-forward svc/myservice 8443:443 ).

New short names for kubectl commands: cj instead of CronJobs , crds - CustomResourceDefinition . For example, the command kubectl get crds has become available.

Other changes



Compatibility



PS


Read also in our blog:

Source: https://habr.com/ru/post/353114/


All Articles