In this article, I would like to show you one cool technology, I successfully use it for Kubernetes. It can be really useful for building large clusters.
From this point on, you no longer have to think about installing the OS and the individual packages on each node. What for? You can do all this automatically through Dockerfile!
The fact that you can buy hundreds of new servers, add them to your work environment and almost instantly get them ready for use - this is really amazing!
Intrigued? Now let's get everything in order.
To begin with, we need to understand exactly how this circuit works.
In short, for all the nodes we are preparing a single image with the OS, Docker, Kubelet and all the rest.
This system image, along with the kernel, is built automatically by CI using the Dockerfile.
The final nodes load the OS and the kernel from this image directly through the network.
The nodes use overlayfs as the root filesystem, so if you reboot, any changes will be lost (as is the case with docker containers).
There is a main config, it can describe mount points and some commands that should be executed during node loading (for example, the command to add an ssh key and kubeadm join
)
We will use the LTSP project, because it gives us everything you need to organize network boot.
In general, LTSP is a bundle of shell scripts that makes our life much easier.
It provides the initramfs module, several helper scripts, and some configuration system that prepares the system at an early stage of loading, even before init is called.
This is how the image preparation procedure looks like:
ltsp-build-image
commandImmediately after this, you will get a compressed image of this chroot with all the installed software inside.
Each node will download this image at boot time and use it as rootfs.
To update, just restart the node, the new image will be downloaded and used for rootfs.
The server part of the LTSP in our case includes only two components:
You also need to have:
Description of the node loading process
next-server
, filename
.As I said before, I prepare the LTSP server with a squashed image automatically using a Dockerfile. This method is not bad, because all the steps for building can be described in your git repository. You can manage versions, use tags, use CI and all the same thing that you would use to prepare ordinary Docker projects.
On the other hand, you can manually deploy an LTSP server by completing all the steps manually; this can be good for training purposes and for understanding the basic principles.
Run the commands listed in the article manually, if you just want to try LTSP without Dockerfile.
At the moment, LTSP has some flaws, and the authors of the project are not very willing to accept corrections. Luckily, LTSP is easily customizable, so I prepared a few patches for myself, I will bring them here.
Maybe someday I will ripen on fork if the community warmly accepts my decision.
We will use the stage building in our Dockerfile to save only the necessary parts of our docker image, the remaining unused parts will be excluded from the final image.
ltsp-base ( ltsp ) | |---basesystem | ( chroot- ) | | | |---builder | | ( , ) | | | '---ltsp-image | ( , docker, kubelet squashed ) | '---final-stage ( squashed , initramfs stage)
OK, let's get started, this is the first part of our Dockerfile:
FROM ubuntu:16.04 as ltsp-base ADD nbd-server-wrapper.sh /bin/ ADD /patches/feature-grub.diff /patches/feature-grub.diff RUN apt-get -y update \ && apt-get -y install \ ltsp-server \ tftpd-hpa \ nbd-server \ grub-common \ grub-pc-bin \ grub-efi-amd64-bin \ curl \ patch \ && sed -i 's|in_target mount|in_target_nofail mount|' \ /usr/share/debootstrap/functions \ # EFI Grub (#1745251) && patch -p2 -d /usr/sbin < /patches/feature-grub.diff \ && rm -rf /var/lib/apt/lists \ && apt-get clean
At the moment, our docker image has already been installed:
At this stage, we will prepare the chroot environment with the base system and install the main software with the kernel.
We will use the usual debootstrap instead of the ltsp-build-client to prepare the image, because ltsp-build-client will install the GUI and some other unnecessary things that we obviously will not need to deploy servers.
FROM ltsp-base as basesystem ARG DEBIAN_FRONTEND=noninteractive # RUN mkdir -p /opt/ltsp/amd64/proc/self/fd \ && touch /opt/ltsp/amd64/proc/self/fd/3 \ && debootstrap --arch amd64 xenial /opt/ltsp/amd64 \ && rm -rf /opt/ltsp/amd64/proc/* # RUN echo "\ deb http://archive.ubuntu.com/ubuntu xenial main restricted universe multiverse\n\ deb http://archive.ubuntu.com/ubuntu xenial-updates main restricted universe multiverse\n\ deb http://archive.ubuntu.com/ubuntu xenial-security main restricted universe multiverse" \ > /opt/ltsp/amd64/etc/apt/sources.list \ && ltsp-chroot apt-get -y update \ && ltsp-chroot apt-get -y upgrade # LTSP RUN ltsp-chroot apt-get -y install ltsp-client-core # initramfs # 1: /etc/lts.conf (#1680490) # 2: PREINIT lts.conf ADD /patches /patches RUN patch -p4 -d /opt/ltsp/amd64/usr/share < /patches/feature_initramfs_params_from_lts_conf.diff \ && patch -p3 -d /opt/ltsp/amd64/usr/share < /patches/feature_preinit.diff # LTSP_NBD_TO_RAM , ram: RUN echo "[Default]\nLTSP_NBD_TO_RAM = true" \ > /opt/ltsp/amd64/etc/lts.conf # RUN echo 'APT::Install-Recommends "0";\nAPT::Install-Suggests "0";' \ >> /opt/ltsp/amd64/etc/apt/apt.conf.d/01norecommend \ && ltsp-chroot apt-get -y install \ software-properties-common \ apt-transport-https \ ca-certificates \ ssh \ bridge-utils \ pv \ jq \ vlan \ bash-completion \ screen \ vim \ mc \ lm-sensors \ htop \ jnettop \ rsync \ curl \ wget \ tcpdump \ arping \ apparmor-utils \ nfs-common \ telnet \ sysstat \ ipvsadm \ ipset \ make # RUN ltsp-chroot apt-get -y install linux-generic-hwe-16.04
Please note that some packages, such as lvm2, may have problems. They are not fully optimized for installation in an unprivileged chroot. Their postinstall scripts attempt to invoke privileged commands that may fail with errors and block the installation of the entire package.
Decision:
At this stage, we can collect all the necessary software and kernel modules from source, it’s very cool that it is possible to do this at this stage, in fully automatic mode.
Skip this step if you don’t need to collect anything from the tribe.
I will give an example of installing the latest version of the MLNX_EN driver:
FROM basesystem as builder # cpuinfo ( ) RUN cp /proc/cpuinfo /opt/ltsp/amd64/proc/cpuinfo # Mellanox driver RUN ltsp-chroot sh -cx \ ' VERSION=4.3-1.0.1.0-ubuntu16.04-x86_64 \ && curl -L http://www.mellanox.com/downloads/ofed/MLNX_EN-${VERSION%%-ubuntu*}/mlnx-en-${VERSION}.tgz \ | tar xzf - \ && export \ DRIVER_DIR="$(ls -1 | grep "MLNX_OFED_LINUX-\|mlnx-en-")" \ KERNEL="$(ls -1t /lib/modules/ | head -n1)" \ && cd "$DRIVER_DIR" \ && ./*install --kernel "$KERNEL" --without-dkms --add-kernel-support \ && cd - \ && rm -rf "$DRIVER_DIR" /tmp/mlnx-en* /tmp/ofed*' # RUN ltsp-chroot sh -c \ ' export KERNEL="$(ls -1t /usr/src/ | grep -m1 "^linux-headers" | sed "s/^linux-headers-//g")" \ && tar cpzf /modules.tar.gz /lib/modules/${KERNEL}/updates'
At this stage we will establish what we collected in the previous step:
FROM basesystem as ltsp-image # COPY --from=builder /opt/ltsp/amd64/modules.tar.gz /opt/ltsp/amd64/modules.tar.gz # RUN ltsp-chroot sh -c \ ' export KERNEL="$(ls -1t /usr/src/ | grep -m1 "^linux-headers" | sed "s/^linux-headers-//g")" \ && tar xpzf /modules.tar.gz \ && depmod -a "${KERNEL}" \ && rm -f /modules.tar.gz'
Now we will make additional changes to complete our LTSP image:
# docker RUN ltsp-chroot sh -c \ ' curl -fsSL https://download.docker.com/linux/ubuntu/gpg | apt-key add - \ && echo "deb https://download.docker.com/linux/ubuntu xenial stable" \ > /etc/apt/sources.list.d/docker.list \ && apt-get -y update \ && apt-get -y install \ docker-ce=$(apt-cache madison docker-ce | grep 18.06 | head -1 | awk "{print $ 3}")' # docker RUN DOCKER_OPTS="$(echo \ --storage-driver=overlay2 \ --iptables=false \ --ip-masq=false \ --log-driver=json-file \ --log-opt=max-size=10m \ --log-opt=max-file=5 \ )" \ && sed "/^ExecStart=/ s|$| $DOCKER_OPTS|g" \ /opt/ltsp/amd64/lib/systemd/system/docker.service \ > /opt/ltsp/amd64/etc/systemd/system/docker.service # kubeadm, kubelet kubectl RUN ltsp-chroot sh -c \ ' curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add - \ && echo "deb http://apt.kubernetes.io/ kubernetes-xenial main" \ > /etc/apt/sources.list.d/kubernetes.list \ && apt-get -y update \ && apt-get -y install kubelet kubeadm kubectl cri-tools' # RUN rm -f /opt/ltsp/amd64/etc/apt/apt.conf.d/20auto-upgrades # apparmor RUN ltsp-chroot find /etc/apparmor.d \ -maxdepth 1 \ -type f \ -name "sbin.*" \ -o -name "usr.*" \ -exec ln -sf "{}" /etc/apparmor.d/disable/ \; # (cmdline) RUN KERNEL_OPTIONS="$(echo \ init=/sbin/init-ltsp \ forcepae \ console=tty1 \ console=ttyS0,9600n8 \ nvme_core.default_ps_max_latency_us=0 \ )" \ && sed -i "/^CMDLINE_LINUX_DEFAULT=/ s|=.*|=\"${KERNEL_OPTIONS}\"|" \ "/opt/ltsp/amd64/etc/ltsp/update-kernels.conf"
Now we will make a squased image from our chroot:
# RUN rm -rf /opt/ltsp/amd64/var/lib/apt/lists \ && ltsp-chroot apt-get clean # squashed RUN ltsp-update-image
At the final stage, we will save only our squashed image and the kernel with initramfs
FROM ltsp-base COPY --from=ltsp-image /opt/ltsp/images /opt/ltsp/images COPY --from=ltsp-image /etc/nbd-server/conf.d /etc/nbd-server/conf.d COPY --from=ltsp-image /var/lib/tftpboot /var/lib/tftpboot
Great, now we have a docker image that includes:
OK, now that our Docker image with the LTSP server, the kernel, the initramfs and the squashed rootfs is fully ready, we can run the deployment with it.
We can do it as usual, but there is another issue that we have to solve.
Unfortunately, we cannot use the usual Kubernetes service for our deployment, because during loading the nodes are not part of the Kubernetes cluster and they need to use externalIP, but Kubernetes always uses NAT for externalIP and currently there is no way to change this behavior.
I know two ways to avoid this: use hostNetwork: true
or use pipework , the second option will also give us fault tolerance, because in case of failure, the IP address will move to another node together with the container. Unfortunately, pipework is not a native and less secure method.
If you know of any more suitable solution, please tell us about it.
I will give an example of deployment with hostNetwork:
apiVersion: extensions/v1beta1 kind: Deployment metadata: name: ltsp-server labels: app: ltsp-server spec: selector: matchLabels: name: ltsp-server replicas: 1 template: metadata: labels: name: ltsp-server spec: hostNetwork: true containers: - name: tftpd image: registry.example.org/example/ltsp:latest command: [ "/usr/sbin/in.tftpd", "-L", "-u", "tftp", "-a", ":69", "-s", "/var/lib/tftpboot" ] lifecycle: postStart: exec: command: ["/bin/sh", "-c", "cd /var/lib/tftpboot/ltsp/amd64; ln -sf config/lts.conf ." ] volumeMounts: - name: config mountPath: "/var/lib/tftpboot/ltsp/amd64/config" - name: nbd-server image: registry.example.org/example/ltsp:latest command: [ "/bin/nbd-server-wrapper.sh" ] volumes: - name: config configMap: name: ltsp-config
As you might notice, configmap with the lts.conf file is also used here.
As an example, I will give part of my config:
apiVersion: v1 kind: ConfigMap metadata: name: ltsp-config data: lts.conf: | [default] KEEP_SYSTEM_SERVICES = "ssh ureadahead dbus-org.freedesktop.login1 systemd-logind polkitd cgmanager ufw rpcbind nfs-kernel-server" PREINIT_00_TIME = "ln -sf /usr/share/zoneinfo/Europe/Prague /etc/localtime" PREINIT_01_FIX_HOSTNAME = "sed -i '/^127.0.0.2/d' /etc/hosts" PREINIT_02_DOCKER_OPTIONS = "sed -i 's|^ExecStart=.*|ExecStart=/usr/bin/dockerd -H fd:// --storage-driver overlay2 --iptables=false --ip-masq=false --log-driver=json-file --log-opt=max-size=10m --log-opt=max-file=5|' /etc/systemd/system/docker.service" FSTAB_01_SSH = "/dev/data/ssh /etc/ssh ext4 nofail,noatime,nodiratime 0 0" FSTAB_02_JOURNALD = "/dev/data/journal /var/log/journal ext4 nofail,noatime,nodiratime 0 0" FSTAB_03_DOCKER = "/dev/data/docker /var/lib/docker ext4 nofail,noatime,nodiratime 0 0" # Each command will stop script execution when fail RCFILE_01_SSH_SERVER = "cp /rofs/etc/ssh/*_config /etc/ssh; ssh-keygen -A" RCFILE_02_SSH_CLIENT = "mkdir -p /root/.ssh/; echo 'ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDBSLYRaORL2znr1V4a3rjDn3HDHn2CsvUNK1nv8+CctoICtJOPXl6zQycI9KXNhANfJpc6iQG1ZPZUR74IiNhNIKvOpnNRPyLZ5opm01MVIDIZgi9g0DUks1g5gLV5LKzED8xYKMBmAfXMxh/nsP9KEvxGvTJB3OD+/bBxpliTl5xY3Eu41+VmZqVOz3Yl98+X8cZTgqx2dmsHUk7VKN9OZuCjIZL9MtJCZyOSRbjuo4HFEssotR1mvANyz+BUXkjqv2pEa0I2vGQPk1VDul5TpzGaN3nOfu83URZLJgCrX+8whS1fzMepUYrbEuIWq95esjn0gR6G4J7qlxyguAb9 admin@kubernetes' >> /root/.ssh/authorized_keys" RCFILE_03_KERNEL_DEBUG = "sysctl -w kernel.unknown_nmi_panic=1 kernel.softlockup_panic=1; modprobe netconsole netconsole=@/vmbr0,@10.9.0.15/" RCFILE_04_SYSCTL = "sysctl -w fs.file-max=20000000 fs.nr_open=20000000 net.ipv4.neigh.default.gc_thresh1=80000 net.ipv4.neigh.default.gc_thresh2=90000 net.ipv4.neigh.default.gc_thresh3=100000" RCFILE_05_FORWARD = "echo 1 > /proc/sys/net/ipv4/ip_forward" RCFILE_06_MODULES = "modprobe br_netfilter" RCFILE_07_JOIN_K8S = "kubeadm join --token 2a4576.504356e45fa3d365 10.9.0.20:6443 --discovery-token-ca-cert-hash sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855"
'ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDBSLYRaORL2znr1V4a3rjDn3HDHn2CsvUNK1nv8 + CctoICtJOPXl6zQycI9KXNhANfJpc6iQG1ZPZUR74IiNhNIKvOpnNRPyLZ5opm01MVIDIZgi9g0DUks1g5gLV5LKzED8xYKMBmAfXMxh / nsP9KEvxGvTJB3OD + / bBxpliTl5xY3Eu41 + VmZqVOz3Yl98 + X8cZTgqx2dmsHUk7VKN9OZuCjIZL9MtJCZyOSRbjuo4HFEssotR1mvANyz + BUXkjqv2pEa0I2vGQPk1VDul5TpzGaN3nOfu83URZLJgCrX + 8whS1fzMepUYrbEuIWq95esjn0gR6G4J7qlxyguAb9 admin @ kubernetes' >> /root/.ssh/authorized_keys" apiVersion: v1 kind: ConfigMap metadata: name: ltsp-config data: lts.conf: | [default] KEEP_SYSTEM_SERVICES = "ssh ureadahead dbus-org.freedesktop.login1 systemd-logind polkitd cgmanager ufw rpcbind nfs-kernel-server" PREINIT_00_TIME = "ln -sf /usr/share/zoneinfo/Europe/Prague /etc/localtime" PREINIT_01_FIX_HOSTNAME = "sed -i '/^127.0.0.2/d' /etc/hosts" PREINIT_02_DOCKER_OPTIONS = "sed -i 's|^ExecStart=.*|ExecStart=/usr/bin/dockerd -H fd:// --storage-driver overlay2 --iptables=false --ip-masq=false --log-driver=json-file --log-opt=max-size=10m --log-opt=max-file=5|' /etc/systemd/system/docker.service" FSTAB_01_SSH = "/dev/data/ssh /etc/ssh ext4 nofail,noatime,nodiratime 0 0" FSTAB_02_JOURNALD = "/dev/data/journal /var/log/journal ext4 nofail,noatime,nodiratime 0 0" FSTAB_03_DOCKER = "/dev/data/docker /var/lib/docker ext4 nofail,noatime,nodiratime 0 0" # Each command will stop script execution when fail RCFILE_01_SSH_SERVER = "cp /rofs/etc/ssh/*_config /etc/ssh; ssh-keygen -A" RCFILE_02_SSH_CLIENT = "mkdir -p /root/.ssh/; echo 'ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDBSLYRaORL2znr1V4a3rjDn3HDHn2CsvUNK1nv8+CctoICtJOPXl6zQycI9KXNhANfJpc6iQG1ZPZUR74IiNhNIKvOpnNRPyLZ5opm01MVIDIZgi9g0DUks1g5gLV5LKzED8xYKMBmAfXMxh/nsP9KEvxGvTJB3OD+/bBxpliTl5xY3Eu41+VmZqVOz3Yl98+X8cZTgqx2dmsHUk7VKN9OZuCjIZL9MtJCZyOSRbjuo4HFEssotR1mvANyz+BUXkjqv2pEa0I2vGQPk1VDul5TpzGaN3nOfu83URZLJgCrX+8whS1fzMepUYrbEuIWq95esjn0gR6G4J7qlxyguAb9 admin@kubernetes' >> /root/.ssh/authorized_keys" RCFILE_03_KERNEL_DEBUG = "sysctl -w kernel.unknown_nmi_panic=1 kernel.softlockup_panic=1; modprobe netconsole netconsole=@/vmbr0,@10.9.0.15/" RCFILE_04_SYSCTL = "sysctl -w fs.file-max=20000000 fs.nr_open=20000000 net.ipv4.neigh.default.gc_thresh1=80000 net.ipv4.neigh.default.gc_thresh2=90000 net.ipv4.neigh.default.gc_thresh3=100000" RCFILE_05_FORWARD = "echo 1 > /proc/sys/net/ipv4/ip_forward" RCFILE_06_MODULES = "modprobe br_netfilter" RCFILE_07_JOIN_K8S = "kubeadm join --token 2a4576.504356e45fa3d365 10.9.0.20:6443 --discovery-token-ca-cert-hash sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855"
/etc/fstab
file.nofail
option, it gives the following behavior that if the partition does not exist the download continues without errors.rc.local
file, which will be called by systemd at boot time.kubeadm join
command, which adds a node to the kubernetes cluster.More detailed information on all variables can be obtained from the lts.conf manual page .
Now you can configure your DHCP. In fact, what is needed there is to specify next-server
and filename
options.
I use the ISC-DHCP server, give an example dhcpd.conf
:
shared-network ltsp-netowrk { subnet 10.9.0.0 netmask 255.255.0.0 { authoritative; default-lease-time -1; max-lease-time -1; option domain-name "example.org"; option domain-name-servers 10.9.0.1; option routers 10.9.0.1; next-server ltsp-1; # write ltsp-server hostname here if option architecture = 00:07 { filename "/ltsp/amd64/grub/x86_64-efi/core.efi"; } else { filename "/ltsp/amd64/grub/i386-pc/core.0"; } range 10.9.200.0 10.9.250.254; }
You can start with this, but as for me, I have several LTSP servers and for each node I set up a separate static IP address and the necessary options using the Ansible-playbook.
Try running your first node, and if everything was done correctly, you will get a loaded system on it. The node will also be added to the Kubernetes cluster.
Now you can try making your own changes.
If you need something more, please note that LTSP can be very easily customized for your needs. Feel free to look at the source, there you can find quite a lot of answers.
Join our Telegram channel: @ltsp_ru .
Source: https://habr.com/ru/post/423785/
All Articles