Understanding the installation and booting of Linux on the example of ArchLinux

First we install Archlinux and turn it into a boot server. Right from there we will prepare a new compact system, to which we will add a minimal graphical environment and the most necessary functionality (for example, Firefox). We will teach our system to boot over the network even on computers with UEFI. Then we will fully translate it into a “read only” mode (let's make it “live”), which will allow us to use the system at the same time on the floor of a hundred unsuited computers with one single boot server. All this will work even inside a cheap 100-MB network, which we will additionally “overclock” a couple of times.

No bookmarks in hard disks will be terrible for you, because we will not have disks. No crazy users handles will break anything, since after a reboot, the system will return to its original state. Of course, you will learn and be able to independently modify the boot system so that it contains only the functionality you need and nothing extra. In between, we will figure out how and in what order Linux boots, and what it is made of. Knowledge, as we know, is priceless, so I share them with a gift.

I will try, without much deliberation, to explain what is happening, sometimes looking a little ahead, but afterwards, be sure to sort it all out. So that you do not have any problems with understanding, I assume that you have already worked with some ready-made Linux distribution, tried to write simple scripts using nano or another text editor. If you are new to ArchLinux, then you will learn a lot of new things, and if you are “old man” you will learn less, but I hope that in any case you will love Linux even more.
')
There was a lot of information. And according to the established Hollywood tradition, the series awaits you in several parts:
continuation ;
ending

Now we will install Archlinux in VirtualBox, which can be cloned and run on almost any computer with legacy BIOS without any additional settings. In the meantime, we will learn about the basic techniques for working with systemd, as well as how to use it to run arbitrary services and programs at boot time. We will also see what stages Linux is going through when booting, and write our own handler (hook), which we put in the initramfs. Don't know what initramfs is? Then go under the cat.

There are many reasons why the choice fell on Archlinux. First reason: he is my long-time shrewd friend and loyal assistant. Gentoo, as they say on the Internet, is even more resourceful, but I don’t want to build a system from source. The second reason: ready-made assemblies always contain a lot of excess, and pumping large amounts of data can critically affect network performance, and nothing can be seen behind the wide back of the “automatic installer” - this is the third reason. Fourth: systemd gradually penetrates all distributions and even Debian , so that we can thoroughly delve into the future ready distributions using the example of Archlinux. With all this, the system that we will later prepare can be downloaded over the network not only by a server running in a virtual machine, but also from a regular computer, for example, with Raspberry Pi, and even with Western Digital My Cloud (verified) Works under Debian.

Preparatory work

Download the latest image from the link from the official site . In Moscow, from Yandex servers, for example, the download happens very quickly, and if your process has been delayed - just try downloading it elsewhere. I recommend to remember in what, because this information is still useful to us.

In VirtualBox we create a new virtual machine (for example, with 1 GB of RAM and 8 GB of storage). In the network settings, you must select the type of connection "network bridge" and the appropriate network adapter with Internet access. We connect the downloaded image to the CD-ROM. If you can’t wait to start working with hardware, then take a flash drive and burn an image using Win32 Disk Imager (if you work under Windows), and then load the future server directly from it.

We turn on the machine, wait for the command line to appear and set a password, without which SSH will not work:

passwd

Start the SSH server with the command:

 systemctl start sshd

It remains to find the IP address of the machine, examining the output of the command:

 ip addr | grep "scope global"

The address will be indicated immediately after the “inet”.

Now Windows users will be able to connect to the machine using putty , and then they will copy the commands from here and paste them in and right-click.

Basic setup

Next, I will briefly describe the standard Archlinux installation. If you have questions, you will probably find the answers to them in the Detailed Installation Description for beginners . The wiki is simply wonderful, and the English-language wiki is even relevant, so try to use it.

Prepare the media using cfdisk (this is a console utility with a simple and intuitive interface). One section is enough for us, just do not forget to mark it as bootable:

 cfdisk /dev/sda

We formatted in ext4 and set a label, for example HABR:

 mkfs.ext4 /dev/sda1 -L "HABR"

The future root partition is mounted in / mnt:

 export root=/mnt mount /dev/sda1 $root

Archlinux is usually installed via the Internet, so immediately after installation you will have the newest and most current version. The list of repositories is in the file /etc/pacman.d/mirrorlist. Try to remember where you downloaded the distribution and move these servers to the very top of the list - this will save you a lot of time in the next step. Usually these are servers geographically located in the same place where you are now.

 nano /etc/pacman.d/mirrorlist

Install the basic set of packages and the set for developers:

 pacstrap -i $root base base-devel

Now we use the arch-chroot command, which allows you to temporarily change the root directory to any other one that has the structure of the Linux root file system. At the same time, the programs that we launch from there will not know that something else exists outside. We will practically find ourselves in our new system with administrator rights:

 arch-chroot $root

Notice how the command prompt has changed.

Choose the languages that we plan to use. I propose to leave en_US.UTF-8 UTF-8 and ru_RU.UTF-8 UTF-8. In a text editor, you just need to remove comments in front of them:

 nano /etc/locale.gen

Now we generate the selected localizations:

 locale-gen

If everything went well, then you will see something like this:

 Generating locales... en_US.UTF-8... done ru_RU.UTF-8... done Generation complete.

Set the language to be used by default:

 echo LANG=ru_RU.UTF-8 > /etc/locale.conf

As well as the layout and font in the console:

 echo -e "KEYMAP=ru\nFONT=cyr-sun16\nFONT_MAP=" > /etc/vconsole.conf

Specify the time zone (I use Moscow time):

 ln -s /usr/share/zoneinfo/Europe/Moscow /etc/localtime

We make a name for our future server:

 echo "HabraBoot" > /etc/hostname

Now set the admin password. We do this primarily because SSH will not allow us to connect to the system without a password. The theme of the unreasonable use of the system, unprotected with a password, we will not develop here.

 passwd

Twice we enter the password and we are convinced that password updated successfully .

Add a new user named username (you can choose any), give him administrator rights and give him a password for the same reasons, but also because under root in the current version of Arch we will not be able to build packages from AUR (Arch User Repository is a repository from the Arch Linux user community with programs that are not in the main repository):

 useradd -m username

Edit the / etc / sudoers settings file using nano:

 EDITOR=nano visudo

Adding to it immediately after the line “root ALL = (ALL) ALL” one more line:

 username ALL=(ALL) ALL

And set the password for the username:

 passwd username

Now you need to install the bootloader on the internal drive so that the system can boot from it. As a bootloader, I suggest using GRUB, because later it will come in handy again. Install packages using the standard Archlinux package manager pacman:

 pacman -S grub

Write the bootloader to the MBR (Master Boot Record) of our internal drive.

 grub-install --target=i386-pc --force --recheck /dev/sda

If everything went well, then you will see Installation finished. No error reported .

Exit the chroot:

 exit

And we notice how the command line prompt has changed.

We will use disk labels, a detailed explanation of this statement will follow later.

Remove the comment from the line GRUB_DISABLE_LINUX_UUID = true so that the UUIDs of the drives are not used:

 nano $root/etc/default/grub

We generate the boot loader configuration file again using the arch-chroot. A single command will be logged in, and an automatic logout will follow:

 arch-chroot $root grub-mkconfig --output=/boot/grub/grub.cfg

We need to replace all references to / dev / sda1 with LABEL = HABR in the configuration file:

 mv $root/boot/grub/grub.cfg $root/boot/grub/grub.cfg.autoconf && cat $root/boot/grub/grub.cfg.autoconf | sed 's/\(root=\)\/dev\/sda1/\1LABEL=HABR/g' > $root/boot/grub/grub.cfg

If you change the line set lang = en_US to set lang = ru_RU in the same file, the bootloader will communicate with us on the great and mighty.

We generate the fstab file with the -L key, which will force the generator to use disk labels:

 genfstab -p -L $root > $root/etc/fstab

This completes the basic ArchLinux installation. The system will boot on its own and will meet you with a friendly Russian-language command line interface. If after this we enter the dhcpcd command, then most likely even the Internet will work. But we will not rush to reboot.

Run at boot using systemd using NTP and SSH as an example

Since our system will communicate with other computers, we will need to synchronize time. If the time on the server and the client is different, then there is a high probability that they will not be able to connect with each other at all. In turn, sudo can start asking for a password after each action, thinking that the authorization timeout has expired long ago. And who knows what we still have to face? Reinsured.

To synchronize time with servers over the Internet using the NTP protocol, we need to install the missing packets. You can use arch-root, but we’ll manage with keys that tell a new place to install the package manager:

 pacman --root $root --dbpath $root/var/lib/pacman -S ntp

Let's configure getting the exact time from Russian servers:

 mv $root/etc/ntp.conf $root/etc/ntp.conf.old && cat $root/etc/ntp.conf.old | sed 's/\([0-9]\).*\(.pool.ntp.org\)/\1.ru\2/g' | tee $root/etc/ntp.conf

We just need to synchronize the time once at boot. Previously, we would have recorded the start of the exact time service in the rc.local file, but now a system manager and systemd services manager appear, who are trying to start the services (in the original, they are called unit) in parallel to reduce the system load time. Naturally, the performance of one service may depend on the operation of another. For example, it is useless for us to try to synchronize time via the Internet before the network is running on our computer. To describe all these relationships, simply specifying the name of the executable file is no longer enough; therefore, running via systemd has become a very non-trivial exercise. For this purpose, special files were created with the extension ".service". They indicate dependencies, executable file names and other parameters that must be considered for a successful launch. In particular, to control the stages of loading in systemd, targets (target) are used, which, according to the tasks assigned to them, are similar to launch levels (runlevel). Read more on the wiki .

To the delight of beginners, along with the ntp package comes already ready ntpdate.service. All files describing the launch of services are located in the $ root / usr / lib / systemd / system / folder, and they can be opened in any text editor or viewed in the usual way. For example, $ root / usr / lib / systemd / system / ntpdate.service:

 [Unit] Description=One-Shot Network Time Service After=network.target nss-lookup.target Before=ntpd.service [Service] Type=oneshot PrivateTmp=true ExecStart=/usr/bin/ntpd -q -n -g -u ntp:ntp [Install] WantedBy=multi-user.target

In the [Unit] block, the Description line indicates a brief description of the service, and under what conditions it should be started (in this case, after starting the network, but before the start of the NTP server, which we do not plan to start at all). The exact time request occurs only once during the download, and the Type = oneshot line from the [Service] block is responsible for this. In the same block, the ExecStart line contains the actions that must be performed to start the service. In the [Install] block in our case, it is indicated that the launch of our service is necessary to achieve the goal of multi-user.target. It is recommended to use the same content of the [Install] block to run self-made services.

As a first practical example, we will slightly expand the functionality of ntpdate.service, asking it to additionally fix the time on the hardware clock. If after this, on the same computer, you boot Windows, then you will see GMT, so do not be intimidated.

Changing the standard behavior of any systemd service is done as follows: first, a new directory with the full service name and the extension ".d" is created in the / etc / systemd / system / folder, where the file with an arbitrary name and the extension ".conf" is added, and there necessary modifications are made. Let's start:

 mkdir -p $root/etc/systemd/system/ntpdate.service.d && echo -e '[Service]\nExecStart=/usr/bin/hwclock -w' > $root/etc/systemd/system/ntpdate.service.d/hwclock.conf

It simply says that immediately after starting the service, execute the command "/ usr / bin / hwclock -w", which will translate the hardware clock.

Add the ntpdate service to autoload (syntax is standard for all services):

 arch-chroot $root systemctl enable ntpdate Created symlink from /etc/systemd/system/multi-user.target.wants/ntpdate.service to /usr/lib/systemd/system/ntpdate.service.

As you can see, an ordinary symbolic link to the ntpdate.service file was created in the multi-user.target.wants directory, and we saw the mention of the multi-user.target target in the [Install] block of this file itself. It turns out that in order for the system to reach the goal of multi-user.target, all services from the multi-user.target.wants directory must be running.

Now install the SSH package in a similar way (in ArchLinux it is called openssh):

 pacman --root $root --dbpath $root/var/lib/pacman -S openssh

But this time for autorun, we will use a socket so that the SSH server starts only after a connection request is received, and does not hang as a dead load in the RAM:

 arch-chroot $root systemctl enable sshd.socket

We did not change the standard port 22 and did not include the forced use of Protocol 2 - let it remain on my conscience.

Looking ahead or getting acquainted with handlers (hooks)

So that we can connect to our future server without looking, we need to know its IP address. It will be much easier if this address is static. The usual methods mentioned in the wiki do not suit us. The problem is that network adapters in the modern world are named according to their physical location on the motherboard. For example, the device name enp0s3 means that it is an ethernet network adapter, which is located on PCI bus zero in the third slot (see details here ). Made so that when replacing one adapter with another, the device name in the system does not change. This behavior is not desirable for us, since on different motherboard models the position of the network card may be different, and when we try to transfer our boot server from VirtualBox to real hardware, we will most likely have to boot with the keyboard and monitor in order to properly configure the network. . We need to make the name of the network adapter more predictable, for example, eth0 (this place is reserved by an emoticon).

Why do we do this?

I have no doubt that there are more elegant solutions to the problem of device names, but the following version turned out to be very suitable to demonstrate the general principle of booting Linux. Please do not forget to acquaint the community with the methods you have verified in the comments.

Install the mkinitcpio-nfs-utils package, and we will have a hook (called “net”):

 pacman --root $root --dbpath $root/var/lib/pacman -S mkinitcpio-nfs-utils

By default, all handler files fall into / usr / lib / initcpio /. These are usually paired files with the same name, one of which will be in the install subdirectory and the other in hooks. The files themselves are ordinary scripts. The file from the hooks folder usually gets inside the initramfs file (we will find out about it later) and runs when the system boots. The second file of the pair is in the install folder. Inside it is the build () function, which contains information about what actions need to be performed during the generation of the initramfs file, as well as the help () function with a description of what this handler is intended for. If you are confused, then just read on, and everything said in this paragraph will fall into place.

The initcpio folder is also present in the / etc directory, and it also has install and hook subdirectories. However, it has unconditional priority over / usr / lib / initcpio, i.e., if both folders contain files with the same name, then when generating initcpio, files from / etc / initcpio will be used, not from / usr / lib / initcpio .

We need to slightly change the functionality of the net handler, so just copy the files from / usr / lib / initcpio to / etc / initcpio:

 cp $root/usr/lib/initcpio/hooks/net $root/etc/initcpio/hooks/ && cp $root/usr/lib/initcpio/install/net $root/etc/initcpio/install/

We bring the hooks / net file to the following form:

 cat $root/etc/initcpio/hooks/net # vim: set ft=sh: run_hook() { if [ -n "$ip" ] then ipconfig "ip=${ip}" fi } # vim: set ft=sh ts=4 sw=4 et:

Now open the $ root / etc / initcpio / install / net file and see that the help () function says well what the “ip” variable should be:

 ip=<client-ip>:<server-ip>:<gw-ip>:<netmask>:<hostname>:<device>:<autoconf>

It remains to simply set the value of the variable to set the static IP address and the name of the network device, for example, “192.168.1.100::192.168.1.1:255.255.255.0::eth0:none” (hereinafter use the appropriate network settings for yourself). In the next section, you will learn exactly where the value is set.

In the meantime, remove all unnecessary from the file $ root / etc / initcpio / install / net. We leave the download of the modules of network devices, the ipconfig program that was used above, and, of course, the script itself from the hooks folder, which does all the basic work. It turns out about the following:

 cat $root/etc/initcpio/install/net #!/bin/bash build() { add_checked_modules '/drivers/net/' add_binary "/usr/lib/initcpio/ipconfig" "/bin/ipconfig" add_runscript } help() { cat <<HELPEOF This hook loads necessary modules for a network device. Manually configures network and freezes network device name. HELPEOF } # vim: set ft=sh ts=4 sw=4 et:

When the systemd-udevd device manager tries to rename our network device to its usual predictable network interface name, for example, enp0s3, at boot time, it will fail. Why - read on.

How does the system boot?

For simplicity, consider the usual BIOS. After power up and initialization, the BIOS starts to go through the list of boot devices in order, until it finds a boot loader, to which it will transfer further boot control.

Just such a bootloader we recorded in the MBR of our drive. We used GRUB, in the settings of which (the grub.cfg file) indicated that the root partition is on a disk labeled HABR. Here is the entire line:

 linux /boot/vmlinuz-linux root=LABEL=HABR rw quiet

The vmlinuz-linux file, which is the kernel of the system, is mentioned here, and the pointer to the root system is its parameter. We ask to look for the root system on the device labeled HABR. It could also be unique for each drive UUID, but in this case, when transferring the system to another drive, we would undoubtedly have to change it. If we specified the position of the root system in the usual way for Linux users: / dev / sda1, we would not be able to boot from the USB drive, because this USB drive would only get being the only drive in the computer. It is unlikely that the computer will be another drive labeled HABR, but do not forget about it.

It also sets the value of the global variable “ip” for our handler “net” (do not forget to change the addresses for those used in your network):

 linux /boot/vmlinuz-linux root=LABEL=HABR rw quiet ip=192.168.1.100::192.168.1.1:255.255.255.0::eth0:none

In the next line there is a mention of the initramfs file, which I promised to deal with:

 initrd /boot/initramfs-linux.img

Then, when booting, the following happens: the GRUB loader receives the vmlinuz and initramfs files, tells them where to look for the root file system and gives them control of the further boot.

The name initramfs is derived from the initial ram file system. This is actually the usual Linux root filesystem, packaged in an archive. It is deployed in RAM at boot time and is designed to find and prepare the root file system of our linux, which we are trying to load as a result. In initramfs there is everything you need for these purposes, because it is a real “little Linux” that can execute many ordinary commands. Its capabilities are extended with the help of handlers (hooks), which help to create a new root filesystem of our linux.

After the programs from the initramfs have done their work, further loading control is transferred to the init process of the prepared root file system. Archlinux uses systemd as the init process.

The systemd-udevd device manager is part of systemd. He, like his older brother, tries to detect and configure all devices in the system in parallel. It starts its work one of the first, but already after our net handler initializes the network card at the initramfs stage of operation. Thus, systemd-udevd cannot rename the device being used, and the name eth0 remains behind the network card for the entire duration of its operation.

Cooking initramfs

To create an initramfs file, use the mkinitcpio program, which is included in the base package that we installed at the very beginning. The settings are in the $ root / etc / mkinitcpio.conf file, and the presets are in the /etc/mkinitcpio.d folder. We are required to make initramfs such that it can find and prepare the root file system, from which systemd will start working later. We absolutely do not need to take into account all possible options, only the most necessary is enough to not increase the size of the initramfs file. More information is available here wiki.archlinux.org/index.php/Mkinitcpio

Be sure to remove the autodetect handler. It checks the devices installed in this particular computer, and leaves only the modules necessary for them in the initramfs. We do not need this, since we initially consider the possibility of further transferring the system to another computer, which is hardware most likely to be different from the virtual machine being used.

A list of handlers, sufficient for our purposes, including the net we created, is as follows:

 HOOKS="base udev net block filesystems"

We insert this line into the mkinitcpio.conf file, and comment on the old one:

 nano $root/etc/mkinitcpio.conf

Based on the standard linux preset, we create our own habr preset:

 cp $root/etc/mkinitcpio.d/linux.preset $root/etc/mkinitcpio.d/habr.preset

And we bring it to this form:

 cat $root/etc/mkinitcpio.d/habr.preset ALL_config="/etc/mkinitcpio.conf" ALL_kver="/boot/vmlinuz-linux" PRESETS=('default' ) default_image="/boot/initramfs-linux.img"

We do not need the 'fallback' branch, which removes autodetect from the handlers, because we have already removed it ourselves, and we don’t need to generate the same initramfs file with different names twice.

Generating new initramfs using the habr preset:

 arch-chroot $root mkinitcpio -p habr

Writing DNS update service for use with systemd

Our network card receives all the settings for the network and the Internet to work. But site names will not be translated into IP addresses, since our system does not know which DNS servers to use for this. We will write our own service for this purpose, which, when loaded, will be run by systemd.And in order to learn something new and not get bored of the monotony, let us transfer the information about the name of the network device as a parameter, and save the list of DNS servers in an external file.

DNS server information is updated by resolvconf. We are ideally suited syntax:

 resolvconf [-m metric] [-p] -a interface <file

In the file imported here, the IP address of each server is indicated in the new line after the nameserver keyword. You can specify as many servers as you like, but only the first 3 of them will be used. As an example, we use Yandex servers. In this case, the file passed to resolvconf should look like this:

 nameserver 77.88.8.8 nameserver 77.88.8.1

We need to obtain information about DNS servers before the system is confident that the network is fully operational, that is, before reaching the goal of network.target. We assume that we only need to update the server information once during the download. And standardly, we say that our service is required by the goal of multi-user.target. Create a service startup file in the directory with the following content:

 cat $root/etc/systemd/system/update_dns@.service [Unit] Description=Manual resolvconf update (%i) Before=network.target [Service] Type=oneshot EnvironmentFile=/etc/default/dns@%i ExecStart=/usr/bin/sh -c 'echo -e "nameserver ${DNS0}\nnameserver ${DNS1}" | resolvconf -a %i' [Install] WantedBy=multi-user.target

In the ExecStart line, we execute the echo command, which generates a file with a list of servers on the fly, which passes the resolvconf through the pipeline. In general, you cannot use several commands in the ExecStart line, much less pipelines, but we again deceived everyone by passing these commands as the -c parameter for / usr / bin / sh.

Please note that the name of the update_dns @ .service file uses the @ symbol, after which you can specify a variable, and it will fall inside the file, replacing "% i". Thus, the string EnvironmentFile = / etc / default / dns @% i turns into EnvironmentFile = / etc / default / dns @ eth0 - this is the name of the external file, we will use to store the values of the variables DNS0 and DNS1. The syntax is like in ordinary scripts: “variable name = variable value”. Create a file:

 nano $root/etc/default/dns@eth0

And add the following lines:

 DNS0=77.88.8.8 DNS1=77.88.8.1

Now we add the service to autoload without forgetting to specify the name of the network card after @:

 arch-chroot $root systemctl enable update_dns@eth0.service

We have just written a universal file that provides the launch of the service. The versatility is that if there are several network adapters in our system, then for each of them we will be able to specify our own DNS servers. You will need to simply prepare a set of files with a list of servers for each of the devices and start the service for each adapter separately, specifying its name after @.

Before the first launch

This completes the initial setup. We need to load the installed ArchLinux from the internal drive so that the changes we make take effect.

Turning off the finished root system:

 umount $root

And turn off the virtual machine:

 poweroff

Now you can disable the boot image from the CD-ROM or get a flash drive, then turn on the machine and make sure that everything works.

Continuation and ending .

Source: https://habr.com/ru/post/253256/

All Articles