Abstract: description of rebut types, story about sysrq, ipt_SYSRQ, ipmi, psu.
How to restart the server? - This is a question that is usually asked to very novice users who get confused between halt, shutdown -r, reboot, init 6, etc.
An experienced administrator will clarify the question: “what is wrong with the server?” Different types of server failures require different types of reboots - and the wrong choice will lead to dire consequences, from which a visit to the IPMI / DRAC / iLO web face to “reload” the easiest. The most difficult in my personal practice was the enikeyschik's business trip to a neighboring city. In order to "push reboot" on a server standing alone.
')
In this article: what prevents the server from rebooting and how to help it.
Let's start with the rebut theory.
When the server is turned off or restarted, the initialization manager (in most modern distributions - systemd, in an eccentric Ubuntu 14.04 is still upstart, in archaic junk - sysv-init) in a certain order sends all the demons a command to "turn off". And most demons (like DBMS, like mysql) know how to shut down correctly. For example, finish all transactions, save all unsaved data to disk, etc. For in-memory, the DBMS, like redis, can be critical at all: did not save - lost.
Old initialization systems waited indefinitely for each of the init scripts. For example, if the “joker” added a “sleep 3600” branch to you in the “stop”, then your server will reboot an hour with a tail. And if there is a maybe more number, or just a program that does not want to end, then the reboot will never end.
New initialization systems (in fact, do not be shy - only systemd remains) give a certain timeout (usually 120 or 180 seconds) to save data, and then complete the process forcefully. In addition to stopping the demons, the file systems are unmounted (that is, all block caches are discarded), the iscsi target stops (also with the cache discarded), etc. etc. Given that the time of shatdaun is obtained indefinitely long, it is all the same of course. Plus, there is at least some hope for the correct completion of all daemons, dropping file caches, and so on.
Thus, on a healthy system, the correct answer to the question “how to reboot” is to execute the reboot command. In some cases, even the only correct one (correction: if you make a “reboot” in the graphical interface, the desktop environment will think that this is an emergency reboot, you must use the “reboot” in the DE interface to reboot from the graphical mode).
What can go wrong with the “regular reboot”? Well, firstly, some of the processes-demons can begin to "blunt" - see above.
Secondly, there may be a problem with unmounting filesystems. It is believed that it is enough to “kill” all processes, and it will be easy to unmount the disk - no one uses it. But, to put it mildly, this is not so. Here are potential methods of “nailing fs with nails so that it is not unmounted:
- fallocate / fs / swap -l 1G; mkswap / fs / swap; swapon / fs / swap
- dd if = / dev / sda of = / fs / image; kpartx / fs / image
- losetup --find --show / fs / image
etc. In short: a file can be occupied not only by the file system, but also by the kernel. A module in the kernel can be busy searching for answers to the meaning of life and have no intention of freeing the resource.
What is it fraught with? Unmounted file system. Systemd in this situation, tries, tries, and throws (unmounted file system). That is, the reboot in this situation will be VERY long, but it will still pass. But this if umount returns an error.
And it happens that umount cannot complete the operation due to the fact that something is not available. For example, a file on the nfs server. If a process turns to such a file, it cannot be completed (even with the help of kill -9). And in this situation, 'reboot' just hangs the server. Again, the most typical places in systemd are “covered”, but the chance to stumble upon TASK_UNINTERRUPTIBLE ('D' in ps aux) is still possible.
What to do? You can reboot without synchronizing file systems and terminating anything with reboot -f. But he can also hang. For the reasons below, but for now about the consequences: all processes are not stopped and die instantly, tcp sessions are not closed, disk caches are not reset. However, the core still performs some movements in the reboot area (and, perhaps, part of the caches will be reset). The main thing is that in the reboot process most of the core will be involved. And that means that if the kernel pops out, then we may not go back.
The second, extremely unpleasant situation: problems with the file system at / (at the root). Any attempt to make ls, grep, and even 'reboot' causes either a console hang or an error. Problems with libc (including deleting it) go through the same category when trying to 'reboot' talk about linking problems and refuse to do something. Or, we have reached the limit on the number of pids and they are all in the 'D' state. or something else of the same caliber that goes in the “server bad” category.
It so happens that only one console remains open on the server (and the second one no longer opens). Why? Because someone had something to do with the disk driver. Or a raid controller. Or something else, after which only memories in the disk cache remain from the '/'. This means that we only have bash commands (built-in) that run without starting new processes.
There is a reboot method that does not require the execution of any executable files (i.e., reading from the missing disk). This (from root):
echo b >/proc/sysrq-trigger
. The sysrq-trigger file allows you to "push" any button from the SysRq combinations (kernel emergency buttons). Including SysRq-b, that is, an emergency "reboot". It often happens that after pressing enter, it does not even have time for a line feed to appear - the server is already in reboot before syscall returns. This is the strongest softovogo that is for a reboot.
Note: the “sync, reboot” that seems correct in this situation, i.e. SysRq-s, SysRq-B is a mistake, because after SysRq-S, the kernel may try to start communicating with the empty set, and, potentially, fall into panic or break off the last of the available consoles. If an emergency reboot is made - it should be an emergencyipt_sysrq
This all works if you have a console to server. And if the login hangs and there is no open console? There is an
ipt_SYSRQ module that allows you to perform sysrq requests for receiving a specific network packet (more precisely, according to the iptables rule). Works entirely in the kernel, i.e. does not depend on FS. The send_sysrq command is attached to it.
watchman for watchman
One might think that this is "all", but there are even more unpleasant hangs. For example, the network card is stuck. And the usual reboot (including via sysrq) does not help. A second example of such a bad situation is the enclosure hang, which is stuck on a bad disk and ignores all bus reset. The reboot seems to reset everything, and the disks are inaccessible.
In this case, we need a power cycle (on / off). Physically running to the server is not interesting, so you can look at the capabilities of modern servers: IPMI. This is a built-in microcomputer that allows you to control a large computer. It is usually called IPMI, DRAC, iLO, etc.
Introducing us: ipmitool chassis power cycle. It is more demanding on the health of the system (the kernel modules must be loaded, ipmitool itself must start successfully, ipmi must be working, etc.). But on the other hand, it allows you to distort the nutrition of all. More precisely, almost everyone - if the server has jbods, then this command does not reach them. But, after all, it is a very good and good reboot.
If the kernel is completely poplohlo, the command can be executed remotely (ipmitool -H ipmi.server.local chassis power cycle)
Another difficult situation is when ipmi hangs. If the system is more or less alive, you can “restart ipmi”:
ipmitool mc reboot hard
. After that, you can make a power cycle for the chassis. It sounds strange, but several times in my life I “pulled out” the server to a normal reboot with just such a sequence. (
After mc reboot hard, you need to give a couple of minutes to download BMC ).
The next “pain” point is hanging power supplies. Yes, it happens. Bugs in the firmware power supply fix, they need to be
flashed . Of course, any soft rebuts (such as the ipmi power cycle) do not work in this situation. You need to either physically poke the cable or distort the power remotely. An IP socket helps in this situation.
It looks like this (fragment of the control panel for servers.com/servers.ru):

Obviously, in these conditions, the reboot will go through a very tough scenario, but it definitely will.
Conclusion
Brief squeeze
Normal work | reboot |
problems with software | reboot -f |
problems with core / mounts / libc | echo b> / proc / sysrq-trigger |
problems with the kernel / mounts / libc and no open console | ipt_SYSRQ (must be prepared in advance) |
kernel / iron problems | ipmitool chassis power cycle |
kernel / iron problems without open console | ipmitool -H ipmi.server.local chassis power cycle |
problems with autonomous peripherals / PSU / ipmi | reboot through IP socket |