📜 ⬆️ ⬇️

What does the CPU do when it has nothing to do


A man comes to get a job at a construction site. He is asked by the master:
- What can you do?
- I can dig ...
- And what else?
- I can not dig ...

It is no secret that modern processors work very quickly. Their job is to continuously retrieve instructions from the memory and perform the actions prescribed in them. However, it turns out that for one reason or another it is often necessary to slow down this process. In application programs rarely have to think about what happens with the processor. But for the creators of the system software is not an idle question.


An inactive processor can be not only to save energy, but also as a result of the occurrence of special situations, during the execution of initialization protocols, or as a result of intentional actions of system programs. Why is this interesting? When writing software models (including virtual machines) of computer systems, it is necessary to correctly model the transitions between the states of virtual processors. In the work of the system programs there are regular situations when, for one reason or another, the CPU should “slow down”. The ability to correctly use and model these situations depends on the knowledge and understanding of specifications.


The article focuses on the software side of the issue of processor states. I will not focus on implementation details (voltages, pins, frequencies, etc.), since 1) they differ significantly between generations and processor models of even the same architecture, while the software interface remains backward compatible; 2) they are not visible directly to the programs and the OS. This is an attempt to summarize information scattered across many pages of the Intel IA-32 and Intel 64 Software Developer Manual .


Let's start with a simple and familiar situation - the processor is turned on, cheerful and cheerful.


Active state


The most common state of the processor in which it continues to execute instructions one after another. At the same time, modern processors can dynamically vary the frequency of their clock generator for the needs of energy management. Using the accepted terminology, in the active mode the logical processor remains in the C0 state, but can change the P-states.


In part, this process can be controlled programmatically, from the BIOS, OS, or application programs. However, the last word in the management at the same time remains outside the control of programs running on the central processor.


In all other modes described further, the processor does not execute instructions.


Hlt


The first of the inactive modes that appeared in the ancestor of the Intel 8086 series is associated with the processor instruction of the same name. Having executed this instruction, the processor stops the work, not executing the following command. Starting with the Intel 80486 DX4 in this mode, the power consumption of the CPU is reduced compared to the active mode. How exactly this is done depends on the implementation.


The processor itself cannot get out of this sleepy state. An external event is required. This can be a normal interrupt from a device, a nonmaskable interrupt (NMI), a system mode interrupt (SMI), or alternatives of initialization signals - INIT or RESET.


Is it possible to completely suspend the system using HLT

Yes, if you run the HLT in SMM (system management mode), in which all interrupts and nonmaskable interrupts are blocked by default. After this, only RESET will be able to re-start the processing of machine commands.


Formally, the mode after HLT is denoted as C1.


MWAIT and other power saving modes


The idea of ​​a special mode for power saving the CPU has been further developed in the form of a new MWAIT instruction. Unlike the HLT, which has no operands, the MWAIT takes two values ​​in the EAX and ECX registers. At the same time, EAX contains a description of the desired energy-saving state, numerical values ​​for C-state and C-substate.


The ECX register defines optional hints for the option of inactive mode specified in the command. Currently, only one such hint is described - a flag in the zero bit. About his appointment will be discussed below.


Otherwise, the behavior of the processor after execution is similar to that of the HLT: the processor stops working until the arrival of external signals. Unlike the HLT, the energy savings achieved with the MWAIT can be greater. If the HLT is the C1 state, then using the MWAIT, you can request the processor to go to a deeper sleep - states C2, C3 ... C6, etc. Each such state may have sub-states. The specific allowable combinations vary, and for a specific processor model, CPUID instructions are described in the fifth sheet.


In addition to fine control of inactive energy consumption, the more interesting purpose of MWAIT is that it increases the efficiency of synchronization processes on multiprocessor systems.


A typical situation in parallel algorithms: flow A expects a signal of readiness from flow B, after which both of them can continue the calculations. In multiprocessor systems A and B will be executed on different logical processors. How can I transmit this signal? Two options:


  1. Put A into inactive mode (for example, using HLT). Then B uses interprocessor interrupt, which brings A out of the sleep state. However, sending and processing such an interrupt is rather expensive in terms of time, since it will require several transitions between the kernel and user modes, and the path of the interrupt signal will not be long.


  2. Stream A in the "infinite" loop checks the contents of some memory cell. Stream B, wishing to send a readiness message, writes a new value to this cell, which takes A out of the loop. In this case, the delivery delay is less. However, it is obvious that A is not waiting in the most energy efficient way, burning cycles, but not moving forward.

MWAIT paired with the instruction MONITOR is designed to eliminate the disadvantage of the second approach. The MONITOR command takes an address in memory as its argument, after which the processor begins to “monitor” it, waiting for recordings from other streams. If such a recording occurs while the processor is in a sleepy state due to the MWAIT, it will be removed from it.


Thus, the sleep state created with MWAIT can be interrupted for two reasons: external interrupts or writing to a memory cell marked with MONITOR. But what if interrupts were disabled at the time of the execution of the MWAIT?


In the first implementations of MONITOR / MWAIT, the arrival of an interrupt would not have resulted in getting out of sleep. It turned out that this behavior is not very convenient. Therefore, on modern processors, MWAIT implements an extension enabled using the ECX [0] bit, which allows even forbidden interrupts to take the processor out of its inactive state.


I want to emphasize the somewhat "optional" nature of the MWAIT behavior. The exit from the inactive state can occur for various reasons that are not always controlled by the current application. Programs that use it must be designed to work correctly, even if exits from a sleepy state will occur spontaneously. Therefore, in the first approximation, the MWAIT can be considered a variant of the NOP - the instruction that does nothing. This is quite typical for the synchronization primitives of the class conditional variable . Algorithms that use them are required to work correctly in the conditions of the possibility of parasitic awakenings .


This completes the power management functionality. Let us turn to the features of the processor in the first and last moments before it is turned on and rebooted. As it turns out, at the same time it can also be in inactive modes.


Wait-for-SIPI


This rather awkward name stands for “waiting for SIPI signal”. SIPI, in turn, is an abbreviation for "Start-up IPI". Finally, IPI is “inter-processor interrupt,” an interprocessor interrupt. To understand why the wait-for-SIPI state was entered, you need to have a general idea of ​​how initialization occurs in a multiprocessor system. The problem is the following: if all the cores, threads and processors, after power up, rush to execute the same boot code, then there will be a mess. In general, the process is rather complex and varies in details on different platforms as follows.


  1. After power-up, all logical processors are included in the race, as a result of which one principal is defined, the so-called. boot processor (boot-strap processor, bsp). All other processors are referred to as application processors (AP).


  2. BSP begins to execute the boot code from ROM at 0xfffffff0.


  3. All APs are put into wait-for-SIPI mode, waiting for BSP to send them SIPI. This will happen when the critical part of the system initialization is performed using the code executed on the BSP: building ACPI tables in memory, assigning unique APIC IDs. Alternatively, BSP can wake anyone else and not wake up if, say, multiprocessing has been manually turned off in the BIOS.

In the wait-for-SIPI state, the processor does not execute the instructions. In addition, it ignores external interrupts from devices, INIT and NMI signals, and delays SMI interrupts. In fact, the only thing that should take him out of this state is the SIPI signal. I note that the specifications do not say anything about power consumption in this mode.


I want to note that with further system boot, all APs can be turned off and on again several times. For example, the OS loader can be written only for one thread, and the OS itself usually prefer to bring processors into battle one by one. At the same time, the wait-for-SIPI state is no longer used - the HLT or just an infinite loop on the AP goes into action.


Most programmers, even the system programmers, will not have to meet the wait-for-SIPI mode in their practice, simply because it happens once and rather early in the process of working with any system. However, this rule has an exception. What happens if a virtual machine is running that uses hardware to support Intel VT-x virtualization with multiple logical processors? It turns out that in the non-root VMX mode (guest system), the processor can also be placed in different modes. In addition to the active, supported inactive modes HLT, Shutdown (about him a little further) and wait-for-SIPI. In this state, the behavior of the processor is very similar to what happens during normal AP initialization. Namely: it does nothing, ignores many incoming signals, and only when SIPI appears, leaves the guest mode to the master mode ( VM-exit happens). I note that the decision on whether to use the SIPI mechanism depends on the specific virtual machine monitor; in practice, some of them implement their own BSP wakeup protocol and AP inside the VM.


Shutdown


Alas, the code that people write is not perfect. Serious errors in application programs most often lead to their completion under the watchful eye of the operating system. But who will take care of the OS itself, if it stumbles? Its monitors can be software monitors of virtual machines or, if they are not used, the equipment itself, i.e. processor and its special states. We will talk about them.


A typical situation in the operation of any program is the occurrence of an exceptional situation (interruption). It does not always and does not necessarily indicate an error; the interruption of the current program may be temporary, related to the operation of external devices, or be intentionally initiated by the application itself to request some services from the OS (see the classification of such situations in my comments ).


When an exceptional situation occurs, the state of the processor is switched, somewhat similar to a very complicated procedure call. We are not interested in its details right now (this article is not about exceptions), only the fact that something may go wrong in this process is an important one - an exception will arise when trying to handle an exception. In the Intel IA-32 specification, this case is referred to as Double Fault - double slip. Like other exceptions, it has its own number (8) and its entry in the system interrupt table. The OS can set up its own program handler for it.


But what will happen if an exception is raised when attempting to switch to Double Fault processing? No need to guess - this situation is called Triple Fault, triple slip. But the handler is no longer provided for it; instead, the processor goes into shutdown mode.


This mode is similar to the state after the HLT. In it, the processor stops executing instructions until the arrival of NMI, SMI, RESET or INIT signals. What actually happens with the system in the shutdown state depends on the implementation. For example, a front panel indicator light may be turned on, a non-maskable interrupt is generated to record diagnostic information, a system reboot is completed (hot or cold), or an SMI signal is generated.


Perhaps the most frequent reaction to switching the processor to shutdown mode is to restart the entire system. In Linux, intentionally putting the processor into shutdown mode is one of six methods (the latter, as the most desperate) to process a request for reboot.


As in the case of wait-for-SIPI, virtualization adds nuances to the processor behavior in shutdown mode. A triple miss in non-root mode, of course, does not restart the entire system. It calls the VM-exit, allowing the VM monitor to handle the situation in the "buggy" guest system. In addition, the monitor can launch a guest in non-root mode in the shutdown state (I don’t know why this may be necessary).


More about Shutdown

A very attentive reader of the documentation may find that some VM-exit outputs with a disturbed processor state can put the processor into the so-called VMX-abort shutdown mode. It is so severe that the processor can only pull the RESET out of it; he ignores all other signals.


I want to note that the usual Triple Fault in the system code is quite simple to call, it is enough just not to configure the system tables and wait a bit. The very first interrupt enabled will result in a (un) desired effect and a reboot.


But the VMX-abort event with subsequent shutdown is not so easy to get. It can occur only during the exit from the guest to the monitor (transition from non-root to root). Before you exit, you need to log in (make VM-entry). But only at the entrance to non-root a huge number of checks are carried out, including those that prohibit working with a non-consistent state. If something was configured incorrectly, then an attempt to enter the guest VM will immediately return with an error code. During work, the guest is significantly limited in their rights and usually cannot independently destroy the system structures. In other words, usually an error in the monitor program manifests itself earlier when entering. It is necessary to be very inventive (for example, screwing up with memory isolation or model-specific registers) in order to get an error with VM-exit.


Exotic: SENTER sleep and TXT shutdown


Finally, it is worth mentioning the SMX (safer mode extensions) extension, which is a software interface to the set of platform technologies Intel TXT (trusted execution technology). Processors supporting SMX receive two more inactive modes.


  1. The first priority of any security-related technology is to establish which entities (code, elements of the runtime environment) can be trusted with, that is, they can single out the root of trust. The easiest way to do this is if only one processor is active in the system — in this case, the remaining processors will not be able to leave untrusted programs.


    Execution of the GETSEC [SENTER] instruction on one logical processor introduces the remaining processors into a new inactive state of SENTER Sleep. After this, the program running on the remaining active processor must transfer the system to the so-called “certified” environment (measured environment). Once the certified environment is ready, the other processors can work in it. To do this, they are removed from the SENTER sleep state using the GETSEC [WAKEUP] instruction.


  2. As always, in the course of the trusted code operation, errors associated with access to the wrong resources, exceptions, or discrepancies of the results of cryptographic checks are possible. They arise either through the fault of negligent programmers, or because of deliberate attempts to disrupt the work of the certified environment from the outside. In the second case, the goal is to compromise the environment with the substitution of untrusted code or obtaining secrets.


    When detecting invalid events in a certified environment, the processor is transferred to a new state - TXT-shutdown. Its distinctive feature is that information about the cause of the shutdown is stored in the platform registers and survives after a reboot, which allows it to be analyzed later. Eh, that would be something for the usual Triple Fault! Would noticeably help with diagnosing problems.



Thanks for attention!


')

Source: https://habr.com/ru/post/283066/


All Articles