Virtual time. Part 1: computer time sources

A person who has one watch knows what time it is. A person who has several hours is not sure of anything.
Segal's Law

Why do you need to know the time inside the program? In fact, a fairly large number of algorithms used in practice do not depend at all on what time it is. And this is good: history knows many cases when programs that worked on old equipment “break down” when executed on a new, faster one, just because of the sticking to the characteristic time durations of the processes.
I was able to come up with three types of tasks that require reading the current time in everyday life.

Determine the relative order of events. For this, watches are used that measure the time from the “beginning of time”, “era” or some other fixed event in the past.
Measure the duration of the processes. To do this, use stopwatches, timers.
Do not miss an important event in the future. For this we need alarm clocks.

Inside computers, the situation is similar: temporary devices work as one of three devices, and sometimes as all three devices work at once.
In this part of the article, I will give a brief overview of the general properties of time-measuring devices present in modern systems, and describe their features and problems. In the second part of the article I will talk about the features of modeling timers when creating simulators and monitors of virtual machines.

Time source requirements

The requirements for time devices are numerous and varied.

Famous and high resolution. It is determined by the frequency of the oscillator used in the device. The higher it is, the shorter the minimum measurable time interval.
Low latency when reading. The call to the time source device is not in itself instantaneous, and by the time the response from it reaches the processor, the value is already outdated. In addition, the magnitude of this delay may be variable, which will introduce additional uncertainty in the read values. The order of the delay is determined by the position of the time source device relative to the reading core: the time received from the network synchronization server is much longer than the measurement taken from the counter located in the core itself.
The longest time interval that can be measured. Since fixed-width numbers are used to represent the time, it is possible for the counter to overflow between two dimensions. The higher the resolution of the timer, i.e. its frequency, the faster the register that stores time.
Independence from external power. What happens to the contents of the registers of a typical device if the system is turned off? Most often, they will lose their values and be filled with "garbage", at best - with zeros. And yet sometimes you want the flow of time to be tracked even when the computer is turned off. For this, the time device can be made non-volatile, i.e. just have your own battery, the resource of which is enough to wait out the characteristic period of the system being in the off state.
The monotony of changing values. Each following value received from the timer must be strictly greater than all previous ones, except for cases when overflow occurs; However, this case can be signaled by the equipment and processed by software in a special way. The lack of guarantees of monotony forces the users of the timer to consider the possibility of returning negative or zero durations for the intervals (for example, it is impossible to divide by the length of the time interval without checking - it may turn out to be zero).
Uniform change of values. If the oscillation frequency of the timer changes in the process, or it pauses its work, for example, when it enters the power saving mode, then events scheduled under the assumption of its monotony may happen later than expected. Another inherent reason for non-uniformity is a purely physical instability of the source of oscillations associated with temperature fluctuations, material degradation processes, and other similar strange things.
Consistency with other timers in the system. If there are several time source devices on a computer, it is very possible that their readings will not match. And, even being reconciled with each other at the beginning, they can disperse in the process of work for many reasons, including those described above.

Of course, a sufficient set of source properties depends on how it is used in programs. For example, one device can provide low resolution and high readout time, but at the same time be non-volatile and very stable, and the other allows to measure very short periods of time, but at the same time quickly overflow, and even not be synchronized with anything more.
')

PC Timer Overview

There may be several time sources in the system. Application programs rarely access any of them directly. Instead, various APIs are used that are offered by the programming language used (for example, C ++ 11 <chrono>), by the runtime (for example, by gettimeofday from POSIX or QueryPerformanceCounter on MS Windows), or even by system calls of your operating system.

The OS itself also needs to know the time and be able to measure its segments in order to plan the work of user flows, account for the resources consumed by them, performance profiling, energy management, etc. In this case, the OS itself works directly with the interfaces provided by the hardware. Since there are a lot of timers, modern OSs can choose one “centrally” used at the beginning of the download, based on their ideas about the “quality” of detected devices (for example, on some systems, some of the timers may be blacklisted because of known problems in work) or user settings (the clocksource parameter in the Linux kernel and the useplatformclock option, tscsyncpolicy, the disabledynamictick option in BCDEDIT in Windows).
I will describe the most frequently encountered devices, which are clocks and timers in the PC.

Common

Real Time Clock (Real Time Clock, RTC) - the source of the current date and time for the needs of the OS. The typical resolution of this timer is one second. All ACPI-compliant systems have an RTC chip that is compatible with the Motorola MC146818 that has been present in the original IBM PC / AT since 1984. In modern systems, RTC is usually integrated into the set of system logic of the south bridge on the motherboard (which means a rather large reading delay). Non-volatility of this timer is provided by a special battery. The principles of RTC programming cause nostalgia for BCD numbers and the Y2K problem.

Is RTC always available?

This is surprising, but the first IBM PC systems did not have RTC in themselves. Each time you start your computer, MS-DOS prompts you to set the current date and time.
And even in our time, not every computing system is capable of storing time between reboots. For example, the original RaspberryPi does not have a built-in RTC (this was done to reduce the cost), and the correct setting of the current date / time when the system boots is dependent on synchronization with network NTP servers.

Programmable Interval Timer (PIT) 8253 or 8254 from Intel is a standard counter and timer available on the PC from the very beginning of the existence of this platform (1981). Like the RTC, it was originally a separate microcircuit, and now it is part of the system logic. Quite an interesting device that contains three timers (although the last two were always reserved for updating the RAM and PC speaker, respectively) and allowing them to be programmed into various modes: periodic interrupts, one-shot timeout, etc. d.

The first PIT channel can still be used by the OS as a source of interruptions for the work of the displacing task scheduler. However, by modern standards it is not very convenient in operation: a low oscillator frequency of 1193181.8 Hz (a strange value is a historical heritage from the NTSC sweep frequency), a counter width of only 16 bits (frequent overflow) with a status register and command bits width of only eight bits (i.e. it is necessary to transfer or read the value in parts), and access to the registers through the slow and inflexible PIO mechanism (processor IN / OUT commands).

Local APIC (advanced programmable interrupt controller), built into all modern Intel processors (starting with the P54C architecture) and which also includes a timer. Moreover, each logical processor has its own LAPIC, which can be convenient for performing work that is local to the current kernel without the need for resource management. However, this device does not have a fixed known frequency; the latter is most likely tied to the core frequency. Therefore, before using the program, it is necessary to calculate (calibrate) it, and for this you need an additional reference device. The modes supported by LAPIC are: single interrupt, periodic interrupt, and period defined by the TSC.

The timer as part of ACPI , for some reason called Performance Monitoring Timer (PMTIMER), has been another device that has been supported by all systems that implement the ACPI standard since 1996. This timer has a frequency of 3.579545 MHz, the register-counter width can be 24 or 32 bits. The timer itself is always active when the system is powered on and does not depend on the operation mode of the CPU.

High Precision Event Timer ( HPET ) is a device created as a replacement for an outdated PIT. According to the standard, HPET must contain an oscillator that operates at a fixed frequency of at least 10 MHz, the value of which can be read programmatically from its status register, and a 64-bit counter that monotonously increases the value. It must also contain at least three comparators 32 or 64 bits wide, which are used to generate interrupts after programmed time periods have expired. Like the PIT, it is able to work in a periodic mode or in a single interrupt mode. At the same time, its programming method (MMIO instead of PIO) is more convenient and faster than that of the PIT, which, together with the increased resolution, allows setting the intervals more accurately and with less delay. The required generator stability is 0.05% for intervals longer than 1 ms and 0.2% for gaps shorter than 100 μs; much or little depends on the applications.

Despite the fact that HPET has long been present in the PC (since 2005), operating systems are not in a hurry to start using it. This is partly due to the not very convenient way of setting intervals using an increasing counter instead of a decreasing counter - because of the non-promptness of operations, there is a risk of “not having time” and setting an event in the past . Often, the OS uses a timer from the APIC or PMTIMER, or TSC functionality, using processor ticks as the time source.

RDTSC instructions hard luck

The history of TSC is quite interesting and instructive to dwell on it a little longer.
The idea itself is very transparent - to use the processor itself as a source of time, or rather its clock generator. The current clock number is stored in the TSC register (timestamp counter).
With the help of TSC, you can both learn the time from the start of work, and measure time intervals using two readings. TSC also works as an alarm clock in conjunction with APIC in TSC deadline mode.

RDTSC (Read TimeStamp Counter - read timestamp) appeared in Intel® Pentium ™. It writes to the EDX: EAX register pair the 64-bit number of clock cycles that have passed since the last power on / reset of the current processor core. Unlike all previously described devices that are only available to privileged code, RDTSC can run by default at any privilege level (although the OS can dynamically disable support for RDTSC in user mode, and then it will cause an exception).
RDMSR [0x10] - model-specific register read (MSR) IA32_TIMESTAMP_COUNTER also returns the current TSC. This instruction is allowed only in the privileged mode, and some operating systems actively use it for reading TSC (although I personally don’t understand why). A useful feature is that through MSR, the TSC value can not only be read, but also modified using the WRMSR instruction.
RDTSCP - You can set it by checking the corresponding CPUID sheet. About its two differences from RDTSC will be discussed below.

Well, TSC is a completely natural thing with simple logic and a simple usage scenario, which should have many useful properties: high resolution (one CPU cycle), low reading latency (tens of cycles), rare overflows (a 64-bit counter should be enough for 10 years), the monotony of the readings (after all, the counter always increases its value), uniformity (the processor always works), consistency with other timers (when starting the system, you can set the desired value by writing to MSR).
Could something have gone wrong? On the way to the successful use of TSC as the primary means of measuring time in the PC, the subsequent evolution of processors was set. New features that appeared in processors after the Pentium, “spoiled” RDTSC and for many years prevented it from being used as the main timer in popular OS. So, in 2006, one of the Linux-developers Ingo Molnar wrote :

We observed that for 10 years not a single implementation of gettimeofday, based on TSC and working in general, was written (and I wrote the first version for Pentium, so I am guilty of this), and that we’d better do without it .

It was not always possible to get a bit of time for the Pentium.

I note that over time, corrections were made to the IA-32 architecture, eliminating the flaws that appeared, and at the moment TSC can (not yet broken again) be used in the capacity in which it was intended.

Extraordinary execution (Out of Order Execution, OoO). Starting with Intel® Pentium ™ Pro (1995), the processor can execute machine instructions in a different order than used in the program, or even in parallel (if they are independent of each other). This means that the execution of RDTSC can be delayed or, conversely, performed before the sequential program order requires. Because of this, for example, it is impossible to understand how many instructions were executed between two RDTSC calls - it is impossible to reliably measure the length of a code segment. As a result, the monotony of testimony is not guaranteed.
RDTSC is not an instruction that serializes the flow of execution. Therefore, a “fence” is usually used from serializing commands around it, for example, CPUID . This, of course, does not look very elegant. In subsequent updates to the architecture, RDTSCP appeared - an instruction that partially serializes the execution flow, so it does not need additional barriers. She has another good property, but about it a little later.
Energy Management. The TSC value is increased each processor cycle. Does the beat always have the same period, and does the next beat always immediately follow the previous one? For Intel® Pentium ™, this was done. For modern processors, the answers to both questions are negative. The processor is quite a significant proportion of the time can be suspended to save energy (C-state). When executing instructions, he can use dynamic frequency changes to save energy (P-state) or vice versa, to maximize performance (Turbo-state). It follows from this that a simple clock counter will have neither uniformity nor consistency.
And for this problem a solution was presented (starting with Nehalem) in the form of a so-called. invariant TSC, the rate of change of which does not depend on the C- and P-states of individual nuclei.
Multiprocessing and multi-core. In a system with multiple threads, cores or processors , each of the logical processors will have its own TSC. This creates not one, but two whole difficulties.
First, the values returned by RDTSC on different logical processors may be shifted due to the non-simultaneity of the kernel initialization moments. Moreover, due to the unrecoverable frequency drift of individual timers, this difference could fluctuate in an unpredictable manner during operation.
Secondly, the ability to reliably measure time in user applications stops working. Without additional tweaks such as writing affinity at any moment, the program can be pushed out from one processor and then continued on another. If a process that wants to measure the duration between two events, during the operation, the OS was moved from one core to another, the two RDTSC readings performed by it will not be connected.
To compensate for the first problem in the latest generations of processors for TSC, a single signal source is started up. The TSC readings from all cores should be the same.
To eliminate the second drawback, RDTSCP has another feature that allows a user application to detect migration during the measurement of a time interval. In addition to the TSC value in EDX: EAX, it returns the value of the individual model-specific register IA32_TSC_AUX in ECX. Both entries occur atomically, i.e. TSC and TSC_AUX are always taken from the same logical processor. At the beginning of the operation, the OS should set unique TSC_AUX values on all processors of the system. Matching the ECX readings for the two RDTSCP calls ensures that they were made on the same processor; otherwise, the difference between the two TSCs cannot be relied upon, and the measurement should be repeated. In general, this mechanism may have other uses; for example, it can be used to notify the application not only about the fact of migration, but also simply about the displacement, which can also distort the results of time measurements. Instead of applications, there may be “privileged” ones: the Xen hypervisor uses this mechanism to notify DomU systems about migration between machines.

The fact that not all processors had the ability to write to all 64 bits of the TSC - only the lower 32, and the older ones were reset, acted as another “shortcoming” in the first implementations of TSC. This problem was also subsequently eliminated.

Other devices

Above, I described the most common and used time-determining devices. Of course, specific systems may have additional devices that are unique to a processor, integrated logic, or even in the form of specialized peripherals (for example, ultra-precise atomic clocks). The degree of their availability from the programs depends on whether there is a driver for a specific device in the selected OS. So, having run through Linux sources, I found at least two more supported time sources for x86 assemblies: the NatSemi SCx200 device on AMD Geode systems, and Cyclone for IBM x440 systems. Unfortunately, there is not a lot of documentation on the Internet.

Well, in brief I will mention the devices for determining the time in processors of other architectures. The list is, of course, incomplete, and if some of the readers have interesting information about other systems, please mention this in the comments.

PowerPC. Specifications for 32-bit and 64-bit systems postulate the presence of a 64-bit-wide TB (time base) register, available for reading by user applications and for reading / writing from the supervisor. Changes in TB should not monotonously decrease and should not be uniform, and their frequency depends on the implementation. A 32-bit DEC (decrementer) register is also available from the supervisor mode, allowing you to program an interrupt after a period of time. Its value decreases to zero with the same frequency with which TB increases.
ARM. In general, the availability of time measurement tools strongly depends on the selected family. On the ARM11 architecture, the CCNT register can be used to read the current clock number; however, its width is only 32 bits, which means an overflow approximately every 10 seconds on a system with a frequency of 400 MHz. On Cortex M3 systems, a 24-bit wide Systick device is present, and its rate of change is specified by the value from the TENMS register.
Intel® IA-64 (Itanium). On these systems, the 64-bit register ar.itc (interval time counter) is used as a clock counter. For programming time periods, the set of registers cr.itm (interval timer match), cr.itv (interval timer vector) can be used. The first one sets the ITC value at which the interrupt is generated, and the second one determines its number.
SPARC v9. The architecture implies a 63-bit TICK register. The last 64th bit of this register controls whether the unprivileged application is allowed to read the time.

Conclusion

I hope that from this note it became clear that working with time inside the computer at the system level is in fact far from trivial. The requirements for time-consuming devices depend on the problem being solved, and it is not always easy to find a completely suitable option. At the same time, the devices themselves often contain “architectural features” capable of breaking the head of an unfortunate programmer.
However, this is all an architectural introduction to a simulation fairy tale. In fact, I wanted to tell you how to model this whole zoo of devices. In the next article, I will describe how the capricious nature of time manifests itself when creating virtual environments - simulators and virtual machine monitors. Thanks for attention!

Source: https://habr.com/ru/post/260113/

All Articles