
If the work of a hacker, or rather a programmer-researcher, proceeded as shown in classic films: he came, tapped the keys, everything on the screen flashed green, the passwords broke, and the money suddenly moved from point A to point B, then it was would definitely be easier and more fun. But in reality, any serious hack is always preceded by thorough and boring analytical work. Here we will deal with it, and we will roll out the results to your court in the form of a cycle of two articles. Make sure you have enough beer and cigarettes - reading these materials is dangerous for the unprepared brain :).
Detection of the bug that subsequently received the MS13-092 number (an error in the Hyper-V component of Windows Server 2012 that allows you to send the hypervisor to the BSOD from the guest OS or execute arbitrary code in other guest OS running on the vulnerable host server) was a very unpleasant surprise for Microsoft engineers. Prior to that, for almost three years, no one had discovered vulnerabilities in Hyper-V. Before it was only MS10-102, which was found at the end of 2010. Over these four years, the popularity of cloud services has greatly increased, and researchers have shown an increasing interest in the security of the hypervisors underlying cloud systems. However, the number of publicly available works is extremely small: researchers are reluctant to spend their time studying such complex and poorly documented architectural solutions. This article does not cover the specific vulnerabilities of the hypervisor, but it should shed light on the work of some of the internal mechanisms of Hyper-V and thereby partially simplify future research.
')
INFO
Before reading the article, it is recommended that you read the ERNW report, the “Hyper-V debugging for beginners” material, as well as the official document Hypervisor TLFS.
VMBus
At the time of writing, Windows Server 2012 R2 Update 1 (machine type - Generation 1) was used as a Hyper-V server and guest OS, but other versions of Windows operating systems were used to reflect some features of the bus, which will be clearly indicated in article. It is better to deploy the test environment in VMware Workstation 2014 July TechPreview or later, because in earlier versions, the bug in Workstation does not allow debugging of virtual machines over the network (or you must force the use of UEFI in the virtual machine configuration). Also in the future it will be assumed that the stand is deployed on the Intel hardware platform and the functions of the hypervisor are implemented in hvix64.exe.
Terms and Definitions
- Root partition (parent partition, root OS) - Windows Server 2012 R2 with the Hyper-V component enabled;
- Guest OS - Hyper-V virtual machine with Windows Server 2012 R2 installed;
- TLFS - Hypervisor Top-Level Functional Specification: Windows Server 2012 R2;
- LIS - Linux Integration Services;
- ACPI - Advanced Configuration and Power Interface
About Vmbus
MSDN Hyper-V Architecture Article
In short, VMBus is a technology of interaction between guest operating systems and root OS. Accordingly, both in guest and root OS there are components that implement this interaction through the interfaces provided by the hypervisor and described in TLFS 4.0. Microsoft is developing guest components for Linux-based operating systems that are integrated into the Linux kernel and are laid out separately on GitHub:
github.com/LIS/LIS3.5 .
Beginning with Windows Server 2008, features have appeared in the Windows kernel that optimize the operating system in a virtual Hyper-V environment. For comparison, the Windows Server 2008 (x64) core implements only 25 functions with the Hvl prefix, which identifies their belonging to the hypervisor integration library, 109 Hvl functions are already present in Windows Server 2012 R2.
Consider how the VMBus bus components interact with the hypervisor, the root OS, and the guest OS. First, look at the LIS source code and see that VMBus is a device that supports ACPI. ACPI allows you to standardize a hardware platform for various operating systems and is implemented in Hyper-V (as, indeed, in other popular virtualization platforms), which allows you to use standard utilities to obtain the necessary information for research.
ACPI devices can be viewed using the ACPI tool included in the older version of AIDA64 (later it was removed). With it, two devices are detected in _SB.PCI0.SBRG: VMB8 and VMBS (see Figure 1).

Fig. 1. VMB8 and VMBS devices
We will build the ACPI DSDT (Differentiated System Description Table) table, which contains information about peripheral devices and additional functions of the hardware platform, using the same ACPI Tool utility and decompiling the
AML disassembler in ASL. Get the dump shown in fig. 2

Fig. 2. ASL dump
A surface reading of the Advanced Configuration and Power Interface Specification 5.0 made it clear that if the guest OS is Windows 6.2 and higher, then the VMB8 device will be involved, otherwise VMBS. The only difference between these devices is the _UID (Unique ID) object, which is present in VMB8. If you believe the specifications for ACPI, then the presence of this object in the table is optional and is required only if the device cannot otherwise provide the operating system with a permanent unique device identifier. Also became known resources that uses the device - interrupt 5 and 7.
For comparison: in a Generation 2 virtual machine, there is only a VMBS device located in _SB_.VMOD.VMBS (but with a _UID object) that only uses interrupt 5 (see Figure 3).

Fig. 3. Part of ASL Dump Gen2
Interrupt handling in a virtual environment
In Windows, interrupt processing is performed by procedures registered in the Interrupt Dispatch Manager Table (IDT). There is no direct connection between ACPI DSDT IRQ 5 and 7 and handlers in IDT, and in order to match the interrupt to its handler, Windows uses the interrupt arbitrator (in general, there are several arbitrator classes besides IRQ, DMA, I / O, memory).
Www
Everything about the referees in the MSDN blog
goo.gl/FuvG4R
goo.gl/V3UV8z
goo.gl/h1vXaf
Information about registered arbitrators can be seen in WinDBG using the! Acpiirqarb command.
kd \>! acpiirqarb - For guest Windows Server 2012 R2 Gen1 (fig. 4):

Fig. 4.! Acpiirqarb Windows Server 2012 R2 Gen1 Guest
The output of the command shows that for IRQ 7 the address of the handler will be in the 0x71 IDT element, for IRQ 5 - in 0x81. The generation of interrupt handler numbers occurs in the acpi! ProcessorReserveIdtEntries function at the stage of building the device tree by the PnP manager, when the functional device driver is not yet loaded. Registration of an ISR in IDT occurs already at later stages, for example, when the IoConnectInterrupt procedure is performed by the device driver itself. However, looking at the IDT elements, we see that the ISRs for the 0x71 and 0x81 interrupts are not registered:
kd\> !idt -a ……………………………………………………………………………………………………………………………. 71: fffff80323f73938 nt!KxUnexpectedInterrupt0+0x388 81: fffff80323f739b8 nt!KxUnexpectedInterrupt0+0x408 …………………………………………………………………………………………………………………………….
In Windows Server 2012 R2 Gen2 for IRQ 5, the 0x90th IDT element was mapped.
kd\> !acpiirqarb Processor 0 (0, 0): Device Object: 0000000000000000 Current IDT Allocation: 0000000000000000 - 0000000000000050 00000000 \<Not on bus\> A:0000000000000000 IRQ(GSIV):10 0000000000000090 - 0000000000000090 D ffffe001f35eb520 (vmbus) A:ffffc00133972660 IRQ(GSIV):5 …………………………………………………………………………………………………………………………….
However, as the debugger shows, the ISR procedure for vector 0x90 is also undefined:
kd\> !idt -a 90: fffff8014a3daa30 nt!KxUnexpectedInterrupt0+0x480
In Windows 8.1 x86, we see a slightly different picture:
kd\> !acpiirqarb Processor 0 (0, 0): Device Object: 00000000 Current IDT Allocation: ……………………………………………………………………………………………………………………………. 0000000000000081 - 0000000000000081 D 87f2f030 (vmbus) A:881642a8 IRQ(GSIV):fffffffe — MSI-. ……………………………………………………………………………………………………………………………. 00000000000000b2 - 00000000000000b2 SB 87f31030 (s3cap) A:8814b840 IRQ(GSIV):5
In this case, for the interruption with the number 0x81, the ISR-procedure vmbus! XPartPncIsr is defined:
kd\> !idt 81: 81b18a0c vmbus!XPartPncIsr (KINTERRUPT 87b59e40 - (. . 5)) b2: 81b18c58 nt!KiUnexpectedInterrupt130
s3cap is an auxiliary driver for working with an emulated Hyper-V video card S3 Trio.

Vmbus interrupt object
Thus, ISR vmbus! XPartPncIsr is registered in IDT only in Windows 8.1 x86 (presumably, in other x86 operating systems that Microsoft supports as guest OS for Hyper-V, the same method is used). The vmbus! XPartPncIsr procedure is used to process interrupts generated by the hypervisor.
In x64-bit systems, starting with Windows 8 \ Windows Server 2012, integration with the hypervisor is implemented somewhat differently. Handheld interrupt handlers generated by the hypervisor have been added to IDT operating systems. Briefly consider how IDT is formed at the Windows startup stage.
After initializing the Windows loader, winload.efi IDT looks like this (outputting the script to pykd at the WinDBG breakpoint in winload.efi when booting the operating system with the / bootdebug option):
kd\> !py D:\\hyperv4\\idt\_winload\_parse.py isr 1 address = winload!BdTrap01 isr 3 address = winload!BdTrap03 isr d address = winload!BdTrap0d isr e address = winload!BdTrap0e isr 29 address = winload!BdTrap29 isr 2c address = winload!BdTrap2c isr 2d address = winload!BdTrap2d
Then, during the execution of winload! OslArchTransferToKernel IDT is reset, control is transferred to the Windows kernel, where, in the nt! KiInitializeBootStructures IDT function, initialized with values from the KiInterruptInitTable table:
kd\> dps KiInterruptInitTable L40 ………………………………………………………………………………………. fffff800\`1b9553c0 00000000\`00000030 fffff800\`1b9553c8 fffff800\`1b377160 nt!KiHvInterrupt fffff800\`1b9553d0 00000000\`00000031 fffff800\`1b9553d8 fffff800\`1b3774c0 nt!KiVmbusInterrupt0 fffff800\`1b9553e0 00000000\`00000032 fffff800\`1b9553e8 fffff800\`1b377810 nt!KiVmbusInterrupt1 fffff800\`1b9553f0 00000000\`00000033 fffff800\`1b9553f8 fffff800\`1b377b60 nt!KiVmbusInterrupt2 fffff800\`1b955400 00000000\`00000034 fffff800\`1b955408 fffff800\`1b377eb0 nt!KiVmbusInterrupt3 ……………………………………………………………………………………….
Accordingly, the 0x30-0x34 system interrupt handlers after the completion of initialization will look like this:
kd\> !idt ………………………………………………………………………………………. 30: fffff8001b377160 nt!KiHvInterrupt 31: fffff8001b3774c0 nt!KiVmbusInterrupt0 32: fffff8001b377810 nt!KiVmbusInterrupt1 33: fffff8001b377b60 nt!KiVmbusInterrupt2 34: fffff8001b377eb0 nt!KiVmbusInterrupt3 ……………………………………………………………………………………….
A second-generation virtual machine in Hyper-V can only be created on the basis of an OS containing five additional handlers in the kernel described above. In order to generate interrupts, Intel introduces the hardware function of virtual interrupt delivery, however, Hyper-V does not use this capability to transfer control to these handlers. Instead, in the hypervisor, the bit corresponding to the vector number is activated in a special memory area using an instruction like lock bts [rcx + 598h], rax, where rax is the interrupt vector number (0x30–0x32), so, perhaps, the developers of Hyper- V considered the option of registering the vmbus! XPartPncIsr procedure as a handler as a less productive solution than the option of generating interrupts via APIC virtualization based on data in the SINTx virtual registers.
These handlers are registered with IDT even if the operating system is operating outside the Hyper-V environment. Each handler calls HvlRouteInterrupt, passing the index as a parameter (see Figure 6).

Fig. 6. Additional Windows System Handlers
HvlRouteInterrupt looks as follows (Fig. 7).

Fig. 7. HvlRouteInterrupt
This function calls the handler from the HvlpInterruptCallback pointer array, depending on the index value. This array in root OS looks like this:
5: kd\> dps HvlpInterruptCallback fffff802\`fff5cc30 fffff800\`dc639d50 winhvr!WinHvOnInterrupt fffff802\`fff5cc38 fffff800\`dd5a9ec0 vmbusr!XPartEnlightenedIsr fffff802\`fff5cc40 fffff800\`dd5a9ec0 vmbusr!XPartEnlightenedIsr fffff802\`fff5cc48 fffff800\`dd5a9ec0 vmbusr!XPartEnlightenedIsr fffff802\`fff5cc50 fffff800\`dd5a9ec0 vmbusr!XPartEnlightenedIsr fffff802\`fff5cc58 00000000\`00000000
The XPartEnlightenedIsr on the index passed from KiVmbusInterruptX adds one of two possible functions from the array of DPC structures to the vmbusr to the DPC queue: vmbusr! ParentInterruptDpc or vmbusr! ParentRingInterruptDpc (Figure 8).

Fig. 8. DPC objects
The number of DPC structures in the array is determined by the function vmbusr! XPartPncPostInterruptsEnabledParent and depends on the number of logical processors in the root OS. For each logical processor, DPC with vmbusr! ParentInterruptDpc and vmbusr! ParentRingInterruptDpc is added. The vmbusr! ParentRingInterruptDpc function determines the address of the DPC procedure for nt! KeInsertQueueDpc based on which processor is currently running.
In the guest OS, VMBus registers in the HvlpInterruptCallback array only one handler:
1: kd\> dps HvlpInterruptCallback fffff803\`1d171c30 fffff800\`6d7c5714 winhv!WinHvOnInterrupt fffff803\`1d171c38 fffff800\`6d801360 vmbus!XPartEnlightenedIsr fffff803\`1d171c40 00000000\`00000000
The HvlpInterruptCallback array is filled with the kernel exported nt! HvlRegisterInterruptCallback function. The WinHvOnInterrupt handler is registered when the winhvr.sys driver is loaded (winhvr! WinHvpInitialize-> winhvr! WinHvReportPresentHypervisor-> winhvr! WinHvpConnectToHypervisor-> nt! HvlRegisterInterruptCallback) /
The remaining four handlers are registered by the vmbusr.sys driver when it is loaded by the PnP manager (vmbusr! RootDevicePrepareHardwareParent-> nt! HvlRegisterInterruptCallback).
Let's try to figure out how the hypervisor passes control to the system interrupt handlers. To do this, refer to the TLFS Virtual Interrupt Control section. In short, Hyper-V manages interrupts in the guest OS via a synthetic interrupt controller (SynIC), which is an extension of the virtualized local APIC and uses an additional set of memory mapped registers. That is, each virtual processor, in addition to the usual APIC, has an additional SynIC. SynIC contains two pages: SIM (synthetic interrupt message flags) and SIEF (synthetic interrupt event flags). SIEF and SIM are arrays of 16 elements, the element size is 256 bytes. The physical addresses (to be more precise, GPA) of these arrays are located in the SIEF and SIMP MSR registers, respectively. The addresses of these pages for each logical processor will be different. Also for SynIC, 16 SINTx registers are defined. Each of the elements of the SIEF and SIM arrays is associated with the corresponding SINTx register. WinDBG displays the contents of the SINTx registers using the! Apic command (starting with WinDBG 6.3).

! apic in root OS

! apic in the guest OS
The configuration of the SINT0 and SINT1 registers is performed by the nt! HvlEnlightenProcessor function by writing parameters to the MSR registers 40000090h and 40000091h, respectively. SINT4 and SINT5 are configured by the driver vmbusr.sys: vmbusr! XPartPncPostInterruptsEnabledParent-> winhvr! WinHvSetSint-> winhvr! WinHvSetSintOnCurrentProcessor. SINT2 in the guest OS is configured by the vmbus.sys driver in the winhv! WinHvSetSintOnCurrentProcessor function.
Each SINTx has an 8-bit Vector field. The value of this field determines to which interrupt handling procedure control will be transferred when executing hypercalls, in the parameters of which the PortID (HvSignalEvent, HvPostMessage) is set.
SINTx can be specified implicitly (for example, for an interception message it will always be controlled by the SINT0 register and the SIM page is located in the first element), explicitly (for timer messages) or specified in the parameters of the port created using the HvCreatePort hypercall. One of the parameters is PortTypeInfo. If the port type is HvPortTypeMessage or HvPortTypeEvent, then in the PortTypeInfo parameter there is a TargetSint containing the SINT number to which the port will be attached and the value of which can be from 1 to 15 (SINT0 is reserved for messages from the hypervisor and cannot be specified as TargetSint when creating a port).
An analysis of the values of the active SINT registers in root OS shows that only three system interrupt handlers (KiHvInterrupt, KiVmbusInterrupt0, KiVmbusInterrupt1) out of five will be involved in the work. For what purposes the KiVmbusInterrupt2 and KiVmbusInterrupt3 system handlers were added to the kernel, could not be detected. They may be needed on servers with a large number of logical processors (for example, 64), but, unfortunately, we failed to verify this in the test environment. Also, by the values of the SINTx registers, it is clear that the nt! KiHvInterrupt handler (vector 30) will be called both when generating interrupts from the hypervisor and through ports created with the TargetSint parameter equal to 1.
Windows and TLFS
For example, consider the parameters of ports that are created when activating each of the services of guest components of Hyper-V integration. In fig. 11 shows the characteristics of the ports created for the operation of integration services (one port for each component).

Fig. 11. Ports of integration services
The interaction of the root OS and the guest OS during the operation of the Integration Services components occurs through the 5th element of the SIEF array, that is, the handler at root of the OS will be KiVmbusInterrupt1.
The number of each next created port is equal to the previous one, increased by 1. That is, if you disable all integration services and then enable them again, the port numbers created for these services will be in the range from 0x22 to 0x27.
You can see the port settings if you connect the debugger directly to the hypervisor and monitor the data passed to the HvCreatePort hypervisor callback, or you can connect the debugger to the kernel and track the WinHvCreatePort function parameters in the winhvr.sys driver.
The remaining ports that are created when the guest OS is turned on (the number of ports depends on the configuration of the guest operating system) are shown in Figure. 12. The numbering is given in the order of creation of ports when you turn on the Windows Server 2012 R2 virtual machine with the default hardware configuration.

Fig. 12. Ports created when you start the virtual machine00000000000000.
It is important to note that the zero SIM slot in both the guest and the parent OS is reserved for sending messages from the hypervisor. The format of such messages is documented in TLFS. When transferring data through the remaining slots, a different data format is used. VMBus messages are not documented, but the necessary information to work with them is present in the LIS source codes.
Some information about the processing of VMBus messages by the vmbusr.sys driver (see Figure 13). Such messages in the root OS are processed by the vmbusr! ChReceiveChannelMessage function, which analyzes the contents of the 4th SIM slot and determines the code of the VMBus message. If it is 0 (CHANNELMSG_INVALID) or more than 0x12, then the function returns an error and 0xC000000D (STATUS_INVALID_PARAMETER). Otherwise, the function processes the message sent by the guest or root OS. For example, when the Guest Services root OS component is enabled, the guest OS sends a message to CHANNELMSG_OFFERCHANNEL, in response, the caller system sends the changes to the current system, CHANNELMSG_GPADL_HEADER; . It is worth paying attention to the fact that, before processing each valid message, the ChReceiveChannelMessage function checks the sent message (ChpValidateMessage), in particular for who the sender (root OS or guest OS) is, the minimum size of the message body. Each type of message has its own verification conditions. In fig. 13 marked those messages that will be processed if they are sent to the guest OS (may be interesting, for example, to create a fuzzer).

Fig. 13. VMBus messages processed by the vmbusr.sys driver
In order to understand what kind of messages the root OS and the guest OS are exchanging, we will write a driver that replaces the addresses of the handlers from the HvlpInterruptCallback array in the root OS with their own handlers. But about this - in the next article.
Conclusion
In the first part of the article, changes in the operating system kernel made by Microsoft to optimize work in a virtual environment and affecting the operation of VMBus were analyzed. In this issue] [we have analyzed the theory, and the following will be published the practical part of the study, so please be patient.
First published in Hacker magazine dated 11/2014.Subscribe to "Hacker"

