Christmas gifts, part two: Specter

Part One: Meltdown .

Despite the power of Meltdown's vulnerability, the happiness brought by this New Year would not have been complete, if not for the second part of the discovery, not limited to Intel-Specter processors.

Speaking very, very briefly, Specter is a fundamental vulnerability to processors in Meltdown in the sense that it also represents a hardware feature and exploits indirect channels of data leakage. Specter is more difficult in practical implementation, but it is not limited to Intel processors, but extends - albeit with nuances - to all modern processors that have a cache and a transition prediction mechanism. That is, all modern processors.
')
Strictly speaking, Specter is not one vulnerability - at the start two different mechanisms are declared (CVE-2017-5753 and CVE-2017-5715), and the authors note that there may be many less obvious options.

At its core, Specter is similar to Meltdown, since it is also based on the fact that during speculative code execution, the processor can execute instructions that it would not execute under the condition of strictly sequential (non-speculative) calculation, and although later the result of their execution discarded, its footprint remains in the processor cache and can be used.

The Branch Prediction Unit, the branch prediction unit, the task of which is to estimate the probability with which it will go along one or another path after any condition, without prior calculation of this condition . The prediction unit works statistically, that is, it accumulates data on similar branches that have been completed at the moment, and on their basis predicts the outcome of each subsequent branch.

That is, for example, if the code if (a <b) , for calculating which it is necessary to load a and b long and sadly, gave true a thousand times in a row, then for the thousand first time it is possible to decide with great confidence that now it will be true before a and b were loaded from memory and the check actually occurred.

What is even more interesting is that processors do not distinguish between the processes in which this condition is calculated. Therefore, if in the process of malware.exe a thousand times in a row such an if gave true , then the processor will assume that the first similar if in the process of word.exe also returns true .

This is quite logical, since in different programs there can be a lot of very similar constructions, therefore it is more efficient to train the prediction unit on the entire data stream than on one specific process.

In fact, I have just described the mechanism by which the malware.exe program can control the progress of the word.exe program without having any formally approved rights.

Until January 3 of this year, it was considered that there is no danger — after all, if something goes wrong with word.exe, the processor ultimately recognizes the speculative computing branch as invalid, drops the pipeline to its original state and recount all over again, this time already consistently. Word.exe will not even notice anything, except for a small unevenness in the pace of execution of instructions by the processor.

This place is still similar to Meltdown, but further differences begin. As we remember, Meltdown does not work on processors that are on time - before the actual end of the execution of a sequence of commands - check the conditions of access of the process to someone else's memory, even in the case of speculative execution.

Specter does not have such a problem, because Specter does not imply any direct access to other people's memory, even with speculative execution. Instead, Specter makes sure that the attacked process (this can be either the core of the system or another user program) gives information about the contents of its own memory.

Imagine the following code in the code of the attacked process, with the variable x being the result of some user input that we can influence:

if (x < array1_size) { y = array2[array1[x]]; }

Now we take and write in our program, which exploits a vulnerability, the most similar construction, and execute it many, many times, and each time an honestly calculated condition gives true , the array indices are completely valid, and in general everything is fine. The branch prediction block thus collects statistics, saying that this construction is always calculated to be true , therefore, having met it, you can not wait for the end of the condition calculation, but go directly to the content.

And now we pass such data to the attacked program that x suddenly jumps out somewhere far beyond the array1 array. If there were no speculative execution, the processor would consider the condition x <array1_size, find it invalid and jump further. But it is there, and the prediction block gives it that x <array1_size will almost certainly be executed, so while the value of array1_size is slowly and sadly sucked from somewhere in memory, in order to actually perform the comparison, the processor starts to execute the body of this piece of code.

An important point in the attack, by the way, is the item “slow and sad” - if array1_size is ready somewhere in the cache, the processor may not bother with speculative calculation, but simply quickly calculate the condition. Therefore, array1_size should be in RAM, where it will take a long time to get it.

The value of x is chosen so that it points to the address in the memory area of the program being attacked, which we want to read. Suppose it stores the value of k , and we also need this value to have already been loaded into the cache earlier.

The latter is often not very difficult to achieve - if, for example, we want to steal a private key from a data encryption program, it’s enough for the previous action to simply absolutely legally ask the program to encrypt something, then it will turn to this key itself, and the processor, respectively, will drag it in the cache. Let me remind you that another condition was the absence of other variables involved in the process in the cache, but this can be implemented by simply clogging the cache with rubbish on behalf of the attacking process, or even simply instructing to force a reset of the cache, if there is one on the attacked processor.

So, everything is adjusted so that the processor reads array1 [x] , which will be equal to k , which it does. Since k is in the cache, the processor gets it almost instantly, substitutes array1 as an index and requests the value corresponding to array2 [k] from RAM.

After this intricate chain, the value of array1_size finally arrives, the processor calculates the condition, recognizes it as invalid and throws out the results of all the above-described calculations. He numbered the kingdom and put an end to it. Everywhere except the cache.

What is especially cynical, the basic possibility of an attack was provided to us by checking the array index, which is necessary to ensure the security of the code.

It doesn’t become much simpler further, since we now have to figure out what is in the cache - and it is supposed to go there on behalf of a third-party process, that is, unlike Meltdown, we cannot directly touch the memory.

Nevertheless, and here can find their own methods. For example, if array2 has sufficiently valid indices (at least k pieces), and we can encourage the program to read it from the outside in a more or less direct way, then the read operation on the index k will be faster than on other indexes, since it has already been cached.

As it is easy to see, the method is not simple and straightforward to implement - however, subject to an attack on a specific software, known to an attacker and, if possible, available in source codes in the same version and on the same system that the attack is supposed to be, it can be implemented.

Unlike Meltdown, this method may leave traces in the system if the program being attacked, for example, reports input data that causes the index overflow in the array used in the attack.

Just as in the case of Meltdown, an attacking program does not require any special privileges for itself, except for the possibility of launching itself on the attacked system. Theoretically, an attack can be made even from a JS script in the browser and other interpreted languages, in which it is possible to organize a timer with an accuracy suitable for distinguishing the speed of obtaining a variable from the cache and from RAM.

This was the first Specter version , which I find it difficult to give a short name to. To the second, a simple, well-known Russian ear name is asked: Gadgets .

No, the gadget in this case is not your iphone. A gadget is something that can be used to pull your passwords out of your iPhone without your knowledge.

A gadget is a sequence of commands in the address space of the attacked program, which can be used to attack. The task of such a sequence is to organize data leakage, if not directly, then through the cache using the mechanism described above, so the sequence can be quite short and also not associated at all with any software vulnerabilities - direct data leakage beyond the limits of the memory area being attacked by the program, I remind you, it does not happen.

An important point: this sequence is not created and is not introduced by the attacker, that is, again, de jure no invasion of the program under attack occurs. The attacker simply finds the piece of code he needs in the body of the attacked program or any of the libraries loaded by it; Moreover, in some cases, it does not even need a preliminary analysis of the software - directly on the attacked system, you can try to find the desired sequence in the commonly used system libraries in situ, it is logical to assume that the attacked program also uses these libraries.

Researchers at Google used the BPF feature at all. This is a mechanism that exists in Linux and FreeBSD and allows user applications to hook their filter to the system kernel, for example, to track I / O streams. In this case, of course, it does not matter at all what this filter will do - it is important that in some place it was the sequence of commands we need.

Nota bene : a version was born of this that Specter’s vulnerability does not apply when BPF is turned off. It is not .

To transfer the execution of the attacked program to the desired sequence, an approach similar to the above-described training of the branch prediction block is used - there is a similar transition prediction block in the processor that tries to guess at what address the transition will be performed with the next instruction of the indirect transition (we all remember these instructions on Meltdown, but here they play a different role).

To simplify the work, this block does not perform translation between virtual and real addresses, which means it can be trained in the attacker's address space for certain actions in the attacker's address space.

That is, if we know that the instruction we need in the attacked program lies at 123456, and also in this program there is a regularly executed indirect transition. In the attacking program, we write a construction that is as close as possible to the transition to the attacked, but always performing the transition at 123456. In our address space, of course, an absolutely valid and legal transition. What exactly we have at the address 123456, does not matter.

After a while, the transition prediction block is absolutely sure that all transitions of this type lead to the address 123456, therefore, when the attacked program - from our submission or on its own initiative - reaches a similar transition, the processor joyfully begins the speculative execution of instructions from the address 123456. address space of the attacked program .

After some time, the real address of the transition will be calculated, the processor realizes the error and discards the results of speculative execution, however, as in all other cases of Meltdown and Specter, there will be traces of it in the cache.

And what to do with traces in the cache, you already know.

In general, according to the description of this entire puzzling procedure, it is quite obvious that it is much more difficult to operate Specter than Meltdown - but, on the other hand, most of the existing processors are affected to some degree or another.

Who is affected?

We can assume that all processors are newer than the Pentium MMX, but there are nuances.

Intel processors are all susceptible
The new ARM cores are all susceptible . The latest kernels without speculative code execution were the Cortex-A7 and Cortex-A53. Cortex-A7 in wildlife is also found in embedded systems, from Raspberry Pi 3 to system-on-module on iMX6UL and iMX6ULL, but on Cortex-A53 many mid-level smartphones are built - there it is known as Snapdragon 625, Snapdragon 410, Mediatek MT6752 etc.
AMD's processors, according to the company, are “virtually unaffected” by an attack using gadgets officially called Branch Target Injection or Indirect Branch Poisoning.
There is no information about other cores, but most likely, the Specter's first variant (Bounds Check Bypass) is all susceptible, and the second depends on the implementation of a specific transition prediction architecture

Why AMD is "practically not exposed" to the attack through the redirection of indirect transitions, the company does not disclose. The MMU cannot be involved here, since all requests are carried out strictly within the address space of the program being attacked. It can be assumed that AMD has a different transition prediction mechanism, possibly tracking, skeptical about the idea of transferring such predictions between different processes. Characteristically, AMD does not speak about the complete impossibility of an attack, only about "almost zero probability."

At the same time, AMD, just like Intel and ARM, is subject to the first type of Specter attack, through the training of branch prediction block.

Is it true that AMD is subject to a second type Specter attack only on Linux and only when BPF is enabled?

Not.

On Linux with BPF enabled, the attack was shown in a Google Project Zero document, it was also noted there that it could not be performed on an AMD processor without BPF - however, apparently, this was caused only by the fact that researchers could not find it in the compiled kernel sequences of commands chosen by them for attack. In practice, firstly, attacks can be carried out not only against the core of the system, but also against any programs running in the system and the libraries they use, and secondly, the necessary sequences of commands can be different. Therefore, although one specific attack could be carried out in a particular case only through BPF, this has nothing to do with the general question of vulnerability to Specter attacks.

Processor manufacturers promise easy and fast fix

First, see the remark in the first part about the attitude to the current statements of manufacturers.

Secondly, Specter, unlike Meltdown, is not a specific attack - these are just two of the most obvious of the whole spectrum (I’m aware that the "specter" translates wrong, but it really hurts to ask) sophisticated attacks that use targeted training processor blocks predict program execution.

What will we do next, how will we continue to live?

Not very clear yet.

First, it is likely that processor manufacturers will further tune their architecture in order to exclude or obscure known attacks. But we will see the result only in two or three years, and besides, the existing freedom in the algorithms of processors is due to the desire to increase their performance - in both cases we deal with Specter in that the processor learns to perform one process faster using the example of another process. , thereby actually allowing the second process to control the progress of the first.

Globally, all problems would be solved by more accurate handling of the cache, for example, zeroing the results of speculative execution in the event of pipeline flushing or storing such results into a separate cache with transferring them to the main one only after successful execution - but both options also entail additional overhead.

Compiler patches are also being developed to provide protection against Specter’s second attack — options for gcc and llvm based on Google’s suggestions are already being presented.

They are based on a rather simple thing: the substitution of an indirect transition to a return from a function. Returning from a function works a little differently than an indirect transition, the transition prediction block does not affect it. In fact, this is a cheater trick. It would be a double pleasure, but, as usual, there are nuances.

First, the fix does not affect the Specter of the first type.

Secondly, every magic comes with a price. Although Google in the official message carefully avoids the question of the quantitative measurement of the overhead, in practice one “protected” indirect transition is on average ten times heavier. The effect for a particular application depends on its structure, language and compiler - for the Linux kernel it is within 2%, for other applications it can be much more.

In this regard, at the moment it is “officially considered” that in order to ensure an “acceptable” level of protection, it is enough to rebuild the kernel and single critical applications, and everything else is unlikely to attack.

Thirdly, there is no specter-wide patch for Specter at the moment, and it is not foreseen - not from one of the options. Minimal protection from the second option requires a complete recompilation of the system kernel and, probably, in most OSs it will be implemented not earlier than the release of the next major version. Protection from the first option is currently represented solely in the form of searching for and removing the sequence from the Linux kernel code used in the Google Project Zero demonstration on Intel.

Processor manufacturers began to slowly update their microcode, but so far no one really appreciated the efficiency and performance losses as a result of these updates. Intel has released two updates - IBRS, Indirect Branch Restricted Speculation, and IBPB, Indirect Branch Prediction Barriers; as it is not difficult to notice by name, both belong to Specter’s second type of attack.

TL: DR

A global error that is present in approximately all existing processors. It would be a complete ass if it were not for the high complexity of the practical implementation, because of which hackers will score on it.

AMD seems to be half safer than others, although it is not clear why.

TL: DR - the difference with Meltdown

Meltdown uses an error in the Intel and ARM processors, due to which the processor ignores memory access rights during the speculative execution of instructions.

Specter uses a feature of the work of branch prediction and transition algorithms in modern processors, due to which one process can influence the likelihood of speculative execution of instructions in another process.

Part three: did we behave well

Source: https://habr.com/ru/post/374147/

All Articles