Side Effects Cryptography

Thanks to Richard Snowden, more and more people now know what the NSA is and what it does. Based on the internal presentations that have been disclosed, it is obvious that the NSA spends a lot of effort not only to collect traffic and implement the “right” programs in the network of Internet providers and software giants, but also to analyze cryptoalgorithms. A 178-page document with a national security budget for 2013 was made public. It follows from it that $ 11 billion was spent on the Consolidated Cryptologic Program project. What can be done for such money? Exactly spend with benefits. For example, the construction of a giant computing center in Utah for $ 2 billion, right in the Mormon den. The center contains 2300 m2 of space for servers, has its own power station of 65 megawatts and 60 thousand tons of refrigeration equipment to cool all this. In 2012, from official lips, it was rather covertly stated that the NSA recently achieved breakthrough success in cryptanalysis and hacking of complex systems. Whether the new data-center was necessary for this purpose for them? Cryptographic guru Bruce Schneier commented on these statements and suggested that the NSA is unlikely to be able to crack some modern strong cipher, such as AES, in the near future. And then he made the assumption that the NSA will focus its efforts not on “honest” hacking of algorithms, but on finding vulnerabilities in the very implementation of these algorithms. Bruce identified several areas where success can be achieved:

attack on the key generation procedure, where hack-of-the-art random number sensors are operated
attack on a weak link in data transmission (for example, the channel is well protected and the network switch is bad)
attack on ciphers with weak keys, which are still left somewhere in the oversight of system administrators (a good candidate is RSA with a 1024-bit key)
side effect attack

Let's try to figure out what attacks on side effects are.

A great variety of works and studies is devoted to the analysis of the security of various cryptosystems. At the same time, the “cryptosystem” is considered very widely here - it can be some kind of encryption protocol, and a hardware device, and a whole software and hardware system with servers, subscribers, etc. The system’s ability to withstand a certain kind of attack is analyzed first: cracker (in the literature on cryptography, it is customary to give him the name Eva or Mallory), to whom a certain set of knowledge and tools is available. He uses them to hack the system: trying to figure out the key, read the encrypted messages, replace data or digital signatures. If a potential attacker cannot access the secret information of this system using plausible computer power for a reasonable time, then the system is considered to be stable. A good tone in cryptography is not the invention of a proprietary algorithm, but the use of methods that are robust and time-tested, since they are carefully studied, and their robustness is usually supported by mathematics. The question arises: if in my device I use an algorithm based, for example, on some proven theorem, can I be calm (at least until the invention of quantum computers)? It turns out no.

Traditional models of security analysis consider the “test subject” object as a kind of black box that performs a certain operation, such as encryption: the plaintext is sent to the input, and the encrypted one appears at the output. Let also a key be stored inside this box (alternatively, it can be set from the outside, be permanent, or generated for each session - it does not matter). The main thing is that the key to the outside world is not accessible at all, and Eve is content with the standard hacker’s arsenal: intercepting input and output data, the ability to feed arbitrary large amounts of input data, accurate knowledge of the algorithm and its parameters, and so on.

Interacting with the outside world through radiation, electricity consumption, as well as any other recorded manifestations, electronic devices or computer programs on these devices can give an attacker a lot of useful information. These manifestations are called side effects, sometimes side effects, and in English. literature is an established term - Side Channel. For a hacker, this information can be “worth its weight in gold,” as it can save him from the unpleasant prospect of complete enumeration. By analogy with the opening of the door lock: why should a burglar choose a master key for a long time if it is enough to unscrew the bolts on which the door is attached?
')

Origins

The first serious mention of the “benefits” of side effects can be found in the autobiography of one British MI-5 agent, Peter Wright, which describes an attempt to crack the cipher of a rotary cryptographic machine that stood at the Egyptian embassy in London in the 60s. They obviously didn’t have enough computing power at that time, and then the author suggested installing a microphone and recording clicks made by the machine, while every morning the technician put it in order. Thus, it was possible to calculate the position of two of the three wheels, in order to help open the cipher.

Comprehensive side effects began to be considered since 1995, after the publication of Paul Kocher’s research, in which he proved the statistical relationship between the secret key value and the time spent by the crypto device to calculate the exponentiation operation. Since then, it has become clear that the implementation of cryptography on a kind of hardware device does not look so "reinforced." An important place in ensuring the security of various systems occupy crypto processors. They are optimized for performing crypto operations, execute their own isolated code that does not interact with the enemy environment, and keep the secret key in its own memory, which often has protection — if unauthorized reading is detected, the key material is destroyed. With the development of side effects analysis, crypto processors no longer seem so strong for at least two reasons:

they are available (bank cards with a chip, SIM cards, DRM chips in tablets and laptops, tokens - we come across all this every day)
deploying an attack on side effects does not require any huge computing power and super-expensive equipment. For example, a digital oscilloscope with a resolution of 100 MHz costs around $ 2000

Attack classification

Attacks to side effects in the literature are usually classified according to several independent criteria.

1. Upon intervention

passive attack : the attacker does not interfere in any way with the operation of the device and is only an observer collecting information. In this case, the “experimental” device works exactly as if the attack did not occur.
active attack : the attacker interferes with the operation of the device by acting on both external and internal interfaces, or at least it turns on the device and initiates its operation

2. By degree of intervention

intrusion attack : direct effect on the inside of the device. These can be simple electrical measurements in conductors, or very hardcore methods, for example, the creation of conductive structures by ion irradiation, laser cutting of crystals, the intentional creation of interference in order to influence the execution of instructions, etc.
attack close without invasion : attackers measure everything that is measured, but do not interfere with the normal operation of the device. Usually measure power consumption and lead time, as well as read signals in available conductors
attack at a distance : similar to an attack without intrusion, but in this case there is no physical access to the device, which means that the set of measured parameters is greatly narrowed. On the other hand, the purpose of such an attack can be not only an isolated device, but also a network service or just a program.

3. By way of data analysis

simple analysis : they conduct a series of their small number of measurements and analyze each measurement separately - they try to identify the relationship between the executable instructions and the “leaked” information. Separately, it means that they are not trying to identify correlations between different measurements, as well as how the change in the source data affects the change in the observed data. The main problem of this method is to separate the actual manifestations of secret information from the useless noise.
differential analysis : trying to identify the relationship between the source data and the observed data. This is achieved by conducting a large number of measurements and subsequent statistical analysis of the results. Analysis helps to level the influence of noise - it allows you to “separate flies from cutlets”. In this case, the attacker is repelled by a certain simplified model of the device of interest:

Practice - time attack on RSA encryption

We describe a popular and well-studied time attack in the literature on RSA encryption. The basis of the attack is an accurate measurement of the time spent on crypto operations. Here it is assumed that the hacker has the device itself, the equipment necessary for submitting data to the input and measuring the time stamps, and he also reliably knows that RSA is used in the device.

Time attacks are in principle possible due to the fact that the attacked device spends different times on performing crypto-operations, depending on the input data (whether it is a cipher or a plaintext) or a key. This difference is amplified by optimization: the processor “seeks” to optimize to the maximum and, as a result, some operations can be performed much faster than we expected theoretically.

As is known, the RSA algorithm (and the same Diffie-Hellman) is also based on the modulation exponentiation operation. I propose to use the following notation:

m - plaintext (from message )
c - cipher (from cipher )
{e, n} - public key (from encryption - taken during encryption)
{d, n} - private key (from decryption - taken when decrypting)
w - key width (from width )
d [i] - the i-th bit of the key

Then the decryption process will look like this:
m = c^d mod n
Here n is publicly available, as it is part of the public key, and c can be acquired by listening to the data channel. Our goal is to find d.

There are various algorithms for calculating this formula, since it is too expensive to calculate it head-on. Suppose that our crypto device uses the simplest algorithm for rapid exponentiation, often called the binary exposure algorithm, or, as in foreign sources, square and multiply or exponentiation by squaring. Here is his pseudocode:

 int squareAndMultiply(c, d, n) { R = array(0..w-1) S = array(0..w-1) s(0) = 1 for (k = 0, k < w, k++) { if (d[k] == 1) R(k) = (s(k) * y) mod n else R(k) = s(k) s(k+1) = (R(k) ^ 2) mod n } return R(w–1) }

Obviously, the iterations for the zero bits of the key will take less time than for single bits, since in the first case the processor multiplies and in the second only assignment. We show how to calculate the entire key by measuring the execution time of the squareAndMultiply function. We will not give exact formulas, we will describe the general meaning.
The method proposed by Kosher is to iteratively calculate the key bits: first, bit 0, then bit 1, and so on, with each such iteration having the following properties:

some bits are already computed (or none if this is the 1st iteration)
the remaining bits are unknown, including the current bit being studied, but their values are equiprobably distributed (if this is not the case, then it is time to write off the random number sensor used to generate the key)

At each iteration, the hacker takes a large number of measurements, the purpose of each of which is to obtain 3 values:

total function execution time
time spent by the system on processing already known key bits
time attributable to the operation (s (k) * y) mod n in our algorithm

Parameter 1 is the easiest to measure, supplying a ciphertext to the input, 2 and 3 is more difficult, but also real, if you know the features of the physical implementation of the device or slightly “prepare” it. Then the time spent processing all the unknown bits following the test will be equal to:

T = T1 - T2 , if the bit under study = 0
T = T1 - T2 - T3 , if the bit under study = 1

Having made a lot of measurements for a given test bit and accumulating a whole “pile” of T values, we can calculate the probabilities that this bit is 1 and 0, respectively, using conditional probability formulas.

We have just considered the simplest exponentiation algorithm suitable for a time attack. In modern cryptosystems, it is rarely used, and more optimal methods are used instead. Typically, this is an algorithm based on the Chinese theorem on residuals, a Montgomery algorithm, or a Karatsuba algorithm. But even these “advanced” algorithms, being applied in their pure form, are subject to temporary attacks, which was proved by a successful attack on the OpenSSL server.

Attack to the OpenSSL server

Initially it was thought that the lot of temporary attacks was only hardware devices: the hacker took, for example, a smart card, weighed it with sensors and instruments, and then extracted the secret keys. But as David Brumley and Dan Boni from Stanford University showed, the temporary attack is also subject to software, in particular, a Web server that accepts an SSL connection using the OpenSSL version 0.9.7 “sink” library. And this is despite the fact that a typical server performs many parallel processes, and the channel of access to the server makes its contribution. Nevertheless, three successful attack options were made:

Client attack on Apache web server accessible via local network. For greater certainty, the channel between the client and the server included several routers and switches.
Attack of one process to another, while both are running within the same machine
An attack on a distributed key storage system in which the private key is stored on an isolated virtual machine. Here, the Web server itself does not decrypt the data, but requests the key machine.

The SSL / TLS protocol is described in detail in RFC 6101 (SSL v3.0), 2246 (TLS v1.0), 4346 (TLS v1.1), 5246 (TLS v1.2) and 6176 (TLS v2.0), therefore we will focus on the part that is the target of the attack. So a typical handshake includes the following steps:

The client sends a ClientHello message in which it sends a 28-byte random number and a list of supported ciphers.
The server sends a ServerHello message similar to the client.
The server sends a Certificate message with its certificate.
The client, having received the server certificate, extracts the server's public key from it, encrypts a 48-byte random number with it, and sends it to the server in a message ClientKeyExchange
The server decrypts a random client number with its private key, after which the mutual master key is calculated.

Note that in clause 4, the block is formatted in accordance with the PKCS # 1 standard (type 2):

  [0x00] [0x02] [padding string] [data]

after which it is encrypted.

The iteration of the attack consists of sending test data to the server as an encrypted block. After decryption, the server naturally detects that the data is not formatted according to PKCS # 1. Immediate interruption of the handshake at this stage would open a vulnerability (which was attacked by Danelle Bleichenbacher of Bell Laboratories), therefore modern SSL / TLS implementations “pretend” that there is no error and continue the handshake. As a result, the client and server will calculate a different master key, which will pop up with the following messages - one way or another, the client will receive an Alert message and the connection will be interrupted, but this is of little interest to us. The main thing is that the time is measured from the moment ClientKeyExchange is sent and until a response is received from the server. Here we recall that N = p·q is an RSA-module, and the secret and open exponent are related by the relation d·e = 1 mod (p-1)(q-1) . So, after conducting a series of measurements, you can restore q, bit by bit, and, consequently, find the private key. Where do such miracles come from? For accurate proof, I would have to cite a whole bunch of mathematical formulas that go beyond the scope of this article, and instead I will describe the general principles that underlie the analysis. There are two of them.

To begin with, we note that OpenSSL uses the binary algorithm described earlier for modular exposure, but with a number of optimizations. First, applying the Chinese Residual Theorem , the problem m = c^d mod N divided into two subtasks m1 = c1^d1 mod p and m2 = c2^d2 mod q , then from two numbers m1 and m2 using a special formula, we get m . Secondly, the multiplication modulo used in the algorithm is optimized by the Montgomery method . The essence of the method is to go away from the calculation on the original module and calculate on the module equal to the power of 2, which is much faster for the processor. First, both multipliers are converted into a special Montgomery form, then a quick multiplication is done, after which the result is translated into the usual form. It would be ok if German professor Werner Schindler found out in 2000 that the more g close to the number multiple of q , the longer the whole algorithm takes. This is the first principle : by measuring the time of the Montgomery method, it can be concluded that g close to q , 2q , 3q , etc.

Go ahead. The simplest multiplication itself (without taking modulo) is implemented in OpenSSL in two ways: the usual method and the Karatsuba method . The complexity of the conventional method is estimated at O(n·m) , where m and n are the sizes of the multiples. The Soviet scientist A. A. Karatsuba first invented the method of fast multiplication with complexity O(n^1,58) . This is what OpenSSL uses when factors of the same size are used. The second principle follows from here: if g is a bit less than a multiple of q, then the fast method will be used, and if a bit more, then more time will be spent.

The authors of this attack on OpenSSL managed to demonstrate that about a million requests were enough to find the 1024-bit key, which took about 2 hours. But do not rush to panic. Starting with version 0.9.7b, OpenSSL contains protection against temporary attacks. Protection consists in the useless calculation of x = r^e · · mod N before the decryption operation itself, where r is a random number, e is a secret exponent, c is a cipher text. This operation introduces a random delay and does not allow the key information to “leak” into side effects. Price protection - from 2% to 10% loss of performance.

Power Attacks

We turn to another class of attacks on side effects - attacks on power consumption. These attacks are actually applied only to hardware cryptographic modules, and the more isolated and narrow the function the module performs, the more successful the attack. Obviously, the different instructions executed by the processor consume a different amount of energy, which is ultimately determined by the number of switching transistors. (From the physics course, we all know that MOS transistors consume current at the moment of switching, and at rest their current is negligible, which cannot be said about TTL transistors). Consequently, instructions or groups of instructions can be identified on the consumption graph. But the same Kosher showed that by analyzing the charts of consumption of smartcards, you can extract the secret key.

Here is the consumption schedule for DES blocking a single block: the initial permutation is visible, 16 rounds of encryption, and the final permutation

And here is the detailed schedule of the 2nd and 3rd round:

Attacks on consumption are usually classified as simple and differential (abbreviated as SPA, Single Power Analysis, and DPA, Differential Power Analysis). Charts, like the ones above, are analyzed in SPA attacks, where the main goal is to match the parts of the chart to the instructions or functions to be executed and somehow determine the values that occur in the device registers. SPA attack assumes that there is detailed information about the internal implementation of the device. The DPA attack, on the other hand, is based on a statistical analysis of the results of multiple measurements. For example, the classic attack on the DES smartcard, described by Kosher, looks like this:

register the power consumption of the last few rounds for 1000 DES encryption, and for each round after that the encryption result is also recorded
the schedule for each such encryption is broken down into 100,000 equal segments, and each segment is assigned its energy consumption
The obtained matrix 1000 x 100000 is analyzed by statistical methods and a key is found that has been unchanged throughout the entire measurement period. (For the exact algorithm, the reader is invited to refer to the original source, since the format of the article is not very convenient for displaying multi-story formulas)

Opposition

We considered several options for attacks on side effects. All over the world they understood that when designing cryptosystems, such attacks should be taken into account even at the design stage. There are several directions:

Creating obstacles to reading side information. A very limited measure, unless it prevents physical access to the device. In other cases, hacking is usually reduced to more expensive equipment for data collection, and if its value exceeds the value of the hacking itself, then the goal is achieved
Incorporation of random noise into “leaky” signals. This is a more effective measure, but it also has its pitfalls. First, obtaining random data in a computer system is in itself a problem: where to get it and in the right amount. If the interference is “not quite” random, then the patterns can be calculated and further separated the noise from the useful signal. Secondly, a hacker can even find a reducing algorithm even for “good” random noise, and the more data he gathers, the greater the likelihood of success.
Attempt to make a deterministic system. For example, so that all operations take the same time - then an attack by time will be useless. This is also not easy to achieve, since optimizations of the processor and compiler are entering the arena, the presence of several cache levels with different access times - all this plays into the hands of the cracker
Designing cryptoalgorithms and protocols that are initially resistant to side-channel leakage is the most reliable countermeasure. For example, to apply such an algorithm to calculate a certain formula, the execution time of which would not depend either on the width of the input data, or on the number of zeros or ones in them, or on any other properties of the arguments

A lot of scientific papers are devoted to the design of side effect attacks. And in no case should we think that only old cryptoalo rhythms are subject to these attacks: RSA, DES, Diffie-Hellman. Already there is evidence of “leaks” from AES, and from algorithms on elliptic curves. To cover all this within the framework of one article is unreal. My task is to interest readers in order to deepen this fascinating topic.

For those who want to go deep

Source: https://habr.com/ru/post/194760/

All Articles