RSA, is it really that simple?

Prelude

Good day, dear readers.
Most likely, most of you know what the asymmetric RSA encryption algorithm is. In fact, so many articles are devoted to this issue on the whole Internet and on this resource in particular, that it is almost impossible to say something new about it.
Well, there, by God, you can still come up with and so everything is understood long ago. The recipe is simple:
Two prime numbers P and Q.
Multiply to get the number N.
Choose arbitrary E.
Find D = E ^-1 (mod ( P-1 ) ( Q-1 )).
To encrypt a message, M is raised to the power of E modulo N. To decrypt the cryptotext C to the power of D along the same modulus N. All crypto-primitive is ready. We take and use, right? In fact, not so. The thing is that this is really nothing more than a crypto-primitive and in the real world everything is a bit more complicated.

A bit of theory

In order to understand why it is extremely not recommended to use RSA in its most simple form, we first note the following requirement put forward to asymmetric cryptosystems.
Requirement 1:
A modern asymmetric cryptosystem can (but this is not a fact) be considered resistant if an attacker, having two plaintext M ₁ and M ₂ , as well as one ciphertext C _b , with a probability greater than 0.5, cannot determine which of the two plaintext ciphertext C _b .
See if RSA meets this requirement. So let's imagine the attacker Malory listens to Alice and Bob's correspondence. On one wonderful day for himself, he sees that Bob, in the clear, asked Alice a very important question, knowing which answer would enrich, well, or at least immensely amuse Malori’s curiosity. Alice monosyllabically answers Bob to this personal question. She encrypts her response with Bob’s public key and sends a ciphertext. Further, Malory intercepts the ciphertext and suspects that either "Yes" or "No" is encrypted in it. All he now needs to do in order to find out Alice’s answer is to encrypt the word “Yes” with Bob’s public key and if the received cryptotext matches the intercepted, it means that Alice answered “Yes”, otherwise the attacker will understand that it was no.

As can be seen from this, let the somewhat contrived, but still not so utopian example, RSA is not as reliable as it is considered to be. Yes, of course, we can say that Alice is a fool herself and no one asked her to answer such a serious question for her in monosyllables. So now to prohibit the use of monosyllabic answers in cryptography? Of course not. Everything is not so bad, it is enough that the algorithm adds some random information to the text that it would be impossible to predict and insidious Mallory will be powerless. Indeed, in fact, he cannot predict that the answer “Yes” after processing by the algorithm will turn, for example, into “Yes4FE6DA54”, and therefore he will not be able to encrypt this message and will have nothing to compare with the intercepted cryptotext.

Thus, we can already say that RSA in all its manifestations, whether it be PGP or SSL, does not encrypt only the data sent to the encryption function. The algorithm first adds to this data blocks containing a random set of bits. And only after that the result is encrypted. Those. instead of the usual
C = M ^E (mod N )
we get closer to reality
C = (M || Rand) ^E (mod N ),
where rand is a random number.
This technique is called add-on schemes. Currently, using RSA without add-on schemes is not so much a bad tone as a direct violation of standards .
')
But that's not all. It is believed that even if a crypto system satisfies the requirement formulated above, this does not prove its suitability for practical purposes. We formulate another requirement for the stability of an asymmetric algorithm.

Requirement 2:
Let the malefactor have access to the decoding "black box". Those. Any cryptotext at the request of the attacker can be decrypted. Next, the attacker creates two clear text M ₁ and M ₂ . One of these texts is encrypted and the resulting cryptotext C _{b is} returned to the attacker. The task of the attacker to guess with a probability greater than 0.5 which of the messages M ₁ and M ₂ corresponds to the cryptotext C _b . At the same time, he may ask to decrypt any message, except C _b (otherwise the game does not make sense). It is said that a cryptosystem is resistant, if an attacker, even in such excellent conditions, cannot indicate to which source text C _b corresponds with a probability greater than 0.5.

In the light of the above, let's see how things are in RSA.
So, the attacker has two messages M ₁ and M ₂ . And also cryptotext
C _b = M ₁ ^E (mod N ).
He needs to specify which of the two texts corresponds to C _b . To do this, he can do the following. Knowing the public key E, he can create a message
C '= 2 ^E * C _b (mod N ).
Then he asks the decoding “black box” to decrypt the message C '. And then simple arithmetic to help him. We have:
M '= C' ^D (mod N ) = 2 ^ED * M ₁ ^ED (mod N ) = 2 * M ₁ (mod N ).
Those. calculating M '/ 2 the attacker will see M ₁ . And this means that he will understand that in our example the message M ₁ was encrypted, and therefore we were once again convinced of the unacceptability of using RSA in its naive form in practice.

Eliminate this trouble help add-on schemes. Only now they are being asked not only that the additional information be completely random and unpredictable. But also the fact that the additional blocks helped to determine whether the ciphertext was obtained as a result of the operation of the encryption function or was modeled by the attacker. Moreover, if it is detected that the ciphertext is modeled instead of the decrypted data, the attacker will receive a message about the inconsistency of the data with the real cryptotext.

It would seem that the implementation of such an add-on scheme is an intractable task, but after all, cryptography already has a ready-made tool for controlling data integrity. Of course, these are hash functions. All modern add-on schemes are implemented on the idea of using a hash function as a check for decrypted data for authenticity.

In RSA, when signing and encrypting data, two different add-on schemes are used. The scheme implemented for signing documents is called the RSA-PSS (probabilistic signature scheme) or probabilistic signature scheme. The scheme used for encryption - RSA-OAEP (Optimal asymmetric encryption padding) or optimized asymmetric encryption complement , using the example of OAEP and consider how message encryption actually happens in RSA.

RSA-OAEP

So, to encrypt absolutely any message in RSA-OAEP, the following is done:

Two hash functions G (x) and H (x) are selected so that the total length of the results of the hash functions does not exceed the RSA key length.
A random string of l is generated. The length of the string must be equal to the length of the result of the hash function H (x).
Message M is divided into blocks of k-bits. Then, to each received block m, append (nk) zeros. Where n is the length of the hash function G (x).
The following bit set is defined: {m || 0 ^(nk) ⊕G (l)} || {l⊕H (m || 0 ^(nk) G (l))}
The resulting bits are represented as an integer M ₁
Cryptotext receive the formula: C = M ₁ ^E (mod N )

The decryption process is as follows:

Find M ₁ by the formula M ₁ = C ^D (mod N )
In the resulting set of bits cut off the left side. In the sense: the left part is the n left bits of the number M ₁ where n is the length of the hash function G (x). We denote these bits conditionally by T. And note that T = {m || 0 ^(nk) G (l)}. All other bits are the right side.
Find H (T) = H (m || 0 ^(nk) G (l))
Knowing H (T) we get l, since we know l⊕H (T) is the right side of the block.
Calculating l, we find m from T⊕G (l) since T = {m || 0 ^(nk) G (l)}
If m ends with (nk) nulls, then the message is encrypted correctly. If not, this means that the ciphertext is incorrect, and therefore it is most likely forged by an intruder.

Conclusion

Those. RSA is not only exponentiation modulo a large number. It is also the addition of redundant data allowing for additional protection of your information. You may ask: why is all this necessary? Can it really happen that an attacker gets access to a decryption algorithm? On another occasion, it was somehow said: if any unpleasantness can occur, it will happen. Only with the use of add-on schemes this will no longer be considered a nuisance.

upd: transferred cryptography to the blog.
upd2:
Literature and links:
1. N. Ferguson, B. Schneier "Practical cryptography"
2. N. Smart "Cryptography"
3. RSA-OAEP specification (pdf)

Source: https://habr.com/ru/post/99376/

All Articles

RSA, is it really that simple?

Prelude

A bit of theory

RSA-OAEP

Conclusion

More articles: