Format preserving encryption or how to correctly encrypt credit card numbers

Hi% username%! Today we have a little Friday crypto. In March 2016, an interesting publication from NIST was numbered 800-38G (pdf) and with a very interesting naming. Two algorithms that do not change the data format when encrypting are written off. . That is, if this is the credit card number 1234-3456-4567-6678, then after encryption it will also remain a number, just different. For example 6243-1132-0738-9906. And this is not a simple xor, there is AES and everything is serious. Let's talk a little about FPE in general, and about one of the implementations in particular.

Yes, this is not some kind of focus, you can really properly encrypt and decrypt data from a limited set of input values. Of course, there are some pitfalls here, but the problem is solved in principle. The main thing is that the resulting design is not less resistant to cracking than the cryptoalgorithm that underlies it.

So, we have a predefined set of characters, for example, the numbers 0-9 and we do not want to go beyond it. To make it easier, they talk about the basis of the number system equal to the size of this initial set. For numbers, this is obviously 10. For the Russian alphabet, this will be 33, for English 26. The characters themselves are not taken into account, we work only with an array of indices, and then instead of indices we substitute characters from the original set. For example, for the string “Habr” and the Russian alphabet as an input set, the result will be an array [22, 0, 1, 17]. That was the easy part.

Comrades Black and Rogaway have come up with several ways to encrypt while preserving the format.
The simplest is the prefix cipher. Let's look at it:
')

Prefix cipher

We encrypt each index with a strong algorithm like AES.
Sort the encrypted values.
Use the sorted list as a new index list.

For example, we encrypt a set consisting of 4 elements.

Here I copy Wikipedia in order to demonstrate how it works in practice
Encrypt each index, for example, AES. We get, for example

weight(0) = 0x56c644080098fc5570f2b329323dbf62 weight(1) = 0x08ee98c0d05e3dad3eb3d6236f23e7b7 weight(2) = 0x47d2e1bf72264fa01fb274465e56ba20 weight(3) = 0x077de40941c93774857961a8a772650d

Sorting [0,1,2,3] by weight gives [3,1,2,0], so the cipher looks like this:

 F(0) = 3 F(1) = 1 F(2) = 2 F(3) = 0.

It is a working version, very simple to implement. But for large sets such a generator each time is long and therefore other algorithms were invented.

Cycling walking

If the size of the set is not much smaller than the size of the encryption block, then you can try to encrypt in a loop until the result is equal to the size of the set. Very similar to mining Bitcoins , by the way. But here the obvious minus is the unpredictable speed, which will fall the more, the smaller the set relative to the block size.

Feistel nets

In the description of Black and Rogaway, this is about the same as Cycle walking, only a Feistel network with a size equal to the size of the set is used.

The BPS algorithm, which I want to dwell on in more detail, is a further development of the idea of encrypting a limited set of a Feistel network with the difference that it uses addition modulo the size of a set . Thus, you do not need anything brutal, everything turns out very nice and pleasant. It is only necessary to work a little with baitikami.

I will give all the data here simply because in the original algorithm there are irrelevant steps like reversed bytes BE-> LE at each stage, which will only trash the description. Go!

Input data:

The number of elements in the set, it is also the base of the radix number system
Key for AES. There is still a tweak, this is type IV, but we'll omit it here for simplicity
A set of encryption indexes, it’s also our “plaintext”. Must be given a radix in one AES block

One of the most important and cunning functions here is the packaging of an array of indices into an array of bytes and unpacking back. The size of the set does not necessarily fit in 1 byte and vice versa, you can save space by packing indexes more tightly if the size allows.

This function is called num and in fact it works very simply:

 BigInteger num(int[] X, int radix) { // 1. Let x = 0. BigInteger x = BigInteger.ZERO; // type conversion for readability BigInteger r = BigInteger.valueOf(radix); // 2. For i from 1 to LEN(X) for (int i = 0; i < X.length; i++) { // check the value of X[i] if (X[i] < 0 || X[i] >= radix) throw ... // let x = x * radix + X[i] x = x.multiply(r).add(BigInteger.valueOf(X[i])); } return x; }

That is, we add to our large intu index multiplied by the base of the number system. And then we convert BigInteger into an array of bytes with standard tools and add zeroes to the required size, if necessary.

Further I will denote this operation num(x, radix)

Feistel Network

Since the construction is based on the Feistel network, we divide the initial set into 2 halves and work separately with each of them, changing places at the end of the round. Let's call these halves A and B.

We encrypt, for example, the combination [0 1 2 3] with the base (radix) 4.

A = [0 1]
B = [2 3]

Consider one simplified BPS round step by step:

1) We work with half B. We translate it into bytes. num (B, 4) = [14] (because 2 + 3 * 4) is padded to the block size with zeros, we get [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 , 0, 0, 0, 14]

2) Encrypt the block from AES A.1. We get, for example 3B20D6CC035B63C8CFD53C15D477BBF8

3) We translate 3B20D6CC035B73C8CFD53C15D477BBF8 to a number, we get 78594961850022499408352439646346001400

4) We translate half of A into a number, num (A, 4) 0 + 1 * 4 = 4

5) Fold up p.4 and p.3, we get 7859496185002249940835243964634600140 4

6) Now the trick.

6.1) In order to pack this value in half for the next round, we first calculate how many digits there are, raising the radix to a power equal to the number of elements in the half A
4 ^ 2 = 16

6.2) We take the remainder of the division of 7859496185002249940835243964634600140 4 by 16, we get 12

6.3) Convert 12 into an array of indices for base 4, we get [0, 3] (0 + 3 * 4)

7) Change halves in places, it turns out [2, 3] [0, 3]

And so 7 more times, only 8 rounds

In the opposite direction is not much more difficult. Just starting with A, not B

1) Repeat steps 1–3 to get 78594961850022499408352439646346001400

2) We translate [0, 3] into a number, we already know that it is 12

3) Subtract 78594961850022499408352439646346001400 from 12, we get a negative number -78594961850022499408352439646346001388, which is 16 in the module is four. Yes, negative numbers can also be taken modulo, no big deal.

4) Translate 4 into [0, 1]
Swapping halves, we get [0, 1] [2, 3]

Everything, we honestly encrypted and decrypted a half of our pleintext and did not go beyond the boundaries of the set. Agree, this is a very beautiful and elegant method.

The standard, by the way, is not described, and the BPS dock (pdf) describes a method of hooking blocks like CBC in case the size of the AES block is not enough to encrypt a single text. There, however, instead of xor, modulo addition is also used, only elementwise.

About credit cards. It is not necessary to encrypt in this way (base 10, of course) all 16 digits of the card. At least you can not touch the Cheksumma and honestly calculate it separately.
In addition, the card has an identifier of who released it (the first 6 digits), and the actual account number is the penultimate 9 digits. It makes sense to encrypt them in this way.

I have it all. This code on java, compatible with the new NIST standard, helped me to study the algorithm.

Source: https://habr.com/ru/post/311200/

All Articles

Format preserving encryption or how to correctly encrypt credit card numbers

Prefix cipher

Cycling walking

Feistel nets

Feistel Network

More articles: