Introduction to KDF on the example of solving a cryptographic rebus

One day, walking across the dark corners of the light Internet, I came across a software developer’s job with an impressive list of requirements and responsibilities with a focus on security systems for both software and hardware.

In addition to a long list of requirements, an even more fantastic list of expectations was attached: serious mathematical skills, experience in cryptography, analysis, and the like. But it was also proposed to solve the puzzle test: a coded message that needed to be decrypted.

Although the topic for me is absolutely new and I don’t come close to the requirements, but I decided, for the sake of simple interest, to try to solve the encrypted message. See what hackers are lured to work with.
')
The message was encoded in base64:
KDF function for use of this generic hash algorithm which has a 32 bytes output.
KDF output.
Here is the encrypted question: followed by the text of the encrypted message as a hexadecimal string.

KDF

The first thing that caught my eye was the unknown KDF function .

Key Derivation Function is a function that is used to obtain secret keys from some other secret value for information protection tasks.

The most common task for KDF is password hashing. Where it is necessary to complicate hacking compromised hashes.
update (thanks to frol ):
The task of KDF is to generate a conclusion that is not only difficult to reverse (the task of a hash function), but also satisfies the requirements of randomness of a sequence (this requirement comes from the encryption algorithms with which KDF is used for initialization parameters). Thus, yes, the most common KDF task is to “hash” passwords, but not for storing in the database (although it is possible already), but for further use of these “passwords” as input to encryption.

Isn't the usual sha-2 with salt suitable for backing up hashing, saving and protecting passwords?

Here KDF provides better protection for secret keys (such as passwords) than regular cryptographic hash functions. This is obtained by stretching the key ( key stretching ) and the ability to use different pseudo-random hash functions (pseudo-random hash-function).

If password hashes were obtained with ordinary cryptographic functions, like md5, sha256, and others, then dictionary attacks or brute force will happen orders of magnitude faster than if some key retrieval functions (KDF) were used based on the same hash functions general purpose.

The thing is that using the key stretching, namely, repeated hashing, allows you to control the difficulty of selecting the hash, the used memory and processor time. As a result, the repeated complication of hacking results.

When legal users need to perform a calculation only once (for example, getting a password hash and comparing with a stored value), then an attacker needs to make billions of such calculations, which, if properly configured, KDF gives almost no chance of success.

KDF Types

And now we have dealt with KDF, now we will return to the first hint.
KDF function for use of this generic hash algorithm which has a 32 bytes output.

Since KDF normally uses general-purpose cryptographic hash functions as pseudo-random, it is clear that there is some KDF that takes this sentence and uses some common function that returns 32 bytes.

What is the nature of the species KDF?
Here is a sample list:
KDF1, KDF2, KDF3, KDF4, MGF1, PBKDF-Schneier, PBKDF1, PBKDF2, bcrypt, scrypt, HKDF.

Too much, it would be necessary to exclude the obviously wrong.
After reviewing the description of each function, deleted from the list those that are either outdated and difficult to find implementation, or there is no information about pseudo-random functions.

The most modern and up-to-date remained: PBKDF2, bcrypt, scrypt, HKDF.

It seemed the most promising challenger was PBKDF2 . But the input gets, in addition to the hash function, also the salt, the number of iterations and the number of output bytes.

Then it turned out that bcrypt always requires salt to enter, and the number of iterations depends on the implementation. Since we don’t know anything about the salt from the message, bcrypt leaves the bidders.

Then comes scrypt - the most crypto-resistant modern KDF, which uses HMAC_SHA256 as a pseudo-random function (which suits us, since the output will be 32 bytes), but also takes the input salt, the number of iterations or computational complexity, block size and paralleling factor.

The HMAC-based Extract-and-Expand Key Derivation Function or HKDF also remains.

Key search

The second stage of decryption is to understand how to decipher.
The hint says: Encryption algorithm is a KDF output.

With the encryption algorithm, everything is clear - this is AES in CBC encryption mode.

For decryption, you need to know in addition to the key and the initialization vector, so what other variant of AES is used, with a block size of 128, 192 or 256 bits.

We were given a hint that the key and initialization vector (most likely) is the output of the KDF.
Have you already felt the number of unknown parameters?

Here dances begin, because functions such as PBKDF2 or scrypt can return count almost any number of bytes, and HKDF returns, depending on the pseudo-random function.

And the calculation complexity parameters for PBKDF2 and scrypt are unknown, and the salt is also unknown. And it is unclear what is the size of the AES key and what is the size of the initialization vector.

Fortunately, it turned out that the size of this vector should coincide with the size of the block. For example, for rijndael-128 (rijndael is another name of AES) - the block size is 128 bits or 16 bytes. Therefore, in this embodiment, the size of the initialization vector must be exactly 16 bytes.

It also turned out that the key cannot be greater than 32 bytes.

Decryption

Now you can proceed to the decoding itself.
And now it is clear that we need to take the result of the KDF, from 0 to 32 bytes is the key, and then, depending on the AES version, the initialization vector: 16, 24 or 32 bytes.

It turns out that the size of the result of the KDF should be no more than 64 bytes to accommodate 32 bytes of the key and 32 bytes of the initialization vector.

After checking on the basic parameters: empty salt, simple coefficients (ones), the result could not be obtained.
So I had to start to panic.

Panic

For PBKDF2, only one parameter was moved - encryption iterations, but with empty salt (since it was not clear where to get this salt).
Three parameters were scrolled for scrypt at once: block size, paralleling factor and complexity.

Despite the fact that the key and initialization vector were also selected, no results appeared.

I began to look at the hash-based message authentication code ( HMAC ) options hashes, but everywhere it was said that it is used as a pseudo-random function, and not as a KDF.
But, having tried all the options with HMAC, I did not get a result.

Light at the end

In one of the scrypt papers, I discovered that just KDFs could be used as general purpose hash functions that were not specifically designed to be keyed, which is of course bad practice.

Understanding that there can be no difficult answer in such rebuses, we must try to find the most suitable and simple option. Therefore, I decided to go through all 32 byte general-purpose hash functions, without any KDF, and just try to get the key and initialization vector from the result.

And it turned out that using rijndael-128 (16 bytes is the size of a vector and a block), the sha256 hash function, the result of which 32 bytes was divided into a key of 16 bytes, and the remaining 16 bytes per initialization vector, turned out to decrypt the original message. In which it was simply proposed to send a letter to a specific address.

With one hand I type, I wipe the other with tears. Everything turned out to be much simpler than was seen at first.

Source: https://habr.com/ru/post/259199/

All Articles