Electronic digital signature for dummies: what it is, and how not to choke. Part 1

So, more and more often in the circles working with documents, the words “electronic document” and, associated with it, almost indissolubly “electronic digital signature”, or else EDS, are increasingly heard.

This series of articles is intended to reveal the “secret knowledge” of what it is, when and how it can and should be used, what are the pros and cons.

Naturally, articles are not written for cryptographic specialists, but for those who will use this cryptography, or just starting to study it, wanting to become an expert, so I tried to simplify the understanding of the whole process as much as possible, giving analogies and considering examples.
')

Why do we need to sign something at all? Naturally, in order to certify that we have read the content, we agree (and sometimes, on the contrary, disagree) with it. And the electronic signature also protects our content from being changed.

So, naturally, it’s worth starting with what an electronic digital signature is.
In the most primitive case, this is the result of the hash function. Wikipedia will explain what it is better than me; in our case, the main thing is that with a high degree of probability its result does not repeat for different source data, and also that the result of this function is not only shorter than the source data, you cannot restore the source information . The result of a function is called a hash, and the application of this function to data is called hashing. Roughly, you can call the hash function archiving, with the result that we get a very small sequence of bytes, but you cannot restore the original data from such an “archive”.

So, we read a file in memory, we hash it. And what, already get an EDS? Nearly. Our result with great stretch can be called a signature, but, nevertheless, it is not a full-fledged signature, because:

1. We do not know who made this signature.

2. We do not know when the signature was made.

3. The signature itself is not protected from substitution in any way.

4. Well, yes, a lot of hash functions, which one was used to create this particular hash?

Therefore, to apply the word “signature” to the hash is not good, we will call it just a hash.

You send your file to another person, say, by mail, being sure that he will receive and read exactly what you sent. He, in turn, must also hash your data and compare his result with yours. If they coincided - all is well. Does this mean that the data is protected? Not.
After all, anyone can hash anytime, and you can never prove that he hashed is not what you sent. That is, if the data is intercepted along the way by an attacker, or the person to whom you are sending data is not a very good person, then the data can be quietly replaced and prohashed. And your recipient (well, or you, if the recipient is the same bad person) will never know that he received not what you sent, or he himself changed the information from you for further use in his bad purposes.
Therefore, the place to use pure hash functions is data transport within a program or programs, if they are able to communicate with each other. Actually, with the help of hash functions checksums are calculated. And these mechanisms protect against accidental data substitution, but do not protect against special .

But, let's go further. We want to protect our hashing result from substitution, so that everyone who meets them could not say that he has the right result. For this, the most obvious is that (besides administrative measures)? Correct, encrypt. But with the help of encryption, you can verify the identity of the one who hashed the data! And it is relatively easy to do, because there is asymmetric encryption . Yes, it is slow and heavy, but we just need to encrypt a small sequence of bytes. The advantages of such an action are obvious - in order to verify our signature, it will be necessary to have our public key, according to which the identity of the encrypted (and therefore created the hash) can be easily established.
The essence of this encryption is as follows: you have a private key that you have stored. And there is a public key. You can show and distribute the public key, but not the public one. Encryption is performed using the private key, and decryption is performed using the public key.
Leading the analogy, you have a great lock and two keys to it. One key opens the lock (open), the second - closes (closed). You take the box, put something in it and close it with your lock. So, as you want the box closed by your lock to be opened by its recipient, then you open it, which opens the lock, quietly give the key to it. But you do not want someone to lock the box again with your lock, because this is your personal lock, and everyone knows that it is yours. Therefore, you always keep the closing key with you so that someone will not put nasty muck in your box and then say that you put it and locked it with your lock.

And everything would be fine, but then the problem immediately arises, and, in fact, not even one.

1. We must somehow transfer our public key, and the receiving party must understand it.

2. We must somehow associate this public key with us so that it cannot be assigned.

3. Not only does the key need to be tied to us, you also need to understand which encrypted hash to decrypt with which key. And if the hash is not one, but there are, say, one hundred? Keeping a separate registry is a very difficult task.

All this leads us to the fact that both the private key and our hash must be stored in some formats that need to be standardized, distributed as widely as possible and then used so that the sender and the recipient do not have “translation difficulties”.

As usual among people, they could not come to something single, and two large camps were formed - the OpenPGP format and the S / MIME + X.509 format. But about this in the next article.

Part 2

Source: https://habr.com/ru/post/97066/

All Articles

Electronic digital signature for dummies: what it is, and how not to choke. Part 1

More articles: