GitHub introduces SHA-1 collision detection system

From March 20, 2017, when calculating the SHA-1 hashes on GitHub, any content that has signs of a possible SHAttered attack on the SHA-1 hashes is detected and rejected. About this company wrote in the official blog. Thus, no one will be able to place here files from a pair with the same hashes, but different content. Although so far in practice no one has conducted such attacks anywhere except torrents , but GitHub decided to play it safe just in case.

Almost a month ago, Google and the Center for Mathematics and Computer Science in Amsterdam presented the first way to generate collisions for the SHA-1 hash function. The SHAttered attack was the result of a two-year study that began shortly after the publication in 2013 of the work of the cryptographer Mark Stevens from the Center for Mathematics and Computer Science in Amsterdam on the theoretical approach to creating a SHA-1 collision. He further continued the search for practical methods of hacking, along with colleagues from Google. Published scientific work , which describes the general principles of generating documents with message blocks that are prone to SHA-1 collisions.

Shortly after the document was published, the first sample of its actual use in applications appeared: the BitErrant attack, which allows you to create two identical torrents (a .torrent file) with the same hashes, but which correspond to different files. One is a regular executable file, and the other is a malicious file with Meterpreter stuffing for the Metasploit framework.
')
Linus Torvalds said that the SHA-1 collisions in the Git repositories have nothing to fear . He explained that there is a big difference between using a cryptographic hash for digital signatures in encryption systems and for generating “content identification” in a system like Git. In the first case, the hash is a kind of statement of trust. The hash acts as a source of trust that fundamentally protects you from people you cannot verify in other ways. On the contrary, in projects like Git the hash is not used for “trust”. Here, trust extends to people, not to hashes, says Linus. In projects like Git, SHA-1 hashes are used for a completely different, technical purpose - just to avoid accidental conflicts and as a really good way to detect errors.

Despite the opinion of Torvalds, the developers of GitHub decided that reinsurance would never hurt, so now this platform is protected from placing files and code from a pair with the same hashes.

Unlike Torvalds, the GitHub developers believe that the correct operation of SHA-1 is really important for Git. They do not argue with the very logic of his reasoning, they simply indicate that such an attack is indeed possible in practice in Git. They even describe roughly how such an attack might look. It should be borne in mind that the attack is impossible on previously created objects. To conduct it, the attacker must specifically create at the same time create two new objects with the same hashes. They will be all the same, except for a small section of different data.

In this case, the attack can be carried out according to the following scenario:

Generate a couple of objects where one looks fine and the other does something malicious. This is best done with binary files, where people hardly notice the difference.
Convince project maintainers to accept the “innocent” half of the pair and wait until they tag or commit with this object.
Distribute a copy of the repository, in which an innocent object is replaced with a malicious one (for example, by placing it somewhere on a third-party server and presenting proof to everyone of the authenticity of the copy using an identical hash). Everyone after checking the signature will think that the contents of this project corresponds to the contents of the original repository.

The SHAttered attack always leaves traces — a specific sequence of bytes that is the same in both parts of the pair. It can be detected if SHA-1 is calculated for any part of the pair. GitHub now performs this check on every hash calculation.

The code for defining a specific sequence of bytes was written by Mark Stevens and Dan Shamov and published in the public domain .

So far, there has not been a single hash collision on GitHub, according to her. The PDF files from the original attack example give the same hashes by themselves, but not in the Git repository, because here, when calculating the hash, some more technical information from Git is taken into account.

There is some possibility that the automatic blocking algorithm will work on any code or file that is not hosted for the purpose of subsequent substitution and attack. But there's nothing you can do. You have to change your code to unlock it.

The GitHub developers even calculated the probability of accidentally writing such code that would be susceptible to blocking. If five million programmers generate one commit per second each, then the chance of an accidental collision at the time of the transformation of the Sun into a red giant is about 50%.

GitHub developers are now working with Git to incorporate a collision detection library into a common project. In addition, Git is now developing a plan for switching from SHA-1 to a more secure hash function with minimal damage to existing repositories.

Source: https://habr.com/ru/post/324600/

All Articles

GitHub introduces SHA-1 collision detection system

More articles: