📜 ⬆️ ⬇️

About hashes and the problem of distribution in torrents

Surely, many of you in the process of using torrents faced such an annoying problem when torrents of the same content have different hashes, which does not allow combining several from different trackers in one distribution, thereby making a multi-tracker distribution.

What is the reason for the difference in hashes?

As you know, a torrent hash or info_hash is a SHA-1 hash from the info section in a torrent file. In this section, usually there is a distribution size, a list of files and other information about the content being distributed. For example, the torrent created by our beloved uTorrent looks like this from the inside:

image
')
Indeed, nothing more. Now let's see what the second third most popular Azureus bittorrent client in the world will give us from the same file:

image

Yeah ... And we are already in trouble. As you can see, the info section has been replenished with branded Azureus features in the form of the name.utf-8 item, which duplicates it is not clear what the previous name item is for. Indeed, it is worth looking at the specification of the protocol , where it is clearly stated that the meta-information is already in utf-8. Another item that ruined our holiday hash is private = 0 . I note that uTorrent, if a torrent is specified as public, the private item does not add at all and does it right, for private = 0 is equivalent to its absence. Also the size of the pieces varies.

As a result, we received two different hashes from the same file at the stage of creating a torrent only. Further more.

After downloading the torrent to the tracker, the situation is usually aggravated. Many trackers force torrents to be private, i.e. add private = 1 to the info section, thereby spoiling the hash (by the way, on torrents.ru this construct has for a long time been apparently inserted outside the info section by mistake). But this is half the trouble. A lot of trackers add all sorts of rubbish to this archival section, like specifying the identity of a torrent to itself: tracker = ***. Ru and so on, which again makes the torrent unique.

What is detrimental difference hashes?

The fact that the distribution of content in bittorrent networks is strongly inhibited by the difference in hashes, while the distribution files are identical, and it was possible to significantly optimize the whole process, unifying the principle of torrent creation. Such a picture not only spoils the distribution of distribution on DHT, but also does not allow you to optimally download the same distribution from several trackers by adding new addresses for announcements. It turns out that it is possible to distribute the same file to several trackers, but not to download.

Today, many trackers, including torrents.ru, abandoned the practice of “retrieving” torrents, which is very encouraging, but the above-mentioned rubbish tracker = torrents.ru in the info section negates all efforts, because the hashes of torrents from the same torrents are unique.

By the way, I would also like to mention the problem of the bittorrent protocol itself, which consists in the fact that the names of the distributed files (not to mention their relative location in the distribution) have the most direct effect on the hash. This is in my opinion a very big hole in all this ingenious protocol. If I'm not mistaken, even ed2k is deprived of it.

What to do?

Undoubtedly, this universal problem must be fought. What are the ways to do this?
  1. The trackers at the stage of downloading the torrent will automatically bring it to the standard form, clearing the info section from constructions like name.utf-8 and not adding there anything superfluous from it. And, if it does not contradict the ideology of the resource, do not make the torrent private. However, even private torrents from different trackers can be combined into a multi-track distribution, which is already good.
  2. Developers of bittorrent clients will come to a common standard for generating torrents.
  3. We are with you - ordinary users - to create torrent files whenever possible in a single client, for whose role it is best to choose exactly uTorrent as the most common.

Source: https://habr.com/ru/post/79952/


All Articles