
In the context of recent laws, events and trends, the value of a rutracker as a database of various content, and not as a specific resource, is more obvious than ever. Unfortunately, all my calls for the rutracker administration to provide a
public, complete, convenient dump of their base came across a
complete misunderstanding on their part. Spreading something that they call an
encrypted "base" - I do not consider it a solution to the problem for the reasons stated in the above discussion threads and duplicated below.
Unfortunately, I didn’t have enough time to solve the problem, nor, frankly, knowledge. But, fortunately, my words had an effect on people who possess both. As a result, these people organized and together they did what the
Bolsheviks talked about for so long about what I wrote, namely, using scripts, bypassed the root-tracker, dumped all descriptions of distributions with hashes, parsed them and put them into a convenient base for use. In addition to this, the “muzzle” was also written: a program for convenient work with a database of end users who do not know from which end they are holding grep.
Unfortunately, none of this team has an account in Habré (except for read-only) , in the sandbox the article could be lost, so I was chosen as a mouthpiece for this site. I, frankly, thought very briefly and only on how best to do everything. If there are any questions - ask me in the comments, I either answer myself, or redirect to developers. Technical texts from the first person, but I have an indirect relation to them, they are left in this form for ease of perception.
')
Before turning to the technical part and links, I would like to add that the whole point of this undertaking is that as many people as possible keep this base to themselves. Therefore, I beg you, download the data on the links below (it is desirable to use a torrent) and stay on the distribution as long as possible. Most likely in the future, the base will be updated, but this moment has not yet been thought through to the end.
Description of the storage base distribution format
The number of hands in the database: 1411636
There are two storage locations: a table and a database of descriptions.
The table stores the distribution number on the rutreker, the name of the distribution, the approximate size in bytes, the number of seeds, the number of peers, the hash in base32 format, the number of downloads and the date of update of the distribution. The size of the distribution is approximate, as it was obtained by parsing lines like “2.05 GB”. Unfortunately, no way to find the exact size from the source code of the distribution page was found. The distribution name is encoded in UTF-8 so that on systems where this encoding is standard, the file can be viewed less without additional manipulation. Hash distribution in base32 to take up less space. In the graphic program for viewing the database, it is possible to switch the display of the hash (including magnet links) to HEX. Field separator: TAB. All whitespace characters in the names of distributions were replaced with spaces. All HTML constructs in the names were replaced with the corresponding Unicode characters, this is another reason why cp1251 was refused in favor of UTF-8. The date is encoded in the format: "16-Jul-11 06:23". The English names of the months are chosen so that there are fewer problems with parsing.
Example: 4085734 [x86] Ubuntu 12.04 Classic Remix 1170378588 206 3 Y4R4DX74NPXBKU6NECLJLV2N733F2NBW 20911 06-Jun-12 13:02
The base of descriptions is a collection of tar.gz files, in each of which there are distributions in increments of 1000. gzip is selected because of its speed and simplicity in terms of RAM. Archive files are grouped in 100 pieces into folders. Description of the distribution with the number 1234567 is in the file 012 / 01234.tar.gz / 01234567 in the UTF-8 encoding.
Program
Sources GNU GPL v2 license. Send pull requests.
The program is written in C ++ using Qt and
kdelibs libraries (for working with archives). The main part of the program is a table in which distributions are displayed (
QTableWidget is used). Above there is a field to enter a search phrase. The search (reading the file with the table and selecting the appropriate rows) occurs in a separate execution thread (thread), the results are sent in batches to the main thread, which adds new rows to the table. To transfer results between threads, use a
Qt :: QueuedConnection connection . When the file is read to the end or the required number of results is selected, a message is sent to the main thread that the search is completed. After this, the table is re-sorted. You can interrupt the search with the Stop button located on the top during the search.
The file with the table can be compressed in gzip, bzip2 or lzma / xz (under windows, unfortunately, the latter is not supported in our build). The file is unpacked and viewed on the fly, without fully unpacking and creating temporary files. This is implemented using the
KFilterDev class from the
kdelibs library. It was found that gzip and xz give a much better unpacking speed than bzip2, so the latter was rejected when choosing the format in which the base would be distributed. Gzip showed the speed, at times a large xz, and was present on windows in the used version of the
kdelibs library. Therefore, the choice fell on gzip, despite the loss in compression and a half. The user can unpack the table himself or use the appropriate menu option to store the table on disk without compression. By the way, it’s not a fact that it will speed up the search, since a larger amount of data will be read from the hard disk during the search, and reading from the hard disk may be slower than unpacking gzip.
Consider the table. I think the value of the columns does not need to be explained. You can sort by all columns, and by default the results are sorted by the number of downloads. To implement sorting, I had to inherit from QTableWidgetItem and
define a comparison operation.
If you double-click on any cell, the value in it is highlighted and becomes suitable for copying.
To view the distribution description, click the left mouse button in any field, except the distribution number and hash. The description will be displayed below (using
QWebView ).
To download a distribution page and display it below, click on the distribution number. To copy the distribution URL, right-click on its number.
It was not possible to make it so that when you click the right mouse button in the cell with the number and hash of the distribution the context menu with the item “Copy link” appears. Maybe some readers know how to get this from QTableView. However, you can leave it as it is, as pressing the right mouse button is faster than selecting an item from the context menu.
The implementation of mouse event capture on cells is done by inheriting
QItemDelegate and
defining editorEvent . Getting the description from the corresponding tar.gz is
implemented by means of the
KTar class from the
kdelibs library.
The program can be used without having a base of descriptions of distributions, then it will be possible to view the description only through the site by clicking on the distribution number.
The program stores the settings in the dump_viewer.ini file located in the program folder.
Instructions for building a program for the Debian GNU / Linux OS and Windows OS are in the
INSTALL file.
During the development of the program a funny incident came out with the parsing of dates. The date format “16-Jul-11 06:23” is non-standard, but it was left because it is rather short, readable and similar to the one that rutracker uses in its output. It turned out that
QDateTime :: fromString expects localized month notation (Jan instead of Jan in the Russian-speaking environment). Therefore, I had to write a
crutch that converts textual notation of months into numeric (Jan -> 01).
Why did we do this?
The database was prepared to facilitate user access to distributions in case of problems with the availability of the tracker site. For example, when the message "forum is temporarily disabled." In addition, this distribution is useful if the tracker is included in the list of blocked sites. I do not want even the smallest chance that everything that we have done together over the years has been lost at the whim of officials or due to a server crash, for example. While this distribution is alive, all the distribution of the tracker is also alive. Probably, once a month you will need to update this distribution.
rutracker wrote that the encrypted distribution on their tracker is better!
Answer: (more
here and
here )
a) We have descriptions of distributions. It is often difficult to choose, for example, BDRip, without looking at the description. The base of all descriptions is compressed to ~ 2 gigabytes. It was possible to shrink more, but decided not to save at the expense of the speed of the "muzzle". (Actually, there are still a few thoughts on optimization, but so far they have decided that the best enemy is good. However, ideas and commits will come true!)
b) Even if a group of people who know the password is distributed around the world - this is the final group of people that can be calculated and having the necessary resources to buy or intimidate.
c) The administration of the rutracker and personally the intellect are indisputably infinitely honest people, but until I myself see that it is the rutreker base in the distribution, and not the encrypted white noise, I will not believe anyone. I'm sorry.
d) There is no problem with fake sites and fake magnetic links. The base can be made not only by the administration (our base is an example of this), so the encryption of the base on the rutracker does not save. And the validity of hashes in the database is checked either by checksums (with GPG signature), or by a banal comparison with the rutracker itself (if it is still available).
e) In order to have actual distributions in the database - the base should be updated. The more often the better. And if the administration of the rutracker really cares that the users receive relevant information, I hope they will not put obstacles in updating our database. And even help, what the hell is not joking.
Future plans
The next logical step is to make an HTML generator [PHP] - a
site that duplicates the functionality of the program and the base. After that, we want to aim a blow at the static implementation of all parts of the site, that is, pure HTML / CSS / JS, without PHP or similar server logic. This will allow you to upload your website to almost any hosting, including free hosting, which will make it impossible in principle to eradicate this database from the network. There are already ideas on the implementation of the JavaScript search (for example, to make an index of word distributions, split it into separate files, balancing between the average size of a single file and the total number of files). You can add a full implementation of the search on the server side. Unfortunately, we don’t have any clever web developers;
Do this for
other trackers . For the pirate bay
already done . When the rutracker database is finished, you can go to other domestic and foreign trackers. You might think how to combine all the databases into one (apparently, on the file to the tracker, so that it would be convenient to select the desired trackers when downloading).
Distributed update base distributions. Of course, you need to periodically update the database: new distributions are added, old ones are updated. And why not shift the update task to users? Of course, those who agree on this. Firstly, our channels are not rubber, in order to constantly dump the tracker (s). Secondly, the trackers of several spiders can also be detected, followed by a ban and, possibly, a trial. and if there are 100 spiders, each of them will pick up new distributions too slowly for it to be detected. For the user, it will look like an item in the program “Take part in updating the database” and entering data to log into your account. Then the program will do everything itself. Found new distributions and changes in old ones will be sent to the center, which after checking them will add data to the common database.
By the way, an interesting problem on the theory of probability: if N distributions randomly pump M independent spiders at a speed of X distributions per day, after what time (expected) will they pump out the share Y of all distributions?
Links and contacts
bitbucket (source and base distributions without descriptions)
mega.co.nz (description database only, unpack the main tar in the program folder)
Torrents (all in one):
i2p (in the process of filling and indexing)
Magnetic linkmagnet? % 2ftracker.openbittorrent.com% 3a80
opensharingrutrackersha256 hashes of all distribution files: sha256.txt
The current sha256.txt and sha256.txt.asc can be taken in the torrent and
here .
GPG fingerprint: C567 227F 6D75 014E CDC0 FE7B E0F9 25D1 E020 95A4
e-mail: sir.ratnik@yandex.ru
Jabber: sir.ratnik@ya.ru
Jabber Conference: torrents-database@conference.jabber.no
OTR fingerprint: 7503B021 02E30FEA 88861B43 7AB21676 35704DBA
GPG-key----- BEGIN PGP PUBLIC KEY BLOCK -----
Version: GnuPG v1.4.12 (GNU / Linux)
mQINBFJEN4IBEAD0CPv + nS / cmY3RUfVgFfjTWNHCUg / PVXZwz0bcEdS9MxfG4Orq
4bn80EHBWX0d9lfe2l6sKPLWb52OxLFTwqGvOqcII8DHI502PMupGfTB00FU1 / rt
BY5xHCQMYseUZQfM7M5egbVLh6dzh + koWU4Syl0xfMVh87HVahs6ZaDPvfpk478A
mR063bKroHIm2wtJwiTnJgjlI53C + 0dg0dqalfMnXEI7OFBorvmi3tR1Xvw551LF
/ uWZ6OhoO / KHHuqLtaiWFN1Mw9zYZAsEFV6OXomt9QXsg7VYDlQoWGFxjdBfuk5E
PyfUZu4EwsKuaJbffUoglTKpj2ecT2mU9G51l2ZMqJm + JQZYeAkczwrN0iz + 7Syg
hEdYFL8Pd3Rsq6ttwDzoSXw3uqWnyfosB8FXAHq2M4vhip8HR + tK7isDhAuoB2Mt
lLFxqBVy3W4pRHYMH6h3cNsRS676pt6CGxfisdh3sMtykSNZDDPAYUwloP32QA / U
ugArWB3cVVW2o47qZVt / HReU53N7Tq / s + g9WaokU + qE65Q549M9vE1xhgf5ivGEz
xS2KS35PxJ9spizHCE3OSUWP2bHDE + O + qTeX3v9hYPJREExwQwor + r8sheX2kMst
UV3GC + DFQT9X11eG1rMVB + U / 0l + Dri0EFmbyNLmE3vGpuuLnSeFkDj + xZwARAQAB
tCFNci4gUmF0bmlrIDxzaXIucmF0bmlrQHlhbmRleC5ydT6JAjgEEwECACIFAlJE
N4ICGwMGCwkIBwMCBhUIAgkKCwQWAgMBAh4BAheAAAoJEOD5JdHgIJWkliAP / 3ZQ
77pGYWKr12JY6QKE8hw4L3lj7qjLra8PWFiSwVkbJe3Vrb2oGG / + n3YsTNt7bdKY
PyG7lfVraMcekdEzuJevSt / Cp2NXwcHGyE3405KaymG + kyv3e7lWmXSFS5Nzo3ta
TQ9M + MLspVwxaT3jcW + nCbnml5TkvhSPEmOIe6gTlfXgRhngE6zvsxB1I0bxixEa
u0 + SOHVBrlzBPVOXbQyli99 / vsYAuf9xIhJtv2ySYYlZRXOYhj + eyYEu878Z87J1
jxTsYfoG3pMZ10rWWbh0rtCvHTeZjzb8G0gswyNlwPqVuU + nW6CQL8gb0kGUBtBR
pQkei02zY1RoE + cB3tddtZYb7hJzSyZD8Gvbwr03xJeYldwbOg9KIYvIvsrB3GP9
BhGAf + wEaZX56yFMmP6snqBUuJ3hdYqXswpnZB1Dt7y9CzdsANpETcys5ika2typ
vfpbxI27Ace1SOsoFRmFXzwaKCvKWoR4vfaU7YxDYJ7fbin07vdIEY + d0FozHHRT
o1Zr1DHmV5fYFA1iAn14IXwPaIocxTtjAOY55q9p9xFygUPKnFlVEX3mSIL9 + FJy
IQfqvWNvw4Z + PwNaNpFfWS5XAXrxiV0TJHXcmW8e6d12z9MEyRpUlndLPE37Q6iB
WAj3QKNM3gR / M / BNZ8d + 52V5kxZXtj5zi / O + fuGLuQINBFJEN4IBEAC5PyxaDHRA
DMUn5fuZnQZyJP37yiR5x4us6th6dBQFthpZQ8uso + x1YI9namQYxOZRPBr5IIpo
qmAmTVoskoTIGlMJ43IwuFO / fqxzba44cUahLyEWwQ8Q6L8JsU3KACdDRW1cfM8 +
9E0kLfXHxpY57tQmRpqczvXfF88G58309fnVd8HVPFg3Hp1DwB7sXoCO0NiyRc6i
o0r8WNQ3TJABQd76nw79aWDcIox1ayff8DBbzQI + Azefd + s1SaOlUrH568IaatFA
daGhXPHz2qhfnlPVbqK7HUWoNKBd3O4XGjogc8k / 9e4RlpBbinPzZMSr0AcPU65I
dMAizyh6UrluTmfK99ujxOloC0KJIYann26OPdCdHcj6YsdhiBpuxE03L7NmsBNP
QIOXva09WkD7vdoWRdRtLRAd / WzChmr0P7gTFLQqEmY + dq7nec2U70zoYtnhgB77
Csu6UYK04oVMX / ytHSJWDyr7IdrTOYRFAawX4ppyNxspT7mrK0Fv5qcoDenieSuP
X4klLnueIQQZbAfFGZE2Q + oq8Zm6v + pPHQ53zHYokY1M7kY / O4XhLiHwhMyUflPp
vXp2gdypYNc7p / eXne + hpEPcn9gzJcpJnqT6SzoAOxGOvnazGf9LlygJXQkAYeGa
ezWQKN5cOJe5S / 0OpPWKhJtggl9RWSWNywARAQABiQIfBBgggAJBQJSRDeCAhsM
AAoJEOD5JdHgIJWkBNYP / jI8eLjFJl / 5P8BTtV0dzODGu3492RAAlo6Ia6XBhTCg
lVJKs97TaJLQU0g8NrP2JWaMUVoDnvWldHDYBP0XF7iJqzjvxInY21joFEI2FBVY
uBibtZiPhRXX2wxAUrJCpzoWRZuoOPAucN24kESOt8QkRYvJu402WzE8n70 + Bhhd
kKHEvVPHwn + beNJo06dzRENuhS5Qc3lnr3rWyozFZzeZnHwqzztCvx1vM8bwWq + r
Vq / HeA + BjAGN / E7iK02xp / 2lpp / DT06pe2je1cdCDXO41w8lgUad4WsYhoPVZ7BA
TTyRqMVYIL69XkljgrUHRp9Dqj8ID6kl2u9L6oi4C4VQYTcgoUPXQuiebz5D / Fxi
fbox3VshqG + jk3tJaiiavO / TcENvmgqpMsvcvjfN / CEUz / H0 / c7idreRUTKc / 0Cg
KrUG0JOq3rinyfdQ69B / rIwAHCLErL6DgT0MLhH0H + s1dC2nWjZBbj8cn6VvVQTj
Fe0VLG3Rg5E8UPGTevaegN2gY5EPcgB6GKZIWn1Saoa7FEY / m5gVK0UMwB6wfnVC
MMLppPWvn6Ej76QZTPUYGZHnvKogEkQTa + PCVgJWDEcTADEoqF5S7wR / JJXshSwd
QofqYT1XrdI07u50bYv5X11H7yWfIdUhzYOGCm0hrZmzos + bMbMry2Y6v4KxFsib
= Peeh
----- END PGP PUBLIC KEY BLOCK -----
PS I would like to thank the LAVteam team for technical support.
UPD: Also, thanks a
lot to init0 for an invite to the direct representative of the development team -
ratnik0 . You are not namesakes, by the way? ;)
UPD2: If the program requires ssleay32.dll for someone under Windows, then installation of
openssl libraries will help you.
UPD3: Created a jabber conference to coordinate sympathizers and discuss future plans: torrents-database@conference.jabber.no
UPD4: Who voted for the dump pornornab? I need your help - we are waiting in the conference.
UPD5: rutor deleted the distribution without giving a reason.