📜 ⬆️ ⬇️

Interception and editing of http-traffic files on the example of a torrent

A couple of years ago, the idea arose to make a local bittorrent-retreker for users of our “home” city network so that users download faster and we have less traffic. The installation of the retreker itself has just begun, it was necessary to somehow announce it for downloadable torrents. In the process of finding out the ways and mechanisms of the announcement, I came to a fairly general and universal algorithm, with which I propose to get acquainted.

So first:

What to do


There are three ways to announce the existence of a tracker in the local network:

Each of them has its advantages and disadvantages, respectively:

To begin with, support was made for the first two options, since no special efforts were required for this: it was easy to add several records to the DNS ( retracker.local IN A and retracker.smarthome.spb.ru _SRV_ ). In this case, you can close your eyes to the incompatibility of .local with zeroconf, since ideally DNS client requests with zeroconf enabled should not even reach our server. Update : An important remark of cadmi is that in order for the user to work with .local and our retreker, you need to create a zone not .local, but .retracker.local , which will allow you to combine both options.

But still the third option looked the most interesting and tempting, so I decided to look for information on the general technique of changing downloaded files on the fly. Server requirements were simple:

')

What to do


To my surprise, I discovered that there is practically no technician and open source software for such interception and editing of files. By and large, there are only two of them: this is Squid with experimental ICAP / ECAP modules and some kind of filtering proxy called " MiddleMan ", the last release of which was released back in 2004, but which continues to be maintained in ports.

I refused to use Squid almost immediately: despite the presence of two experimental modules for working with passing traffic, the solution turned out to be extremely “crooked” and unstable even in installation and configuration, not to mention work.

Moved to middleman. Amazingly, but the fact is that the old program turned out to be more functional and more convenient than the modern monster Squid. In essence, it satisfies all the requirements, except for full transparency for the user - the user's source ip corresponds to the ip server with the proxy. I note that only Squid with TPROXY module under Linux can leave source ip. Moreover, it has a unique option - if the configured timeout exceeds a waiting time, the proxy gives the user an unchanged source file.

How to do


1. Identify the most popular torrent servers

To begin with, I wrote a small perl script that listens to port 80 via pcap and collects ip-addresses that are sent to requests from Content-Type: application / x-bittorrent. This is necessary in order to intercept not all http-traffic, but only the one that belongs to the large trackers.

Then, by simple manipulations, these ip-addresses are entered into the ipfw table used when redirecting to our proxy:

${ipfw} add fwd ${proxy_ip}, ${proxy_port} tcp from $lan_customers to 'table(15)' dst-port 80 in via ${int_if}

2. Configuration of the proxy server middleman
The section responsible for uploading files to the editing script is called external in mman.xml .

3. Editing "external" script


The mypatcher.pl script transfers files with the mime type "application / x-bittorrent", it adds a local tracker entry to it (removing retracker.local if there is one there, which solves client problems with zeroconf) and sends the contents back to the proxy , simultaneously saving the file also to disk, in this form:

#for i in `find /home/torrents/patched/ -type d`; do echo -n "$i" && ls -1 $i| wc -l; done | awk '{print $2" "$1}' | sort -rn | head -n 10

24103 /home/torrents/patched/dl.rutracker.org
7817 /home/torrents/patched/dl.torrents.ru
6184 /home/torrents/patched/tfile.ru
3744 /home/torrents/patched/kinozal.tv
2928 /home/torrents/patched/rutor.org
2872 /home/torrents/patched/torrents.thepiratebay.org
2583 /home/torrents/patched/www.tfile.ru
2582 /home/torrents/patched/www.torrentino.ru
2531 /home/torrents/patched/pornolab.net
1032 /home/torrents/patched/www.rutor.org


The result of the work: on a server that performs NAT, shaping and routing a network of 3000 users, the load of mman is generally not noticeable. On the day, about 200-400 files are being edited now. There were no complaints for almost a year of work, everyone is happy.

Update 1 . "Saving files to disk" is used exclusively for collecting anonymous statistics in the form specified above. Well, I simply forgot to disable this script debugging mechanism. :) Cadm

Source: https://habr.com/ru/post/106131/


All Articles