How to avoid becoming a bot in Bittorrent DHT and other P2P networks

The Bittorrent DHT network allows you to find torrent sources by hash from magnet links . The network consists of nodes that can be both Bittorent clients and malicious programs that interfere with the normal operation of the network. They prevent the client from getting sources of the torrent, redirect requests to the attacked nodes, fill the list of nodes with useless addresses.

While I was working on a peer and seed counter ( DHT Scrape ) on this network, I came across these kinds of attacks.

Port number 1

')
Some nodes produced a list of nodes where the first one was specified as a port. On the Internet, there was a recommendation not to connect from 0 to 1024 ports. They are critical for the work of the Internet services. The node that sent the addresses with the port in this segment is ignored.

Mirrors

There are nodes that simply return the package sent back. It turns out that we ask ourselves and answer ourselves. Since the node responded correctly, it is marked as active by some clients and its address is transferred to other nodes. In order to exclude such nodes from the network, you need to check this option.

Flood ports

Some nodes produce the same IP with a bunch of different ports. This can happen with a node behind a NAT which changes the outgoing port of the node. In this case, if a node with such an IP and ID is already confirmed (that is, there was a connection with it), the new information is discarded. In the other case, the last or random entry is used for verification.

Token

In each package there is a token that allows us to determine that our request reached the addressee and he answered us thereby excluding attacks with address substitution. But it is necessary to check that the token (like the other lines) does not get out of the package. This may allow reading the data from the memory following the packet.

Timer

The token is not a panacea for incoming requests with a sub-address. In this case, only 2 consecutive requests per second from one IP are allowed. In the case of a larger number, they are simply ignored.

Local addresses

Some sites return local addresses that are accordingly inaccessible from the Internet. This can also be the internal address of the router. These addresses should also be ignored if, of course, they are not received from a node on the same local network.

We publish only verified sites.

When we are asked a list of nodes from the database of nodes, only those from which we received the correct answer to our request are selected (active). The rest (undefined) are polled gradually and out of the database in the absence of an answer (dead).

The G2 network has recently suffered greatly from the fact that it runs a large number of dead node addresses. This slows down the entrance to the network and posk it.

Store node database

After a long interruption in the client’s work in the node database, all entries become expired. But the client must use them to enter the network before receiving a sufficient number of active nodes. If all nodes are dead, then the client accesses the input nodes. In my experience, even a very old database with a sufficiently large number of nodes allows you to enter the network.

Filter bits

To get the number of peers and seed distribution in the network, Bloom filter is used. Fake knots can fill it with units and thus distort the numbers. Therefore, data from at least three nodes are compared.

Send ping before replying

In order not to participate in primitive DDoS attacks before sending a large packet, we send ping to the site. With the correct response to the ping send a large package.

Conclusion

I hope this article will help write more efficient and secure clients for P2P networks.

Source: https://habr.com/ru/post/263831/

All Articles