Basics of I2P Network Client Development

This article is intended for those who would like to develop their own I2P client from scratch. It assumes familiarity with the basic concepts and concepts of I2P. At the moment there are enough documents and articles on this account, including those translated into Russian. On the other hand, there is official documentation that describes the protocols and message formats quite well. Unfortunately, it is scattered, with many unobvious things missing there. This article is written primarily based on the study and debugging of the official I2P java client. The ultimate goal is implementation entirely in C ++. The source code of the project in its current state is located on github .

Encryption used

To build your own I2P marshuzer, you must have an implementation of the following encryption algorithms:

El Gamal (ElGamal). Asymmetric encryption, based on the construction of the base in the power module. The base and module are fixed constants for the entire I2P network. In addition to standard size blocks of 514 bytes, non-standard size blocks of 512 bytes are also used.
Diffie-Hellman (Diffie-Hellman) to obtain a shared key symmetric encryption key by exchanging public keys. The same keys are used as for El-Gamal.
DSA for creating and verifying electronic signatures
AES in two modes: CBC using the encryption key and initialization vector (IV), ECB for encrypting the actual IV with a length of 16 bytes
SHA256 to calculate hashes
Adler32 to calculate the checksum of messages

Basic Protocols

I2P network consists of 4 main levels:

Transport level. These are encrypted TCP / IP or UDP Internet connections. Includes connection setup and encryption.
The tunnels. “Windows” of nodes to the outside world, located on other nodes and allowing you to hide your true location. Consist of a sequence of nodes interconnected by transport layer protocols. The tunnel can be simply imagined as a chain of proxy servers to anonymize both the client and the server.
"Garlic". Sending messages or sequences between two end nodes by an arbitrary route and tunnels. It is characterized by session identifiers and asymmetric, and, after setting the session, by symmetric encryption
Application layer protocols for transferring user data between nodes.

Each level adds its own encryption for different purposes. Encryption of the transport layer hides traffic from the provider, tunnels - the content and direction from the intermediate nodes of the tunnels, "garlic" - from the end nodes of the tunnels when sending messages between the tunnels.

Transport level

In order to establish a transport connection, you need to know the IP address and port. There is a list of known nodes, called netDb, which changes in the course of work, information about new nodes comes from other nodes. Initially, the list of nodes is downloaded from special sites, whose addresses are explicitly listed in the file router / networkdb / Reseeder.java. The protocol running over TCP / IP is called NTCP, and over UDP is called SSU. In addition to some differences in setting up connections, SSU, due to the packet nature, supports splitting of long messages into several fragments. Transmitted messages consist of a header, an I2PN message (I2NP protocol below) and a checksum. A special message is periodically transmitted containing the current time for synchronization purposes. When the connection is established, the public keys of the routers are exchanged, on the basis of which the Diffie-Hellman algorithm calculates the common key for AES encryption, each on its side.
')

Tunnels

Tunnels are always unidirectional - all messages can be transmitted only from the input node (Gateway) to the output node (Endpoint). Depending on which end of the tunnel belongs to its owner, who has all the information about the tunnel, the tunnels are divided into incoming (owner - output node) and outgoing (owner - input node). The intermediate nodes of the tunnel do not know whether the tunnel is incoming or outgoing; the only action performed by the intermediate node is to encrypt the message with its encryption key and transfer it to the next node. This implies an important consequence: the sequential decryption of the tunnel messages should be carried out by its owner, since only the owner has the encryption keys of all intermediate nodes. This fact is rather trivial for incoming tunnels, i.e., having received a message, the output node must decipher it sequentially, but for outgoing tunnels the original unencrypted message must be decrypted sequentially before sending it. Tunnels for which this node is not the owner, are called transit. Transit tunnels transmit alien traffic and are necessary to support the operation of the entire I2P network, thereby turning the node into a marshutiser. Tunnel nodes use AES encryption with three different keys: one is used to encrypt the node's response when creating the tunnel, and the other two are used to transmit data through the tunnel: one key encrypts the data itself, and the other encrypts the impersonalization vector (IV) to encrypt the data. In this case, IV is encrypted with the same key twice: before encryption and after, this is called double encryption. The node receives these two keys in a related message to create a tunnel, encrypted with its public key using El-Gamal.
Inside the tunnels, only TunnelData messages are transmitted, generally speaking consisting of several fragments. For transmission between tunnels, use the message TunnelGateway. Although the official documentation says that a two-way connection requires at least 4 tunnels (2 incoming and 2 outgoing), in fact, it is not necessary to send messages through outgoing tunnels, but you can send a TunnelGateway message to the input node of the desired incoming tunnel.
In the TunnelData message, the checksum is calculated from the content data following the zero byte and the unencrypted IV attached to it.

I2NP protocol

Data exchange within the I2P network takes place using I2NP messages of various types. Each message contains a header with its type and length, allowing you to define the boundaries between messages. Depending on the type, the message length can vary from 20 to 64K bytes. Each layer uses “wrapper” messages that contain other higher-level I2NP messages. For tunnels, such "wrappers" are TunnelData messages for transmission within tunnels and TunnelGateway for transmission between tunnels. For "garlic" - Garlic. Most of the I2P traffic consists of the following attached messages:
Data-> Garlic-> TunnelData.
As a rule, messages are transmitted through tunnels, although they can also be transmitted directly between the marshufflers, in particular for the initial creation of new tunnels. Marshursizers also exchange DatabaseStore messages immediately after the connection is established. Messages between destination points should be transmitted through garlic, since the corresponding field is present only there.

Routers and destinations

To work in an I2P network, an I2P client is needed, consisting of a marshruzier, providing access to the I2P network, and destination points for sharing content information. Information about the marshutizers, including their IP addresses, is publicly available; moreover, the current list of marsutizers can be downloaded from special ftp-sites. At the same time, the location information of the destination points is confidential. Information about the destination points located on this marshrutizer is only available to this marshrutizer, for all others it is not possible to get this information, which is one of the main mechanisms for ensuring anonymity of the I2P network.
Since the routers are mainly located on the computers of network participants, their composition changes all the time. Therefore, the marshutizers are forced to constantly keep their list of other marshutzers up to date. This process is called “exploratory” (exploratory), which consists in sending requests with a randomly selected 32-byte address to special marshutizers called floodfill. It is assumed that floodfill-marshutzers have all the information about the network. Among other things, floodfill-marshutzers constantly communicate to each other information about new nodes found.
To request information about the site, I2NP uses the DatabaseLookup message, and to transfer the DatabaseStore information itself. As a rule, messages are transmitted through tunnels, but the DatabaseStore is transmitted by the node directly at the transport level immediately after the connection is established, thereby informing the network about its existence. Otherwise, the construction of tunnels for new nodes would be impossible.
The DatabaseStore can contain two types of information, if the address given is the structure RouterInfo, then the address is a marshruiser, and if LeaseSet is the destination.
RouterInfo contains the public keys of the marshruzator, as well as a variety of proprietary information, the most important of which are IP addresses, ports and supported transport protocols for connecting and information about whether this marshrutizer is floodfill or not. Since RouterInfo can contain quite a lot of textual information, it is transmitted by archived gzip.
LeaseSet, contains a list of incoming tunnels of this destination, as well as a public key for encrypting garlic messages destined for this destination.

Application Services

Consider the informative actions of the I2P client: anonymous hosting of online resources, and, accordingly, access to them. For a start, let's try to get data from some website, for example, Flibusta. At the moment we have only a 32-byte hash of its I2P address, our goal is to send an HTTP request and receive a response.
Of course, the router with such an address is not in the database (otherwise, the IP address of the resource would be visible to everyone), so the only way to send a request is to enter a tunnel of the desired node that exists at a given time, for which you should first request and receive a LeaseSet. Unlike RouterInfo, which can be requested and received from a neighboring node at the transport level, LeaseSet can only be requested and obtained through tunnels that you must first build. From this follows a disappointing conclusion that the use of I2P network “on demand” will not work, I2P marshutiser must be running and must constantly be engaged in building and maintaining tunnels. Due to the decentralization of the network, building tunnels is not an easy task - most attempts to create tunnels end in failure.
For the successful construction of the tunnel, two conditions are necessary:

All nodes participating in the tunnel must be accessible at the transport level at least to the previous node in the tunnel.
All nodes participating in the tunnel must agree to build a new tunnel. The node may refuse to create a tunnel, for example, due to its load

The maximum lifetime of the tunnel is 10 minutes, the tunnel can cease to exist and prematurely, if the node participating in the tunnel went offline. Therefore, the owners of the tunnels constantly send test messages to keep the list of “live” tunnels up to date.
So, the tunnels are available and the necessary LeaseSet is available. Now you can send an HTTP request and it will even reach the addressee, however, it is desirable for us to also receive a response. To do this, we must indicate our own LeaseSet in our message, then the answer will be sent to us through some kind of incoming tunnel and most likely reach our site safely. Since several connections can simultaneously work through our node, each of them must either be assigned its own I2P address and formed a LeaseSet from several incoming tunnels, or a “shared” address multiplexing the connections using a special protocol with the appropriate fields, which is a “wrapper” on application protocol. This protocol is called I2CP and in the official I2P client it is used exclusively, although it is not necessary to build your own services. Of course, to access Flibusta, you should use I2CP, since it expects it. However, for building your own torrent-like network for example, you can get by with I2P only.

The I2CP protocol and the protocol stack built on top of it is a separate topic, which is covered in a separate article .

Source: https://habr.com/ru/post/205320/

All Articles