Introduction |
Tor |
Tarzan and MorphMix |
Low-cost attack |
Low-cost attack on Tarzan and Morphmix |
Principles of building secure systems (conclusion)I present to your attention a part of the article by recognized researchers in the field of information security, the translation of which we (the participants of "
openPGP in Russia ") are now engaged in. Among the authors are
Willy Susilo (Co-Director,
Center for Computer and Information Security Research ).
The following "Introduction" from the article "
Principles of building anonymizing systems with low latency against counter-timing attacks " is addressed to all those interested in network anonymity. The authors in an accessible language reveal the basics of anonymizers and possible attacks against them, with an emphasis on the currently most popular anonymizing network "
Tor ".
Introduction
')
For the first time, anonymous communication systems were introduced in the seminal work of Chaum (Chaum 1981). The point is that anonymity is achieved by forwarding messages through a series of transfer nodes called mix nodes (mixing nodes). Each mix node performs two primary tasks. The first is to provide bitwise unlinkability of messages, the second is to mix the message flow.
To prevent the attacker from identifying the tracked message by its content at the exit from the node, all incoming messages are reduced to the same size (short messages are supplemented with random data) and are encrypted. This is “provide bit-wise indistinguishability.”
The mixing of the message flow is used so that the attacker cannot match the time the message entered the node with the time of its exit, and thus understand exactly which outgoing message corresponds to the desired one. The node accumulates incoming messages for a while, mixes them up and sends them further in random order.
Anonymous Internet communication systems can be divided into two categories:
- high latency systems (high-latency systems);
- systems with low latency (low-latency systems).
For example, e-mail (e-mail) is a system with long delays, since It can take a long time to deliver a message. If real-time interaction is required (or close to it), systems with low latency should be used. SSH and instant messenger are examples of low latency systems. The non-traceability of messages in the systems of both categories is achieved through the implementation of Chaum's ideas: a chain of transmission nodes between the sender and receiver, and encryption hiding the message data. Each node in the chain knows only its predecessor, from whom it received the message, and its successor, to whom it will transmit the message.
Systems with large delays specialize in sending single messages, and systems with low delays in maintaining a connection. This means that in systems with large delays a new path is created for each message, a new message is a new path. And in systems with small delays, one path is used for some time to send a whole stream of packets.
There is another significant difference with low latency systems. To meet the stringent requirements for message delivery time, we have to abandon the accumulation and mixing phase of messages. Therefore, such systems are more vulnerable to traffic analysis attacks and in particular timing-attacks. Simple timing attacks can be reduced to calculating the time it takes a packet to go through a network. More complex timing attacks may include an analysis of the distinctive features of the connection used by the victim - identifying the traffic pattern
1 .
Timing attacks use the fact that all nodes in the network introduce different delays. Knowing the time of delays, one can speculate about the connection between the incoming node and the outgoing flows. In other words, guess which outgoing flow corresponds to the monitored incoming flow. An attacker may observe connections between nodes for a while, and then, comparing the traffic patterns of all nodes, identify nodes with similar patterns — probably these nodes form a chain. Using statistical methods, an attacker can obtain information about the sender and receiver of the stream, and even identify the entire path of the stream. To protect against this attack, you need to make sure that the temporal characteristics of all threads are indistinguishable. However, this will require a significant amount of mixing operations
2 and a large amount of cover traffic (cover traffic)
3 , hence the delay time will increase. Finding the right balance between anonymity and delays is no easy task.
To successfully carry out the aforementioned attack, a global observer is needed, able to monitor all the streams passing through the network. Anonymizing nets with low latencies, such as Tor, are considered to successfully withstand a weaker threat model that does not include a global observer. In this weaker model, the attacker can only see part of the ties. If we talk about large public networks, such as the Internet, then it is quite possible to rely on such an assumption implying the absence of a global observer.
Recently, Murdoch and Danezis set an example of a successful attack on low latency systems without using a global observer. The attack is based on the analysis of traffic offered by Danesis in 2004. In their attack, the anonymity provided by Tor can be broken by an attacker who sees only part of the network or owns one Tor node. The attack works because Tor developers have removed the blending operation that was in the earlier version, and now the incoming thread queue is processed in round robin fashion
4 mode. The Tor node being controlled by the attacker creates connections with other network nodes and thus can indirectly estimate the amount of traffic passing through them at each time point. Estimates are based on the difference in delays in the streams sent and received on these connections.
Based on the fact that
- the amount of traffic passing through each Tor-node, or in other words, the load on the Tor-node, is the sum of the loads of all connections established with this node
- and the attacker was able to estimate these total loads for all nodes at each time point
the attacker can make a fairly good assumption about the flow routes.
The authors note that their attack is applicable to all anonymous networks with low latency. This loud statement hints at the inconsistency of the threat model used to create many anonymous systems with low latency.
Our contribution
We tested the Murdoch and Daneseis attacks (Murdoch & Danezis 2005) on other anonymous networks with low latency. Tarzan (Freedman & Morris 2002) and MorphMix (Rennhard & Plattner 2002) do not work like Tor. In particular, they follow the peer-to-peer architecture, and Tor uses dedicated servers. Tarzan also uses some blending operations and cover traffic that is not available in Tor. And in MorphMix, intermediate nodes can independently choose a part of the path for the transmitted stream, in contrast to Tor, where the client that initializes the stream itself sets the entire path.
Our analysis moved in two directions. Initially, we focused on the delays that occur in the system, to make sure that they can really be used to estimate the amount of traffic passing through the nodes. Then we analyzed the effectiveness of the attack for different architectures. The results of the study allowed us to identify the principles of creating anonymous networks with low latency capable of withstanding such attacks.
Document structure
In the second section, we analyze three anonymizing networks — Tor, Tarzan, and Morphmix
5 — and show how they differ from each other. In the third section, we consider the attack on Tor, described by Murdoch and Danesis (Murdoch & Danezis 2005). Note that the authors claim that all anonymous networks with low latency are subject to attack. In the 4th section, we will refute this statement with the help of Morphmix. In the fifth section, we derive several rules for constructing systems with small delays. And finally, in the 6th section we conclude.
Translator's Note1 Traffic Pattern
In the context of timing attacks, this is a characteristic form of a “spike” of activity on the “traffic volume versus time” graph, which can make it possible to distinguish between users or types of activity on the network.
2 Mixing threads
There are many options for mixing flows (mixing, mixing). For example, in the current Tor, in the absence of mixing, the picture is as follows: n chains entered the server X and the same number came out. What kind of user belongs as it is not clear, but according to the volume of traffic, bursts, and other indirect features, you can build hypotheses.
When mixing (only as the simplest example), all chains from server X to server Y can be encrypted and packed again into a common encrypted channel and it will be unclear: n has entered, and how many of which servers are released is not visible. Individual chains go from users to servers, and a continuous stream between servers.
Since separate chains will still appear at the exit from the Tor network, this is not very effective.
3 cover traffic
Fake traffic generated by transfer nodes, which has a special mark, due to which other nodes can distinguish it from real traffic
4 Round robin fashion
The simplest RR in a circle processes one packet of data from each stream (it is assumed that the priority of all threads is equal). In this way, “equality” of all data streams is achieved.
5 Tarzan and Morphmix existed as concepts at the time Tor began developing. Currently, these networks are not really used.