- Why are there asterisks in the trailout after node X?
- The service does not work, and the trailout terminates at node X - does it mean the problem is at node X?
- Why do identical Windows tracers and Unix show different results?
- Why does a traceroute show large delays on a particular node?
- Why does the traceroute show gray addresses when tracing over the Internet?
- Why does the router respond to the trailout with the wrong address I want?
- Why does a traysroute show some “not such” domain names?
- Why is the conclusion of a trailroot different from the intuitively expected more often than we would like?
Network engineers and administrators in their relationship with the traysrout are divided into two categories: regularly asking themselves and those around them these questions and hesitating to answer them.
This topic
does not give answers to the above questions. Or almost does not. But it offers to think about whether they should be asked at all, and if so, when and to whom.
Regarding the relationship with the traysrout, Richard Nashevse Steinbergen made a report at the NANOG-47 (2009) conference, which I recommend to recommend to all interested persons.
A Practical Guide to (Correctly) Troubleshooting with Traceroute (PDF, 222 KB) (in English, of course, the language).
')
I will not retell the details here (those who wish to read it), I’ll dwell only on a set of arguments and conclusions that it would be good to bear in mind before calling for help with a cry “I have a trace showing that ...”
( ) — . , , , , . , . , , .
Some facts (without going into details)
- The delay in passing a packet through a network consists of several factors: serialization, buffering, and distribution. Each of the factors is more complicated than you think about it.
- The delay that the tracerout shows you is even more complex: routers process packets addressed to themselves completely differently than transit packets. This circumstance leads to the specific nature of the values of delays, which are shown to us by the tracerout. It does not follow from this that they cannot be guided, but one must be able to read them.
- Traffic on the Internet almost always goes different ways in the directions from the client to the server and from the server to the client. The tracerout always shows the total delay in both directions, and the trace - only in one direction.
- Specifying the source address on a device with multiple interfaces (a router) does not affect the choice of interface from which requests will be sent. It influences the choice of the way back , through which the answers are transmitted. Their trace is not visible in the output, but in this way it is possible to measure the delay difference for parallel return routes.
- Using L3 balancing somewhere on Internet backbones is likely to force different packages to follow different paths within the same trace. This behavior leads to a hard-to-interpret conclusion that not everyone can read correctly.
- Modern routers do not comply with the requirements of clause 4.3.2.4 of RFC1812, which obliges to set the IP address of the source of ICMP replies to the address of the outgoing interface. Instead, they set it equal to the address of the interface to which the trace request was received (packet with TTL = 1). However, if it were the other way around, reading the output of a trace would be much harder.
- The presence of MPLS switching within the backbone networks (now this is the case with any large provider respecting itself) leads to a counterintuitive way of transmitting responses to the tracer and even less obvious way of calculating the delays.
Some of the most important findings (with my creative insight)
- Traceout is not as simple as it seems; need to be able to use it. And for this you need to understand how it works.
- Most administrators and engineers of operational services, not to mention ordinary users, do not understand and do not know how. Such a situation very often leads to false alarms, misdiagnosis, etc.
- Traysroute traysroutu discord. The standard tracert utilities in Windows and traceroute in Linux are implemented differently and can give different results. Windows sends ICMP and Linux sends UDP, firewalls on the trace path may have different filtering settings for different protocols.
- When interpreting the results of tracing requires experience and ingenuity. It happens that important conclusions can be made only by guessing, relying on indirect data, and others - and not at all unequivocally, but only up to "most likely".
Total
If you are a customer
Do not be bothered by technical support of providers, integrators, vendors, corporate help desk, etc., with the findings of the traceroute, unless you are absolutely sure of the answer to the question “why do I interpret the trace exactly like that?” At best, you will simply be ignored or sent. At worst - you can convince inexperienced support staff in the fairness of their wrong version, as a result of which they will go to dig the problem in a completely different place.
If you do not see any problems with the service (everything works), but you don’t like something in the output of the trace, think carefully before you raise the alarm. It is highly likely that you simply misinterpret the output. Very rarely, a single trace can be judged on the existence of a problem. And if there really is a problem, it is usually easier to demonstrate it without a trace.
If you are a performer
Never get fooled by someone else's interpretation of the traceroute output. Think with your head (always please - your cap). In general, if a problem report begins with a trace-time output, this is a sure sign that, before doing anything, the information stated further needs to be rechecked personally three times.
Read the presentation of Richard. Use caution as a basic tool for troubleshooting: it is very easy to make a mistake in interpretation, information is often not enough for definitive conclusions. Always compare the testimony of the trace with other available data, if possible use it only as additional or draft information.