Traceroute You can use the Linux
traceroute command to spot the slow leg of a network packet’s journey and troubleshoot sluggish network connections. We’ll show you how!
How traceroute Works
When you appreciate how
traceroute works, it makes understanding the results much easier. The more complicated the route a network packet has to take to reach its destination, the harder it is to pinpoint where any slowdowns might be occurring.
A small organization’s local area network (LAN) might be relatively simple. It’ll probably have at least one server and a router or two. The complexity increases on a wide area network (WAN) that communicates between different locations or via the internet. Your network packet then encounters (and is forwarded and routed by) a lot of hardware, like routers and gateways.
The headers of metadata on data packets describe its length, where it came from, where it’s going, the protocol it’s using, and so on. The specification of the protocol defines the header. If you can identify the protocol, you can determine the start and end of each field in the header and read the metadata.
traceroute uses the TCP/IP suite of protocols, and sends User Datagram Protocol packets. The header contains the Time to Live (TTL) field, which contains an eight-bit integer value. Despite what the name suggests, it represents a count, not a duration.
A packet travels from its origin to its destination via a router. Each time the packet arrives at a router, it decrements the TTL counter. If the TTL value ever reaches one, the router that receives the packet decrements the value and notices it’s now zero. The packet is then discarded and not forwarded to the next hop of its journey because it has “timed out.”
The router sends an Internet Message Control Protocol (ICMP) Time Exceeded message back to the origin of the packet to let it know the packet timed out. The Time Exceeded message contains the original header and the first 64 bits of the original packet’s data. This is defined on page six of Request for Comments 792.
traceroute sends a packet out, but then sets the TTL value to one, the packet will only get as far as the first router before it’s discarded. It will receive an ICMP time exceeded message from the router, and it can record the time it took for the round trip.
It then repeats the exercise with TTL set to 2, which will fail after two hops.
traceroute increases the TTL to three and tries again. This process repeats until the destination is reached or the maximum number of hops (30, by default) is tested.
Some Routers Don’t Play Nicely
Some routers have bugs. They try to forward packets with a TTL of zero instead of discarding them and raising an ICMP time exceeded message.
According to Cisco, some Internet Service Providers (ISPs) rate-limit the number of ICMP messages their routers relay.
traceroute has a default timeout for replies of five seconds. If it doesn’t receive a response within those five seconds, the attempt is abandoned. This means responses from very slow routers are ignored.
As we covered above,
traceroute's purpose is to elicit a response from the router at each hop from your computer to the destination. Some might be tight-lipped and give nothing away, while others will probably spill the beans with no qualms.
As an example, we’ll run a
traceroute to the Blarney Castle website in Ireland, home of the famous Blarney Stone. Legend has it if you kiss the Blarney Stone you’ll be blessed with the “gift of the gab.” Let’s hope the routers we encounter along the way are suitably garrulous.
We type the following command:
The first line gives us the following info:
- The destination and its IP address.
- The number of hops
traceroutewill try before giving up.
- The size of the UDP packets we’re sending.
All of the other lines contain information about one of the hops. Before we dig into the details, though, we can see there are 11 hops between our computer and the Blarney Castle website. Hop 11 also tells us that we reached our destination.
The format of each hop line is as follows:
- The name of the device or, if the device doesn’t identify itself, the IP address.
- The IP address.
- The time it took a round trip for each of the three tests. If an asterisk is here, it means there wasn’t a response for that test. If the device doesn’t respond at all, you’ll see three asterisks, and no device name or IP address.
Let’s review what we’ve got below:
- Hop 1: The first port of call (no pun intended) is the DrayTek Vigor Router on the local network. This is how our UDP packets leave the local network and get on the internet.
- Hop 2: This device didn’t respond. Perhaps it was configured never to send ICMP packets. Or, perhaps it did respond but was too slow, so
- Hop 3: A device responded, but we didn’t get its name, only the IP address. Note there’s an asterisk in this line, which means we didn’t get a response to all three requests. This could indicate packet loss.
- Hops 4 and 5: More anonymous hops.
- Hop 6: There’s a lot of text here because a different remote device handled each of our three UDP requests. The (rather long) names and IP addresses for each device were printed. This can happen when you encounter a “richly populated” network on which there’s a lot of hardware to handle high volumes of traffic. This hop is within one of the largest ISPs in the U.K. So, it would be a minor miracle if the same piece of remote hardware handled our three connection requests.
- Hop 7: This is the hop our UDP packets made as they left the ISPs network.
- Hop 8: Again, we get an IP address but not the device name. All three tests returned successfully.
- Hops 9 and 10: Two more anonymous hops.
- Hop 11: We’ve arrived at the Blarney Castle website. The castle is in Cork, Ireland, but, according to IP address geolocation, the website is in London.
So, it was a mixed bag. Some devices played ball, some responded but didn’t tell us their names, and others remained completely anonymous.