This is part 3 in a sequence of posts examining how broadband services actually work. Part 1 looked at ISP concentration ratios and part 2 examined the impact of averaging many subscribers. Future posts will consider how to size backhaul links and how to configure buffers.
Two major reasons people purchase broadband (or sign up for more capacity) are to improve interactivity and to get access to new applications, for example streaming movies. New apps is clear, but it's worth looking congestion and how congestion affects interactivity.
Congestion has two impacts. First you get increased delays, as packets are buffered in queues waiting for free capacity on the next link. Usually this adds tens or hundreds of milliseconds to round trip times. Then, if the average traffic exceeds link capacity, you get packet loss. If you do have congestion, it's these packet losses that cause TCP traffic flows to back off, thus avoiding congestion collapse (and even less throughput).
Here is a graph of packet loss on a service that most of us would consider unacceptable. These measurements were taken by Tom Dunigan on an early cable modem service (February 2001) that was substantially upgraded about a year later.
The next graph (also from Tom Donigan) shows how this congestion impacts interactivity. The graph shows the echo delay for typing into a remote program using telnet in character-at-a-time mode. One test (red line) was run in the early morning and one test (green line) was run during the evening when packet loss was peaking. Each test measured the echo response time for 100 successive individual characters. Notice the typical echo delay goes up from ~130 ms in the morning to roughly 200-300 ms in the evening, but with occasional delays of more than one second.
A more common activity is viewing content on websites. Here, the dominant time in any interaction is the time spent waiting for web content to download to your browser. Since web content is downloaded using TCP (and more generally, TCP is the dominant protocol in use today), it's worth looking at the impact of packet loss on TCP throughput.
The TCP protocol includes a congestion avoidance algorithm which is triggered by packet loss. When a packet is lost, the TCP sender slows down. As a result, the data rate for a single TCP flow looks like this (thanks to Guido Appenzeller):
What happens in real networks is complex, but RFC 3155 has an approximate formula for how a single flow is affected by packet loss in real network. Even better, Bill Gibson at Niwot Networks has produced a nice graphic based on that formula for his paper on TCP limitations on file transfer performance. It looks like this:
This is dramatic! The different lines reflect different end-to-end round trip times (RTT) - times that will vary depending on the site you are connecting with. Also, they represent the maximum throughput you could achieve with long duration TCP flows and no other bottlenecks. What's notable is the logarithmic throughput scale on the left and the fact that, at 1% packet loss (0.01 on the bottom scale), potential throughput drops by a factor of 100!
Again, there are many caveats. A real network has a mix of short- and long-lived flows. The local operating system may not be optimized to take full advantage of broadband speeds (although MS Windows actually got better at this with Vista). None-the-less, even 1% or 2% packet loss is correlated with poor user experience.
In short, to actually obtain the instantaneous throughput you thought you were purchasing, you don't want packet loss in upstream portions of the network.
I'll discuss what this actually implies for routers and backhaul links in a subsequent post.