How Google Reinvented TCP for Faster Video Streaming

Discover how Google replaced a 30-year-old internet rule to fix video buffering and change streaming forever with an congestion control algorithm called BBR. The Story: My Starting Point The other day, I was completely absorbed in a live stream, watching a gaming tournament unfold. Suddenly, the crystal-clear 1080p feed I was enjoying turned into a blurry, stuttering mess. My internet connection felt fine, and I knew I had plenty of bandwidth. "What gives?" I wondered. Why does this happen so often with livestreams? My curiosity was piqued, and I felt compelled to dig into how these things actually work. The Core Problem For decades, the internet has relied on a simple rule for managing traffic, embedded in a protocol called TCP (Transmission Control Protocol). The core problem, as I discovered, is its fundamental assumption: packet loss always means the network is congested. When a server sends you data and a piece (a "packet") gets lost along the way, TCP assumes the pipes are full and immediately slows everything down to avoid making it worse. This worked fine in the '80s when networks were simple and slow. But today? It's a disaster for streaming. Our modern networks have huge memory buffers in their routers, which can lead to a phenomenon called bufferbloat. Your data gets stuck in a massive queue, and even though there's no real congestion, the latency skyrockets. On top of that, a flaky mobile connection can drop packets for dozens of reasons—none of which are actual congestion. Yet, traditional TCP sees that loss, panics, and throttles your video stream from crystal-clear HD to a pixelated mess. This old rule is the culprit behind a lot of frustrating livestreaming experiences. Exploring the Solutions Google's engineers recognized this outdated assumption was killing YouTube's performance. So, they decided to challenge it and created a revolutionary new congestion control algorithm called BBR (Bottleneck Bandwidth and Round-trip time). Instead of reacting blindly to packet loss, BBR proactively models the network to find the perfect sending rate for real-time video. Step 1: Measure, Don't Guess BBR operates on a beautifully simple principle: it constantly measures two key things to understand the network's true capacity, rather than just guessing based on packet loss. BtlBw (Bottleneck Bandwidth): This is the maximum speed the connection can actually handle right now. Think of it as the true width of the pipe. RTprop (Round-trip Propagation Time): This measures the absolute minimum time it takes for a packet to travel from the server to you and back. It's the inherent delay of the journey itself. By knowing these two values, BBR can precisely calculate the "Goldilocks" amount of data to send—just enough to keep the pipe full without overfilling it and creating a queue (bufferbloat). // A simplified look at BBR's core idea: function calculateOptimalDataInFlight(bottleneckBandwidth, roundTripTime) { const optimalData = bottleneckBandwidth * roundTripTime; return optimalData; } // BBR knows the max speed (BtlBw) and the travel time (RTprop). // It calculates the exact amount of data needed to fill the "pipe." // Anything more than this would just create a queue (bufferbloat) // and increase latency. This model-based approach means BBR doesn't have to wait for packet loss to happen. It knows the network's capacity and adapts in real-time, preventing those annoying quality drops before they even occur. Step 2: Probing for More Speed BBR doesn't just find a good rate and stick with it; it's always gently probing to see if it can go faster. It cycles through a few states, but the most important is ProbeBW, where it systematically sends a little more data to see if the network's capacity has increased. This is fundamentally different from older algorithms like CUBIC, which would aggressively fill the buffer until packets started dropping. BBR uses a technique called pacing to send data out in a smooth, consistent stream instead of clumpy bursts. The result? When YouTube switched to BBR, they saw throughput increase by up to 25x on some routes and cut latency by 53% globally! Downsides and Evolution of BBR While BBR offers significant performance improvements for streaming, it's not a silver bullet. I learned that early versions (BBRv1) had some drawbacks: Higher Retransmissions: BBR can sometimes cause more packet retransmissions than traditional algorithms, especially in networks with shallow buffers. This happens because it maintains more data in the network to keep the pipe full. Fairness Issues: In certain shared network environments, BBRv1 could be unfair to other TCP flows using older, loss-based congestion control algorithms, sometimes grabbing a disproportionate share of bandwidth. CPU Usage: BBR's continuous measurement and probing phases can be more CPU-intensive than simpler algorithms, which is a considera

Jun 10, 2025 - 14:10
 0
How Google Reinvented TCP for Faster Video Streaming

Discover how Google replaced a 30-year-old internet rule to fix video buffering and change streaming forever with an congestion control algorithm called BBR.

The Story: My Starting Point

The other day, I was completely absorbed in a live stream, watching a gaming tournament unfold. Suddenly, the crystal-clear 1080p feed I was enjoying turned into a blurry, stuttering mess. My internet connection felt fine, and I knew I had plenty of bandwidth. "What gives?" I wondered. Why does this happen so often with livestreams? My curiosity was piqued, and I felt compelled to dig into how these things actually work.

The Core Problem

For decades, the internet has relied on a simple rule for managing traffic, embedded in a protocol called TCP (Transmission Control Protocol). The core problem, as I discovered, is its fundamental assumption: packet loss always means the network is congested. When a server sends you data and a piece (a "packet") gets lost along the way, TCP assumes the pipes are full and immediately slows everything down to avoid making it worse.

This worked fine in the '80s when networks were simple and slow. But today? It's a disaster for streaming. Our modern networks have huge memory buffers in their routers, which can lead to a phenomenon called bufferbloat. Your data gets stuck in a massive queue, and even though there's no real congestion, the latency skyrockets. On top of that, a flaky mobile connection can drop packets for dozens of reasons—none of which are actual congestion. Yet, traditional TCP sees that loss, panics, and throttles your video stream from crystal-clear HD to a pixelated mess. This old rule is the culprit behind a lot of frustrating livestreaming experiences.

Exploring the Solutions

Google's engineers recognized this outdated assumption was killing YouTube's performance. So, they decided to challenge it and created a revolutionary new congestion control algorithm called BBR (Bottleneck Bandwidth and Round-trip time). Instead of reacting blindly to packet loss, BBR proactively models the network to find the perfect sending rate for real-time video.

Step 1: Measure, Don't Guess

BBR operates on a beautifully simple principle: it constantly measures two key things to understand the network's true capacity, rather than just guessing based on packet loss.

  • BtlBw (Bottleneck Bandwidth): This is the maximum speed the connection can actually handle right now. Think of it as the true width of the pipe.
  • RTprop (Round-trip Propagation Time): This measures the absolute minimum time it takes for a packet to travel from the server to you and back. It's the inherent delay of the journey itself.

By knowing these two values, BBR can precisely calculate the "Goldilocks" amount of data to send—just enough to keep the pipe full without overfilling it and creating a queue (bufferbloat).

// A simplified look at BBR's core idea:
function calculateOptimalDataInFlight(bottleneckBandwidth, roundTripTime) {
  const optimalData = bottleneckBandwidth * roundTripTime;
  return optimalData;
}

// BBR knows the max speed (BtlBw) and the travel time (RTprop).
// It calculates the exact amount of data needed to fill the "pipe."
// Anything more than this would just create a queue (bufferbloat)
// and increase latency.

This model-based approach means BBR doesn't have to wait for packet loss to happen. It knows the network's capacity and adapts in real-time, preventing those annoying quality drops before they even occur.

Step 2: Probing for More Speed

BBR doesn't just find a good rate and stick with it; it's always gently probing to see if it can go faster. It cycles through a few states, but the most important is ProbeBW, where it systematically sends a little more data to see if the network's capacity has increased.

This is fundamentally different from older algorithms like CUBIC, which would aggressively fill the buffer until packets started dropping. BBR uses a technique called pacing to send data out in a smooth, consistent stream instead of clumpy bursts. The result? When YouTube switched to BBR, they saw throughput increase by up to 25x on some routes and cut latency by 53% globally!

Downsides and Evolution of BBR

While BBR offers significant performance improvements for streaming, it's not a silver bullet. I learned that early versions (BBRv1) had some drawbacks:

  • Higher Retransmissions: BBR can sometimes cause more packet retransmissions than traditional algorithms, especially in networks with shallow buffers. This happens because it maintains more data in the network to keep the pipe full.
  • Fairness Issues: In certain shared network environments, BBRv1 could be unfair to other TCP flows using older, loss-based congestion control algorithms, sometimes grabbing a disproportionate share of bandwidth.
  • CPU Usage: BBR's continuous measurement and probing phases can be more CPU-intensive than simpler algorithms, which is a consideration for high-speed server environments.

Google addressed many of these issues with BBRv2. This updated version improves fairness with other TCP flows, supports ECN (Explicit Congestion Notification) signals to react better to network congestion, and enhances stability on challenging networks like Wi-Fi.

Where BBR Shines: Use Cases

BBR truly shines in scenarios where traditional congestion control struggles, which often includes exactly what we care about for livestreams:

  • Video Streaming and Real-Time Media: It dramatically reduces latency and bufferbloat, leading to better video quality and fewer buffering events.
  • Mobile and Wireless Networks: BBR can better handle the variable packet loss and fluctuating conditions common in cellular and Wi-Fi networks.
  • Long-Haul and Satellite Links: In these scenarios, packet loss is often not due to congestion, so BBR's model-based approach avoids unnecessary throttling.

Final Thoughts: The Takeaway

The journey of BBR is a powerful lesson: always challenge the underlying assumptions of your technology stack. For 30 years, "packet loss equals congestion" was an unquestioned truth in networking. By creating a system based on live measurement rather than a fixed rule, Google's engineers solved a problem that no amount of front-end optimization could ever fix.

It reminds me that the principles of performance—whether it's in networking, databases, or even how our UI renders—are often universal. Building systems that measure and adapt to real-world conditions will almost always outperform systems that operate on outdated heuristics. This fundamental shift is now baked into next-generation protocols like QUIC, which will power the future of the web, making our livestreams smoother and more reliable.

References

[1] https://cloud.google.com/blog/products/networking/tcp-bbr-congestion-control-comes-to-gcp-your-internet-just-got-faster
[2] https://blog.apnic.net/2020/01/10/when-to-use-and-not-use-bbr/
https://www.reddit.com/r/firewalla/comments/12i9l97/any_downsides_to_enabling_bbr_tcp_congestion/
[3] https://www3.cs.stonybrook.edu/~arunab/papers/imc19_bbr.pdf
[4] https://www.thousandeyes.com/blog/path-quality-brr-future-congestion-avoidance
[5] https://news.ycombinator.com/item?id=14298576
[6] https://dl.acm.org/doi/10.1145/3355369.3355579