Camel: Frame-Level Bandwidth Estimation for Low-Latency Live Streaming under Video Bitrate Undershooting
Low-latency live streaming (LLS) has emerged as a popular web application, with many platforms adopting real-time protocols such as WebRTC to minimize end-to-end latency. However, we observe a counter-intuitive phenomenon: even when the actual encoded bitrate does not fully utilize the available bandwidth, stalling events remain frequent. This insufficient bandwidth utilization arises from the intrinsic temporal variations of real-time video encoding, which cause conventional packet-level congestion control algorithms to misestimate available bandwidth. When a high-bitrate frame is suddenly produced, sending at the wrong rate can either trigger packet loss or increase queueing delay, resulting in playback stalls. To address these issues, we present Camel, a novel frame-level congestion control algorithm (CCA) tailored for LLS. Our insight is to use frame-level network feedback to capture the true network capacity, immune to the irregular sending pattern caused by encoding. Camel comprises three key modules: the Bandwidth and Delay Estimator and the Congestion Detector, which jointly determine the average sending rate, and the Bursting Length Controller, which governs the emission pattern to prevent packet loss. We evaluate Camel on both large-scale real-world deployments and controlled simulations. In the real-world platform with 250M users and 2B sessions across 150+ countries, Camel achieves up to a 70.8% increase in 1080P resolution ratio, a 14.4% increase in media bitrate, and up to a 14.1% reduction in stalling ratio. In simulations under undershooting, shallow buffers, and network jitter, Camel outperforms existing congestion control algorithms, with up to 19.8% higher bitrate, 93.0% lower stalling ratio, and 23.9% improvement in bandwidth estimation accuracy.
💡 Research Summary
Low‑latency live streaming (LLS) has become a cornerstone of modern interactive media, with platforms adopting real‑time transport protocols such as WebRTC and SRT to keep end‑to‑end latency within a few seconds. Despite this progress, large‑scale operational data reveal a paradox: broadcasters often transmit at less than 80 % of their estimated uplink bandwidth, yet stall events remain frequent, especially when the encoded video bitrate is even lower than the estimated capacity. The authors term this phenomenon “bitrate undershooting” and demonstrate that it correlates with higher stall ratios and lower resolution selection by adaptive bitrate (ABR) algorithms.
The root cause is traced to the intrinsic bursty nature of real‑time video encoders. Unlike bulk data transfers that maintain a continuously backlogged packet queue, RTC encoders generate frames of highly variable size (I‑frames can be ten times larger than P/B‑frames). Consequently, the sender emits short packet bursts followed by idle gaps, often draining the sending buffer. This irregular transmission pattern contaminates traditional congestion‑control signals—packet loss rates and inter‑arrival intervals become functions of encoder dynamics rather than pure network conditions, leading to systematic under‑estimation of available bandwidth. When a sudden high‑bitrate frame arrives, an incorrectly estimated sending rate either causes packet loss (if too aggressive) or queue buildup (if too conservative), both of which trigger playback stalls.
To address these challenges, the paper introduces Camel, a frame‑level congestion control algorithm (CCA) specifically designed for LLS upstream video. Camel’s central insight is to treat each video frame as a natural “packet train” and to collect network feedback at the frame granularity, thereby isolating the network’s true capacity from encoder‑induced variability. Camel comprises three tightly coupled modules:
-
Bandwidth and Delay Estimator – After a frame finishes transmission, Camel measures the total bytes sent, the elapsed transmission time, and the minimum observed round‑trip time (RTT). It combines the instantaneous frame throughput (bytes/ transmission time) with a propagation‑delay‑based estimate of the bottleneck bandwidth to produce a robust bandwidth estimate (B_est).
-
Congestion Detector – Traditional delay‑gradient or minimum‑RTT methods fail at the frame level because delays of adjacent frames interfere. Camel instead computes the derivative of observed delay with respect to the number of inflight packets (∂D/∂inflight). A positive derivative indicates growing queue occupancy (congestion), while a negative derivative signals queue draining, allowing the controller to react promptly without relying on long RTT windows.
-
Bursting Length Controller – The length of each frame‑level burst must be carefully balanced. Too long a burst can overflow shallow network buffers, causing loss; too short a burst yields insufficient samples for stable estimation. Camel dynamically calculates a target burst length L* = β·(buffer_capacity / B_est) and throttles the sending rate when the current burst exceeds L*, ensuring that the sender never overwhelms the network while still gathering enough data for accurate estimation.
The authors implemented Camel in a production environment serving 250 million users and 2 billion sessions across more than 150 countries. In a four‑week A/B test, Camel increased the proportion of 1080 p streams by up to 70.8 %, raised average media bitrate by 14.4 %, and reduced stall ratio by up to 14.1 % compared with the incumbent congestion controller.
Complementary controlled simulations examined three stress dimensions: varying degrees of undershooting (ETR 0.4–0.8), shallow receiver buffers (≤ 50 ms), and network jitter (σ ≈ 30 ms). Camel was benchmarked against state‑of‑the‑art CCAs including BBR, GCC, Copa, SQP, and Pudica. Results consistently showed Camel delivering up to 19.8 % higher video bitrate, a 93 % reduction in stall ratio, and a 23.9 % improvement in bandwidth‑estimation accuracy.
The paper also discusses limitations. Extremely small frames may provide insufficient samples for the estimator, and an overly conservative burst‑length policy could underutilize available capacity. Future work is suggested on multi‑stream interference, codec‑specific frame‑size distributions, and adaptive tuning of the hyper‑parameters (α, β) based on live network telemetry.
In summary, Camel demonstrates that leveraging frame‑level network feedback—rather than traditional packet‑level metrics—can fundamentally overcome the bandwidth‑undershooting problem inherent to real‑time video encoding. By jointly estimating bandwidth and delay, detecting congestion via inflight‑delay gradients, and controlling burst length, Camel achieves substantial quality‑of‑experience gains at massive scale, offering a practical pathway for next‑generation low‑latency streaming services.
Comments & Academic Discussion
Loading comments...
Leave a Comment