Buffer sizing and Video QoE Measurements at Netflix Bruce Spang , - PowerPoint PPT Presentation

Buffer sizing and Video QoE Measurements at Netflix Bruce Spang , Brady Walsh, Te-Yuan Huang, Tom Rusnock, Joe Lawrence, Nick McKeown February 10, 2020

What are we talking about?

What are we talking about? Buffer Server 1 ISP Server 2 …

How big should a buffer be? Too big: packets wait for too long Too small: too many packets thrown away

“A buffer should be at least one BDP” [Villamizar, Song 1994]

“A buffer should be at least one BDP” [Villamizar, Song 1994] BDP=Bandwidth x Delay # of packets in a link for full utilization

“A buffer should be at least one BDP” [Villamizar, Song 1994] BDP=Bandwidth x Delay # of packets in a link for full utilization Congestion Window Time

“A buffer should be at least one BDP” [Villamizar, Song 1994] BDP=Bandwidth x Delay # of packets in a link for full utilization Congestion Window Loss happens when link and buffer are full BDP + B Time

“A buffer should be at least one BDP” [Villamizar, Song 1994] BDP=Bandwidth x Delay # of packets in a link for full utilization Congestion Window Loss happens when link and buffer are full BDP + B ½(BDP + B) TCP stops sending until ½ (BDP+B) packets received Time

“A buffer should be at least one BDP” [Villamizar, Song 1994] BDP=Bandwidth x Delay # of packets in a link for full utilization Congestion Window Loss happens when link and buffer are full } BDP + B Buffer needs to hold this many packets ½(BDP + B) TCP stops sending until ½ (BDP+B) packets received Time

How big should a buffer be? BDP: Villamizar and Song 1994 BDP/√n: Appenzeller, McKeown, Keslassy 2004 O(n): Dhamdhere, Jiang, Dovrolis 2005 O(1): Enachescu, Ganjali, Goel, McKeown, Roughgarden 2006

Which is correct?

It’s complicated

1. TCP New Reno (mostly) behaves as expected 2. Video performance varies 3. Real routers complicate this story

Our Experiment

Catalog servers Uses spinning disks, cheaply stores entire catalog

Offload servers Use SSDs to serve top ~30% of content faster

These three racks are called a stack

…and this Make this one large buffer small…

Large buffer has higher latency during congested hour

Sometimes the large buffer has much higher latency

Large buffer has lower loss during congested hour

Good buffer size: + Fewer rebuffers + Better video quality + Videos start faster Bad buffer size: - More rebuffers - Worse video quality - Videos start slower

Good buffer size: + Fewer rebuffers + Better video quality + Videos start faster - Videos start slower } Bad buffer size: This happens - More rebuffers when buffer is - Worse video quality too large or too small.

Site #2: A smaller buffer is better Reducing the buffer from 500MB to 25MB -15.6% decrease in sessions with a rebuffer -5.3% decrease in low quality video -13.5% decrease in play delay

Site #3: A smaller buffer is better Reducing the buffer from 500MB to 50MB -22.1% decrease in sessions with a rebuffer -7.0% decrease in low quality video -14.8% decrease in play delay

Site #1: A smaller buffer is worse Reducing the buffer from 500MB to 50MB +46.3% increase in sessions with a rebuffer +5.7% increase in low quality video -5.9% decrease in play delay

Large buffer has higher latency during congested hour

Remember how the large buffer has much higher latency…

Servers have different very latency distributions Min RTT (ms)

What I imagined Buffer Server 1 ISP Server 2 …

What I imagined LIES! Buffer Server 1 ISP Server 2 …

Line card #1 Line card #2 Line card #3 Line card #4

VOQ #1 VOQ #2 VOQ #3 VOQ #4 VOQ #5 VOQ #6 VOQ #7 VOQ #8

Buffer architecture “Offload” VOQ Server #1 2/3 Server #2 100Gbps ISP “Catalog” VOQ 1/3 Server #3

Traffic is fairly split when load is equal “Offload” VOQ 40 Gbps 40 Gbps 67 Gbps 100Gbps ISP “Catalog” VOQ 33 Gbps 40 Gbps

When one VOQ offers less than its fair share, it sees no congestion “Offload” VOQ 50 Gbps 50 Gbps 90 Gbps 100Gbps ISP “Catalog” VOQ 10 Gbps 10 Gbps No delay!

VOQs explain the RTT differences This VOQ is served faster This VOQ is served slower This VOQ is all over the place Min RTT (ms)

Switches prioritize long-tail content

Switches prioritize long-tail content Same latency during uncongested hours

Switches prioritize long-tail content Same latency during uncongested hours Popular content Long-tail content is congested not congested

New scheduling algorithm! “Offload” VOQ Server #1 Load-dependent Server #2 100Gbps ISP “Catalog” VOQ Load-dependent Server #3

New scheduling algorithm is more consistent Default scheduling algorithm

How big should a buffer be?

Thanks! For more details, please see: https://brucespang.com/papers/netflix-buffer-sizing.pdf

Buffer sizing and Video QoE Measurements at Netflix Bruce Spang , - PowerPoint PPT Presentation

Buffer sizing and Video QoE Measurements at Netflix Bruce Spang , Brady Walsh, Te-Yuan Huang, Tom Rusnock, Joe Lawrence, Nick McKeown February 10, 2020 What are we talking about? What are we talking about? Buffer Server 1 ISP Server 2

Buffer sizing and Video QoE Measurements at Netflix Bruce Spang , Brady Walsh, Te-Yuan Huang,

Discrete Buffer and Wire Sizing for Discrete Buffer and Wire Sizing for Link-Based Non-Tree Clock

Peering to Scale the Netflix Perspective Scaling for Growth How Does Netflix Manage Growth?

User-behavior analytics for video streaming QoE assessment Ricky K. P. Mok The Hong Kong

Netflix: Netflix: Petabyte Scale Petabyte Scale Analytics Infrastructure in Analytics

How We Know Where You Are in House of Cards @zimmermatt Netflix Scale @zimmermatt Netflix

CS244 Advanced Topics in Networking Lecture 10: Buffer Sizing Nick McKeown Sizing Router

QoE in broadband telecommunication networks Dr. Jens Berger, Rohde & Schwarz Agenda

Spring Cloud, Spring Boot and Netflix OSS http://localhost:4000/decks/cloud-boot-netflix.html

Keeping Movies Running Amid Thunderstorms Fault-tolerant Systems @ Netflix Sid Anand (@r39132)

Netflix: Integrating Spark At Petabyte Scale Ashwin Shankar Cheolsoo Park Outline 1. Netflix

External buffer Raslan Darawsheh Mellanox External buffer First was introduced by Olivier

What are buffers for? Frank Kelly Workshop on buffer sizing Stanford, 2-3 December 2019

Innovation & Creativity CEPI WORKSHOP - PANEL 1 18 JUNE 2018 Netflix History 100M Netflix

Importance of Session 3: How is quality of QoE and QoS experience important to for Orange

The Role of Pricing for QoE Marketization A Fixed-point and Measurement Problem Patrick Zwickl

On estimating the number of flows Bruce Spang, Nick McKeown December 3, 2019 How big should a bu

CROSS CULTURAL CHALLENGES IN THE CROSS CULTURAL CHALLENGES IN THE CROSS CULTURAL CHALLENGES IN

Language support and linguistics in Lucene, Solr and ElasticSearch and the eco-system June 3rd,

Clutching a Grip on AUTOSAR using Haskell Johan Nordlander Chalmers University of Technology

6.888: Lecture 3 Data Center Conges4on Control Mohammad Alizadeh Spring 2016 1 Transport

Can the Production Network Be the Testbed? Rob Sherwood Deutsche Telekom Inc. R&D Lab Glen

Lecture 18: Congestion Control in Data Center Networks 1 Overview Why is the problem

A Fluid-based Simulation Study: The Effect of Loss Synchronization on Sizing Buffers over 10Gbps

Buffer sizing and Video QoE Measurements at Netflix Bruce Spang , - PowerPoint PPT Presentation

Buffer sizing and Video QoE Measurements at Netflix Bruce Spang , Brady Walsh, Te-Yuan Huang, Tom Rusnock, Joe Lawrence, Nick McKeown February 10, 2020 What are we talking about? What are we talking about? Buffer Server 1 ISP Server 2

Buffer sizing and Video QoE Measurements at Netflix Bruce Spang , Brady Walsh, Te-Yuan Huang,

Discrete Buffer and Wire Sizing for Discrete Buffer and Wire Sizing for Link-Based Non-Tree Clock

Peering to Scale the Netflix Perspective Scaling for Growth How Does Netflix Manage Growth?

User-behavior analytics for video streaming QoE assessment Ricky K. P. Mok The Hong Kong

Netflix: Netflix: Petabyte Scale Petabyte Scale Analytics Infrastructure in Analytics

How We Know Where You Are in House of Cards @zimmermatt Netflix Scale @zimmermatt Netflix

CS244 Advanced Topics in Networking Lecture 10: Buffer Sizing Nick McKeown Sizing Router

QoE in broadband telecommunication networks Dr. Jens Berger, Rohde &amp; Schwarz Agenda

Spring Cloud, Spring Boot and Netflix OSS http://localhost:4000/decks/cloud-boot-netflix.html

Keeping Movies Running Amid Thunderstorms Fault-tolerant Systems @ Netflix Sid Anand (@r39132)

Netflix: Integrating Spark At Petabyte Scale Ashwin Shankar Cheolsoo Park Outline 1. Netflix

External buffer Raslan Darawsheh Mellanox External buffer First was introduced by Olivier

What are buffers for? Frank Kelly Workshop on buffer sizing Stanford, 2-3 December 2019

Innovation &amp; Creativity CEPI WORKSHOP - PANEL 1 18 JUNE 2018 Netflix History 100M Netflix

Importance of Session 3: How is quality of QoE and QoS experience important to for Orange

The Role of Pricing for QoE Marketization A Fixed-point and Measurement Problem Patrick Zwickl

On estimating the number of flows Bruce Spang, Nick McKeown December 3, 2019 How big should a bu

CROSS CULTURAL CHALLENGES IN THE CROSS CULTURAL CHALLENGES IN THE CROSS CULTURAL CHALLENGES IN

Language support and linguistics in Lucene, Solr and ElasticSearch and the eco-system June 3rd,

Clutching a Grip on AUTOSAR using Haskell Johan Nordlander Chalmers University of Technology

6.888: Lecture 3 Data Center Conges4on Control Mohammad Alizadeh Spring 2016 1 Transport

Can the Production Network Be the Testbed? Rob Sherwood Deutsche Telekom Inc. R&amp;D Lab Glen

Lecture 18: Congestion Control in Data Center Networks 1 Overview Why is the problem

A Fluid-based Simulation Study: The Effect of Loss Synchronization on Sizing Buffers over 10Gbps

QoE in broadband telecommunication networks Dr. Jens Berger, Rohde & Schwarz Agenda

Innovation & Creativity CEPI WORKSHOP - PANEL 1 18 JUNE 2018 Netflix History 100M Netflix

Can the Production Network Be the Testbed? Rob Sherwood Deutsche Telekom Inc. R&D Lab Glen