aug 16 2012
play

Aug. 16, 2012 Yale LANS Live Streaming is a Major Internet App - PowerPoint PPT Presentation

ShadowStream: Performance Evaluation as a Capability in Production Internet Live Streaming Networks Chen Tian Richard Alimi Yang Richard Yang David Zhang Aug. 16, 2012 Yale LANS Live Streaming is a Major Internet App Yale LANS Poor


  1. ShadowStream: Performance Evaluation as a Capability in Production Internet Live Streaming Networks Chen Tian Richard Alimi Yang Richard Yang David Zhang Aug. 16, 2012 Yale LANS

  2. Live Streaming is a Major Internet App Yale LANS

  3. Poor Performance After Updates Lacking sufficient evaluation before release Yale LANS

  4. Don’t We Already Have … Testbeds • Emulab • PlanetLab • …. Testing Channels • Gradually rolling out They are not enough ! Yale LANS

  5. Live Streaming Background We focus on hybrid live streaming systems: CDN + P2P Yale LANS

  6. Live Streaming Background We focus on hybrid live streaming systems: CDN + P2P Yale LANS

  7. Testbed: Misleading Results at Small Scale Piece Missing Ratio Small-Scale Large-Scale Production 3.5% 0.7% Default With Connection 3.7% 64.8% Limit Live streaming performance can be highly non-linear. Yale LANS

  8. Testbed: Misleading Results due to Missing Features LAN Style ADSL Style (Same BW) (Same BW) Piece Missing Ratio 1.5% 7.3% 2548.25 1404.25 # Timed-out Requests 0 633 # Received Duplicate Packets 5.65 154.20 # Received Outdated Packets Realistic features can have large performance impacts. Yale LANS

  9. Testing Channel: Lacking QoE Protection Yale LANS

  10. Testing Channel: Lacking Orchestration What we have is … What we want is … 6000 6000 Expected Expected Provided 5000 5000 Number of Peers Number of Peers 4000 4000 3000 3000 2000 2000 1000 1000 0 0 0 0 5000 5000 10000 10000 15000 15000 Time (Seconds) Time (Seconds) Yale LANS

  11. ShadowStream Design Goal Use production network for testing with • Protection of real user QoE • Transparent orchestration of testing conditions Yale LANS

  12. Roadmap Motivation Protection Design Orchestration Design Evaluations Conclusions and Future Work Yale LANS

  13. Protection: Basic Scheme Note: R denotes Repair, E denotes Experiment Yale LANS

  14. Example Illustration: E Success Yale LANS

  15. Example Illustration: E Success Yale LANS

  16. Yale LANS

  17. Yale LANS

  18. Example Illustration: E Fail Yale LANS

  19. Example Illustration: E Fail Yale LANS

  20. Example Illustration: E Fail Yale LANS

  21. Example Illustration: E Fail Yale LANS

  22. How to Repair? Choice 1: dedicated CDN resources (R=rCDN) – Benefit: simple – Limitations • requires resource reservation, – e.g., 100,000 clients x 1 Mbps = 100 Gbps • may not work well when there is network bottleneck Yale LANS

  23. How to Repair? • Choice 2: production machine (R=production) – Benefit 1: Larger resource pool – Benefit 2: Fine-tuned algorithms – Benefit 3: A unified approach to protection & orchestration (later) Yale LANS

  24. R= Production: Resource Competition Repair and Experiment compete on client upload bandwidth Competition leads to underestimation on Experiment performance Yale LANS

  25. R= Production: Misleading Result y=m ( θ ) x+y= θ 0 repair demand x misleading O result missing x ratio O accurate O result x O x θ L θ * θ R θ 0 x= θ Yale LANS

  26. Yale LANS

  27. Yale LANS

  28. Yale LANS

  29. Yale LANS

  30. Implementing PCE Requirements • Streaming machine transparent of testing state • Streaming machines are isolated from each other Yale LANS

  31. Yale LANS

  32. Yale LANS

  33. Yale LANS

  34. Client Components Yale LANS

  35. Roadmap Motivation Protection Design Orchestration Design Evaluations Conclusions and Future Work Yale LANS

  36. Orchestration Challenges Orchestrator client P C E Streaming Hypervisor • How to start an Experiment streaming machine – Transparent to real viewers • How to control the arrival/departure of each Experiment machine in a scalable way Yale LANS

  37. Transparent Orchestration Idea Viewer Enters Channel Yale LANS

  38. Transparent Orchestration Idea Experiment Enters Testing real playpoint virtual playpoint R E Yale LANS

  39. Transparent Orchestration Idea Experiment Leaves Testing real playpoint virtual playpoint R E Yale LANS

  40. Distributed Activation of Testing • Orchestrator distributes parameters to clients • Each client independently generates its arrival time according to the same distribution function F(t) • Together they achieve global arrival pattern – Cox and Lewis Theorem Yale LANS

  41. Orchestrator Components Yale LANS

  42. Roadmap Motivation Protection Design Orchestration Design Evaluations Conclusions and Future Work Yale LANS

  43. Software Implementation • Compositional Runtime – Modular design, including scheduler, dynamic loading of blocks, etc. – 3400 lines of code • Pre-packaged blocks – HTTP integration, UDP sockets and debugging – 500 lines of code • Live streaming machine – 4200 lines of code Yale LANS

  44. Experimental Opportunities Yale LANS

  45. Protection and Accuracy Piece Missing Ratio Buggy R=rCDN R=rCDN w/ bottleneck Virtual Playpoint 8.73% 8.72% 8.81% Real Playpoint N/A 0% 5.42% Yale LANS

  46. Protection and Accuracy Piece Missing Ratio PCE bottleneck PCE w/ higher bottleneck Virtual Playpoint 9.13% 8.85% Real Playpoint 0.15% 0% Yale LANS

  47. Orchestration: Distributed Activation Yale LANS

  48. Utility on Top: Deterministic Replay Control non-deterministic inputs • Event • Message • Random seeds Practical per-client log size Log Size 100 clients; 650 seconds 223KB 300 clients; 1,800 seconds 714KB Yale LANS

  49. Roadmap Motivation Protection Design Orchestration Design Evaluations Conclusions and Future Work Yale LANS

  50. Contributions • Design and implementation of a novel live streaming network that introduces performance evaluation as an intrinsic capability in production networks – Scalable (PCE) protection of QoE despite large- scale Experiment failures – Transparent orchestration for flexible testing Yale LANS

  51. Future Work • Large-scale deployment and evaluation • Apply the Shadow (Experiment->Validation- >Repair) scheme to other applications • Extend the Shadow (Experiment->Validation- >Repair) scheme – E.g., repair does not mean do the same job as Experiment, as long as it masks visible failures Yale LANS

  52. Adaptive Rate Streaming Repair Follow Base Adaptive 1.26x 1.26x 1.26x Accuracy 1.59x 1.42x 1.58x Protected QoE 1.49 Kbps Protection Overhead 3.69 Kbps 1.39 Kbps Yale LANS

  53. Thank you! Yale LANS

  54. Questions? Yale LANS

  55. backup Yale LANS

Recommend


More recommend