Drongo Speeding Up CDNs with Subnet Assimilation from the Client CoNEXT ‘17 Authors: Incheon, South Korea Marc Anthony Warrior CDN & Caching Session Uri Klarman Marcel Flores Aleksandar Kuzmanovic
Bird’s Eye View ● What is Drongo? ● Why we need Drongo ● Performance Analysis ● Thoughts & Conclusions ● Questions 2 2
What is Drongo? 3
What is Drongo? It’s a bird! 4
What is Drongo? It’s a bird! 5
What is Drongo? It’s a bird! 6
What is Drongo? It’s a system that allows end-users to enhance the QoS (quality of service) they get from Content Distribution Networks ( CDN s) 7
What is Drongo? It’s a system that allows end-users to enhance the QoS (quality of service) they get from Content Distribution Networks ( CDN s) (in this talk, QoS = latency ) 8
Why Latency? 9
● Latency is time Why Latency? 10
● Latency is time Why Latency? ● Latency is money ○ Google (Marissa Mayer), Amazon (Greg Linden) ■ Web 2.0 Summet, glinden.blogspot.com 11
● Latency is time Why Latency? ● Latency is money ○ Google (Marissa Mayer), Amazon (Greg Linden) ■ Web 2.0 Summet, glinden.blogspot.com ● Latency is the bottom line ○ “What we have found running our applications at Google is that latency is as important, or more important, for our applications than relative bandwidth,” Amin Vahdat (Google) 12
Drongo helps you (the end user) lower your own latency! 13
Drongo’s Effect on Latency CDNetworks r t C CubeCDN t e Amazon Google Alibaba N a n i h C 14
Drongo’s Effect on Latency CDNetworks r t C CubeCDN t e Amazon Google Alibaba N a n i h C ONLY client-side changes 15
Example Scenario 16
17 Provider wants to serve client
18 Client is far
19 CDN = more replica locations
20 DNS Redirection Which replica serves the client?
21 Choose the “closest” server
22 Choose the “closest” server This choice is nontrivial !
23 Often Suboptimal Choices!
Maybe just a far LDNS... [Chen - SigComm ’15; Huang - SigComm CCR ‘12; Alzoubi - WWW ‘13; Rula - SigComm ‘14 …] 24
Ordinary DNS Query DNS Query LDNS IP Somewhere in California 25
EDNS0 Client-Subnet extension (ECS) DNS Query LDNS IP Client Subnet Somewhere in California Actually somewhere in New York 26
27 We used ECS: (ECS User)
28 We used ECS: This still happens (ECS User)
29 We used ECS: This still happens … frequently (ECS User)
Really? ... 30
Really? ... YES! We measured it! 31
How did we measure it? 32
How did we measure it? Find subnets directed to different replicas 33
Subnet Assimilation Client Subnet DNS Query LDNS IP 34
Subnet Assimilation Client Subnet DNS Query LDNS IP Other Subnet 35
How did we measure it? Find subnets directed to Perform pings and downloads different replicas to each replica 36
How did we measure it? Find subnets directed to Perform pings and downloads Identify which subnet resulted in different replicas to each replica the “best” replica 37
38 1. Get “Default” Choice (use client’s own subnet for ECS)
39 2. Traceroute to default choice
40 3. Get Hop Subnet Choices (use hops’ subnets for ECS)
41 4. Measure Latencies
42 4. Measure Latencies Steps 1-4: a “trial”
Latency Ratio 43 Normalize to default choice’s RTT 1.4 1 0.6
We’re looking for this 44 1.4 1 0.6
Valley = better choice from hop subnet replica choice for subnet traceroute 100 ms RTT: client to replica 45 0 ms
Valley = better choice from hop subnet replica choice for subnet traceroute 100 ms RTT: client to replica 46 0 ms
PlanetLab Sees Valleys! 47
PlanetLab Sees Valleys! 48
PlanetLab Sees Valleys! Google : 20.24% ● ● Amazon : 14.02% Alibaba : 33.68% ● ● CDNetworks : 15.61% ChinaNetCenter : 27.42% ● ● CubeCDN : 38.58% Room for improvement! 49
50 5.
51 5. Use best subnet for ECS
52 5. Use best subnet for ECS Get best mapping!
● Trials are not “fast” Are Valleys Predictable? 53
● Trials are not “fast” Are Valleys Predictable? ● We want valleys “on the fly” 54
● Trials are not “fast” Are Valleys Predictable? ● We want valleys “on the fly” ● We need to find valley-prone subnets 55
Testing Persistence consecutive trials 0 5 10 15 20 56
Testing Persistence VS 0 5 10 15 20 Trial A Trial B 57
Latency Ratio Difference Over Time Latency Ratio = (hop replica RTT) / (default replica RTT) 58
Testing Persistence VS 0 5 10 15 20 Window A Window B 59
Testing Persistence VS 0 5 10 15 20 Window A Window C 60
Testing Persistence VS 0 5 10 15 20 Window A Window C 15 hours 61
Latency Ratio Difference Over Time Latency Ratio = (hop replica RTT) / (default replica RTT) 62
Latency Ratio Difference Over Time Latency Ratio = (hop replica RTT) / (default replica RTT) 63
Latency Ratio Difference Over Time Latency Ratio = (hop replica RTT) / (default replica RTT) 64
Latency Ratio Difference Over Time Latency Ratio = (hop replica RTT) / (default replica RTT) SURPRISE! The Internet is crazy! 65
Filter: at least one valley {0,0,0,0,0, V ,0,0,0,0,0,0, V } Subnet A {0,0,0,0,0,0,0,0,0,0,0,0,0} Subnet B Subnet C { V , V , V , V ,0,0,0,0, V , V , V ,0, V } 66
Filter: at least one valley {0,0,0,0,0, V ,0,0,0,0,0,0, V } Subnet A {0,0,0,0,0,0,0,0,0,0,0,0,0} Subnet B Subnet C { V , V , V , V ,0,0,0,0, V , V , V ,0, V } 67
Filter: at least one valley Latency Ratio = (hop replica RTT) / (default replica RTT) 68
Filter: at least one valley Latency Ratio = (hop replica RTT) / (default replica RTT) very flat 69
Filter: at least one valley Latency Ratio = (hop replica RTT) / (default replica RTT) very flat Close to zero 70
71
72
Parameter Exploration 73
How deep are the V thresh = valleys from useful subnets? 74
Latency Ratio 1 V thresh 1 0.9 0.6 A B C Replicas 75
Latency Ratio 1 V thresh 1 0.9 0.6 A B C Replicas 76
How often do valleys V freq = occur in useful subnets? 77
TRAINING WINDOW 78
TRIALS 79
V freq = 2/5 80
V freq = 2/5 Valley-Prone Subnet 81
V freq = 2/5 Valley-Prone Subnet 82
V freq = 2/5 NOT Valley-Prone Subnet 83
Overview of Drongo: 1. Collect training window 84
Overview of Drongo: 1. Collect training window 2. Count the # of sufficiently deep valleys 85
Overview of Drongo: 1. Collect training window 2. Count the # of sufficiently deep valleys 3. Apply subnet assimilation a. Training window is already complete b. Both parameters met 86
System Wide Performance 87
System Wide Performance 88
System Wide Performance better 89
System Wide Performance better 90
System Wide Performance better 91
System Wide Performance better 92
System Wide Performance better 93
System Wide Performance V freq = 1.0 better 94
System Wide Performance V freq = 1.0 V thresh = 0.95 better 95
Switch Quality r CDNetworks t C CubeCDN t e Amazon Alibaba Google N a n i h C Global Params Per Prov. Params 96
Conclusion & ● CDNs have a lot of room for Insights improvement 97
Conclusion & ● CDNs have a lot of room for Insights improvement ● Clients can help 98
Conclusion & ● CDNs have a lot of room for Insights improvement ● Clients can help ● Low requirements 99
Conclusion & ● CDNs have a lot of room for Insights improvement ● Clients can help ● Low requirements ● Can provide 50% improvement 100
Recommend
More recommend