Efficiently Delivering Online Services over Integrated Infrastructure Hongqiang Harry Liu, Raajay Viswanathan, MaC Calder Aditya Akella, Ratul Mahajan, Jitendra Padhye, Ming Zhang 1
Online Services 2
Online Service Delivery Infrastructure Proxies Wide Area Network Data Centers 3
Online Service Delivery is Evolving Operated by different ISPs Operated by content providers WAN MulOple owners Owned by a single enOty Tradi*onal infrastructure Integrated infrastructure 4
Integrated Infrastructure Enables Joint Control of All Decisions WAN 1. User – Proxy mapping 2. Proxy – DC mapping 3. Paths in the wide area network 5
Advantages of Joint Control DC-2 WAN DC-1 Proxy-3 Proxy-1 Proxy-2 • Increase efficiency : total traffic without congesOon • Improve performance : aggregate end-to-end latency 6
Footprint : Jointly Controls the Integrated Infrastructure Topology DC2 P3 DC1 Controller P1 P2 System UG – proxy User workload capacity latency UG1 Users grouped by locaOon, service provider Control decisions for a user group: Goals: • UG—proxy mapping Maximize congesOon free traffic • • proxy—DC mapping Minimize end-to-end latency • • network paths 7
Outline • Challenges in compuOng forwarding configuraOon • Other challenges in realizing Footprint • EvaluaOon 8
CompuOng ConfiguraOon: Basic Approach EsOmated user demands for an epoch w 1 1 { d 1 , d 2 ,..., d n } C 1 1 l 1 d 1 w 1 2 Resource capaciOes 2 l 1 { C 1 , C 2 ,..., C m } C 2 w 1 m d 2 EsOmated load from user ‘u’ on resource ‘r’ m l 1 r = d u . w u r n u Capacity constraint for resource ‘r’ n r = r ≤ C r ∑ w n m n u d n C m Linear Program u m l n ObjecOve ∑ ∑ r n u r ∑ ∑ r l u maximize n u − Latencies u r u r 9
Does such a simple model suffice ? No Because of the nature of traffic from different online applicaOons 10
User Traffic Arrives over Sessions • MulOple requests and responses over a single session Requests Proxy TCP Responses • Sessions are long-lived and arrive all through the duraOon of an epoch #sessions varies over Ome #Sessions on a Resource 0 Ome (s) 11
Session SOckiness Sessions s*ck to proxy and DC Old sessions are Long lived TCP sessions • sOll forwarded No fresh DNS query in the middle of a session • to P1 1 1 n 1 n 1 P1 UG1 # sessions # sessions on P1 on P1 T T P2 Ome (s) Ome (s) Overlay link 2 2 n 1 n 1 P1 # sessions # sessions UG1 on P2 on P2 T T Ome (s) Ome (s) P2 Non-s*cky sessions S*cky sessions Switch traffic at t = T 12
Challenge: Temporal VariaOon of Load Gradually varying load from a 1. Non-zero session lifeOme user group to a resource 2. Session sOckiness • Resource capacity constraints should be saOsfied during enOre epoch r ≤ C r ∑ r ( t ) ≤ C r ∑ n u n u u u r ( t ) n u • ComputaOonally infeasible if does not have a closed form - ApplicaOons have arbitrary session life distribuOons 13
How to guarantee congesOon free delivery for traffic on sessions? 14
High Fidelity Modeling of Load n r ( t ) = n r new ( t ) + n r old ( t ) previous current Always holds this paCern n r 1 ProporOonal to o CDF arrival rate of n new n old new sessions 0 100 300 200 0 300 0 300 Ome (s) Ome (s) Session life Ome (s) n r ( t ) = λ r F ( t ) + n o r G ( t ) Arrival rate of sessions (decision variable) PaCern FuncOons ß Session length distribuOon 15
DiscreOzing the Temporal Model __ F ( t ) Approximate by a Oght piecewise F ( t ) F ( t ) __ linear upper bound , F ( t ) __ r ( t ) = λ r F __ n r ( t ) ≤ n ( t ) + n r 0 G ( t ) 0 t 3 t 1 t 2 T Ome __ r ( t ) • has maximum at one of the corners n • Capacity constraints have to be checked only at fixed set of points λ r • Op*mal ‘s obtained by solving a linear program 16
Footprint: System ImplementaOon Gathering Inputs CompuOng OpOmal Forwarding ImplemenOng Computed ConfiguraOon 17
Footprint: Inputs to the controller • Input data collected every 5 minutes • Inputs: – User group – proxy latency measurements • Piggy-back on end-host applicaOons • Instrumented JavaScript on bing.com webpage [Calder et al., IMC 2015] – User workload • EsOmated using observed workload in prior epochs – System health status • From Microsoo internal system monitoring pipelines • Deployed in producOon 18
ImplemenOng Computed ConfiguraOon • UG—proxy mapping: DNS (BIND) • Proxy—DC mapping: Custom sooware to change configuraOon • WAN path selecOon: OpenFlow • Prototyped on a modest-sized testbed 19
EvaluaOon 1. Joint Decisions 2. Temporal Modeling 20
EvaluaOon Setup • Trace driven simulaOons • Data - Taken from producOon deployment of Footprint - One week worth of data - MulOple topologies (North America, Europe) • Scale - O(10k) user groups - O(100) routers and links - O(100) proxies - O(10) data centers • Metric - Efficiency: Maximum traffic with no congesOon - Performance: Aggregated end-to-end latency 21
EvaluaOon: Efficiency of Joint Control FastRoute [Flavel et al., NSDI 2015] • UG—proxy: Closest proxy decided by Anycast rouOng • Proxy—DC: Closest proxy based on acOve measurements • WAN path selecOon: Independent traffic engineering module 2.5 2 Normalized Traffic Scale 1.5 1 0.5 0 FastRoute Footprint Footprint can carry 2x more load because user traffic is • diverted to resources with unused capacity 22
EvaluaOon: Latency Improvement Compare end-to-end latency at 70% capacity of FastRoute 100 80 Latency (ms) Queuing Delay 60 Internal PropagaOon Delay 40 External Delay 20 0 FastRoute Footprint Footprint decreases overall latency by ~60% 23
EvaluaOon: Efficiency of Temporal Modeling • Compare with non-temporal models – JointAverage : n r ( t ) = λ x Average session length – JointWorst: n r ( t ) = max (#old sessions) + max (#new sessions) t t 3 2.3 Normalized Traffic Scale 2 1.48 1.18 1 0 JointAverage JointWorst Footprint More than 50% gains with respect to non-temporal models. 24
Related Work To coordinate or not to coordinate? [Narayana et. al , SIGMETRICS 2012] • CooperaOve world vs Single enOty world • Show importance of temporal load modeling • 25
Summary • Joint decision for proxy, DC and WAN path selecOon • 100% increase in supported users, and, • 60% reducOon in end-to-end latency • High fidelity temporal models 50% efficient than non- temporal models #Sessions 0 Ome (s) 26
27
Recommend
More recommend