CloudMirror : T enant Network Abstraction that Reflects Applications’ Needs Myungjin Lee University of Edinburgh In collaboration with: Jeongkeun “JK” Lee, Lucian Popa, Bryan Stephenson, Yoshio Turner, Sujata Banerjee, Puneet Sharma
Need Bandwidth Guarantees for Predictable Performance • Big-data applications require high bandwidth • Hadoop Sort needs ~ 500 Mbps • Web services have stringent latency requirements • Amazon – “Every 100ms latency costs 1% in sales” • Insufficient bandwidth leads to sharp increase in response time 2500 Web response time (msec) 2 secs 2000 browser timeout Wikipedia benchmark 1500 1000 500 250ms bottleneck-free 0 100% 92% 83% 79% Bandwidth provision
No Bandwidth Guarantees Amazon EC2 instance types • Weak or no network SLAs in public clouds • HP Cloud, Amazon EC2, Rackspace, Azure… Network ¡? ¡ HP Cloud instance types
Goal: Network Abstraction for Expressing Bandwidth Demands • Challenge: applications’ complex communication patterns ? MS Bing.com datacenter Source: [Bodik, Sigcomm’12]
Solution: CloudMirror 1. New abstraction for BW guarantees, T enant Application Graph (TAG) 2. VM placement algorithm that efficiently utilizes network & compute resources ¡ Pipe ¡ Virtual VOC TAG Cluster (2-level (VC) VC) Ease of use ¡ û û ü ü û û ü ü Flexibility ¡ û û ü ü ü ü ü ü Efficiency ¡ û û û û û û ü ü 2X BW efficiency Algorithm run > 10 < 1 sec < 1 sec < 1 sec VMs ¡ time for 1K mins
Pipe Model • Lacks statistical multiplexing • Specifies every VM-to-VM communication • Inflexible and inefficient • O( n 2 ) pipes, n : # of VMs B DB DB • Slow: O( n 4 ) algorithm run time web web DB B DB + = B web DB web web DB web DB DB web Total 2 · B bandwidth Actual demand = B DB web DB web web DB DB web DB web DB web web DB web DB web DB web DB web DB DB web DB web DB web
Virtual Cluster Model • Hose Model [Duffield, SigComm’99] • Pros • Per-VM bandwidth: statistical • All VMs connected to a multiplexing single virtual switch • Easy to map on physical topology Virtual Switch • Cons Bandwidth B X B Z B Y Guarantees • Doesn’t capture communication patterns accurately X Y Z • Leads to inefficient bandwidth reservation VMs of one tenant
Virtual Cluster Example L 1 2 B N L 2 2B B B 2B B B … … … Web App DB (N) (N) (N) App(N) DB(N) Web(N) DB Web + App 3-tier web example Virtual Cluster modeling Physical deployment B: per-VM per-edge example bandwidth N: number of VMs in each tier Virtual Cluster reservation at L 2 : 2B · N App - DB demand = B · N 2X bandwidth usage by Virtual Cluster
Virtual Oversubscribed Cluster (VOC) [Ballani, Sigcomm’11] • 2-level hierarchical virtual cluster • Also inefficient, doesn’t accurately capture general application structure Root Virtual Switch oversubscribed Virtual Cluster B z B y B x … … … N X N Z N Y
Intuition: Model the Application, Not the Network Application Our work, TAG = model applications Prior work = model virtual networks Network
T enant Application Graph (TAG) • TAG is a directional graph • Each vertex represents an application component • Component: a set of VMs (or JVMs) performing the same function • Each directional edge represents per-VM sending and receiving bandwidth demands • Each web VM is guaranteed bandwidth B 1 for sending traffic to any VMs in DB tier in B 2 B 1 B 2 DB web (N 2 ) (N 1 )
Bandwidth Models in TAG • Directional edge between two vertices à Virtual Trunk • Self-edge à Virtual Cluster Virtual Switch Virtual Trunk T 1 à 2 B 1 B 2 in B 2 … … Web(N 1 ) DB(N 2 ) Total guarantee of T 1 à 2 = min(B 1 · N 1 , B 2 · N 2 )
TAG is Intuitive • TAG is easy to use because it directly mirrors application structure B B ? B B B B Web App DB B B (N) (N) (N) Web App DB (N) (N) (N) 3-tier example TAG modeling oversubscription ratio ??? • Users don’t need to be concerned with the network topology • VOC requires the user to specify oversubscription ratio
TAG is Efficient • Accurately captures communication patterns B · N B · N B B B B B B B B Web App DB Web App DB (N) (N) (N) (N) (N) (N) 3-tier example TAG modeling DB Web + App Physical deployment • TAG requires less or equal BW than VOC
CloudMirror Operation Available VM slots Network topology & TAG input BW reservation state host1 10 host2 50 Web DB host3 25 VM placement BW reservation
VM Placement 10 App DB 200 • Goal (1) (1) We Map graph-based TAG onto a tree-shaped topology b(1) 90 Cache Deploy as many TAGs as possible while guaranteeing SLAs (1) • Principle: maximize consolidation 1) Localize traffic and save core bandwidth • Place tenant under the smallest feasible subtree [Ballani, Sigcomm’11] • Pack tiers with high inter-tier BW: sized min-cut 100 100 100 problem 2) Fully utilize network & compute resources • Place high-BW, low-BW VMs together: knapsack problem W W A A C D
Evaluations • Methodology • Simulating bandwidth reservations and VM placement given a stream of tenant arrivals • Microsoft Bing.com data • Various communication patterns • Component size: 1 ~ 300 VMs • Tenant: a set of connected components • 3-level tree topology Source: [P . Bodik, et. al, Sigcomm’12] • Modeled after a real HP datacenter • 2048 hosts, 50 VM slots per host
Results • Bandwidth usage • VM slot util. vs. net. capacity • Assume no network bottleneck • Deploy tenants one by one till first tenant rejection • Virtual Cluster consumes 76% more BW than TAG Virtual Cluster
Conclusion • TAG models application structure, not physical topology • Graph-based • Easy to use, efficient and flexible • Placement algorithm efficiently maps TAGs on tree- shaped topology • Blurb: SICSA Software Defined Networking Workshop • Tentative date: mid/late Sept. • A half day event with invited talks, panel discussion, etc. • More details will be announced via NGN mailing list E-mail: myungjin.lee@ed.ac.uk
Recommend
More recommend