survivable and bandwidth guaranteed embedding of virtual
play

Survivable and Bandwidth- Guaranteed Embedding of Virtual Clusters - PowerPoint PPT Presentation

IEEE INFOCOM 2017 Datacenter Networks 1 Survivable and Bandwidth- Guaranteed Embedding of Virtual Clusters in Cloud Data Centers Ruozhou Yu , Guoliang Xue, and Xiang Zhang Arizona State University Dan Li Tsinghua University 1/25 Outlines q


  1. IEEE INFOCOM 2017 Datacenter Networks 1 Survivable and Bandwidth- Guaranteed Embedding of Virtual Clusters in Cloud Data Centers Ruozhou Yu , Guoliang Xue, and Xiang Zhang Arizona State University Dan Li Tsinghua University 1/25

  2. Outlines q Introduction and Motivation q System Model and Algorithm Design q Performance Evaluation q Conclusions 2/25

  3. The Cloud Shift q Cloud computing : seems an omnipotent solution to all kinds of performance requirements The Mighty Cloud q But is it as mighty as it seems? 3/25

  4. Inside the Cloud q An illusion of infinite computing resources created by large clusters of interconnected machines in data centers q Performance bottleneck: Cloud network ! 4/25

  5. VM & Bandwidth q Traditional approach: Network-agnostic VM allocation q Recent advance: Bandwidth-guaranteed VM allocation q Or Virtual Cluster Embedding (VCE) ! v Existing algorithms can allocate bandwidth-guaranteed VMs with minimum bandwidth, migration costs, etc. q But we know that Cloud machines do fail, quite often… 5/25

  6. Survivable VCE q Question : How can we ensure VM availability even when its host machine could fail? q Answer : We prepare extra VMs and bandwidth just in case! q Question : And how much will that cost us? q Answer : No problem! We can minimize that! q Question : How are we going to achieve that? q Answer : Dynamic programming! 6/25

  7. Outline q Introduction and Motivation q System Model and Algorithm Design q Performance Evaluation q Conclusions 7/25

  8. Network Topology q Assumption : the DCN has a tree structure v Abstracts many common DCN topologies (FatTree, VL2, etc) Original FatTree 1 Gbps / Link c 1 c 2 c 3 c 4 a 11 a 12 a 21 a 22 a 31 a 32 a 41 a 42 e 11 e 12 e 21 e 22 e 31 e 32 e 41 e 42 Abstract Tree c 4 Gbps / Link a 1 a 2 a 3 a 4 2 Gbps / Link e 11 e 12 e 21 e 22 e 31 e 32 e 41 e 42 1 Gbps / Link 8/25

  9. VM Survivability Model q Primary VMs : VMs that are active during normal operations; q Backup VMs : VMs in standby mode, activated when a primary VM’s PM fails v Each backup VM synchronizes the states of multiple primary VMs a b c Migrate q Question : Can we find a bandwidth-guaranteed allocation of both primary and backup VMs to cover an arbitrary single- PM failure, with the minimum number of backup VMs? 9/25

  10. Dynamic Programming for SVCE q Given : topology tree T , request J = < N , B > q Assumption : single PM failure v Interpretation : a failure can be either within a subtree, or outside a subtree, but cannot be both. q Key observation : each subtree’s ability to provide VMs is independent from the rest of the tree, both during normal operations and during an arbitrary failure q Two layers of Dynamic Programming v Outer DP : DP for entire subtrees v Inner DP : DP for the first k sub-subtrees of each subtree 10/25

  11. DP in Details q Outer DP : N v [ n 0 , n 1 ] as the minimum number of total VMs needed in subtree T v , to ensure that v T v can provide at least n 0 VMs when no failure is in T v ; v T v can provide at least n 1 VMs when any PM fails in T v . q Inner DP : N v ’ [ n 0 , n 1 , k ] as the minimum number of total VMs needed in the first k subtrees of v , to ensure that v The k subtrees can provide n 0 VMs when no failure is in them; v The k subtrees can provide n 1 VMs when any PM fails in them. q Alternately update the two tables: v N v [ n 0 , n 1 ] depends on N v ’ [ n 0 ’ , n 1 ’ , d v ] ( d v is the # subtrees under v ); v N v ’ [ n 0 , n 1 , k ] depends on N v [ n 0 ’’ , n 1 ’’ ] of lower-layer nodes. 11/25

  12. Work-through Example ’ [n 0 ,n 1 ,2] / N 3 [n 0 ,n 1 ] N 3 q J = < 2 , 100 Mbps> n 0 \n 1 0 1 2 0 x x x 1 x x x 100Mbps 2 x x x 100Mbps 100Mbps SW 3 ’ [n 0 ,n 1 ,1] N 1 [n 0 ,n 1 ] / N 3 N 2 [n 0 ,n 1 ] PM 1 PM 2 n 0 \n 1 0 1 2 n 0 \n 1 0 1 2 0 0 ∞ ∞ 0 0 ∞ ∞ 1 1 ∞ ∞ 1 1 ∞ ∞ 2 2 ∞ ∞ 2 2 ∞ ∞ 12/25

  13. Work-through Example ’ [n 0 ,n 1 ,2] / N 3 [n 0 ,n 1 ] N 3 q J = < 2 , 100 Mbps> n 0 \n 1 0 1 2 0 x x x 1 x x x 100Mbps n 0 =2 2 x x 4 n 1 =2 100Mbps 100Mbps SW 3 ’ [n 0 ,n 1 ,1] N 1 [n 0 ,n 1 ] / N 1 N 2 [n 0 ,n 1 ] PM 1 PM 2 n 0 \n 1 0 1 2 n 0 \n 1 0 1 2 0 0 ∞ ∞ 0 0 ∞ ∞ 1 1 ∞ ∞ 1 1 ∞ ∞ n 0 =2 n 0 =2 2 2 ∞ ∞ 2 2 ∞ ∞ n 1 =0 n 1 =0 13/25

  14. Work-through Example ’ [n 0 ,n 1 ,2] / N 3 [n 0 ,n 1 ] N 3 q J = < 2 , 100 Mbps> n 0 \n 1 0 1 2 0 x x x 1 x x x 100Mbps n 0 =2 2 x 4 4 n 1 =1 100Mbps 100Mbps SW 3 ’ [n 0 ,n 1 ,1] N 1 [n 0 ,n 1 ] / N 1 N 2 [n 0 ,n 1 ] PM 1 PM 2 n 0 \n 1 0 1 2 n 0 \n 1 0 1 2 0 0 ∞ ∞ 0 0 ∞ ∞ 1 1 ∞ ∞ 1 1 ∞ ∞ n 0 =2 n 0 =2 2 2 ∞ ∞ 2 2 ∞ ∞ n 1 =0 n 1 =0 14/25

  15. Work-through Example ’ [n 0 ,n 1 ,2] / N 3 [n 0 ,n 1 ] N 3 q J = < 2 , 100 Mbps> n 0 \n 1 0 1 2 0 x x x 1 x x x 100Mbps n 0 =2 2 x 3 4 n 1 =1 100Mbps 100Mbps SW 3 ’ [n 0 ,n 1 ,1] N 1 [n 0 ,n 1 ] / N 1 N 2 [n 0 ,n 1 ] PM 1 PM 2 n 0 \n 1 0 1 2 n 0 \n 1 0 1 2 0 0 ∞ ∞ 0 0 ∞ ∞ n 0 =1 1 1 ∞ ∞ 1 1 ∞ ∞ n 1 =0 n 0 =2 2 2 ∞ ∞ 2 2 ∞ ∞ n 1 =0 15/25

  16. Work-through Example ’ [n 0 ,n 1 ,2] / N 3 [n 0 ,n 1 ] N 3 q J = < 2 , 100 Mbps> n 0 \n 1 0 1 2 0 x x x 1 x x x 100Mbps n 0 =2 2 x 2 4 n 1 =1 100Mbps 100Mbps SW 3 ’ [n 0 ,n 1 ,1] N 1 [n 0 ,n 1 ] / N 1 N 2 [n 0 ,n 1 ] PM 1 PM 2 n 0 \n 1 0 1 2 n 0 \n 1 0 1 2 0 0 ∞ ∞ 0 0 ∞ ∞ n 0 =1 n 0 =1 1 1 ∞ ∞ 1 1 ∞ ∞ n 1 =0 n 1 =0 2 2 ∞ ∞ 2 2 ∞ ∞ 16/25

  17. Heuristic SVCE q Optimal DP time complexity: O (| V | N 6 ) v where | V | is # tree nodes , N is # requested VMs . q Question : Can we find a near - optimal solution with less time ? q Observation : if we find a normal VCE with N + N ’ VMs, such that each PM hosts at most N ’ VMs, then we can always recover from any single PM failure. q Algorithm : search from N ’=1 to N , each time using an existing VCE algorithm to find a VCE with N ’ extra VMs, and each PM’s # VMs is bounded by N ’ . q Time Complexity : O ( N · | V |log| V |) 17/25

  18. Outline q Introduction and Motivation q System Model and Algorithm Design q Performance Evaluation q Conclusions 18/25

  19. Simulation Setups q Tree-structured DCN v 4-layer 8-ary (512 PMs, 73 switches) v 5 VM slots / PM v ToR bandwidth: 1 Gbps | Aggr/Core bandwidth: 10 Gbps q Tenant VCs v 1000 requests v 15 VMs and 300 Mbps per VM, on average v Poisson arrivals q Comparison: v OPT: Optimal DP SVCE algorithm v HEU: Heuristic SVCE algorithm v SBS: Shadow-based solution (dedicated VC backup) 19/25

  20. Simulation Results: Average VM Usage 20/25

  21. Simulation Results: Acceptance Ratio 21/25

  22. Simulation Results: Running Time 22/25

  23. Outline q Introduction and Motivation q System Model and Algorithm Design q Performance Evaluation q Conclusions 23/25

  24. Conclusions q A first study on Survivable VCE v A two-layer optimal DP algorithm v A faster near-optimal heuristic algorithm q Discussions v Extension to tree-like topologies (FatTree, VL2, etc.) v Extension to cover a constant number of simultaneous failures q Future work v SVCE on generic data center topologies (BCube, JellyFish, etc.) v Covering link failures in addition to PM failures 24/25

  25. Q&A? THANK YOU VERY MUCH! 25/25

  26. Hose Model Bandwidth Guarantee q Request J = < N , B > v N = 7, B = 100 Mbps a 200 Mbps Number of VMs T c can offer (bandwidth constrained): b c [0, 2] ∩ [5, 7] n c ∈ 26/25

  27. DP in Details /2 Bandwidth feasible VMs q Outer DP update : v PM level: Bandwidth Lower bound of infeasible VMs upper bracket bw feasible VM v Switch level: q Inner DP update : v No subtree: v k -th subtree: 27/25

  28. Work-through Example ’ [n 0 ,n 1 ,2] / N 3 [n 0 ,n 1 ] N 3 q J = < 2 , 100 Mbps> n 0 \n 1 0 1 2 0 x x x 1 x x x 100Mbps n 0 =2 2 x 3 4 n 1 =1 100Mbps 100Mbps SW 3 ’ [n 0 ,n 1 ,1] N 1 [n 0 ,n 1 ] / N 1 N 2 [n 0 ,n 1 ] PM 1 PM 2 n 0 \n 1 0 1 2 n 0 \n 1 0 1 2 0 0 ∞ ∞ 0 0 ∞ ∞ n 0 =1 1 1 ∞ ∞ 1 1 ∞ ∞ n 1 =0 n 0 =2 2 2 ∞ ∞ 2 2 ∞ ∞ n 1 =0 28/25

Recommend


More recommend