Cache Contention Aware Virtual Machine Placement and Migration in Cloud Datacenters Authors: Liuhua Chen, Haiying Shen and Stephen Platt Presenter: Haiying Shen IEEE ICNP November 8-11, 2016 Singapore
2 Objective An effective VM allocation algorithm should allocate as many VMs as possible to a PM i) meeting explicit resource requirements (CPU, memory) ii) minimizing contentions on Last Level cache Many previous VM allocation or migration methods provide a metric to choose destination PM and migration VM to handle objective i) but neglect objective ii).
3 Objective VM1 VM2 VM1 Last Level Cache VM2 PM Performance degradation due to shared cache • Reduce cache interference in VM consolidation
4 Overview • A brief review of cache hierarchy • VM cache performance degradation prediction • VM placement and migration algorithm • Experimental results • Conclusion with future directions
5 A brief review of cache hierarchy [A,B,A,B,C,D,D,B] LRU stack D C 1 8MB L3 Cache C 1 increases 1 C C 2 256KB L2 256KB L2 256KB L2 256KB L2 B 64KB L1 64KB L1 64KB L1 64KB L1 C 3 Core 1 Core 1 Core 1 Core 1 C 3 increases 1 A C 4 stack distance profile
6 Stack distance profile fi = {C 1 ,C 2 ,...,C A ,C >A }, Cd counts the number of hits to the line in the dth LRU stack position and C >A counts the number of cache misses The number of accesses Cache Hits Cache Misses … C 1 C A C >A Stack distance counters
7 Stack distance profile (cont.) Cache Hits Number of accesses Extra Original Cache Misses Cache Misses … C 1 C A C >A Stack distance counters
8 Cache Contention Prediction When VM i and VM j compete for a cache line (which is the d th position in the LRU stack), the probability of VM i “winning” the competition is proportional to the number of accesses to this cache line of VM i, but reversely proportional to the total number of accesses to this cache line of the two VMs. Extend to multiple VMs
9 Cache Contention Prediction (cont.) The new stack distance profile of VM i can be estimated by
10 Performance Degradation Prediction Sensitivity is a measure of how much a VM will suffer when cache space is taken away from it due to contention. Intensity is a measure of how much a VM will hurt others by taking away their space in a shared cache. The degradation of co-scheduling v i and v j together is the sum of the performance degradation of the two VMs
11 VM placement and migration algorithm Objective: minimize the total pain of the co-location of the new VMs with the existing VMs Define it as an optimization problem Transform it to an integer linear programming Use lpsolve 5.5 tool to find optimal solution
12 VM placement and migration algorithm (cont.) The computational complexity of the above method is very high, especially for a relatively large number of VMs. We propose a heuristic VM placement and migration algorithm. • VM placement: allocates each VM to a PM that leads to the minimum total performance degradation. • VM migration: select a VM which generates the maximum pain with other co-located VMs in the PM to migrate out.
13 Experimental results Real testbed: Simulation: • High-performance computing (HPC) • CloudSim (extended to model LLC cluster contention) • Each VM run NPB suite workload • Use trace to determine profiles -- NAS Parallel Benchmark (NPB) suite • 1000+ PMs • 20 PMs • 4000 VMs • 120 VMs Our algorithm: CacheVM Comparison algorithms: cache unaware (Random), classification based (Animal), miss rate based (MissRate)
14 Model validation 100 CDF of prediction 80 error (%) 60 40 20 0 0 5 10 15 Prediction error (%) (Cache misses predicted by the model - Cache misses collected by the simulator)/Cache misses collected by the simulator) This result confirms that the proposed model achieves a high accuracy in predicting cache behaviors .
15 Comparison with the Optimal Algorithm 1.0E+12 4.0E+05 Total # of misses Random Animal Random Total time (ns) Animal Missrate CacheVM 3.0E+05 1.0E+10 MissRate Optimal CacheVM 2.0E+05 Optimal 1.0E+08 1.0E+05 1.0E+06 0.0E+00 20 30 40 20 30 40 The number of VMs The number of VMs Optimal<CacheVM<Animal<Random MissRate<CacheVM<Random<Animal<<Optimal MissRate increases faster 20 PMs 20 PMs
16 Simulated performance with real trace 1.5E+07 1.4E+07 Random Total # of misses Random Total # of misses 1.2E+07 Animal Animal 1.0E+07 1.0E+07 MissRate MissRate 8.0E+06 CacheVM CacheVM 6.0E+06 5.0E+06 4.0E+06 2.0E+06 0.0E+00 0.0E+00 2000 3000 4000 Scale 1 Scale 2 Scale 3 The number of VMs Different scales CacheVM<Animal ≈ Random<MissRate CacheVM<Animal<Random<MissRate (1000 VMs, 750 PMs), (2000 VMs, 1500 PMs), (4000 VMs, 3000 PMs) 2000 PMs
17 Performance on real testbed Total execution time (s) Total throughput ( Mop/s ) 2.0E+04 6.0E+04 Random Random Animal 5.0E+04 Animal 1.5E+04 MissRate 4.0E+04 MissRate CacheVM 1.0E+04 3.0E+04 CacheVM 2.0E+04 5.0E+03 1.0E+04 0.0E+00 0.0E+00 20 60 90 120 20 60 90 120 The number of VMs The number of VMs CacheVM<MissRate<Animal<Random Random<Animal<MissRate<CacheVM Varied VMs from 20 to 120 and allocated them to 20 PMs
18 Performance on real testbed (cont.) Normalized throughput 2.0 1.5 Random Animal Normalized time Random Animal MissRate CacheVM 1.2 MissRate CacheVM 1.5 0.9 1.0 0.6 0.5 0.3 0.0 0.0 20 60 90 120 20 60 90 120 The number of VMs The number of VMs CacheVM<Animal<MissRate<Random Random<Animal<MissRate<CacheVM
19 Conclusion and future work • Proposed a cache contention aware VM performance degradation prediction algorithm. • Formulated a cache contention aware VM placement problem. • Transformed this problem to an integer linear programming (ILP) model and solved it. • Proposed a heuristic cache contention aware VM placement and migration algorithm • Conducted trace-driven simulation and real-testbed experiments to evaluate CacheVM. Future work: develop a decentralized version of the proposed algorithm.
20 Thank you! Questions & Comments? Haiying Shen hs6ms@v @virgin ginia.e .edu du Pervas asive ive Communi unica catio tion n Laborato tory ry Univers rsity ty of Virginia
Recommend
More recommend