Looking for the perfect VM scheduler @fhermeni Fabien Hermenier fabien.hermenier@nutanix.com — placing rectangles since 2006 https://fhermeni.github.io
2006 - 2010 PhD - Postdoc Gestion dynamique des tâches dans les grappes, une approche à base de machines virtuelles 2011 Postdoc How to design a better testbed: Lessons from a decade of network experiments 2011 - 2016 Associate professor VM scheduling, green computing
Entreprise cloud company “Going beyond hyperconverged infrastructures” VM scheduling, resource management Virtualization
Inside a private cloud
Clusters from 2 to x physical servers Isolated applications virtual machines containers storage layer SAN based: converged infrastructure shared over the nodes: hyper-converged infrastructure
VM decisions actuators scheduler VM queue model monitoring data
VM scheduling find a server to every VM to run Such that compatible hw enough pCPU enough RAM enough storage enough whatever While min or max sth
? A good VM scheduler provides Bigger business value, same infrastructure
A good VM scheduler provides Same business value, smaller infrastructure
1 node = KEEP CALM VDI workload: 12+ vCPU/1 pCPU AND CONSOLIDATE 100+ VMs / server AS HELL
static dynamic schedulers schedulers live-migrations [5] to consider the VM queue address fragmentation deployed everywhere [1,2,3,4] Costly (storage, migration latency) fragmentation issues thousands of articles [10-13] over-hyped ? [9] but used in private clouds [6,7,8] (steady workloads ?)
Placement constraints performance, security, power e ffi ciency, various concerns legal agreements, high-availability, fault-tolerance … dimension spatial or temporal enforcement level hard or soft manipulated state, placement, resource allocation, concepts action schedule, counters, etc.
discrete N1 constraints N2 VM1 N3 VM2 >> spread(VM[1,2]) ban(VM1, N1) “simple” spatial problem ban(VM2, N2) continuous N1 constraints N2 VM1 [15] N3 VM2 spread(VM[1,2]) ban(VM1, N1) harder scheduling problem ban(VM2, N2) (think about actions interleaving)
spread(VM[1..50]) hard must be satisfied constraints all or nothing approach not always meaningful mostlySpread(VM[1..50], 4, 6) soft satisfiable or not constraints internal or external penalty model [6] harder to implement/scale hard to standardise ?
High-availability x-FT VMs must survive to any crash of x nodes 1 - FT 0 - FT x exact approach: solve n placement problems [17]
The VMWare DRS way slot based catch the x- biggest nodes checks the remaining free slots simple, scalable waste with heterogeneous VMs cluster based
The constraint catalog evolves Dynamic Power Management VM-VM a ffi nity Dedicated instances (DRS 3.1) (DRS) (EC2) 2009 ? 2010 ? mar. 2011 VM-host a ffi nity apr. 2011 (DRS 4.1) The constraint needed MaxVMsPerServer in 2014 (DRS 5.1) sep. 2012 2016
the bjective provider side min(x) or max(x)
atomic objectives min( penalties ) min( Total Cost Ownership ) min( unbalance ) …
composite objectives using weights min( α x + β y ) How to estimate coefficients ? useful to model sth. you don’t understand ? min( α TCO + β VIOLATIONS ) € as a common quantifier: max( REVENUES )
Optimize or satisfy ? min(…) or max(…) threshold based easy to say domain specific expertise verifiable hardly provable composable through composable weighting magic
Acropolis Dynamic Scheduler [18] a t i o n m i t i g s p o t H o t Trigger Thresholds Maintain Minimize 85% Σ mig. a ffi nity constraints cost Resource demand CPU (from machine learning) storage-CPU
adapt the VM placement depending on pluggable expectations network and memory-aware migration scheduler, VM-(VM|PM) a ffi nities, resource matchmaking, node state manipulation, counter based restrictions, energy e ffi ciency, discrete or continuous restrictions
interaction though a DSL, an API or JSON messages The reconfiguration plan 0’00 to 0’02: relocate(VM2,N2) 0’00 to 0’04: relocate(VM6,N2) 0’02 to 0’05: relocate(VM4,N1) 0’04 to 0’08: shutdown(N4) 0’05 to 0’06: allocate(VM1,‘cpu’,3) spread (VM[2..3]); preserve (VM1,’cpu’, 3); offline (@N4); BtrPlace
An Open-Source java library for constraint programming deterministic composition high-level constraints the right model for the right problem
boot ( v ∈ V ) � D ( v ) ∈ N st ( v ) = [0 , H − D ( v )] ed ( v ) = st ( v ) + D ( v ) d ( v ) = ed ( v ) − st ( v ) BtrPlace core CSP d ( v ) = D ( v ) models a reconfiguration plan ed ( v ) < H 1 model of transition per element d ( v ) < H action durations as constants * h ( v ) ∈ { 0 , .., | N | − 1 } relocatable ( v ∈ V ) � . . . shutdown ( v ∈ V ) � . . . suspend ( v ∈ V ) � . . . resume ( v ∈ V ) � . . . kill ( v ∈ V ) � . . . bootable ( n ∈ N ) � . . . haltable ( n ∈ N ) � . . .
s e r n n c c o a l o n t i d d i a n g b r i w s e V i new variables and relations ShareableResource(r) ::= Network() ::= … Power() ::= … High-Availability() ::= …
Constraints state new relations …
vector packing problem mem VM1 items with a finite volume to place inside finite bins VM3 generalisation of the bin packing problem cpu N1 mem VM2 the basic to model the infra. 1 dimension = 1 resource VM4 NP-hard problem cpu N2
how to support migrations temporary, resources are used on the source and the destination nodes
Migration duration [min.] 1000*200K 3 1000*100K 1000*10K 2 1 0 1000 900 800 700 600 500 400 300 200 Allocated bandwidth [Mbit/s] y t l s o c e r a s n o i t a r g M i
dynamic schedulers Using Vector packing [10,12] mem mem VM4 VM1 VM2 N1 N2 cpu cpu mem mem VM6 VM3 VM5 N3 N4 cpu cpu min(#onlineNodes) = 3
dynamic schedulers Using Vector packing [10,12] mem mem VM1 VM2 N1 N2 cpu cpu mem mem VM4 VM6 VM3 VM5 N3 N4 cpu cpu min(#onlineNodes) = 3
dynamic schedulers Using Vector packing [10,12] mem mem VM6 VM1 VM2 N1 N2 cpu cpu mem mem VM4 VM3 VM5 N3 N4 cpu cpu min(#onlineNodes) = 3
dynamic schedulers Using Vector packing [10,12] mem mem VM6 VM2 N1 N2 cpu cpu mem mem VM4 VM1 VM3 VM5 N3 N4 cpu cpu min(#onlineNodes) = 3
dynamic schedulers Using Vector packing [10,12] mem mem VM6 sol #1: 1m,1m,2m VM2 N1 N2 cpu cpu mem mem VM4 VM1 VM3 VM5 N3 N4 cpu cpu min(#onlineNodes) = 3
dynamic schedulers Using Vector packing [10,12] mem mem VM4 VM1 VM2 N1 N2 cpu cpu mem mem VM5 VM6 VM3 N3 N4 cpu cpu min(#onlineNodes) = 3
dynamic schedulers Using Vector packing [10,12] mem mem VM4 VM1 VM2 N1 N2 cpu cpu mem mem VM5 VM6 VM3 N3 N4 cpu cpu min(#onlineNodes) = 3
dynamic schedulers Using Vector packing [10,12] mem mem VM6 VM4 VM1 VM2 N1 N2 cpu cpu mem mem VM5 VM3 N3 N4 cpu cpu min(#onlineNodes) = 3
dynamic schedulers Using Vector packing [10,12] mem mem VM6 VM4 VM1 VM2 sol #2: 1m,2m N1 N2 cpu cpu 1m mem mem VM5 VM3 N3 N4 cpu cpu min(#onlineNodes) = 3
dynamic schedulers Using Vector packing [10,12] mem mem VM6 VM4 VM1 VM2 sol #2: 1m,2m N1 N2 cpu cpu 1m mem mem VM5 lower MTTR (faster) VM3 N3 N4 cpu cpu min(#onlineNodes) = 3
dynamic schedulers Using Vector packing [10,12] mem mem VM6 VM4 sol #1: 1m,1m,2m VM1 VM2 sol #2: 1m,2m N1 N2 cpu cpu 1m mem mem VM5 lower MTTR (faster) VM3 N3 N4 cpu cpu min(#onlineNodes) = 3
dynamic scheduling using vector packing [10, 12] mem mem VM4 VM1 VM2 N1 N2 cpu cpu mem mem mem VM7 VM5 VM6 VM3 N5 N3 N4 cpu cpu cpu offline(N2) + no CPU sharing
VM4 VM1 VM2 N1 N2 VM7 VM5 VM6 VM3 N5 N3 N4 Dependency management
VM5 VM1 N1 N2 VM7 VM4 VM6 VM3 VM2 N5 N3 N4 Dependency management 1) migrate VM2, migrate VM4, migrate VM5
VM5 VM1 N1 VM7 VM4 VM6 VM3 VM2 N5 N3 N4 Dependency management 1) migrate VM2, migrate VM4, migrate VM5 2) shutdown(N2), migrate VM7
coarse grain staging delay actions mig(VM2) mig(VM4) mig(VM5) o ff (N2) mig(VM7) time stage 1 stage 2
Resource-Constrained Project Scheduling Problem [14] time 0 8 4 3 VM5 N1 VM1 VM1 VM2 N2 off VM4 VM7 N3 VM3 VM3 VM6 VM6 N4 VM5 VM7 VM2 N5 VM4
Resource-Constrained Project Scheduling Problem 1 resource per (node x dimension), bounded capacity tasks to model the VM lifecycle. height to model a consumption width to model a duration at any moment, the cumulative task consumption on a resource cannot exceed its capacity comfortable to express continuous optimisation NP-hard problem
From a theoretical to a practical solution duration may be longer convert to an event based schedule 0:3 - migrate VM4 - : migrate VM4 0:3 - migrate VM5 - : migrate VM5 0:4 - migrate VM2 - : migrate VM2 3:8 - migrate VM7 !migrate(VM2) & !migrate(VM4): shutdown(N2) 4:8 - shutdown(N2) !migrate(VM5): migrate VM7
Recommend
More recommend