POWER BUDGETING FOR VIRTUALIZED DATA CENTERS Harold Lim (Duke University) Aman Kansal (Microsoft Research) Jie Liu (Microsoft Research)
Power Concerns in Data Centers Consumption costs Provisioning costs Cost of supply infrastructure, generators, backup UPSs Can be higher than consumption cost in large data centers due to discounted/bulk price on consumption Addressed through peak power management Data from James Hamilton Provisioning Cost
Over-subscription reduces provisioning cost Savings from Power Capping Rated peak (never reached) Data Center Power Possible peak (sum of server peaks) Allocated Capacity Actual power consumption (peak of the sum usually lower than allocated, but can exceed) Time Lower allocated capacity => lower provisioning cost (Slight perf hit) Possible because power can be capped if exceeds [Lefurgy et al. 2003, Femal et. al 2005, Urgaonkar et al. 2009, Wang et al. 2010]
Enter Virtualization Existing capping methods fall short Servers shared by VMs from different applications: cannot cap a server or blade cluster in hardware Rack … … … VM VM VM VM VM VM … Server-11 Server-12 Server-1j
Challenge 1: Disconnect Between Physical Layout and Logical Organization of Resources Server Server VM2 VM2 VM1 VM1 Existing Hardware Capping: Need: Application Aware Unaware of Applications Capping
Challenge 1: Disconnect Between Physical Layout and Logical Organization of Resources Server Server VM2 VM2 VM1 VM1 Existing Hardware Capping: Need: Application Aware Unaware of Applications Capping
Challenge 2: Multi-dimensional Power Control Two knobs: DVFS and CPU time cap Different marks are different DVFS levels, multiple marks correspond to different CPU time caps DVFS = 100 600 550 DVFS = 94 Performance (TPS) 500 450 DVFS = 88 400 350 Perf gap at same power DVFS = 82 300 250 DVFS = 76 200 150 DVFS = 70 100 0 1 2 3 4 5 Power (Watt)
Challenge 3: Dynamic Power Proportions Applications’ input workload volume changes over time Proportion among applications changes Proportion of power among app tiers changes Front-End Back-End Front-End Low High Back-End Load Load CPU CPU Disk Disk 50W 80W 100W 90W (CPU (Disk (CPU (Disk Idle) Spinning, busy) Spinning, Low IO) High IO)
Virtualized Power Shifting (VPS): A Power Budgeting System for Virtualized Infrastructures Addresses the above three challenges Application-aware Eg. Interactive apps not affected during capping Shifts power dynamically as workloads change Distributes power among applications and application tiers for best performance Exploits performance information (if available) and multiple power knobs Selects optimal operating point within power budget
Application-aware Hierarchy P T (t) Data Center Controller Papp-1(t) Papp-n(t) … Application Level Controller 1 Application Level Controller n Ptier-1(t) Ptier-n(t) Ptier-1(t) Ptier-n(t) Tier Level Tier Level Tier Level Tier Level … … Controller 1 Controller n Controller 1 Controller n … … … … VM VM VM VM VM VM VM VM
Top Level Controller: Issues Determines amount of power for each application P T (t) Static allocations does not work Data Center HVAC, Dynamic workloads and power Controller Etc. usage Unused power wasted Must compensate for hidden … App 1 App n power increase in shared infrastructure (e.g., cooling load) that are hard to assign to each application
Top Level Controller: Solution App 1 App m … P T (t) P A1 (t) Top Level P Am (t) Controller Data Center Power P M (t) Uses feedback (PID) to adapt to dynamic workload and power Estimates uncontrollable power P U (t) = P M (t) – Sum(P ai (t)) Outputs application power to be allocated P app (t+1) = P M (t) + D (t+1) - P U (t)
Top Level Controller: Power Split How is P app distributed among apps? Using Weighted Fair Sharing (WFS) Each application has an initial budget E.g., 99 th percentile of its max power In each priority class, allocate power needed to each app, up to its initial budget If not enough power, allocate proportion via WFS
Application Level Controller: Issues Determines how much budget to allocate to each tier Prior work: Learn model of power ratios among Tier Tier … 1 N tiers a-priori. Problems: Model changes with workload Application Depends on the control knobs used Application behavior may change over time
Application Level Controller: Solution VPS: dynamically tunes power allocations without relying on learned models Observations: Application tiers are arranged in a pipeline Throttling one tier affects other tiers Tier Tier … 1 N Application
Application Level Controller (contd.) Uses PID control Measures total application power usage but only control one tier Automatically achieves right proportion Controller u(t) Tier 1 Tier e(t) (Controlled) n … P a (t) - Budget Controller:
Tier Level Controller Tracks tier power budget by controlling P T (t) VM power usage Tier Level Controller Many power control knobs available Use DVFS and VM CPU … VM 1 VM n time allocation as knobs Multiple trade-offs exist w.r.t accuracy, speed, models needed, app visibility needed Study 3 design options
Option 1: Open Loop Control Uses power model to convert power budget to control knob setting E.g., P VM = c freq * u cpu Easy and instantaneous Does not require visibility into application performance But does not compensate for errors
Option 2: PID Control Real time measurements to tune power settings: compensates for error Slower (needs time to converge) Single control knob (no notion of performance optimality) u(t), VM CPU Time Controller e(t) … VM1 VM k Tier - Budget Sum of VM power consumptions
Option 3: Model Predictive Control (MPC) Optimizes performance using multiple power control knobs (DVFS and VM CPU time) Uses a cost function that consists of error and performance terms Solves for the optimal outputs for the next N steps but only apply the setting for next time step Requires application performance measurement Requires system models that relate control knobs to system state
Summary of Design Options Pros Cons Open Loop Fast Needs power models Higher error PID Low error No performance optimization Slower MPC Optimizes Needs system performance models Needs performance measurement
Experiments VPS controllers run as network services in Controller root VM on each … VM VM Service server Root VM Controller tuned using Physical Server known methods Testbed: 17 Quad core HP Proliant servers (11 host the apps, 6 generate the workload) VMs mixed across the physical servers VM power measured using Joulemeter, Hardware power using WattsUp PRO meters
Experiment Workloads Applications Interactive: StockTrader – open source multi-tiered cluster web application benchmark 3 instances, 2 are High priority Background : SPEC CPU 2006 benchmark Low priority Use Microsoft data center traces as input to simulate realistic workloads that vary over time 100 Workload (%) 80 60 40 20 0 Time (s)
Metric: Total Budgeting Error Error = excess power consumed above the assigned budget, normalized by the power budget 0.35 Overshoot Error (%) 0.3 0.25 0.2 0.15 0.1 0.05 0 Open Loop PID MPC Physical Hierarchy VPS
Metrics: Errors within App Hierarchy Application power enforcement errors 30.00% Open Loop PID MPC 25.00% 20.00% Error (%) 15.00% 10.00% 5.00% 0.00% Lo-1 Lo-2 Hi-1 Hi-2
Metric: Power Differentiation VPS is designed to respect application priorities and QoS constraints in a shared infrastructure PID and MPC perform appropriate application differentiation 40 Lo-1 Lo-2 35 Power Reduction (%) 30 Hi-1 Hi-2 25 20 15 10 5 0 Open Loop PID MPC Physical Hierarchy
Metric: Application Performance Performance of (low priority) app that was capped PID MPC 0.6 Response Time 0.4 (s) 0.2 0 0 1000 2000 3000 4000 5000 Time (s) 5000 PID MPC Throughput 4000 (TPS) 3000 2000 0 1000 2000 3000 4000 5000 Time (s)
Conclusions VPS: power budgeting system for virtualized data centers Hierarchy of control follows application layout Respects application priorities and application VM boundaries Optimizes application performance, given a power budget Dynamically adjusts power proportions Exploits multiple knobs
Recommend
More recommend