power budgeting for virtualized data centers
play

POWER BUDGETING FOR VIRTUALIZED DATA CENTERS Harold Lim (Duke - PowerPoint PPT Presentation

POWER BUDGETING FOR VIRTUALIZED DATA CENTERS Harold Lim (Duke University) Aman Kansal (Microsoft Research) Jie Liu (Microsoft Research) Power Concerns in Data Centers Consumption costs Provisioning costs Cost of supply


  1. POWER BUDGETING FOR VIRTUALIZED DATA CENTERS Harold Lim (Duke University) Aman Kansal (Microsoft Research) Jie Liu (Microsoft Research)

  2. Power Concerns in Data Centers  Consumption costs  Provisioning costs  Cost of supply infrastructure, generators, backup UPSs  Can be higher than consumption cost in large data centers due to discounted/bulk price on consumption  Addressed through peak power management Data from James Hamilton Provisioning Cost

  3. Over-subscription reduces provisioning cost Savings from Power Capping Rated peak (never reached) Data Center Power Possible peak (sum of server peaks) Allocated Capacity Actual power consumption (peak of the sum usually lower than allocated, but can exceed) Time  Lower allocated capacity => lower provisioning cost (Slight perf hit)  Possible because power can be capped if exceeds [Lefurgy et al. 2003, Femal et. al 2005, Urgaonkar et al. 2009, Wang et al. 2010]

  4. Enter Virtualization  Existing capping methods fall short  Servers shared by VMs from different applications: cannot cap a server or blade cluster in hardware Rack … … … VM VM VM VM VM VM … Server-11 Server-12 Server-1j

  5. Challenge 1: Disconnect Between Physical Layout and Logical Organization of Resources Server Server VM2 VM2 VM1 VM1 Existing Hardware Capping: Need: Application Aware Unaware of Applications Capping

  6. Challenge 1: Disconnect Between Physical Layout and Logical Organization of Resources Server Server VM2 VM2 VM1 VM1 Existing Hardware Capping: Need: Application Aware Unaware of Applications Capping

  7. Challenge 2: Multi-dimensional Power Control Two knobs: DVFS and CPU time cap Different marks are different DVFS levels, multiple marks correspond to different CPU time caps DVFS = 100 600 550 DVFS = 94 Performance (TPS) 500 450 DVFS = 88 400 350 Perf gap at same power DVFS = 82 300 250 DVFS = 76 200 150 DVFS = 70 100 0 1 2 3 4 5 Power (Watt)

  8. Challenge 3: Dynamic Power Proportions  Applications’ input workload volume changes over time  Proportion among applications changes  Proportion of power among app tiers changes Front-End Back-End Front-End Low High Back-End Load Load CPU CPU Disk Disk 50W 80W 100W 90W (CPU (Disk (CPU (Disk Idle) Spinning, busy) Spinning, Low IO) High IO)

  9. Virtualized Power Shifting (VPS): A Power Budgeting System for Virtualized Infrastructures Addresses the above three challenges  Application-aware  Eg. Interactive apps not affected during capping  Shifts power dynamically as workloads change  Distributes power among applications and application tiers for best performance  Exploits performance information (if available) and multiple power knobs  Selects optimal operating point within power budget

  10. Application-aware Hierarchy P T (t) Data Center Controller Papp-1(t) Papp-n(t) … Application Level Controller 1 Application Level Controller n Ptier-1(t) Ptier-n(t) Ptier-1(t) Ptier-n(t) Tier Level Tier Level Tier Level Tier Level … … Controller 1 Controller n Controller 1 Controller n … … … … VM VM VM VM VM VM VM VM

  11. Top Level Controller: Issues  Determines amount of power for each application P T (t)  Static allocations does not work Data Center HVAC,  Dynamic workloads and power Controller Etc. usage  Unused power wasted  Must compensate for hidden … App 1 App n power increase in shared infrastructure (e.g., cooling load) that are hard to assign to each application

  12. Top Level Controller: Solution App 1 App m … P T (t) P A1 (t) Top Level P Am (t) Controller Data Center Power P M (t)  Uses feedback (PID) to adapt to dynamic workload and power  Estimates uncontrollable power  P U (t) = P M (t) – Sum(P ai (t))  Outputs application power to be allocated  P app (t+1) = P M (t) + D (t+1) - P U (t)

  13. Top Level Controller: Power Split How is P app distributed among apps?  Using Weighted Fair Sharing (WFS)  Each application has an initial budget  E.g., 99 th percentile of its max power  In each priority class, allocate power needed to each app, up to its initial budget  If not enough power, allocate proportion via WFS

  14. Application Level Controller: Issues  Determines how much budget to allocate to each tier  Prior work: Learn model of power ratios among Tier Tier … 1 N tiers a-priori. Problems:  Model changes with workload Application  Depends on the control knobs used  Application behavior may change over time

  15. Application Level Controller: Solution  VPS: dynamically tunes power allocations without relying on learned models  Observations:  Application tiers are arranged in a pipeline  Throttling one tier affects other tiers Tier Tier … 1 N Application

  16. Application Level Controller (contd.)  Uses PID control  Measures total application power usage but only control one tier  Automatically achieves right proportion Controller u(t) Tier 1 Tier e(t) (Controlled) n … P a (t) - Budget Controller:

  17. Tier Level Controller  Tracks tier power budget by controlling P T (t) VM power usage Tier Level Controller  Many power control knobs available  Use DVFS and VM CPU … VM 1 VM n time allocation as knobs  Multiple trade-offs exist w.r.t accuracy, speed, models needed, app visibility needed  Study 3 design options

  18. Option 1: Open Loop Control  Uses power model to convert power budget to control knob setting  E.g., P VM = c freq * u cpu  Easy and instantaneous  Does not require visibility into application performance  But does not compensate for errors

  19. Option 2: PID Control  Real time measurements to tune power settings: compensates for error  Slower (needs time to converge)  Single control knob (no notion of performance optimality) u(t), VM CPU Time Controller e(t) … VM1 VM k Tier - Budget Sum of VM power consumptions

  20. Option 3: Model Predictive Control (MPC)  Optimizes performance using multiple power control knobs (DVFS and VM CPU time)  Uses a cost function that consists of error and performance terms  Solves for the optimal outputs for the next N steps but only apply the setting for next time step  Requires application performance measurement  Requires system models that relate control knobs to system state

  21. Summary of Design Options Pros Cons Open Loop Fast Needs power models Higher error PID Low error No performance optimization Slower MPC Optimizes Needs system performance models Needs performance measurement

  22. Experiments  VPS controllers run as network services in Controller root VM on each … VM VM Service server Root VM  Controller tuned using Physical Server known methods  Testbed: 17 Quad core HP Proliant servers (11 host the apps, 6 generate the workload)  VMs mixed across the physical servers  VM power measured using Joulemeter, Hardware power using WattsUp PRO meters

  23. Experiment Workloads  Applications  Interactive: StockTrader – open source multi-tiered cluster web application benchmark  3 instances, 2 are High priority  Background : SPEC CPU 2006 benchmark  Low priority  Use Microsoft data center traces as input to simulate realistic workloads that vary over time 100 Workload (%) 80 60 40 20 0 Time (s)

  24. Metric: Total Budgeting Error  Error = excess power consumed above the assigned budget, normalized by the power budget 0.35 Overshoot Error (%) 0.3 0.25 0.2 0.15 0.1 0.05 0 Open Loop PID MPC Physical Hierarchy VPS

  25. Metrics: Errors within App Hierarchy  Application power enforcement errors 30.00% Open Loop PID MPC 25.00% 20.00% Error (%) 15.00% 10.00% 5.00% 0.00% Lo-1 Lo-2 Hi-1 Hi-2

  26. Metric: Power Differentiation  VPS is designed to respect application priorities and QoS constraints in a shared infrastructure  PID and MPC perform appropriate application differentiation 40 Lo-1 Lo-2 35 Power Reduction (%) 30 Hi-1 Hi-2 25 20 15 10 5 0 Open Loop PID MPC Physical Hierarchy

  27. Metric: Application Performance  Performance of (low priority) app that was capped PID MPC 0.6 Response Time 0.4 (s) 0.2 0 0 1000 2000 3000 4000 5000 Time (s) 5000 PID MPC Throughput 4000 (TPS) 3000 2000 0 1000 2000 3000 4000 5000 Time (s)

  28. Conclusions  VPS: power budgeting system for virtualized data centers  Hierarchy of control follows application layout  Respects application priorities and application VM boundaries  Optimizes application performance, given a power budget  Dynamically adjusts power proportions  Exploits multiple knobs

Recommend


More recommend