a market approach for handling power
play

A Market Approach for Handling Power Emergencies in Multi-Tenant - PowerPoint PPT Presentation

A Market Approach for Handling Power Emergencies in Multi-Tenant Data Center Mohammad A. Islam, Xiaoqi Ren, Shaolei Ren, Adam Wierman, and Xiaorui Wang What makes up the costs in data centers? 2 Source: A. Greenberg, J. Hamilton, D. A. Maltz,


  1. A Market Approach for Handling Power Emergencies in Multi-Tenant Data Center Mohammad A. Islam, Xiaoqi Ren, Shaolei Ren, Adam Wierman, and Xiaorui Wang

  2. What makes up the costs in data centers? 2 Source: A. Greenberg, J. Hamilton, D. A. Maltz, and P. Patel. 2008. The cost of a cloud: research problems in data center networks. SIGCOMM Comput. Commun. Rev.

  3. What makes up the costs in data centers? Capital Expenditure Operational Expenditure (CapEx) (OpEx) 3 Source: A. Greenberg, J. Hamilton, D. A. Maltz, and P. Patel. 2008. The cost of a cloud: research problems in data center networks. SIGCOMM Comput. Commun. Rev.

  4. ATS Grid Generator AC/DC Cooling DC/AC UPS PDU PDU 4

  5. Infrastructure is really expensive especially for multi-tenant data centers ATS Grid Generator Owned by AC/DC operators Cooling DC/AC UPS PDU Owned by PDU tenants Tenant Tenant Tenant Tenant 5

  6. Infrastructure is really expensive especially for multi-tenant data centers Hyper-scale (e.g. google): 7.8% ATS Multi-tenant: Grid Generator Owned by 37% AC/DC operators Cooling DC/AC UPS Enterprise: 53% PDU Owned by PDU tenants Percentage of total data center industry electricity usage Tenant Tenant Tenant Tenant 6 Pie Chart from CoreSite’s “One Wilshire” (Photo: CoreSite)

  7. Power budget Power Time 7

  8. We need to maximize the utilization! Unused capacity! Power budget Power Time 8

  9. We need to maximize the utilization! Power oversubscription moves the line upward! Power budget Power Time 9

  10. Benefits of power oversubscription 500 450 450 360 400 Extra Revenue 350 ($/kW/year) 270 300 250 180 200 150 100 50 0 10% 15% 20% 30% Oversubscription 10

  11. Challenges for power oversubscription Power emergency! Power budget Power 25% tenants ≥ 1 downtime (several hours) in 2014. Time 11

  12. Challenges for power oversubscription 35% 30% 25% 20% 15% 10% 5% 0% UPS failure/overloading Cyber crime (DDoS) Accidential/human error 2010 2013 2016 12 Data from report “Cost of Data Center Outages” by Ponemon Institute, Jan 2016.

  13. How data center operators currently handle emergencies? Before an outage occurs: Operator Tenants 13

  14. How data center operators currently handle emergencies? After an outage occurs: Operator Tenants Small rebate (approx. $3/kW/h) 14

  15. Consequences of power outage $1,000,000 $900,000 $800,000 $700,000 $600,000 $500,000 $400,000 $300,000 $200,000 $100,000 $0 Patial unplanned outage Total unplanned outage Overall average cost 2010 2013 2016 On average, each incident is a million dollar loss 15 Data from report “Cost of Data Center Outages” by Ponemon Institute, Jan 2016.

  16. Consequences of power outage $1,000,000 $900,000 $800,000 $700,000 $600,000 $500,000 $400,000 $300,000 $200,000 $100,000 $0 Patial unplanned outage Total unplanned outage Overall average cost 2010 2013 2016 On average, each incident is nearly a million dollar loss 16

  17. We need to handle power emergencies better!

  18. Natural ideas • Lower the IT power usage • There’re many power capping solutions • DVFS, admission control, load migration, etc. [X. Wang, 2009][H. Lim 2011][X. Fu, 2011][A. Bhattacharya, 2012][D. Wang, 2013] • But, operator does NOT control tenants’ servers • Even assuming it does, which tenants should reduce power and by how much? • Static power reduction contracts • Cannot predict power reduction from tenants during an emergency 18

  19. Natural ideas • Lower the IT power usage • There’re many power capping solutions • DVFS, admission control, load migration, etc. [X. Wang, 2009][H. Lim 2011][X. Fu, 2011][A. Bhattacharya, 2012][D. Wang, 2013] Not applicable to multi-tenant data centers! • But, operator does NOT control tenants’ servers • Even assuming it does, which tenants should reduce power and by how much? • Static power reduction contracts • Cannot predict power reduction from tenants during an emergency 19

  20. Goal : provide a runtime design to extract power reduction from tenants at minimum performance loss!

  21. COOP: CO-Ordinated Power management Cut power signal Cut power Response Operator Tenants Price (Reward) 21

  22. When a power emergency occurs… • Two-level capping: high-level UPS and low-level PDU • UPS capacity exceeded by 𝐸 0 • PDU capacity exceeded by 𝐸 𝑗 • 𝑂 tenants: each cut power 𝑡 𝑗 and has a “ performance cost ” of 𝑑 𝑗 (𝑡 𝑗 ) 22

  23. How to solve it? • Centralized control doesn’t work… • Market approach 23

  24. Tenants report some, but not all, information via supply functions Supply function bidding Pricing Auction Operator predicts tenants’ Tenants report all information, responses; Tenants report i.e., “performance cost” 𝒅 𝒋 (𝒕 𝒋 ) ; nothing to the operator Operator sets prices accordingly. 24

  25. Supply function 𝑡 𝑠 • If you offer me 𝑠 , I will reduce power 𝑡 𝑠 … • Extensively studied in the context of electricity markets • We choose a parameterized supply function as follows Efficiency [R. Johari, 2011][N. Chen, 2015] • + 𝒕 𝒋 (𝒄𝒋, 𝒔) = 𝜺 𝒋 − 𝒄 𝒋 𝒔 25

  26. Parameterized supply function bidding Cut power 𝑬 𝒌 for j=0,1,…M Cut power by + 𝒕 𝒋 = 𝜺 𝒋 − 𝒄 𝒋 Supply bid 𝒄 𝒋 𝒔 Price 𝒔 Operator Tenants + #1: Operator announces supply function 𝒕 𝒋 (𝒄𝒋, 𝒔) = 𝜺 𝒋 − 𝒄 𝒋 𝒔 #2: Tenant 𝑗 submits bid 𝑐 𝑗 #3: Operator clears market price 𝑠 to satisfy multi-level power capping #4: Power reduction is exercised 26

  27. How to bid? • Bid based on tenant’s own performance cost, but no need to disclose it 27

  28. How to set price? • Tenants reduce more power when offered higher price • Just sufficiently large to make sure that tenants are reducing enough power • If no price is within the expected range (to ensure no profit loss for operator), then enter “failover” mode 28

  29. Implementation 29

  30. Evaluation Methodology • 5 tenants running different workloads housed on two clusters • DVFS for power reduction 30

  31. COOP is close to Optimal 2 Performance cost ($) OPT COOP 1 0 5% 10% 15% Oversubscription rate • COOP almost minimizes the performance costs as OPT • OPT is an idealized case where the operator dictates tenants’ power reduction as in an owner-operated data center • Settling time: just <1 second 31

  32. COOP is win-win 40% Saving/Extra Profit T#1 T#2 T#3 T#4 T#5 Operator 20% 0% 5% 10% 15% Oversubscription • Tenants reduce power cost with minimum (temporary) performance impact • Operator increases profit by selling capacity to more tenants 32

  33. COOP : CO-Ordinated Power management A market-based approach for handling power emergencies and helping operator better oversubscribe data center capacity Simple, Scalable & Efficient 33

Recommend


More recommend