A Market Approach for Handling Power Emergencies in Multi-Tenant Data Center Mohammad A. Islam, Xiaoqi Ren, Shaolei Ren, Adam Wierman, and Xiaorui Wang
What makes up the costs in data centers? 2 Source: A. Greenberg, J. Hamilton, D. A. Maltz, and P. Patel. 2008. The cost of a cloud: research problems in data center networks. SIGCOMM Comput. Commun. Rev.
What makes up the costs in data centers? Capital Expenditure Operational Expenditure (CapEx) (OpEx) 3 Source: A. Greenberg, J. Hamilton, D. A. Maltz, and P. Patel. 2008. The cost of a cloud: research problems in data center networks. SIGCOMM Comput. Commun. Rev.
ATS Grid Generator AC/DC Cooling DC/AC UPS PDU PDU 4
Infrastructure is really expensive especially for multi-tenant data centers ATS Grid Generator Owned by AC/DC operators Cooling DC/AC UPS PDU Owned by PDU tenants Tenant Tenant Tenant Tenant 5
Infrastructure is really expensive especially for multi-tenant data centers Hyper-scale (e.g. google): 7.8% ATS Multi-tenant: Grid Generator Owned by 37% AC/DC operators Cooling DC/AC UPS Enterprise: 53% PDU Owned by PDU tenants Percentage of total data center industry electricity usage Tenant Tenant Tenant Tenant 6 Pie Chart from CoreSite’s “One Wilshire” (Photo: CoreSite)
Power budget Power Time 7
We need to maximize the utilization! Unused capacity! Power budget Power Time 8
We need to maximize the utilization! Power oversubscription moves the line upward! Power budget Power Time 9
Benefits of power oversubscription 500 450 450 360 400 Extra Revenue 350 ($/kW/year) 270 300 250 180 200 150 100 50 0 10% 15% 20% 30% Oversubscription 10
Challenges for power oversubscription Power emergency! Power budget Power 25% tenants ≥ 1 downtime (several hours) in 2014. Time 11
Challenges for power oversubscription 35% 30% 25% 20% 15% 10% 5% 0% UPS failure/overloading Cyber crime (DDoS) Accidential/human error 2010 2013 2016 12 Data from report “Cost of Data Center Outages” by Ponemon Institute, Jan 2016.
How data center operators currently handle emergencies? Before an outage occurs: Operator Tenants 13
How data center operators currently handle emergencies? After an outage occurs: Operator Tenants Small rebate (approx. $3/kW/h) 14
Consequences of power outage $1,000,000 $900,000 $800,000 $700,000 $600,000 $500,000 $400,000 $300,000 $200,000 $100,000 $0 Patial unplanned outage Total unplanned outage Overall average cost 2010 2013 2016 On average, each incident is a million dollar loss 15 Data from report “Cost of Data Center Outages” by Ponemon Institute, Jan 2016.
Consequences of power outage $1,000,000 $900,000 $800,000 $700,000 $600,000 $500,000 $400,000 $300,000 $200,000 $100,000 $0 Patial unplanned outage Total unplanned outage Overall average cost 2010 2013 2016 On average, each incident is nearly a million dollar loss 16
We need to handle power emergencies better!
Natural ideas • Lower the IT power usage • There’re many power capping solutions • DVFS, admission control, load migration, etc. [X. Wang, 2009][H. Lim 2011][X. Fu, 2011][A. Bhattacharya, 2012][D. Wang, 2013] • But, operator does NOT control tenants’ servers • Even assuming it does, which tenants should reduce power and by how much? • Static power reduction contracts • Cannot predict power reduction from tenants during an emergency 18
Natural ideas • Lower the IT power usage • There’re many power capping solutions • DVFS, admission control, load migration, etc. [X. Wang, 2009][H. Lim 2011][X. Fu, 2011][A. Bhattacharya, 2012][D. Wang, 2013] Not applicable to multi-tenant data centers! • But, operator does NOT control tenants’ servers • Even assuming it does, which tenants should reduce power and by how much? • Static power reduction contracts • Cannot predict power reduction from tenants during an emergency 19
Goal : provide a runtime design to extract power reduction from tenants at minimum performance loss!
COOP: CO-Ordinated Power management Cut power signal Cut power Response Operator Tenants Price (Reward) 21
When a power emergency occurs… • Two-level capping: high-level UPS and low-level PDU • UPS capacity exceeded by 𝐸 0 • PDU capacity exceeded by 𝐸 𝑗 • 𝑂 tenants: each cut power 𝑡 𝑗 and has a “ performance cost ” of 𝑑 𝑗 (𝑡 𝑗 ) 22
How to solve it? • Centralized control doesn’t work… • Market approach 23
Tenants report some, but not all, information via supply functions Supply function bidding Pricing Auction Operator predicts tenants’ Tenants report all information, responses; Tenants report i.e., “performance cost” 𝒅 𝒋 (𝒕 𝒋 ) ; nothing to the operator Operator sets prices accordingly. 24
Supply function 𝑡 𝑠 • If you offer me 𝑠 , I will reduce power 𝑡 𝑠 … • Extensively studied in the context of electricity markets • We choose a parameterized supply function as follows Efficiency [R. Johari, 2011][N. Chen, 2015] • + 𝒕 𝒋 (𝒄𝒋, 𝒔) = 𝜺 𝒋 − 𝒄 𝒋 𝒔 25
Parameterized supply function bidding Cut power 𝑬 𝒌 for j=0,1,…M Cut power by + 𝒕 𝒋 = 𝜺 𝒋 − 𝒄 𝒋 Supply bid 𝒄 𝒋 𝒔 Price 𝒔 Operator Tenants + #1: Operator announces supply function 𝒕 𝒋 (𝒄𝒋, 𝒔) = 𝜺 𝒋 − 𝒄 𝒋 𝒔 #2: Tenant 𝑗 submits bid 𝑐 𝑗 #3: Operator clears market price 𝑠 to satisfy multi-level power capping #4: Power reduction is exercised 26
How to bid? • Bid based on tenant’s own performance cost, but no need to disclose it 27
How to set price? • Tenants reduce more power when offered higher price • Just sufficiently large to make sure that tenants are reducing enough power • If no price is within the expected range (to ensure no profit loss for operator), then enter “failover” mode 28
Implementation 29
Evaluation Methodology • 5 tenants running different workloads housed on two clusters • DVFS for power reduction 30
COOP is close to Optimal 2 Performance cost ($) OPT COOP 1 0 5% 10% 15% Oversubscription rate • COOP almost minimizes the performance costs as OPT • OPT is an idealized case where the operator dictates tenants’ power reduction as in an owner-operated data center • Settling time: just <1 second 31
COOP is win-win 40% Saving/Extra Profit T#1 T#2 T#3 T#4 T#5 Operator 20% 0% 5% 10% 15% Oversubscription • Tenants reduce power cost with minimum (temporary) performance impact • Operator increases profit by selling capacity to more tenants 32
COOP : CO-Ordinated Power management A market-based approach for handling power emergencies and helping operator better oversubscribe data center capacity Simple, Scalable & Efficient 33
Recommend
More recommend