Balancing Fairness and Efficiency in Tiered Storage Systems with Bottleneck-Aware Allocation Hui Wang, Peter Varman Rice University FAST’14, Feb 2014
Tiered Storage v Tiered storage: HDs and SSDs q Advantages: } Performance } Cost q Challenges: } Fair resource allocation } High system efficiency ¨ Variable system throughput 2
Tiered Storage Model } Clients : Make requests to SSD (hit) and HD (miss) in certain ratio } Scheduler : Aware of the request target, dispatches requests to storage } Storage : SSD and HD independent, without frequent data migrations 3
Fairness and Efficiency in Tiered Storage v How do we define fairness? q How to define fairness for multiple resources? q Fair allocation may cause low efficiency v How to improve efficiency of both devices? q Only focusing on efficiency may cause unfairness 4
Existing Solutions for QoS Scheduling v Proportional sharing in storage / IO scheduling q Extended from networks and CPU scheduling q Additional Reservation and Limit controls q All of them are designed for a single resource! v Dominant Resource Fairness Model (DRF) [NSDI’11] q Designed for allocating multiple resources q DRF does not explicitly address system utilization 5
Talk Outline v Motivation v Bottleneck-Aware Allocation (BAA) v Evaluation v Conclusions and future work 6
Example: Single Device Type v Configuration: 100% q Single HD with capacity 100 IOPS; 50 IOPS q Two clients with equal weights } Fully backlogged, Work-conserving q Proportional sharing 50 IOPS v Results: HD 100 IOPS q Each gets 50 IOPS q Utilization 100% v Device can be fully utilized for any allocation ratio 7
What if there are multiple resources? 8
Example: Multiple Devices (Fairness) v Natural policy: Weighted Fair Queuing 47% 100% v Configuration: 16.7 IOPS 150 IOPS } HD capacity 100 IOPS, SSD 500 IOPS; } Two clients: h1 = 0.9, h2 = 0.5; 83.3 IOPS } Conventional WFQ 1:1 v Results: IDLE } Each gets 167 IOPS } Utilization of HD = 100%, but SSD only 47% 83.3 IOPS v Simply transferring WFQ to multiple HD SSD 100 IOPS 500 IOPS resources will have efficiency problem! (Capacity Normalized) 9
Example: Multiple Devices (Efficiency) 100% 100% v Configuration: 50 IOPS } HD capacity 100 IOPS, SSD 500 IOPS; } Two clients h1 = 0.9, h2 = 0.5; v Results: } Utilization 100% 450 IOPS 50 IOPS } Client 1 gets 500 IOPS } Client 2 gets 100 IOPS 50 IOPS v It is not possible to precisely assign both the HD SSD relative allocations (fairness) and the system 100 IOPS 500 IOPS utilization (efficiency). (Normalized) 10
DRF (Dominant Resource Fairness) v Configuration: 100% 77% } HD 100 IOPS 36 IOPS } SSD 500 IOPS 324 IOPS } Two clients 64% ¨ h1 = 0.9 (dominant resource SSD) 64 IOPS ¨ h2 = 0.5 (dominant resource HD) v What will DRF do? 64% q Equalize dominant shares IDLE 64 IOPS SSD HD (Normalized) 11
DRF v Not addressing efficiency 100% 48% q Add a third client h3 = 0.1 22 IOPS q Utilization further reduced to 48% 39% 196 IOPS q Worse if more clients bottlenecked 39 IOPS on HD 39% IDLE 39 IOPS 39% 5 IOPS 39 IOPS HD SSD 100 IOPS 500 IOPS 12
One More HD-bound Client 100% 48% 100% 77% 22 IOPS 36 IOPS 39% 196 IOPS 324 IOPS 39 IOPS 64% 64 IOPS 39% IDLE 64% 39 IOPS IDLE 39% 5 IOPS 64 IOPS 39 IOPS SSD HD HD SSD 100 IOPS 500 IOPS 100 IOPS 500 IOPS (Normalized) (Normalized) 13
Talk Outline v Motivation v Bottleneck-Aware Allocation (BAA) v Evaluation v Conclusions and future work 14
Fair Shares Fair Share of a client v 150 IOPS 300 IOPS q IOPS it would get if each resource was ? IOPS ? IOPS partitioned equally among the clients 1/3 Two devices (150 IOPS and 300 IOPS) v ? IOPS ? IOPS } Client 1: h1 = 4/9 1/3 } Client 2: h2 = 4/9 } Client 3: h3 = 5/6 ? IOPS 1/3 ? IOPS HD SSD 15
Fair Shares } Client 1: h1 = 4/9 150 IOPS 300 IOPS } Client 2: h2 = 4/9 50 IOPS 40 IOPS } Client 3: h3 = 5/6 1/3 f i Fair share ( ): v 40 IOPS 50 IOPS 1/3 } Client 1: 90 IOPS } Client 2: 90 IOPS 100 IOPS } Client 3: 120 IOPS 1/3 20 IOPS } Depends only on client’s hit ratio and HD SSD capacities of the devices 16
Fairness Policy v Allocate in the ratio of fair shares ? q Fair share reflects what a client would get if running alone v Problem q Throttling across devices similar to DRF example v Solution q Bottleneck-aware allocation 17
Bottleneck-Aware Allocation v Bottleneck Sets q Define load-balancing point h bal = C s / ( C s + C d ) h i ≤ h bal q If : in HD-bottleneck Set ( D ) q If : in SSD-bottleneck Set ( S ) h i > h bal 18
Fairness Requirements of BAA v Sharing Incentive (SI) q No client gets less IOPS than it would from equally partitioning each resource v Envy-Freedom (EF) q Clients prefer their own allocation over the allocation of any other client v Local Fair Share Ratio q Clients belong to the same bottleneck set get IOPS in proportion to their fair shares 19
Bottleneck-Aware Allocation v Maximize system throughput v Satisfy fairness requirements 20
Solution Space Satisfying All Properties v BAA will match SI and EF of DRF v Get better or same utilization than DRF DRF Sharing Incentive Envy Free BAA search area Local Fair Share Ratio 21
Fairness Constraints of BAA v Fairness between clients in D : v Fairness between clients in S: v Fairness between a client in D and a client in S: } q constraints 22
Optimization for Allocation (2-variable LP) (1) (2) (3) (4) 23
Talk Outline v Motivation v Bottleneck-Aware Allocation (BAA) v Evaluation v Conclusions and future work 24
Evaluation v Simulation q Evaluate BAA’s efficiency q Evaluate BAA’s dynamic behavior when workload changes v Linux q Prototype by interposing BAA scheduler in the IO path q Evaluate BAA’s efficiency, fairness (SI and EF) 25
Simulation (Efficiency - 2 clients) } SSD Utilization: Two clients: h1 = 0.5; h2 = 0.95 v } FQ: 7% Two devices: v } DRF: 65% q HD= 100 IOPS; SSD = 5000 IOPS } BAA: 100% 26
Simulation (Efficiency - 3 clients) } A third client: h3 = 0.8 } SSD Utilization: } FQ: 6% } DRF: 45% } BAA: 71% (bounded by fairness) 27
Simulation (Dynamic Behavior) v Two clients q h1 = 0.45, 0.2 (after 510s) q h2 = 0.95 v Two devices: q HD= 200 IOPS q SSD = 3000 IOPS v The utilization is pulled back high after a short period 28
Linux (Efficiency-Throughput) } Total throughputs: v Two clients: } BAA: 1396 IOPS q Financial workload (h1= 0.3) } DRF: 810 IOPS q Exchange workload (h2 = 0.95) } CFQ: 1011 IOPS 29
Linux (Efficiency-Utilization) The average utilization: v BAA (HD 94% and SSD 92%), v DRF (HD 99% and SSD 78%), CFQ (HD 99.8% and SSD 83%) v 30
Linux (Fairness – Sharing Incentive) v Four financial clients Fair Share Throughput } h1=0.2 ( D Set) 10000 } h2=0.4 ( D Set) } h3= 0.98 ( S Set) 1000 } h4 =1.0 ( S Set) IOPS 100 v Every client receives at least 10 its fair share. q Proportional to fair share 1 Client 1 Client 2 Client 3 Client 4 31
Linux (Fairness – Envy freedom) HD SSD v No one envies others’ allocation } No one get higher allocation 10000 on all devices 1000 } D set: Higher HD allocation IOPS } S set: Higher SSD allocation 100 10 1 Client 1 Client 2 Client 3 Client 4 32
Talk Outline v Motivation v Bottleneck-Aware Allocation (BAA) v Evaluation v Conclusions and future work 33
Conclusions and Future Work v A new model (BAA) to balance fairness and efficiency q Fairness: } Sharing Incentive } Envy free } Local Fair Share q Efficiency: } Maximize utilization subject to fairness constraints 34
Ongoing Work v Apply BAA for broader multi-resource allocation q CPU, Memory, Networks v Other fairness policies q Cost, reservations v Cache model q SSD as a cache of HD q Data migration 35
36
Recommend
More recommend