The SCADS Director: Scaling a Distributed Storage System Under Stringent Performance Requirements Beth Trushkowsky, Peter Bodík, Armando Fox, Michael J. Franklin, Michael I. Jordan, David A. Patterson FAST 2011
elasticity for interactive web apps clients Interactivity Service-Level-Objective: Over any 1-minute interval, 99% of … requests are satisfied in less than 100ms web servers ✔ storage Targeted systems features: - horizontally scalable - API for data movement - backend for interactive apps 2
wikipedia workload trace - June 2009 Michael Jackson dies 3
overprovisioning storage system overprovision by 300% to handle spike (assuming data stored on ten servers) 4
contributions Cloud computing is mechanism for storage elasticity Scale up when needed Scale down to save money We address the scaling policy Challenges of latency-based scaling Model-based approach for elasticity to deal with stringent SLO Fine-grained workload monitoring aids in scaling up and down Show elasticity for both a hotspot and a diurnal workload pattern 5
SCADS key/value store Features Partitioning (until some minimum data size) Replication Add/remove servers Properties Range-based partitioning Data maintained in memory for performance Eventually consistent (see SCADS: Scale-independent storage for social computing applications , CIDR’09) 6
classical closed-loop control for elasticity? Controller sampled actions latency config upper %-tile Action latency Executor sampled actions SCADS latency cluster 7
oscillations from a noisy signal Noisy signal… Will smoothing help? 99 th %-tile latency time 8
too much smoothing masks spike 99 th %-tile latency time 9
variation for smoothing intervals raw 99 th % ! standard deviation [ms] (log scale) 50 ! ! 20 ! raw mean ! ! ! ! 10 99 th %-tile latency ! ! ! ! ! ! ! ! 5 mean latency 2 0 5 10 15 smoothing interval [min] (SCADS running on Amazon EC2) 10
model-predictive control (MPC) MPC instead of classical closed-loop Upper %-tile latency is a noisy signal Use per-server workload as predictor of upper %-tile latency Therefore need a model that predicts SLO violations based on observed workload SLO violation workload Model Reacting with MPC Use model of the system to determine a sequence of actions to change state to meet constraint Execute first steps, then re-evaluate 11
model-predictive control loop Performance Models Controller smoothed sampled actions workload latency config config upper %-tile Workload Action Histogram latency Executor sampled sampled actions SCADS workload latency cluster 12
building a performance model Benchmark SCADS servers on Amazon’s EC2 50/50 2500 2500 Steady-state model 2000 2000 Single server capacity Violation put workload [req/sec] put workload [req/sec] 80/20 Explore space of possible 1500 1500 ! ! ! ! workload ! ! ! ! ! ! ! ! 90/10 1000 1000 !!!!!!!!!!!!!!!!!!!!!! !!!!!!!!!!!!!!!!!!!!!! ! ! ! ! Binary classifier: SLO violation ! ! ! ! ! ! ! ! ! ! or not ! ! !!!!!!!!!!!!!!!!!!!!!!!! !!!!!!!!!!!!!!!!!!!!!!!! 95/5 500 500 ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! No violation ! ! ! ! ! ! ! ! ! ! 0 0 0 0 2000 2000 4000 4000 6000 6000 8000 8000 get workload [req/sec] get workload [req/sec] 13
how much data to move? workload (requests/sec) time 14
finer-granularity workload monitoring Need fine-grained workload monitoring Data movement especially impacts tail of latency distribution Only move enough data to alleviate performance issues Move data quickly Better for scaling down later Monitor workload on small units of data (bins) Move/copy bins between servers 15
summary of approach Fine-grained monitoring and performance model Determine amount of data to move from overloaded server Estimate how much “extra room” an underloaded server has Know when safe to coalesce servers Replication for predictability and robustness See paper and/or tonight’s poster session 16
controller stages Stage 1: Replicate Stage 2: Partition Stage 3: Allocate servers Workload threshold Bins N 1 N 2 N 3 N 4 N 5 N 6 N 7 Storage nodes 17
controller stages Stage 1: Replicate Stage 2: Partition Stage 3: Allocate servers destination Workload threshold Bins N 1 N 2 N 3 N 4 N 5 N 6 N 7 Storage nodes 18
controller stages Stage 1: Replicate Stage 2: Partition Stage 3: Allocate servers Workload threshold Bins N 1 N 2 N 3 N 4 N 5 N 6 N 7 Storage nodes 19
experimental results Experiment setup Up to 20 SCADS servers run on m1.small instances on Amazon EC2 Server capacity: 800MB, due to in-memory restriction 5-10 data bins per server 100ms SLO on read latency Workload profiles Hotspot 100% workload increase in five minutes on a single data item Based on spike experienced by CNN.com on 9/11 Diurnal Workload increases during the day, decreases at night Replayed trace at 12x speedup 20
extra workload directed to single data item 21 per − bin request rate request rate 150 0 10000 30000 30000 0 40000 05:10 05:10 05:10 05:15 05:15 05:15 per-bin request rate aggregate request rate time [min] 05:20 05:20 05:20 hot bin 05:25 05:25 05:25 other 199 bins 05:30 05:30 05:30
replicating hot data 22 per − bin 99th percentile request rate number of servers latency [ms] 150 0 10000 30000 0 5 10 15 20 20 0 50 100 150 0 05:10 05:10 05:15 05:15 99 th %-tile latency (ms) per-bin request rate number of servers time [min] 05:20 05:20 05:25 05:25 05:30 05:30
scaling up and down 120000 Number of servers aggregate request rate workload rate [req/s] two experiments close 80000 to “ideal” 40000 Over-provisioning tradeoff 0 0 20 40 60 80 100 120 Amplify workload by number of servers simulated time [min] 10%, 30% 15 number of servers Savings 10 elastic 30% Known peak: 16% elastic 10% 5 elastic 0.3 ideal 30% headroom: 41% elastic 0.1 ideal 0 0 20 40 60 80 100 120 23 simulated time [min]
cost-risk tradeoff Over-provisioning Allows more time before violation occurs Cost-risk tradeoff Comparing over-provisioning for diurnal experiment Recall SLO parameters: threshold , percentile , interval Over-provisioning factor of 30% vs 10% Interval Max percentile achieved 30% 10% 5 min 99.5 99 1 min 99 95 20 sec 95 90 24
conclusion Elasticity for storage servers possible by leveraging cloud computing Upper percentile too noisy Model-based approach to build control framework for elasticity subject to stringent performance SLO Finer-grained workload monitoring Minimize impact of data movement on performance Quickly responding to workload fluctuations Evaluated on EC2 with hotspot and diurnal workloads 25
increasing replication 99th percentile latency with varying replication 1.0 0.8 0.6 CDF 0.4 5 nodes, 1 replica 0.2 10 nodes, 2 replicas 15 nodes, 3 replicas 0.0 0 50 100 150 200 lantecy [ms] 26
Recommend
More recommend