the scads director
play

The SCADS Director: Scaling a Distributed Storage System Under - PowerPoint PPT Presentation

The SCADS Director: Scaling a Distributed Storage System Under Stringent Performance Requirements Beth Trushkowsky, Peter Bodk, Armando Fox, Michael J. Franklin, Michael I. Jordan, David A. Patterson FAST 2011 elasticity for interactive web


  1. The SCADS Director: Scaling a Distributed Storage System Under Stringent Performance Requirements Beth Trushkowsky, Peter Bodík, Armando Fox, Michael J. Franklin, Michael I. Jordan, David A. Patterson FAST 2011

  2. elasticity for interactive web apps clients Interactivity Service-Level-Objective: Over any 1-minute interval, 99% of … requests are satisfied in less than 100ms web servers ✔ storage Targeted systems features: - horizontally scalable - API for data movement - backend for interactive apps 2

  3. wikipedia workload trace - June 2009 Michael Jackson dies 3

  4. overprovisioning storage system overprovision by 300% to handle spike (assuming data stored on ten servers) 4

  5. contributions  Cloud computing is mechanism for storage elasticity  Scale up when needed  Scale down to save money  We address the scaling policy  Challenges of latency-based scaling  Model-based approach for elasticity to deal with stringent SLO  Fine-grained workload monitoring aids in scaling up and down  Show elasticity for both a hotspot and a diurnal workload pattern 5

  6. SCADS key/value store  Features  Partitioning (until some minimum data size)  Replication  Add/remove servers  Properties  Range-based partitioning  Data maintained in memory for performance  Eventually consistent (see SCADS: Scale-independent storage for social computing applications , CIDR’09) 6

  7. classical closed-loop control for elasticity? Controller sampled actions latency config upper %-tile Action latency Executor sampled actions SCADS latency cluster 7

  8. oscillations from a noisy signal Noisy signal… Will smoothing help? 99 th %-tile latency time 8

  9. too much smoothing masks spike 99 th %-tile latency time 9

  10. variation for smoothing intervals raw 99 th % ! standard deviation [ms] (log scale) 50 ! ! 20 ! raw mean ! ! ! ! 10 99 th %-tile latency ! ! ! ! ! ! ! ! 5 mean latency 2 0 5 10 15 smoothing interval [min] (SCADS running on Amazon EC2) 10

  11. model-predictive control (MPC)  MPC instead of classical closed-loop  Upper %-tile latency is a noisy signal  Use per-server workload as predictor of upper %-tile latency  Therefore need a model that predicts SLO violations based on observed workload SLO violation workload Model  Reacting with MPC  Use model of the system to determine a sequence of actions to change state to meet constraint  Execute first steps, then re-evaluate 11

  12. model-predictive control loop Performance Models Controller smoothed sampled actions workload latency config config upper %-tile Workload Action Histogram latency Executor sampled sampled actions SCADS workload latency cluster 12

  13. building a performance model  Benchmark SCADS servers on Amazon’s EC2 50/50 2500 2500  Steady-state model 2000 2000  Single server capacity Violation put workload [req/sec] put workload [req/sec] 80/20  Explore space of possible 1500 1500 ! ! ! ! workload ! ! ! ! ! ! ! ! 90/10 1000 1000 !!!!!!!!!!!!!!!!!!!!!! !!!!!!!!!!!!!!!!!!!!!! ! ! ! !  Binary classifier: SLO violation ! ! ! ! ! ! ! ! ! ! or not ! ! !!!!!!!!!!!!!!!!!!!!!!!! !!!!!!!!!!!!!!!!!!!!!!!! 95/5 500 500 ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! No violation ! ! ! ! ! ! ! ! ! ! 0 0 0 0 2000 2000 4000 4000 6000 6000 8000 8000 get workload [req/sec] get workload [req/sec] 13

  14. how much data to move? workload (requests/sec) time 14

  15. finer-granularity workload monitoring  Need fine-grained workload monitoring  Data movement especially impacts tail of latency distribution  Only move enough data to alleviate performance issues  Move data quickly  Better for scaling down later  Monitor workload on small units of data (bins)  Move/copy bins between servers 15

  16. summary of approach  Fine-grained monitoring and performance model  Determine amount of data to move from overloaded server  Estimate how much “extra room” an underloaded server has  Know when safe to coalesce servers  Replication for predictability and robustness  See paper and/or tonight’s poster session 16

  17. controller stages Stage 1: Replicate Stage 2: Partition Stage 3: Allocate servers Workload threshold Bins N 1 N 2 N 3 N 4 N 5 N 6 N 7 Storage nodes 17

  18. controller stages Stage 1: Replicate Stage 2: Partition Stage 3: Allocate servers destination Workload threshold Bins N 1 N 2 N 3 N 4 N 5 N 6 N 7 Storage nodes 18

  19. controller stages Stage 1: Replicate Stage 2: Partition Stage 3: Allocate servers Workload threshold Bins N 1 N 2 N 3 N 4 N 5 N 6 N 7 Storage nodes 19

  20. experimental results  Experiment setup  Up to 20 SCADS servers run on m1.small instances on Amazon EC2  Server capacity: 800MB, due to in-memory restriction  5-10 data bins per server  100ms SLO on read latency  Workload profiles  Hotspot  100% workload increase in five minutes on a single data item  Based on spike experienced by CNN.com on 9/11  Diurnal  Workload increases during the day, decreases at night  Replayed trace at 12x speedup 20

  21. extra workload directed to single data item 21 per − bin request rate request rate 150 0 10000 30000 30000 0 40000 05:10 05:10 05:10 05:15 05:15 05:15 per-bin request rate aggregate request rate time [min] 05:20 05:20 05:20 hot bin 05:25 05:25 05:25 other 199 bins 05:30 05:30 05:30

  22. replicating hot data 22 per − bin 99th percentile request rate number of servers latency [ms] 150 0 10000 30000 0 5 10 15 20 20 0 50 100 150 0 05:10 05:10 05:15 05:15 99 th %-tile latency (ms) per-bin request rate number of servers time [min] 05:20 05:20 05:25 05:25 05:30 05:30

  23. scaling up and down 120000  Number of servers aggregate request rate workload rate [req/s]  two experiments close 80000 to “ideal” 40000  Over-provisioning tradeoff 0 0 20 40 60 80 100 120  Amplify workload by number of servers simulated time [min] 10%, 30% 15 number of servers  Savings 10 elastic 30%  Known peak: 16% elastic 10% 5 elastic 0.3 ideal  30% headroom: 41% elastic 0.1 ideal 0 0 20 40 60 80 100 120 23 simulated time [min]

  24. cost-risk tradeoff  Over-provisioning  Allows more time before violation occurs  Cost-risk tradeoff  Comparing over-provisioning for diurnal experiment  Recall SLO parameters: threshold , percentile , interval  Over-provisioning factor of 30% vs 10% Interval Max percentile achieved 30% 10% 5 min 99.5 99 1 min 99 95 20 sec 95 90 24

  25. conclusion  Elasticity for storage servers possible by leveraging cloud computing  Upper percentile too noisy  Model-based approach to build control framework for elasticity subject to stringent performance SLO  Finer-grained workload monitoring  Minimize impact of data movement on performance  Quickly responding to workload fluctuations  Evaluated on EC2 with hotspot and diurnal workloads 25

  26. increasing replication 99th percentile latency with varying replication 1.0 0.8 0.6 CDF 0.4 5 nodes, 1 replica 0.2 10 nodes, 2 replicas 15 nodes, 3 replicas 0.0 0 50 100 150 200 lantecy [ms] 26

Recommend


More recommend