stout
play

Stout An Adaptive Interface to Scalable Cloud Storage John Dunagan - PowerPoint PPT Presentation

Stout An Adaptive Interface to Scalable Cloud Storage John Dunagan John C. McCullough Alec Wolman Alex C. Snoeren UC San Diego Microsoft Research June 23, 2010 Scalable Multi-tiered Services 2 Scalable Multi-tiered Services app


  1. Staying Safe: Consistency ◮ Don’t reveal uncomitted state ◮ Potential async: Inconsistency on failure Synchronous Potential Async app store app store x=5 x=5 9

  2. Staying Safe: Consistency ◮ Don’t reveal uncomitted state ◮ Potential async: Inconsistency on failure Synchronous Potential Async app store app store x=5 x=5 9

  3. Staying Safe: Consistency ◮ Don’t reveal uncomitted state ◮ Potential async: Inconsistency on failure Synchronous Potential Async app store app store x=5 x=5 Failure 9

  4. Staying Safe: Consistency ◮ Don’t reveal uncomitted state ◮ Potential async: Inconsistency on failure ◮ Stout provides serialized update semantics Synchronous Stout Async app store app store x=5 x=5 9

  5. Staying Safe: Consistency ◮ Don’t reveal uncomitted state ◮ Potential async: Inconsistency on failure ◮ Stout provides serialized update semantics Synchronous Stout Async app store app store x=5 x=5 interval 9

  6. Staying Safe: Consistency ◮ Don’t reveal uncomitted state ◮ Potential async: Inconsistency on failure ◮ Stout provides serialized update semantics Synchronous Stout Async app store app store x=5 x=5 interval 9

  7. Staying Safe: Consistency ◮ Don’t reveal uncomitted state ◮ Potential async: Inconsistency on failure ◮ Stout provides serialized update semantics Synchronous Stout Async app store app store x=5 x=5 interval 9

  8. Benefit: Write Collapsing ◮ Batched commits enable further optimization ◮ Can write most recent version only ◮ Reduces load at the store 10

  9. Benefit: Write Collapsing ◮ Batched commits enable further optimization ◮ Can write most recent version only ◮ Reduces load at the store x=5 10

  10. Benefit: Write Collapsing ◮ Batched commits enable further optimization ◮ Can write most recent version only ◮ Reduces load at the store x=5 x=6 10

  11. Benefit: Write Collapsing ◮ Batched commits enable further optimization ◮ Can write most recent version only ◮ Reduces load at the store x=5 x=6 x=7 10

  12. Benefit: Write Collapsing ◮ Batched commits enable further optimization ◮ Can write most recent version only ◮ Reduces load at the store x=5 x=6 x=7 10

  13. Benefit: Write Collapsing ◮ Batched commits enable further optimization ◮ Can write most recent version only ◮ Reduces load at the store x=5 x=6 x=7 x=7 10

  14. Outline 1. Introduction 2. Application Structure 3. Adaptive Batching 4. Evaluation 11

  15. Adapting to Shared Storage ◮ Storage system is a shared medium ◮ Independently reach efficient fair share ◮ Delay as congestion indicator ◮ Rather than modifying storage for explicit notification Stout app store Queue Stout app store Stout app store 12

  16. Delay-based Congestion Control ◮ Unknown bottleneck capacity ◮ Traditional TCP signaled via packet loss ◮ Delay-based congestion control triggered by latency changes Queue Router 13

  17. Applications to Storage Networking Storage Mechanism Change Rate Change Size ACCELERATE Send Faster Batch Less BACK-OFF Send Slower Batch More 14

  18. Algorithm if perf < recent perf BACK-OFF else ACCELERATE 15

  19. Algorithm: Estimating Storage Performance if perf < recent perf batch size BACK-OFF latency + interval else ACCELERATE 16

  20. Algorithm: Estimating Storage Capacity if perf < recent perf BACK-OFF if backed-off else EWMA ( batch size i ) ACCELERATE EWMA ( lat i ) + EWMA ( interval i ) else // accelerated batch size i MAX i ( ) lat i + interval i 17

  21. Algorithm: Achieving Fair Share if perf < recent perf BACK-OFF else ACCELERATE 18

  22. Algorithm: Achieving Fair Share if perf < recent perf ( 1 + α ) ∗ interval i BACK-OFF else ACCELERATE 18

  23. Algorithm: Achieving Fair Share if perf < recent perf ( 1 + α ) ∗ interval i BACK-OFF else � ( 1 − β ) ∗ interval i + β ∗ interval i ACCELERATE 18

  24. Algorithm: Achieving Fair Share if perf < recent perf ( 1 + α ) ∗ interval i BACK-OFF else � ( 1 − β ) ∗ interval i + β ∗ interval i ACCELERATE interval Time (s) 18

  25. Outline 1. Introduction 2. Application Structure 3. Adaptive Batching 4. Evaluation 19

  26. Evaluation ◮ Baseline Storage System Performance ◮ Benefits of batching ◮ Benefits of write-collapsing ◮ Stout ◮ Versus fixed batching intervals ◮ Workload variation 20

  27. Evaluation !"#$%&$'(% 21

  28. Evaluation !"#$%&$'(% Sectioned Document Store 21

  29. Evaluation !"#$%&$'(% Sectioned Document Store Our Workload ◮ 256-byte documents: IOPS dominated ◮ 50% read, 50% write 21

  30. Evaluation: Configuration Evaluation Platform ◮ 50 machines ◮ 1 Experiment Controller ◮ 1 Lease Manager ◮ 12 Frontends ◮ 32 Middle Tiers ◮ 4 Storage (Partitioned Key-Value w/MSSQL as storage) app 12 × www 32 × 4 × store 22

  31. Baseline: Importance of Batching 300 End-to-end Latency (ms) 250 200 no-batching 150 100 50 0 2k 4k 6k 8k 10k 12k 14k 16k 18k Load (requests/s) 23

  32. Baseline: Importance of Batching 300 End-to-end Latency (ms) 250 200 no-batching 150 100 10ms 50 0 2k 4k 6k 8k 10k 12k 14k 16k 18k Load (requests/s) 23

  33. Baseline: Importance of Batching 300 End-to-end Latency (ms) 250 200 no-batching 150 20ms 100 10ms 50 0 2k 4k 6k 8k 10k 12k 14k 16k 18k Load (requests/s) ◮ Batching improves performance 23

  34. Baseline: Importance of Write-Collapsing 300 End-to-end Latency (ms) 250 200 10ms low collapsing 150 100 50 0 4k 6k 8k 10k 12k 14k 16k 18k 20k Load (requests/s) Low collapsing 10k Documents High collapsing 100 Documents 24

  35. Baseline: Importance of Write-Collapsing 300 End-to-end Latency (ms) 250 20ms low collapsing 200 10ms low collapsing 150 100 50 0 4k 6k 8k 10k 12k 14k 16k 18k 20k Load (requests/s) Low collapsing 10k Documents High collapsing 100 Documents 24

  36. Baseline: Importance of Write-Collapsing 300 End-to-end Latency (ms) 250 20ms low collapsing 200 10ms low collapsing 150 100 10ms high collapsing 50 0 4k 6k 8k 10k 12k 14k 16k 18k 20k Load (requests/s) Low collapsing 10k Documents High collapsing 100 Documents 24

  37. Baseline: Importance of Write-Collapsing 300 End-to-end Latency (ms) 250 20ms low collapsing 200 10ms low collapsing 150 20ms high collapsing 100 10ms high collapsing 50 0 4k 6k 8k 10k 12k 14k 16k 18k 20k Load (requests/s) Low collapsing 10k Documents High collapsing 100 Documents ◮ Improvement dependent on workload 24

  38. Evaluation: Stout vs. Fixed Intervals 800 End-to-end Latency (ms) 700 600 500 400 20ms 300 200 100 0 5k 10k 15k 20k 25k 30k 35k 40k 45k Load (requests/s) 25

  39. Evaluation: Stout vs. Fixed Intervals 800 End-to-end Latency (ms) 700 600 500 40ms 400 20ms 300 200 100 0 5k 10k 15k 20k 25k 30k 35k 40k 45k Load (requests/s) 25

  40. Evaluation: Stout vs. Fixed Intervals 800 End-to-end Latency (ms) 700 600 80ms 500 40ms 400 20ms 300 200 100 0 5k 10k 15k 20k 25k 30k 35k 40k 45k Load (requests/s) 25

  41. Evaluation: Stout vs. Fixed Intervals 800 End-to-end Latency (ms) 700 160ms 600 80ms 500 40ms 400 20ms 300 200 100 0 5k 10k 15k 20k 25k 30k 35k 40k 45k Load (requests/s) 25

  42. Evaluation: Stout vs. Fixed Intervals 800 End-to-end Latency (ms) 700 160ms 600 80ms 500 40ms 400 20ms Stout 300 200 100 0 5k 10k 15k 20k 25k 30k 35k 40k 45k Load (requests/s) ◮ Stout better than any fixed interval across wide range of workloads 25

Recommend


More recommend