quantifying scalability with the usl
play

Quantifying Scalability with the USL Baron Schwartz DataEngConf NYC - PowerPoint PPT Presentation

Quantifying Scalability with the USL Baron Schwartz DataEngConf NYC 2018 Introduction Ive been focused on databases for about two decades, rst as a developer, then a consultant, and now a startup founder. Ive written High


  1. Quantifying Scalability with the USL Baron Schwartz • DataEngConf NYC 2018

  2. Introduction I’ve been focused on databases for about two decades, �rst as a developer, then a consultant, and now a startup founder. I’ve written High Performance MySQL and several other books, and created a lot of open source software, mostly focused around database monitoring, database operations, and database performance: innotop, Percona Toolkit, etc. I welcome you to get in touch at @xaprb or baron@vividcortex.com. @xaprb 2

  3. Agenda How can you quantify, forecast, and reason about scalability? 1. Queueing theory. In which we discover load @xaprb 3

  4. Agenda How can you quantify, forecast, and reason about scalability? 1. Queueing theory. In which we discover load 2. Amdahl’s Law. In which we de�ne linearity @xaprb 3

  5. Agenda How can you quantify, forecast, and reason about scalability? 1. Queueing theory. In which we discover load 2. Amdahl’s Law. In which we de�ne linearity 3. The Universal Scalability Law (USL). In which Frederick Brooks laughs last @xaprb 3

  6. Agenda How can you quantify, forecast, and reason about scalability? 1. Queueing theory. In which we discover load 2. Amdahl’s Law. In which we de�ne linearity 3. The Universal Scalability Law (USL). In which Frederick Brooks laughs last 4. Application. In which things are even worse than we thought @xaprb 3

  7. Agenda How can you quantify, forecast, and reason about scalability? 1. Queueing theory. In which we discover load 2. Amdahl’s Law. In which we de�ne linearity 3. The Universal Scalability Law (USL). In which Frederick Brooks laughs last 4. Application. In which things are even worse than we thought 5. Pro�t??? In which we do the impossible @xaprb 3

  8. Queueing Theory In Which We Discover Load

  9. Queueing Theory There’s a branch of operations research called queueing theory. It analyzes the waiting that happens when systems get busy. @xaprb 5

  10. What Causes Queueing? Queueing happens even at low utilization: 1. Irregular arrival timings 2. Irregular job sizes 3. Lost time is lost forever @xaprb 6

  11. What Causes Queueing? A queue fundamentally changes Queueing happens even at low how a system works: utilization: Increases availability and 1. Irregular arrival timings utilization 2. Irregular job sizes Increases average residence 3. Lost time is lost forever time Increases cost/overhead @xaprb 6

  12. Arrival Rate and Queue Delay Eben Freeman has a great visual that explains how arrival rate is related to λ queueing delay. @xaprb 7

  13. Arrival Rate and Queue Delay Eben Freeman has a great visual that explains how arrival rate is related to λ queueing delay. A request arrives, and the server processes it until it’s �nished The height is the job size, and the width is the service time S The upper edge of the triangle is the amount of outstanding work to do @xaprb 7

  14. Another Request Arrives It has to wait in the queue until the �rst is done W Then it has service time too S Its total residence time R = W + S @xaprb 8

  15. An Equation For Queue Wait λS 2 W = Eben uses the area under the graph 2(1 − λS ) to relate the height of the top edge This creates the familiar hockey stick to the width of the red wait curve, shown here in terms of parallelograms: utilization . ρ 25 25 20 20 Residence Time 15 15 10 10 5 5 0 0 Solving this for gives an equation 0.0 0.0 0.2 0.2 0.4 0.4 0.6 0.6 0.8 0.8 1.0 1.0 W Utilization for wait time: @xaprb 9

  16. Some Implications One of the nice things about this form is that it lets you reason about service time and arrival rate easily: λS 2 W = 2(1 − λS ) What if you… double the arrival rate λ halve the service time S @xaprb 10

  17. The Hockey Stick Curve The “hockey stick” queueing curve is hard to use in practice. And the sharpness of the “knee” is nonlinear and very hard for humans to intuit. 25 25 20 20 Residence Time 15 15 10 10 5 5 0 0 0.0 0.0 0.2 0.2 0.4 0.4 0.6 0.6 0.8 0.8 1.0 1.0 Utilization @xaprb 11

  18. Great Truths From Queueing Theory 1. Requests into ~any system have to queue and wait for service. 2. As the system gets busier, queueing escalates suddenly. 3. Queueing is very sensitive to service time and variability. 4. Contention over serialized resources causes nonlinear scaling. The last point is quite a leap, but I’ll explain. @xaprb 12

  19. Amdahl’s Law In Which We Define Scalability

  20. What is Scalability? There’s a mathematical de�nition of scalability as a function of concurrency . @xaprb 14

  21. What is Scalability? There’s a mathematical de�nition of scalability as a function of concurrency . I’ll illustrate it in terms of a parallel processing system that uses concurrency to achieve speedup. @xaprb 14

  22. Linear Scaling Suppose a clustered system can complete tasks per second with no X parallelism. With parallelism, it completes tasks faster, e.g. higher throughput. Linear/Serial Parallel! @xaprb 15

  23. Ideal Linear Scalability Ideally, throughput increases linearly with parallelism . 15000 throughput 5000 0 0 2 4 6 8 10 nodes @xaprb 16

  24. Ideal Linear Scalability Ideally, throughput increases linearly with parallelism . 15000 throughput 5000 0 0 2 4 6 8 10 nodes For example, triple the parallelism means as much work completes. 3 X @xaprb 16

  25. The Linear Scalability Equation The equation of ideal linear scaling: 15000 throughput γN X ( N ) = 1 5000 where the slope is . γ = X (1) 0 0 2 4 6 8 10 nodes @xaprb 17

  26. But Our Cluster Isn’t Perfect Linear scaling comes from subdividing tasks perfectly . @xaprb 18

  27. But Our Cluster Isn’t Perfect Linear scaling comes from subdividing tasks perfectly . What if a portion isn’t subdividable? Linear/Serial Parallel! @xaprb 18

  28. Amdahl’s Law Describes Serialization γN X ( N ) = 15000 15000 1 + σ ( N − 1) throughput throughput Amdahl’s Law describes throughput 5000 5000 when a fraction can’t be σ parallelized . 0 0 0 0 2 2 4 4 6 6 8 8 10 10 nodes nodes @xaprb 19

  29. Amdahl’s Law Describes Serialization γN X ( N ) = 15000 15000 1 + σ ( N − 1) throughput throughput Amdahl’s Law describes throughput 5000 5000 when a fraction can’t be σ parallelized . 0 0 0 0 2 2 4 4 6 6 8 8 10 10 nodes nodes Serialization is queueing. @xaprb 19

  30. Amdahl’s Law Has An Asymptote γN X ( N ) = 1 + σ ( N − 1) Parallelism delivers speedup, but there’s a limit: 1 lim X ( N ) = σ N →∞ @xaprb 20

  31. Amdahl’s Law Has An Asymptote γN X ( N ) = 1 + σ ( N − 1) Parallelism delivers speedup, but there’s a limit: 1 lim X ( N ) = σ N →∞ e.g. a 5% serialized task can’t be sped up more than 20-fold. @xaprb 20

  32. The Universal Scalability Law (USL) In Which Frederick Brooks Laughs Last

  33. What If Workers Coordinate? Suppose the parallel workers also ask each other for things ? @xaprb 22

  34. What If Workers Coordinate? Suppose the parallel workers also ask each other for things ? They’re making each other do extra work. As load increases, each task’s job gets harder . @xaprb 22

  35. How Bad Is Coordination? workers = pairs of interactions, which grows fast: 2 in . N ( N − 1) O ( n ) N N @xaprb 23

  36. The Universal Scalability Law γN X ( N ) = 15000 15000 1 + σ ( N − 1) + κN ( N − 1) throughput throughput The USL adds a term for crosstalk, 5000 5000 multiplied by the coef�cient. κ Crosstalk is also called coordination 0 0 0 0 2 2 4 4 6 6 8 8 10 10 or coherence penalty. nodes nodes Now there’s a point of diminishing returns ! @xaprb 24

  37. The USL Describes Behavior Under Load The USL explains the highly nonlinear behavior we know systems exhibit near their saturation point. desmos.com/calculator/3cycsgdl0b @xaprb 25

  38. Application In Which Things Are Even Worse Than We Thought

  39. Applying the USL to the Real World Behold, I give you two metrics of concurrency and throughput. What do they mean? @xaprb 27

  40. Let’s Scatterplot Concurrency vs Throughput This is the USL’s input and output. Is it linear? @xaprb 28

  41. It Looks Highly Linear, Doesn’t It? R² = 0.9781 Don’t celebrate yet. @xaprb 29

  42. Fit the USL Equation with Regression 40000 Modeled Measured Throughput 35000 30000 25000 Throughput 20000 15000 10000 5000 0 0 5 10 15 20 25 30 35 C oncurrency / L oad Now the picture looks totally different! @xaprb 30

  43. How Much Headroom Does This System Have? 40000 Modeled Measured Throughput 35000 There's not much headroom. 30000 25000 Throughput 20000 15000 10000 5000 0 0 5 10 15 20 25 30 35 C oncurrency / L oad Just by looking, you can tell this system has maybe 10-15% more to give. @xaprb 31

  44. Profit??? In Which We Do The Impossible

  45. What is the System’s Primary Bottleneck? The regression gives estimates of the USL parameters. γN X ( N ) = 1 + σ ( N − 1) + κN ( N − 1) The parameters have physical meaning . is the throughput of single-threadedness. γ is the fraction that’s serialized/queued. σ is the fraction that’s crosstalk/coherency. κ @xaprb 33

Recommend


More recommend