Quantifying Scalability with the USL Baron Schwartz DataEngConf NYC - PowerPoint PPT Presentation

Quantifying Scalability with the USL Baron Schwartz • DataEngConf NYC 2018

Introduction I’ve been focused on databases for about two decades, �rst as a developer, then a consultant, and now a startup founder. I’ve written High Performance MySQL and several other books, and created a lot of open source software, mostly focused around database monitoring, database operations, and database performance: innotop, Percona Toolkit, etc. I welcome you to get in touch at @xaprb or baron@vividcortex.com. @xaprb 2

Agenda How can you quantify, forecast, and reason about scalability? 1. Queueing theory. In which we discover load @xaprb 3

Agenda How can you quantify, forecast, and reason about scalability? 1. Queueing theory. In which we discover load 2. Amdahl’s Law. In which we de�ne linearity @xaprb 3

Agenda How can you quantify, forecast, and reason about scalability? 1. Queueing theory. In which we discover load 2. Amdahl’s Law. In which we de�ne linearity 3. The Universal Scalability Law (USL). In which Frederick Brooks laughs last @xaprb 3

Agenda How can you quantify, forecast, and reason about scalability? 1. Queueing theory. In which we discover load 2. Amdahl’s Law. In which we de�ne linearity 3. The Universal Scalability Law (USL). In which Frederick Brooks laughs last 4. Application. In which things are even worse than we thought @xaprb 3

Agenda How can you quantify, forecast, and reason about scalability? 1. Queueing theory. In which we discover load 2. Amdahl’s Law. In which we de�ne linearity 3. The Universal Scalability Law (USL). In which Frederick Brooks laughs last 4. Application. In which things are even worse than we thought 5. Pro�t??? In which we do the impossible @xaprb 3

Queueing Theory In Which We Discover Load

Queueing Theory There’s a branch of operations research called queueing theory. It analyzes the waiting that happens when systems get busy. @xaprb 5

What Causes Queueing? Queueing happens even at low utilization: 1. Irregular arrival timings 2. Irregular job sizes 3. Lost time is lost forever @xaprb 6

What Causes Queueing? A queue fundamentally changes Queueing happens even at low how a system works: utilization: Increases availability and 1. Irregular arrival timings utilization 2. Irregular job sizes Increases average residence 3. Lost time is lost forever time Increases cost/overhead @xaprb 6

Arrival Rate and Queue Delay Eben Freeman has a great visual that explains how arrival rate is related to λ queueing delay. @xaprb 7

Arrival Rate and Queue Delay Eben Freeman has a great visual that explains how arrival rate is related to λ queueing delay. A request arrives, and the server processes it until it’s �nished The height is the job size, and the width is the service time S The upper edge of the triangle is the amount of outstanding work to do @xaprb 7

Another Request Arrives It has to wait in the queue until the �rst is done W Then it has service time too S Its total residence time R = W + S @xaprb 8

An Equation For Queue Wait λS 2 W = Eben uses the area under the graph 2(1 − λS ) to relate the height of the top edge This creates the familiar hockey stick to the width of the red wait curve, shown here in terms of parallelograms: utilization . ρ 25 25 20 20 Residence Time 15 15 10 10 5 5 0 0 Solving this for gives an equation 0.0 0.0 0.2 0.2 0.4 0.4 0.6 0.6 0.8 0.8 1.0 1.0 W Utilization for wait time: @xaprb 9

Some Implications One of the nice things about this form is that it lets you reason about service time and arrival rate easily: λS 2 W = 2(1 − λS ) What if you… double the arrival rate λ halve the service time S @xaprb 10

The Hockey Stick Curve The “hockey stick” queueing curve is hard to use in practice. And the sharpness of the “knee” is nonlinear and very hard for humans to intuit. 25 25 20 20 Residence Time 15 15 10 10 5 5 0 0 0.0 0.0 0.2 0.2 0.4 0.4 0.6 0.6 0.8 0.8 1.0 1.0 Utilization @xaprb 11

Great Truths From Queueing Theory 1. Requests into ~any system have to queue and wait for service. 2. As the system gets busier, queueing escalates suddenly. 3. Queueing is very sensitive to service time and variability. 4. Contention over serialized resources causes nonlinear scaling. The last point is quite a leap, but I’ll explain. @xaprb 12

Amdahl’s Law In Which We Define Scalability

What is Scalability? There’s a mathematical de�nition of scalability as a function of concurrency . @xaprb 14

What is Scalability? There’s a mathematical de�nition of scalability as a function of concurrency . I’ll illustrate it in terms of a parallel processing system that uses concurrency to achieve speedup. @xaprb 14

Linear Scaling Suppose a clustered system can complete tasks per second with no X parallelism. With parallelism, it completes tasks faster, e.g. higher throughput. Linear/Serial Parallel! @xaprb 15

Ideal Linear Scalability Ideally, throughput increases linearly with parallelism . 15000 throughput 5000 0 0 2 4 6 8 10 nodes @xaprb 16

Ideal Linear Scalability Ideally, throughput increases linearly with parallelism . 15000 throughput 5000 0 0 2 4 6 8 10 nodes For example, triple the parallelism means as much work completes. 3 X @xaprb 16

The Linear Scalability Equation The equation of ideal linear scaling: 15000 throughput γN X ( N ) = 1 5000 where the slope is . γ = X (1) 0 0 2 4 6 8 10 nodes @xaprb 17

But Our Cluster Isn’t Perfect Linear scaling comes from subdividing tasks perfectly . @xaprb 18

But Our Cluster Isn’t Perfect Linear scaling comes from subdividing tasks perfectly . What if a portion isn’t subdividable? Linear/Serial Parallel! @xaprb 18

Amdahl’s Law Describes Serialization γN X ( N ) = 15000 15000 1 + σ ( N − 1) throughput throughput Amdahl’s Law describes throughput 5000 5000 when a fraction can’t be σ parallelized . 0 0 0 0 2 2 4 4 6 6 8 8 10 10 nodes nodes @xaprb 19

Amdahl’s Law Describes Serialization γN X ( N ) = 15000 15000 1 + σ ( N − 1) throughput throughput Amdahl’s Law describes throughput 5000 5000 when a fraction can’t be σ parallelized . 0 0 0 0 2 2 4 4 6 6 8 8 10 10 nodes nodes Serialization is queueing. @xaprb 19

Amdahl’s Law Has An Asymptote γN X ( N ) = 1 + σ ( N − 1) Parallelism delivers speedup, but there’s a limit: 1 lim X ( N ) = σ N →∞ @xaprb 20

Amdahl’s Law Has An Asymptote γN X ( N ) = 1 + σ ( N − 1) Parallelism delivers speedup, but there’s a limit: 1 lim X ( N ) = σ N →∞ e.g. a 5% serialized task can’t be sped up more than 20-fold. @xaprb 20

The Universal Scalability Law (USL) In Which Frederick Brooks Laughs Last

What If Workers Coordinate? Suppose the parallel workers also ask each other for things ? @xaprb 22

What If Workers Coordinate? Suppose the parallel workers also ask each other for things ? They’re making each other do extra work. As load increases, each task’s job gets harder . @xaprb 22

How Bad Is Coordination? workers = pairs of interactions, which grows fast: 2 in . N ( N − 1) O ( n ) N N @xaprb 23

The Universal Scalability Law γN X ( N ) = 15000 15000 1 + σ ( N − 1) + κN ( N − 1) throughput throughput The USL adds a term for crosstalk, 5000 5000 multiplied by the coef�cient. κ Crosstalk is also called coordination 0 0 0 0 2 2 4 4 6 6 8 8 10 10 or coherence penalty. nodes nodes Now there’s a point of diminishing returns ! @xaprb 24

The USL Describes Behavior Under Load The USL explains the highly nonlinear behavior we know systems exhibit near their saturation point. desmos.com/calculator/3cycsgdl0b @xaprb 25

Application In Which Things Are Even Worse Than We Thought

Applying the USL to the Real World Behold, I give you two metrics of concurrency and throughput. What do they mean? @xaprb 27

Let’s Scatterplot Concurrency vs Throughput This is the USL’s input and output. Is it linear? @xaprb 28

It Looks Highly Linear, Doesn’t It? R² = 0.9781 Don’t celebrate yet. @xaprb 29

Fit the USL Equation with Regression 40000 Modeled Measured Throughput 35000 30000 25000 Throughput 20000 15000 10000 5000 0 0 5 10 15 20 25 30 35 C oncurrency / L oad Now the picture looks totally different! @xaprb 30

How Much Headroom Does This System Have? 40000 Modeled Measured Throughput 35000 There's not much headroom. 30000 25000 Throughput 20000 15000 10000 5000 0 0 5 10 15 20 25 30 35 C oncurrency / L oad Just by looking, you can tell this system has maybe 10-15% more to give. @xaprb 31

Profit??? In Which We Do The Impossible

What is the System’s Primary Bottleneck? The regression gives estimates of the USL parameters. γN X ( N ) = 1 + σ ( N − 1) + κN ( N − 1) The parameters have physical meaning . is the throughput of single-threadedness. γ is the fraction that’s serialized/queued. σ is the fraction that’s crosstalk/coherency. κ @xaprb 33

Quantifying Scalability with the USL Baron Schwartz DataEngConf NYC - PowerPoint PPT Presentation

Quantifying Scalability with the USL Baron Schwartz DataEngConf NYC 2018 Introduction Ive been focused on databases for about two decades, rst as a developer, then a consultant, and now a startup founder. Ive written High

Scalability and Replication Marco Serafini COMPSCI 532 Lecture 13 Scalability 2 Scalability

Performance and Scalability (Chapter 11) Performance and Scalability Performance: How long

Root zone scalability model Bart Gijsen October 28, 2009 Root zone scalability model

Quantifying Program Complexity and Comprehension Quantifying Program Complexity and Comprehension

Versioning of Topic Map Templates Structuring Versioning and Scalability Scalability Proc.

2018/19 OFF CAMPUS ACCOMMODATION ADVICE SESSION November 2017 University of S urrey Lettings

Quantifying the Necessity of Quantifying the Necessity of Risk Mitigation Strategies Risk

Hi Hierarchical Models for hi l M d l f Quantifying Uncertainty in Quantifying Uncertainty in

Quantifying error and Quantifying error and modeling accuracy & uncertainty modeling

Quantifying relative effects of Quantifying relative effects of protecting different stages

Quantifying Surface Brightness Quantifying SB profiles Non-Parametric Parametric CSB : 0

Quantifying Temporal and Spatial Quantifying Temporal and Spatial Localities Localities Florida

Quantifying the incompatibility of Quantifying the incompatibility of quantum measurements

Hidden Scalability Gotchas Gotchas Hidden Scalability in Memcached Memcached and Friends and

Improving Scalability and Fault Improving Scalability and Fault Tolerance in an Application

Linux multi-core scalability Oct 2009 Andi Kleen Intel Corporation andi@firstfloor.org

What do you mean, Congestion? some history Congestion Collapse

Queues with vacations and their applications Dieter Fiems and Herwig Bruneel SMACS Research

Introduction to Simulation of Telecommunication Networks 4hr-Seminar in the course Switching

Finite-Source Queueing Systems and their Applications J anos Sztrik University of Debrecen

Responding in a timely manner Martin Thompson - @mjpt777 Hard Real-time Soft Real-time Squidgy

Computing the Transient Behavior of an Overloaded Bipartite Queuing System via Parametric Cut S.

Queuing Analysis Gregory (Grisha) Chockler, Zinovi Rabinovich, Ittai Abraham Operating Systems

PAFFI: Performance Analysis Framework for Fog Infrastructures in realistic scenarios Claudia

Quantifying Scalability with the USL Baron Schwartz DataEngConf NYC - PowerPoint PPT Presentation

Quantifying Scalability with the USL Baron Schwartz DataEngConf NYC 2018 Introduction Ive been focused on databases for about two decades, rst as a developer, then a consultant, and now a startup founder. Ive written High

Scalability and Replication Marco Serafini COMPSCI 532 Lecture 13 Scalability 2 Scalability

Performance and Scalability (Chapter 11) Performance and Scalability Performance: How long

Root zone scalability model Bart Gijsen October 28, 2009 Root zone scalability model

Quantifying Program Complexity and Comprehension Quantifying Program Complexity and Comprehension

Versioning of Topic Map Templates Structuring Versioning and Scalability Scalability Proc.

2018/19 OFF CAMPUS ACCOMMODATION ADVICE SESSION November 2017 University of S urrey Lettings

Quantifying the Necessity of Quantifying the Necessity of Risk Mitigation Strategies Risk

Hi Hierarchical Models for hi l M d l f Quantifying Uncertainty in Quantifying Uncertainty in

Quantifying error and Quantifying error and modeling accuracy &amp; uncertainty modeling

Quantifying relative effects of Quantifying relative effects of protecting different stages

Quantifying Surface Brightness Quantifying SB profiles Non-Parametric Parametric CSB : 0

Quantifying Temporal and Spatial Quantifying Temporal and Spatial Localities Localities Florida

Quantifying the incompatibility of Quantifying the incompatibility of quantum measurements

Hidden Scalability Gotchas Gotchas Hidden Scalability in Memcached Memcached and Friends and

Improving Scalability and Fault Improving Scalability and Fault Tolerance in an Application

Linux multi-core scalability Oct 2009 Andi Kleen Intel Corporation andi@firstfloor.org

What do you mean, Congestion? some history Congestion Collapse

Queues with vacations and their applications Dieter Fiems and Herwig Bruneel SMACS Research

Introduction to Simulation of Telecommunication Networks 4hr-Seminar in the course Switching

Finite-Source Queueing Systems and their Applications J anos Sztrik University of Debrecen

Responding in a timely manner Martin Thompson - @mjpt777 Hard Real-time Soft Real-time Squidgy

Computing the Transient Behavior of an Overloaded Bipartite Queuing System via Parametric Cut S.

Queuing Analysis Gregory (Grisha) Chockler, Zinovi Rabinovich, Ittai Abraham Operating Systems

PAFFI: Performance Analysis Framework for Fog Infrastructures in realistic scenarios Claudia

Quantifying error and Quantifying error and modeling accuracy & uncertainty modeling