Hidden Scalability Gotchas Gotchas Hidden Scalability in Memcached Memcached and Friends and Friends in Neil Gunther Gunther , , Performance Dynamics Performance Dynamics Neil Shanti Subramanyam , , Oracle Corp Oracle Corp oration oration Shanti Subramanyam Stefan Parvu Parvu , , Oracle Finland Oracle Finland Stefan Velocity 2010 Web Performance and Operations Conference
Scalability Scalability Velocity 2010, June 24 Velocity 2010, June 24 2 2
Memcached scale out scale out Memcached • Tier of older servers • Mostly single CPU • Single threading ok Velocity 2010, June 24 Velocity 2010, June 24 3 3
Scalability Strategies Scalability Strategies • Qualitative scalability – Scale up, e.g., big SMP servers – Scale out,e.g, many cheap servers (Unis) • Quantitative scalability – What this talk is about – Need controlled measurements – Need numbers to see cost-benefit Velocity 2010, June 24 Velocity 2010, June 24 4 4
Been Bad for Web 2.0 Lately Lately Been Bad for Web 2.0 Velocity 2010, June 24 Velocity 2010, June 24 5 5
Capacity Planning Capacity Planning • You know you need it – The planning bit, especially – Data ain’t information – Info is hidden in the data • Just like finance, you need a model Metrics + Models == Information Velocity 2010, June 24 Velocity 2010, June 24 6 6
Controlled Measurements Measurements Controlled
Why Controlled Measurements? Why Controlled Measurements? Trying to predict scalability by looking at time series data is like trying to predict the stock mkt by watching the DJX ticker Velocity 2010, June 24 Velocity 2010, June 24 8 8
Bad Throughput Measurements Measurements Bad Throughput Need throughput measured in steady state (which this isn’t) Need x-axis to be load (N) defined in terms of processes or users Velocity 2010, June 24 Velocity 2010, June 24 9 9
Average Throughput in Time Average Throughput in Time This is what steady state looks like as function of time. It corresponds to ONE throughput load point (N). Velocity 2010, June 24 Velocity 2010, June 24 10 10
Controlled MCD Tests Controlled MCD Tests Load Drivers 2 Sun Fire X4170 2 sockets, 64 GB 10 Gbe Switch Memcached Sun Fire X4170 2 sockets, 64 GB SUT Velocity 2010, June 24 Velocity 2010, June 24 11 11
Memcached scaling is thread limited scaling is thread limited Memcached Velocity 2010, June 24 Velocity 2010, June 24 12 12
Better on SPARC Multicore Multicore Better on SPARC Velocity 2010, June 24 Velocity 2010, June 24 13 13
Quantifying Scalability Quantifying Scalability Universal Scalability Law Scalability Law Universal USL USL
1. Equal Bang for The Buck 1. Equal Bang for The Buck Ideal parallelism Capacity Load Velocity 2010, June 24 Velocity 2010, June 24 15 15
2. Cost of Sharing Resources 2. Cost of Sharing Resources Capacity Load Velocity 2010, June 24 Velocity 2010, June 24 16 16
3. Resource Limitation 3. Resource Limitation Amdahl’s law Capacity Load Velocity 2010, June 24 Velocity 2010, June 24 17 17
4. Degradation Negative Return 4. Degradation Negative Return Capacity Load Velocity 2010, June 24 Velocity 2010, June 24 18 18
Universal Scalability Law (USL) Universal Scalability Law (USL) N C(N) = 1 + � (N � 1) + � N(N � 1) Concurrency Coherency Contention α = 0, β = 0 α > 0, β > 0 α > 0, β = 0 Velocity 2010, June 24 Velocity 2010, June 24 19 19
USL regression in Excel USL regression in Excel A miracle happens … Velocity 2010, June 24 Velocity 2010, June 24 20 20
Memcached Scalability Scalability Memcached Quantitative USL Analysis Quantitative USL Analysis
Scalability of mcd mcd 1.2.8 1.2.8 Scalability of USL curve (not Excel) Nmax = 7 α = 0.0255, β = 0.0210 Velocity 2010, June 24 Velocity 2010, June 24 22 22
Scalability of mcd mcd 1.4.1 1.4.1 Scalability of USL curve (not Excel) Nmax = 6 α = 0.0821, β = 0.0207 Velocity 2010, June 24 Velocity 2010, June 24 23 23
Scalability of mcd mcd 1.4.5 1.4.5 Scalability of USL curve (not Excel) Nmax = 6 α = 0.0988, β = 0.0209 Velocity 2010, June 24 Velocity 2010, June 24 24 24
Scalability of SPARC version Scalability of SPARC version USL curves (not Excel) α = 0, β = 0.000434 Nmax = 22 α = 0.0041, β = 0.00197 Velocity 2010, June 24 Velocity 2010, June 24 25 25
USL projected scalability USL projected scalability USL curves (not Excel) Nmax = 48 α = 0, β = 0.000434 α = 0.0041, β = 0.00197 Nmax = 22 Velocity 2010, June 24 Velocity 2010, June 24 26 26
Parameter interpretation Parameter interpretation • Why α ~ 0 – Cache further partitioned – Single lock replaced by multiple locks • Why β > 0? – Is it in mcd code? – Could it be in O/S, H/W, …? Velocity 2010, June 24 Velocity 2010, June 24 27 27
Scaling Among Friends Scaling Among Friends Scalability as a function of Scalability as a function of virtual users users ( (“ “friends friends” ”) not threads ) not threads virtual
JAppServer USL Analysis USL Analysis JAppServer USL curves (not Excel) N = 700 users α = 0.00001486 β = 6.7E-9 N = 1200 users α = 0 β = 2.4E-7 Velocity 2010, June 24 Velocity 2010, June 24 29 29
Scalability on Amazon EC2 Scalability on Amazon EC2 USL curve (not Excel) Nmax = 22 α = 0.038988298 β = 0.001432176 Velocity 2010, June 24 Velocity 2010, June 24 30 30
Memcached Gotchas Memcached Gotchas
Just throw more hardware at it! it! Just throw more hardware at Velocity 2010, June 24 Velocity 2010, June 24 32 32
Old scaling rules will be broken Old scaling rules will be broken • Current scale-out strategy relies on using older cheap hardware • Older hardware is often single CPU – Single-threadedness of mcd is ok • Newer hardware will be multicore – New hardware is faster with lots of cores – But mcd won’t be able to utilize all cores – Multiple mcd instances are mgmt headache Velocity 2010, June 24 Velocity 2010, June 24 33 33
Single threading can wreck you Single threading can wreck you Velocity 2010, June 24 Velocity 2010, June 24 34 34
Summary Summary • Current mcd versions are thread limited – OK for older uniprocessor servers – Not OK for deployment on new multicores – Reason: unused processor capacity costs money • Controlled measurements – Not time-series data from prod (but maybe can work) – Steady state throughput (or pick small prod window) • Quantify scalability – Metrics + Models == Information – Goal is to reduce contention ( α ) and coherency ( β ) – Nmax in mcd: Increased from 6 to 48 threads Velocity 2010, June 24 Velocity 2010, June 24 35 35
Resources Resources • Neil • perfdynamics.blogspot.com • twitter.com/DrQz • www.perfdynamics.com/books.html • www.perfdynamics.com/Manifesto/USLscalability.html • Shanti – perfwork.wordpress.com – twitter.com/shantiS • Stefan – www.systemdatarecorder.org – twitter.com/sperformance Velocity 2010, June 24 Velocity 2010, June 24 36 36
Recommend
More recommend