Top Performance Myths and Folklore Martin Thompson - @mjpt777
Top Performance Myths and Folklore Martin Thompson - @mjpt777
Top 10 Performance Mistakes Martin Thompson - @mjpt777
10
Not Upgrading
9
Duplicated Work
Database Tuning?
Where is the real issue?
8
Data Dependent Loads
Aka “Pointer Chasing”
Are all memory operations equal?
Sequential Access - Average time in ns/op to sum all longs in a 1GB array?
Access Pattern Benchmark Benchmark Mode Score Error Units testSequential avgt 0.832 ± 0.006 ns/op ~1 ns/op
Really??? Less than 1ns per operation?
Random walk per OS Page - Average time in ns/op to sum all longs in a 1GB array?
Access Pattern Benchmark Benchmark Mode Score Error Units testSequential avgt 0.832 ± 0.006 ns/op testRandomPage avgt 2.703 ± 0.025 ns/op ~3 ns/op
Data dependant walk per OS Page - Average time in ns/op to sum all longs in a 1GB array?
Access Pattern Benchmark Benchmark Mode Score Error Units testSequential avgt 0.832 ± 0.006 ns/op testRandomPage avgt 2.703 ± 0.025 ns/op testDependentRandomPage avgt 7.102 ± 0.326 ns/op ~7 ns/op
Random heap walk - Average time in ns/op to sum all longs in a 1GB array?
Access Pattern Benchmark Benchmark Mode Score Error Units testSequential avgt 0.832 ± 0.006 ns/op testRandomPage avgt 2.703 ± 0.025 ns/op testDependentRandomPage avgt 7.102 ± 0.326 ns/op testRandomHeap avgt 19.896 ± 3.110 ns/op ~20 ns/op
Data dependant heap walk - Average time in ns/op to sum all longs in a 1GB array?
Access Pattern Benchmark Benchmark Mode Score Error Units testSequential avgt 0.832 ± 0.006 ns/op testRandomPage avgt 2.703 ± 0.025 ns/op testDependentRandomPage avgt 7.102 ± 0.326 ns/op testRandomHeap avgt 19.896 ± 3.110 ns/op testDependentRandomHeap avgt 89.516 ± 4.573 ns/op ~90 ns/op
Need to ADD 40+ ns/op for NUMA access on a server!!!
Access Pattern Benchmark Benchmark Mode Score Error Units testSequential avgt 0.832 ± 0.006 ns/op testRandomPage avgt 2.703 ± 0.025 ns/op testDependentRandomPage avgt 7.102 ± 0.326 ns/op testRandomHeap avgt 19.896 ± 3.110 ns/op testDependentRandomHeap avgt 89.516 ± 4.573 ns/op
What does this mean for data structures?
Buckets
EUR/USD Buckets 1 Key Value Hash Next
EUR/USD Buckets 1 Key Value Hash Next Key Value Hash Next 2 GBP/EUR
EUR/USD Buckets 1 3 GBP/USD Key Value Hash Next Key Value Hash Next Key Value Hash Next 2 GBP/EUR
Buckets Key Value Hash Next
Buckets Key Value Hash Next 1 EUR/USD 4 -1 0
Buckets Key Value Hash Next 1 EUR/USD 4 -1 2 GBP/EUR 2 -1 1 0
Buckets Key Value Hash Next 1 EUR/USD 4 2 2 GBP/EUR 2 -1 1 3 GBP/USD 4 -1 0
.net Dictionary is >10X faster than HashMap for 2+ GB of data
Understand object relationships and then choose appropriate data structures
Java desperately needs Value Types on the stack and Aggregates on the heap
Data Structures are becoming evermore important again!
7
Too Much Allocation
“Allocation is free…”
Reclamation is NOT free!
Remember Data Dependent Loads?
Too much allocation or copying will wash out your cache
6
Going Parallel
http://www.frankmcsherry.org/assets/COST.pdf
Amdahl’s Law 20 18 16 14 Speedup 12 10 8 6 4 2 0 1 2 4 8 16 32 64 128 256 512 1024 Processors Amdahl
Universal Scalability Law 20 18 16 14 Speedup 12 10 8 6 4 2 0 1 2 4 8 16 32 64 128 256 512 1024 Processors Amdahl USL
Universal Scalability Law C(N) = N / (1 + α (N – 1) + (( β * N) * (N – 1))) C = capacity or throughput N = number of processors α = contention penalty β = coherence penalty
Shared mutable state is Evil!
“You can have a second computer once you’ve shown you know how to use the first one” – Paul Barham
“You can have a second CPU once you’ve shown you know how to use the first one” – Martin Thompson
5
Not Understanding TCP
TCP – Sequenced Flow 1 Client Server
TCP – Sequenced Flow 1 Client Server SYN
TCP – Sequenced Flow 1 Client Server SYN SYN, ACK
TCP – Sequenced Flow 1 Client Server SYN SYN, ACK ACK
TCP – Sequenced Flow 1 Client Server SYN SYN, ACK ACK Data == MSS
TCP – Sequenced Flow 1 Client Server SYN SYN, ACK ACK Data == MSS
TCP – Sequenced Flow 1 Client Server SYN SYN, ACK ACK Data == MSS Delayed ACK
TCP – Sequenced Flow 1 Client Server SYN SYN, ACK ACK Data == MSS Delayed ACK Data < MSS
TCP – Sequenced Flow – TCP_NODELAY Client Server SYN SYN, ACK ACK
TCP – Sequenced Flow – TCP_NODELAY Client Server SYN SYN, ACK ACK Data == MSS
TCP – Sequenced Flow – TCP_NODELAY Client Server SYN SYN, ACK ACK Data == MSS Data < MSS
TCP – Sequenced Flow – TCP_NODELAY Client Server SYN SYN, ACK ACK Data == MSS Data < MSS ACK
4
Synchronous Communications
Client Server
Client Server
Client Server
Client Server
Client Server
Client Server
Client Server
Asynchronous Communications
Client Server
Client Server
Client Server
Client Server
Client Server
Client Server
Client Server
Synchronous Communications is the crystal meth of distributed computing
3
Text Encoding
“But it’s human readable...”
“Binary is hard to work with...”
while (i >= 0) { int remainder = quotient % 10; quotient = quotient / 10; results[i--] = (byte)('0' + remainder); }
Communications Battery life and bandwidth?
2
Recommend
More recommend