myths and folklore
play

Myths and Folklore Martin Thompson - @mjpt777 Top Performance - PowerPoint PPT Presentation

Top Performance Myths and Folklore Martin Thompson - @mjpt777 Top Performance Myths and Folklore Martin Thompson - @mjpt777 Top 10 Performance Mistakes Martin Thompson - @mjpt777 10 Not Upgrading 9 Duplicated Work Database Tuning?


  1. Top Performance Myths and Folklore Martin Thompson - @mjpt777

  2. Top Performance Myths and Folklore Martin Thompson - @mjpt777

  3. Top 10 Performance Mistakes Martin Thompson - @mjpt777

  4. 10

  5. Not Upgrading

  6. 9

  7. Duplicated Work

  8. Database Tuning?

  9. Where is the real issue?

  10. 8

  11. Data Dependent Loads

  12. Aka “Pointer Chasing”

  13. Are all memory operations equal?

  14. Sequential Access - Average time in ns/op to sum all longs in a 1GB array?

  15. Access Pattern Benchmark Benchmark Mode Score Error Units testSequential avgt 0.832 ± 0.006 ns/op ~1 ns/op

  16. Really??? Less than 1ns per operation?

  17. Random walk per OS Page - Average time in ns/op to sum all longs in a 1GB array?

  18. Access Pattern Benchmark Benchmark Mode Score Error Units testSequential avgt 0.832 ± 0.006 ns/op testRandomPage avgt 2.703 ± 0.025 ns/op ~3 ns/op

  19. Data dependant walk per OS Page - Average time in ns/op to sum all longs in a 1GB array?

  20. Access Pattern Benchmark Benchmark Mode Score Error Units testSequential avgt 0.832 ± 0.006 ns/op testRandomPage avgt 2.703 ± 0.025 ns/op testDependentRandomPage avgt 7.102 ± 0.326 ns/op ~7 ns/op

  21. Random heap walk - Average time in ns/op to sum all longs in a 1GB array?

  22. Access Pattern Benchmark Benchmark Mode Score Error Units testSequential avgt 0.832 ± 0.006 ns/op testRandomPage avgt 2.703 ± 0.025 ns/op testDependentRandomPage avgt 7.102 ± 0.326 ns/op testRandomHeap avgt 19.896 ± 3.110 ns/op ~20 ns/op

  23. Data dependant heap walk - Average time in ns/op to sum all longs in a 1GB array?

  24. Access Pattern Benchmark Benchmark Mode Score Error Units testSequential avgt 0.832 ± 0.006 ns/op testRandomPage avgt 2.703 ± 0.025 ns/op testDependentRandomPage avgt 7.102 ± 0.326 ns/op testRandomHeap avgt 19.896 ± 3.110 ns/op testDependentRandomHeap avgt 89.516 ± 4.573 ns/op ~90 ns/op

  25. Need to ADD 40+ ns/op for NUMA access on a server!!!

  26. Access Pattern Benchmark Benchmark Mode Score Error Units testSequential avgt 0.832 ± 0.006 ns/op testRandomPage avgt 2.703 ± 0.025 ns/op testDependentRandomPage avgt 7.102 ± 0.326 ns/op testRandomHeap avgt 19.896 ± 3.110 ns/op testDependentRandomHeap avgt 89.516 ± 4.573 ns/op

  27. What does this mean for data structures?

  28. Buckets

  29. EUR/USD Buckets 1 Key Value Hash Next

  30. EUR/USD Buckets 1 Key Value Hash Next Key Value Hash Next 2 GBP/EUR

  31. EUR/USD Buckets 1 3 GBP/USD Key Value Hash Next Key Value Hash Next Key Value Hash Next 2 GBP/EUR

  32. Buckets Key Value Hash Next

  33. Buckets Key Value Hash Next 1 EUR/USD 4 -1 0

  34. Buckets Key Value Hash Next 1 EUR/USD 4 -1 2 GBP/EUR 2 -1 1 0

  35. Buckets Key Value Hash Next 1 EUR/USD 4 2 2 GBP/EUR 2 -1 1 3 GBP/USD 4 -1 0

  36. .net Dictionary is >10X faster than HashMap for 2+ GB of data

  37. Understand object relationships and then choose appropriate data structures

  38. Java desperately needs Value Types on the stack and Aggregates on the heap

  39. Data Structures are becoming evermore important again!

  40. 7

  41. Too Much Allocation

  42. “Allocation is free…”

  43. Reclamation is NOT free!

  44. Remember Data Dependent Loads?

  45. Too much allocation or copying will wash out your cache

  46. 6

  47. Going Parallel

  48. http://www.frankmcsherry.org/assets/COST.pdf

  49. Amdahl’s Law 20 18 16 14 Speedup 12 10 8 6 4 2 0 1 2 4 8 16 32 64 128 256 512 1024 Processors Amdahl

  50. Universal Scalability Law 20 18 16 14 Speedup 12 10 8 6 4 2 0 1 2 4 8 16 32 64 128 256 512 1024 Processors Amdahl USL

  51. Universal Scalability Law C(N) = N / (1 + α (N – 1) + (( β * N) * (N – 1))) C = capacity or throughput N = number of processors α = contention penalty β = coherence penalty

  52. Shared mutable state is Evil!

  53. “You can have a second computer once you’ve shown you know how to use the first one” – Paul Barham

  54. “You can have a second CPU once you’ve shown you know how to use the first one” – Martin Thompson

  55. 5

  56. Not Understanding TCP

  57. TCP – Sequenced Flow 1 Client Server

  58. TCP – Sequenced Flow 1 Client Server SYN

  59. TCP – Sequenced Flow 1 Client Server SYN SYN, ACK

  60. TCP – Sequenced Flow 1 Client Server SYN SYN, ACK ACK

  61. TCP – Sequenced Flow 1 Client Server SYN SYN, ACK ACK Data == MSS

  62. TCP – Sequenced Flow 1 Client Server SYN SYN, ACK ACK Data == MSS

  63. TCP – Sequenced Flow 1 Client Server SYN SYN, ACK ACK Data == MSS Delayed ACK

  64. TCP – Sequenced Flow 1 Client Server SYN SYN, ACK ACK Data == MSS Delayed ACK Data < MSS

  65. TCP – Sequenced Flow – TCP_NODELAY Client Server SYN SYN, ACK ACK

  66. TCP – Sequenced Flow – TCP_NODELAY Client Server SYN SYN, ACK ACK Data == MSS

  67. TCP – Sequenced Flow – TCP_NODELAY Client Server SYN SYN, ACK ACK Data == MSS Data < MSS

  68. TCP – Sequenced Flow – TCP_NODELAY Client Server SYN SYN, ACK ACK Data == MSS Data < MSS ACK

  69. 4

  70. Synchronous Communications

  71. Client Server

  72. Client Server

  73. Client Server

  74. Client Server

  75. Client Server

  76. Client Server

  77. Client Server

  78. Asynchronous Communications

  79. Client Server

  80. Client Server

  81. Client Server

  82. Client Server

  83. Client Server

  84. Client Server

  85. Client Server

  86. Synchronous Communications is the crystal meth of distributed computing

  87. 3

  88. Text Encoding

  89. “But it’s human readable...”

  90. “Binary is hard to work with...”

  91. while (i >= 0) { int remainder = quotient % 10; quotient = quotient / 10; results[i--] = (byte)('0' + remainder); }

  92. Communications Battery life and bandwidth?

  93. 2

Recommend


More recommend