cache modeling and optimization using miniature
play

Cache Modeling and Optimization using Miniature Simulations Carl - PowerPoint PPT Presentation

Cache Modeling and Optimization using Miniature Simulations Carl Waldspurger CachePhysics, Inc. Trausti Saemundsson CachePhysics, Inc. Irfan Ahmad CachePhysics, Inc. Nohhyun Park Datos IO, Inc. USENIX Annual Technical Conference (ATC


  1. Cache Modeling and Optimization using Miniature Simulations Carl Waldspurger CachePhysics, Inc. Trausti Saemundsson CachePhysics, Inc. Irfan Ahmad CachePhysics, Inc. Nohhyun Park Datos IO, Inc. USENIX Annual Technical Conference (ATC ’17) July 13, 2017

  2. Motivation • Caching important, ubiquitous • Optimize valuable cache resources – Improve performance, QoS – Sizing, partitioning, tuning, cliff removal, … • Problem: need accurate, efficient models – Complex policies, non-linear, workload-dependent – No general, lightweight, online approach CachePhysics, Inc. USENIX ATC ’17 2

  3. Cache Modeling • Cache utility curves ��� ��� – Performance as f (size, …) ��� – Miss ratio curve (MRC) ���� ����� ��� – Latency curve ��� ��� • Observations ��� – Non-linear, cliffs ��� � – Non-monotonic bumps � �� �� �� �� �� �� �� �� ����� ���� ���� CachePhysics, Inc. USENIX ATC ’17 3

  4. MRC Construction Methods Exact Approximate Counter Stacks [OSDI ’14] Counter Stacks [OSDI ’14] Stack Algorithms Mattson algorithm SHARDS [FAST ’15] SHARDS [FAST ’15] LRU, LFU, … all sizes at once AET [ATC ’16] AET [ATC ’16] Any Algorithm separate simulation miniature simulation for each cache size [ATC ’17] ARC, LIRS, 2Q, FIFO, … CachePhysics, Inc. USENIX ATC ’17 4

  5. Miniature Simulation • Simulate large cache using tiny one • Scale down reference stream, cache size – Random sampling based on hash (key) – Assumes statistical self-similarity • Run unmodified algorithm – LRU, LIRS, ARC, 2Q, FIFO, OPT, … – Track usual stats CachePhysics, Inc. USENIX ATC ’17 5

  6. Scaling Down half size hash keys (colors) ≈ 2× refs cache half key space CachePhysics, Inc. USENIX ATC '17 6

  7. Scaling Down ≈ 8× refs cache CachePhysics, Inc. USENIX ATC '17 7

  8. Scaling Down ≈ 32× refs cache CachePhysics, Inc. USENIX ATC '17 8

  9. Scaling Down ≈ 128× refs cache CachePhysics, Inc. USENIX ATC '17 9

  10. Flexible Scaling • Time/space tradeoff sampling rate R – Fixed sampling rate R – Fixed mini size S m S e S m • Example: S e = 1M emulated size mini size – R = 0.005 ⇒ S m = 5000 – S m = 1000 ⇒ R = 0.001 S m = R × S e CachePhysics, Inc. USENIX ATC '17 10

  11. Example Mini-Sim MRCs CachePhysics, Inc. USENIX ATC ’17 11

  12. Mini-Sim Accuracy • 137 real-world traces – Storage block traces – CloudPhysics, MSR, FIU – 100 cache sizes per trace • Mean Absolute Error – | exact – approx | – Average over all sizes CachePhysics, Inc. USENIX ATC '17 12

  13. Mini-Sim Efficiency • Variable costs – Both space and time scaled down by R – R =0.001 ⇒ simulation 1000× smaller, 1000× faster • Fixed costs – Hashing overhead for sampling – Footprint for code, libraries, etc. • Net improvement – R =0.001 ⇒ ~ 200× smaller, ~ 10× faster – Closer to 1000× if existing key hash or multiple sims CachePhysics, Inc. USENIX ATC '17 13

  14. Mini-Sim Cache Tuning • Dynamic multi-model optimization – Simulate candidate configurations online – Periodically apply best to actual cache • Parameter adaptation experiments – LIRS S stack size, 5 mini-sims with f = 1.1 — 3 – 2Q A1 out size, 8 mini-sims with K out = 50% — 300% – R = 0.005, epoch = 1M refs CachePhysics, Inc. USENIX ATC ’17 14

  15. LIRS Adaptation Examples CachePhysics, Inc. USENIX ATC ’17 15

  16. 2Q Adaptation Examples CachePhysics, Inc. USENIX ATC ’17 16

  17. Talus Cliff Removal • Talus [HPCA ’15] – Needs MRC as input – Interpolates convex hull • Shadow partitions 𝛽 , 𝛾 – Steer different fractions 𝛽 of refs to each 𝛾 – Emulate cache sizes on convex hull via hashing CachePhysics, Inc. USENIX ATC '17 17

  18. Talus for Non-LRU Policies? • Need efficient online MRCs • Support dynamic changes? – Workload and MRC evolve over time – Resize partitions, lazy vs. eager? – Migrate cache entries in “wrong” partition? • Not clear how to merge/migrate state CachePhysics, Inc. USENIX ATC ’17 18

  19. SLIDE: Transparent Cliff Removal • S harded L ist with I nternal D ifferential E viction – Single unified cache, no hard partitions – Defer partitioning decisions until eviction – Avoids resizing, migration, complexity issues • New SLIDE list abstraction – No changes to ARC, LIRS, 2Q, LRU code – Replaces internal LRU/FIFO building blocks CachePhysics, Inc. USENIX ATC ’17 19

  20. SLIDE List • Augment conventional list – Per-item hash value – Hash threshold determines current “partition” • Items totally ordered, no hard partitions • Evict from tail of over-quota partition CachePhysics, Inc. USENIX ATC ’17 20

  21. SLIDE Experiments • Construct MRC online – 7 mini-sims {⅛, ¼, ½, 1, 2, 4, 8} × cache size – R =0.005, smoothed miss ratios • Update SLIDE settings periodically – Discrete convex hull every epoch (1M refs) – Set new “partition” targets for SLIDE lists CachePhysics, Inc. USENIX ATC ’17 21

  22. SLIDE: Cliff Reduction 69% 48% 38% of potential gain CachePhysics, Inc. USENIX ATC ’17 22

  23. SLIDE: Little Impact without Cliffs CachePhysics, Inc. USENIX ATC ’17 23

  24. Conclusions • Mini-sim extremely effective – Robust, general method (ARC, LIRS, 2Q, LRU, OPT, …) – Average error < 0.01 with 0.1% sampling • Can optimize workloads/policies automatically – Dynamic parameter tuning – SLIDE transparent cliff removal CachePhysics, Inc. USENIX ATC ’17 24

Recommend


More recommend