adaptsize orchestrating the hot object memory cache in a
play

AdaptSize: Orchestrating the Hot Object Memory Cache in a CDN - PowerPoint PPT Presentation

AdaptSize: Orchestrating the Hot Object Memory Cache in a CDN Daniel S. Mor Ramesh K. Berger Harchol-Balter Sitaraman USENIX NSDI. Boston, March 28, 2017. CDN Caching Architecture Content providers 1% 1% 1% 1% DC HOC CDN 100% 100%


  1. AdaptSize: Orchestrating the Hot Object Memory Cache in a CDN Daniel S. Mor Ramesh K. Berger Harchol-Balter Sitaraman USENIX NSDI. Boston, March 28, 2017.

  2. CDN Caching Architecture Content providers 1% 1% 1% 1% DC HOC CDN 100% 100% 100% 100% Users 1

  3. Optimizing CDN Caches Two caching levels: 1% ❏ Disk Cache (DC) DC ❏ Hot Object Cache (HOC) 40% # reqs served HOC performance metric HOC by HOC object hit ratio = OHR = total # reqs 100% Goal: maximize OHR 2

  4. Prior Approaches to Cache Management Frequent decisions required DC What to admit What to evict Today in practice historically everything LRU a few GBs capacity e.g., Nginx, Varnish HOC mixtures of 2000s in academia everything LRU/LFU e.g., Modha, Zhang, Kumar 500 GB concurrent 2010s in academia per hour everything LRU e.g., Kaminsky, Lim, Andersen 3

  5. We Are Missing a Key Issue 9 orders of magnitude Not all objects are the same ❏ Should we admit every object? (no, we should favor small objects) ❏ A few key companies know this (but don’t know how to it well) ❏ Academia has not been helpful (almost all theoretical work assumes equal-sized objects) 4

  6. What’s Hard About Size-Aware Admission Fixed Size Threshold: How to pick c: admit if size < Threshold c pick c to maximize OHR 9pm 2pm m a 8 t a c t s e b Threshold c The best threshold changes with traffic mix 5

  7. Can we avoid picking a threshold c Probabilistic admission: Unfortunately, many curves example: exp(c) family high admission low admission Which curve makes big difference probability probability We need to adapt c 6

  8. What to admit What to evict concurrent LRU AdaptSize adaptive size-aware The AdaptSize Caching System adapt adapt First system that continuously adapts with with the parameter of size-aware admission time traffic Enforce Calculate Take traffic Calculate admission measurements the best c the best c control 7 Incorporated into high-throughput production caching system (Varnish)

  9. Δ interval Δ interval Δ interval … time How to Find Best c Within Each Δ Interval Traditional approach AdaptSize approach Hill climbing Markov model Enables speedy Local optima on global optimization OHR-vs-c curve 8

  10. How AdaptSize Gets the OHR-vs-c curve hit miss Markov chain OUT IN ➢ track IN/OUT for each object Algorithm request request For every Δ interval and for every value of c use Markov chain to solve for OHR( c ) ❏ find c to maximize OHR ❏ Why hasn’t this been done? Too slow: exponential state space New technique: approximation with linear state space 9

  11. Implementing AdaptSize Incorporated into Varnish highly concurrent HOC system, 40+ Gbit/s DC Enforce Take traffic Calculate admission measurements the best c control HOC A dapt S ize 10 Goal: low overhead on request path

  12. Implementing AdaptSize Incorporated into Varnish highly concurrent HOC system, 40+ Gbit/s DC Enforce Take traffic Calculate admission measurements the best c control HOC Challenges A dapt 40% 1% 1) Concurrent write conflicts requests objects S ize 2) Locks too slow [NSDI’13 & 14] AdaptSize: producer/consumer + ring buffer Lock-free implementation 11

  13. Implementing AdaptSize Incorporated into Varnish highly concurrent HOC system, 40+ Gbit/s DC Enforce Take traffic Calculate admission measurements the best c control HOC AdaptSize: A dapt admission is really simple S ize given c, and the object size ❏ admit with P(c, size) ❏ Enables lock free & low overhead implementation 12

  14. AdaptSize Evaluation Testbed Origin 40 GBit / Origin : emulates 100s of web servers 100ms RTT 55 million / 8.9 TB unique objects DC DC : unmodified Varnish 4x 1TB/ 7200 Rpm HOC unmodified Varnish A dapt HOC systems : ❏ S ize 1.2 GB NGINX cache ❏ 16 threads AdaptSize ❏ 40 GBit / 30ms RTT Clients : replay Akamai requests trace 440 million / 152 TB total requests 13

  15. Comparison to Production Systems what to admit what to evict Varnish everything concurrent LRU Nginx frequency filter LRU AdaptSize concurrent LRU adaptive size-aware +92% +48% 14

  16. Comparison to Research-Based Systems manually tuned parameters recency and manually tuned parameters frequency combinations +67% manually tuned parameters 15

  17. Robustness of AdaptSize Size-Aware OPT: offline parameter tuning AdaptSize: our Markovian tuning model HillClimb: local-search using shadow queues 16

  18. Conclusion # reqs Goal: maximize OHR of the Hot Object Cache served by HOC OHR= total Approach: size-based admission control # reqs 17

  19. Conclusion # reqs Goal: maximize OHR of the Hot Object Cache served by HOC OHR= total Approach: size-based admission control # reqs Key insight: need to adapt parameter c AdaptSize: adapts c via a Markov chain Result: 48-92% higher OHRs 18

  20. Conclusion # reqs Goal: maximize OHR of the Hot Object Cache served by HOC OHR= total Approach: size-based admission control # reqs Key insight: need to adapt parameter c AdaptSize: adapts c via a Markov chain Result: 48-92% higher OHRs Throughput ❏ In our paper /dasebe/AdaptSize Disk utilization ❏ Byte hit ratio ❏ Request latency ❏ 19

Recommend


More recommend