Analysis and design of list-based cache replacement policies 1 Nicolas Gast (Inria) Inria (joint work with Benny Van Houdt (Univ. of Antwerp)) POLARIS / DataMove Seminar, Jan.2016, Inria 1 Mainly based on Transient and Steady-state Regime of a Family of List-based Cache Replacement Algorithms , by G and Van Houdt. ACM SIGMETRICS 2015. Nicolas Gast – 1 / 31
Caches are everywhere User/Application Examples: fast Processor Database cache CDN slow Single cache / hierarchy of caches data source Nicolas Gast – 2 / 31
In this talk, I focus on a single cache. The question is: which item to replace? Application requests cache data source Nicolas Gast – 3 / 31
In this talk, I focus on a single cache. The question is: which item to replace? Application hit requests cache data source Nicolas Gast – 3 / 31
In this talk, I focus on a single cache. The question is: which item to replace? Application Classical cache replacement policies: RAND, FIFO hit miss requests LRU CLIMB cache Other approaches: replace one item Time to live data source Nicolas Gast – 3 / 31
The analysis of cache performance has a growing interest Theoretical studies: started with [King 1971, Gelenbe 1973] Nowadays: New applications: CDN / CON (replication 2 ) New analysis techniques (Che approximation 3 , 4 ) 2[Borst et al. 2010] Distributed Caching Algorithms for Content Distribution Networks 3[Che et al 2002] Hierarchical web caching sys tems: modeling, design and experimental results. 4[Fricker et al. 2012] A versatile and accurate approximation for lru cache performance Nicolas Gast – 4 / 31
0.40 0.40 0.35 0.35 0.30 0.30 probability in cache probability in cache 0.25 0.25 0.20 0.20 0.15 0.15 0.10 0.10 0.05 1 list (200) 0.05 approx 1 list (200) 4 lists (50/50/50/50) approx 4 lists (50/50/50/50) 0.00 0.00 0 2000 4000 6000 8000 10000 0 2000 4000 6000 8000 10000 number of requests number of requests Outline of the talk 1 What are the classical models? Nicolas Gast – 5 / 31
0.40 0.40 0.35 0.35 0.30 0.30 probability in cache probability in cache 0.25 0.25 0.20 0.20 0.15 0.15 0.10 0.10 0.05 1 list (200) 0.05 approx 1 list (200) 4 lists (50/50/50/50) approx 4 lists (50/50/50/50) 0.00 0.00 0 2000 4000 6000 8000 10000 0 2000 4000 6000 8000 10000 number of requests number of requests Outline of the talk 1 What are the classical models? 2 We introduce a family of policies for which the cache is (virtually) divided into lists (generalization of FIFO/RANDOM) We can compute in polynomial time the steady-state distribution 1 ⋆ Disprove old conjectures. Nicolas Gast – 5 / 31
Outline of the talk 1 What are the classical models? 2 We introduce a family of policies for which the cache is (virtually) divided into lists (generalization of FIFO/RANDOM) We can compute in polynomial time the steady-state distribution 1 ⋆ Disprove old conjectures. We develop a mean-field approximation and show that it is accurate 2 ⋆ Fast approximation of the steady-state distribution. ⋆ We can characterize the transient behavior : 0.40 0.40 0.35 0.35 0.30 0.30 probability in cache probability in cache 0.25 0.25 0.20 0.20 0.15 0.15 ODE approximation 0.10 0.10 0.05 1 list (200) 0.05 approx 1 list (200) simulation 4 lists (50/50/50/50) approx 4 lists (50/50/50/50) 0.00 0.00 0 2000 4000 6000 8000 10000 0 2000 4000 6000 8000 10000 number of requests number of requests Nicolas Gast – 5 / 31
Outline of the talk 1 What are the classical models? 2 We introduce a family of policies for which the cache is (virtually) divided into lists (generalization of FIFO/RANDOM) We can compute in polynomial time the steady-state distribution 1 ⋆ Disprove old conjectures. We develop a mean-field approximation and show that it is accurate 2 ⋆ Fast approximation of the steady-state distribution. ⋆ We can characterize the transient behavior : 0.40 0.40 0.35 0.35 0.30 0.30 probability in cache probability in cache 0.25 0.25 0.20 0.20 0.15 0.15 ODE approximation 0.10 0.10 0.05 1 list (200) 0.05 approx 1 list (200) simulation 4 lists (50/50/50/50) approx 4 lists (50/50/50/50) 0.00 0.00 0 2000 4000 6000 8000 10000 0 2000 4000 6000 8000 10000 number of requests number of requests 3 We provide guidelines of how to tune the parameters by using IRM and trace-based simulation Nicolas Gast – 5 / 31
Outline Performance models of caches 1 List-based cache replacement algorithms 2 Steady-state performance under the IRM model Transient behavior via mean-field approximation Parameters tuning and practical guidelines 3 Conclusion 4 Nicolas Gast – 6 / 31
Outline Performance models of caches 1 List-based cache replacement algorithms 2 Steady-state performance under the IRM model Transient behavior via mean-field approximation Parameters tuning and practical guidelines 3 Conclusion 4 Nicolas Gast – 7 / 31
Our performance metric will be the hit probability hit probability = number of items served from cache total number of items served = 1 − miss probability Goal: find a policy to maximize the hit probability. Nicolas Gast – 8 / 31
The offline problem is easy. . . Application hit miss requests cache ( size m ) replace one item data source Nicolas Gast – 9 / 31
The offline problem is easy. . . Application If you know the sequence of requests: MIN policy At time t , if X t is not in the cache, hit miss requests evict an item in the cache whose next request occurs furthest in the future. cache ( size m ) replace one item Theorem (Maston et al. 1970) MIN is optimal data source Nicolas Gast – 9 / 31
The offline problem is easy. . . but with unbounded competitive ratio Application Theorem hit miss requests No deterministic online algorithm for caching can achieve a better competitive ratio than m. cache ( size m ) replace one item LRU has a competitive ratio of m. data source Nicolas Gast – 9 / 31
To compare policies, we need more... We can use trace-based simulations. 5L. Breslau, P. Cao, L. Fan, G. Phillips, and S. Shenker. Web caching and Zipf-like distributions: Evidence and implications. In INFOCOM’99, volume 1, pages 126-134. IEEE, 1999. Nicolas Gast – 10 / 31
To compare policies, we need more... We can use trace-based simulations. We can model request as stochastic processes (Started with [King 1971, Gelenbe 1973]) Independent reference model (IRM) At each time step, item i is requested with probability p i . IRM is OK for web-caching 5 5L. Breslau, P. Cao, L. Fan, G. Phillips, and S. Shenker. Web caching and Zipf-like distributions: Evidence and implications. In INFOCOM’99, volume 1, pages 126-134. IEEE, 1999. Nicolas Gast – 10 / 31
Example: analysis of LRU: from King [71] to Che [2002] [King 71]: Under IRM model, in steady-state, the probability of having a sequence of distinct items i 1 . . . i n is p i 2 p i m P ( i 1 . . . i m ) = p i 1 . . . 1 − p i 1 1 − p i 1 − . . . p i m − 1 � Hit probability is: ( p i 1 + · · · + p i m ) P ( i 1 . . . i m ). distinct sequences i 1 . . . i m Nicolas Gast – 11 / 31
Example: analysis of LRU: from King [71] to Che [2002] [King 71]: Under IRM model, in steady-state, the probability of having a sequence of distinct items i 1 . . . i n is p i 2 p i m P ( i 1 . . . i m ) = p i 1 . . . 1 − p i 1 1 − p i 1 − . . . p i m − 1 � Hit probability is: ( p i 1 + · · · + p i m ) P ( i 1 . . . i m ). distinct sequences i 1 . . . i m [Che approximation 2002] : an item spends approximately T in the cache. P (item i in cache) ≈ 1 − e − p i T , n � 1 − e − p i T where T is such that i =1 Nicolas Gast – 11 / 31
Even when the popularity is constant, LFU is not optimal. LFU is optimal under IRM (it maximizes the steady-state hit probability). Nicolas Gast – 12 / 31
Even when the popularity is constant, LFU is not optimal. LFU is optimal under IRM (it maximizes the steady-state hit probability). LFU is not optimal under general distribution: ◮ e.g. time between two requests of item 1 = 1 with probability . 99, 1000 with probability . 01. Time between two requests of item 2 is 5. LRU outperforms LFU. Nicolas Gast – 12 / 31
Outline Performance models of caches 1 List-based cache replacement algorithms 2 Steady-state performance under the IRM model Transient behavior via mean-field approximation Parameters tuning and practical guidelines 3 Conclusion 4 Nicolas Gast – 13 / 31
I consider a cache (virtually) divided into lists IRM At each time step, item i is Application requested with probability p i (IRM assumption 3 ) . . . . . . list 1 list j list j +1 list h data source 6L. Breslau, P. Cao, L. Fan, G. Phillips, and S. Shenker. Web caching and Zipf-like distributions: Evidence and implications. In INFOCOM’99, volume 1, pages 126-134. IEEE, 1999. Nicolas Gast – 14 / 31
I consider a cache (virtually) divided into lists IRM At each time step, item i is Application requested with probability p i (IRM assumption 3 ) miss MISS If item i is not in the cache, it is exchanged with a item from list 1 (FIFO or RAND). . . . . . . list 1 list j list j +1 list h data source 6L. Breslau, P. Cao, L. Fan, G. Phillips, and S. Shenker. Web caching and Zipf-like distributions: Evidence and implications. In INFOCOM’99, volume 1, pages 126-134. IEEE, 1999. Nicolas Gast – 14 / 31
Recommend
More recommend