acme adaptive caching using multiple experts
play

ACME: Adaptive Caching Using Multiple Experts By Ismail Ari, Ahmed - PowerPoint PPT Presentation

ACME: Adaptive Caching Using Multiple Experts By Ismail Ari, Ahmed Amer, Robert Gramacy, Ethan L. Miller, Scott A. Bandt, Darrell D. E. Long University of California, Santa Cruz Caching gap between CPU speeds and the speed of the


  1. ACME: Adaptive Caching Using Multiple Experts By Ismail Ari, Ahmed Amer, Robert Gramacy, Ethan L. Miller, Scott A. Bandt, Darrell D. E. Long University of California, Santa Cruz

  2. Caching  “gap between CPU speeds and the speed of the technologies providing the data is increasing” − Sound familiar? (note this from 2002...) − “Good” caching techniques are important so that the requested data quickly gets to the CPU / client

  3. Caching of Network Traffic  Focus of paper is efficient caching of remote/web data (as opposed to caching within a single PC)  Delay of remote data can be attributed to network latency and I/O latency at servers  Research has been done on workloads of remote/web data − Various static cache replacement policies implemented in actual systems (can be thought of as proxies/“cache nodes” within a network of computers) − These static policies struggle to adapt to changes in workload over time / between different locations

  4. Caching Complexities (Network Version)  Characteristics of workloads change over time  Workloads mix when system processes multiple workloads  Workload differs based on the location of the cache node in the network topology − Characteristics of workload in a network “hub” likely differ from the workload toward the “edge” of the network

  5. Existing Caching Techniques  Lots of research done in developing effective cache replacement algorithms − Cache replacement policies include Random, First-In-First-Out (FIFO), Last-In-First-Out (LIFO), Least Recently Used (LRU), Least Frequently Used (LFO), SIZE, GDSF, LFUDA, etc.

  6. Trends in “Cache” Research  Trend is toward finding a “magic” function that unites desired criteria into a single key or value − But won't workload changes affect the effectiveness of a particular scheme for cache replacement? − Might one function be better for the workload at time A, and another function be better for the workload @ time B...? − Is there a better way?

  7. Adaptive Caching  According to the paper, the answer is “yes, there is a better way” − Goes on to present adaptive caching...  Makes it possible to change cache replacement policy on the fly (maybe from LRU to SIZE)  Allows different caching policies to be used at times A and B  No need to choose a single cache replacement policy and then be “stuck” with it...  Now, how is it implemented?

  8. Adaptive Caching Scheme Design  Goal: combine potentially 'weak' predictions of different cache replacement policies into a single accurate prediction  Uses pool of static cache replacement algorithms, each linked to a virtual cache − On each request, each virtual cache (attached to a particular algorithm) records whether there would have been a “hit” had it been the actual cache − Increase influence of algorithms that would have gotten a “hit” on given request, and decrease influence of algorithms that would have previously evicted the requested object (note this uses machine learning/expert systems) − Virtual caches larger than physical cache

  9. Adaptive Caching Scheme Design

  10. Results: Part 1  Workload: Web proxy trace gathered at Digital Equipment Corporation (DEC) − Taken on September 16, 1996 − Proxy served 14,000 workstations − Contains 1,245,260 requests for 524,616 unique items (6 GBytes of unique data)  Tested: 12 cache replacement policies using cache size of 64 Mbytes − Each policy tested by itself, no “combined” policy incorporating weights of different policies...

  11. Results: Part 1  Different cache replacement policies “won” for different intervals of the workload − Best cache replacement algorithm at time A differs from time B for many (time A, time B) sets

  12. Results: Part 1  If a single static cache replacement policy “had” to be selected, LRU would be the best choice for this workload... − Still, the “cumulative average of the difference between the byte hit rate of the best policy (in each interval)” and “having” to use LRU in each interval is 3% (in absolute terms) − Since average byte hit rate is under 20%, this indicates that using adaptive policies increases the byte hit rate by at least 15% over sticking with the “best” static policy

  13. Results: Part 2  Workload: Synthetic request stream that favors LRU algorithm until 500 seconds and then favors SIZE algorithm for cache replacement − Results show difference between choosing the “best current” and “best overall” policy − If implementation is stuck on choosing “best overall” policy that looks at entire past performance, then not likely to switch cache replacement policies from LRU when workload changes − On the other hand, an adaptive algorithm will look at recent success and switch to SIZE policy when appropriate

  14. Results: Part 2

  15. What if the workload is random?  Paper states that “an adaptive algorithm based on learning will have its limits when the workload is completely random, since learning works whenever there is at least some information in the form of repetitive patterns” − Still, possibility that an adaptive algorithm will favor algorithms that cache small objects in this situation since it can cache more of them, which may lead to more hits

  16. What does the paper show?  Only a “proof of concept”, no actual implementation − Interesting how the adaptive caching scheme design presented in the paper isn't actually used to generate the results − Why...seems like it might make more sense to show the “results” then describe the design as “future work”?

  17. Future Work  Authors note that in actual implementation, it would be necessary to minimize space/overheads, possibly necessitating performance trade-offs  Also necessary to determine the size of the “window” of data used to determine the “best current” policy − Not possible to instantaneously switch to “best current” policy at the “perfect” time in a real-time implementation with unpredictable data...

  18. Quality of paper  If reviewing it, would I recommend it be accepted? − Pros : gives good overview of current caching techniques, gives convincing argument of a “problem” that needs to be solved for caching (particularly focused on caching of remote data), presents novel method of caching and promising results on actual workload, method is general and could be used in any cache − Cons : design presented not actually used in results, results presented are all theoretical and adjustments would need to be made for “real-life” implementation, should be more focused (paper also presents some interesting two-level cache observations/experiments, but this off the “adaptive caching” theme), paper seems sloppy in a couple places (see: results graph / description in second set of results)

  19. “Legacy” of Paper  Has been cited 65 times according to Google Scholar Possible reason : Paper presented a novel idea that since been built on...

  20. “Legacy” of Paper: Papers that cited it

  21. “Legacy” of Paper Work is continued in future paper from same lab: “Adaptive Caching by Refetching” Maybe someone should read/present that paper in a future class This work states that “this paper presents specific algorithmic solution that address the need (for adaptive caching) identified in that work (“that work” being the work I'm presenting here...)

  22. Web Workloads  Are they “cacheable”? − In paper, authors state that some believe the answer is no  Large amount of “customized” content on web; different people retrieve different data − Authors disagree  Point to 1999 study of “Web proxy workload characterization” − Study spanned 5 months and 117 million requests − Results “reported that 92% of all the requests accounting for 96% of the data transferred was cacheable and high hit rates were achieved by proxies”  Also note there are proposed methods for dealing with dynamic content

  23. Web Workloads  However, this is from 2002, stuff has changed..  Social networking sites...everyone looks at different “stuff” there (unlike CNN.com...)  How Facebook deals with caching is an interesting topic in itself...

  24. Search results for “Facebook caching” on Google

Recommend


More recommend