adaptive look ahead window assisted chunk caching
play

Adaptive Look-Ahead Window Assisted Chunk Caching Zhichao Cao , Hao - PowerPoint PPT Presentation

ALACC: Accelerating Restore Performance of Data Deduplication Systems Using Adaptive Look-Ahead Window Assisted Chunk Caching Zhichao Cao , Hao Wen, Fenggang Wu and David H.C. Du University of Minnesota, Twin Cities 02/15/2018 Agenda


  1. ALACC: Accelerating Restore Performance of Data Deduplication Systems Using Adaptive Look-Ahead Window Assisted Chunk Caching Zhichao Cao , Hao Wen, Fenggang Wu and David H.C. Du University of Minnesota, Twin Cities 02/15/2018

  2. Agenda • Deduplication Process • Restore Process with Different Caching Schemes – Container/chunk based caching – Forward Assembly • Objective and Challenges • Proposed Approach – Look-ahead window assisted chunk based caching (all fixed) – Adaptive Look-ahead Chunk-based Caching (ALACC) • Evaluations • Conclusions and Future Work C enter for R esearch in I ntelligent S torage

  3. Agenda • Deduplication Process • Restore Process with Different Caching Schemes – Container/chunk based caching – Forward Assembly • Objective and Challenges • Proposed Approach – Look-ahead window assisted chunk based caching (all fixed) – Adaptive Look-ahead Chunk-based Caching (ALACC) • Evaluations • Conclusions and Future Work C enter for R esearch in I ntelligent S torage

  4. Deduplication Process [1] Byte Stream 22 23 8 22 18 2 5 10 14 1. Chunk ID Sliding Window 2. Chunk Size Recipe Container Buffer 14 3. Container Address 4. Offset in the container 22 23 8 22 18 2 5 10 14 22 23 5 14 Recipe 5. Other meta information Entry Indexing Table 18 19 20 21 2 3 10 13 Not Found …… 8 9 17 25 Container Storage C enter for R esearch in I ntelligent S torage [1] Zhu B, Li K, Patterson R H. Avoiding the Disk Bottleneck in the Data Domain Deduplication File System[C]//Fast. 2008, 8: 1-14.

  5. Deduplication Process [1] Byte Stream 22 23 8 22 18 2 5 10 14 5 Sliding Window Recipe Container Buffer 22 23 8 22 18 2 5 10 14 5 22 23 5 14 Indexing Table 18 19 20 21 2 3 10 13 Exits …… 8 9 17 25 Container Storage C enter for R esearch in I ntelligent S torage [1] Zhu B, Li K, Patterson R H. Avoiding the Disk Bottleneck in the Data Domain Deduplication File System[C]//Fast. 2008, 8: 1-14.

  6. Agenda • Deduplication Process • Restore Process with Different Caching Schemes – Container/chunk based caching – Forward Assembly • Objective and Challenges • Proposed Approach – Look-ahead window assisted chunk based caching (all fixed) – Adaptive Look-ahead Chunk-based Caching (ALACC) • Evaluations • Conclusions and Future Work C enter for R esearch in I ntelligent S torage

  7. Why Improving Restore Performance is Important? • Due to the serious data fragmentation and size mismatching of requested data and I/O unite , the restore performance is much lower than that of directly reading out the data which is not deduplicated. • CPU and memory resources are limited. ? 22 23 8 22 18 2 22 8 18 …… 2 23 Container Storage C enter for R esearch in I ntelligent S torage

  8. Restore Process with Container-based Caching Restore Direction Recipe …… 5 13 22 23 8 22 18 2 5 10 14 18 13 22 3 28 23 12 13 32 23 28 6 …… Container Cache Assembling Buffer Restored Data Storage 22 23 5 14 18 2 5 10 22 23 8 22 14 18 18 2 19 3 10 20 21 13 2 3 5 13 12 14 5 13 18 19 20 21 2 3 10 13 …… …… 8 9 17 25 22 23 5 14 Container Storage C enter for R esearch in I ntelligent S torage

  9. Restore Process with Chunk-based Caching Recipe …… 22 23 8 22 18 2 5 10 14 18 13 22 3 28 23 12 13 32 23 28 6 Chunk Cache Assembling Buffer Container Read Buffer Restored Data Storage 2 3 18 22 18 18 2 5 10 22 23 8 22 14 2 3 10 13 14 13 5 23 10 2 3 5 13 12 14 5 13 18 19 20 21 2 3 10 13 …… …… 8 9 17 25 22 23 5 14 Container Storage C enter for R esearch in I ntelligent S torage

  10. Container-based Caching vs. Chunk-based Caching Container-based Caching Chunk-based Caching Less operating and management 1. Higher cache hit ratio overhead 2. Even much higher if look-ahead window is applied Relatively higher cache miss ratio, Higher operating and management especially when the caching space is overhead limited. C enter for R esearch in I ntelligent S torage

  11. Container-based Caching vs. Chunk-based Caching Container_LRU Container_LRU 300 Chunk_LRU Container-reads per 100MB 5 Chunk_LRU 4.5 250 Computing Time 4 (seconds/GB) 200 3.5 3 Restored 150 2.5 2 100 1.5 1 50 0.5 0 0 Total Cache Size Total Cache Size C enter for R esearch in I ntelligent S torage

  12. Forward Assembly Scheme [1] Look-Ahead Window Recipe …… …… 22 23 8 22 23 8 22 18 8 5 10 14 18 9 22 3 28 23 12 13 Container Read buffer Forward Assembling Area (FAA) 22 23 8 22 18 8 5 18 14 18 9 22 22 23 5 14 2 22 23 8 …… 18 18 19 19 20 20 21 21 2 3 10 13 Restored Data Storage 8 8 9 9 17 17 25 25 22 23 5 14 Container Storage [1] Lillibridge M, Eshghi K, Bhagwat D. Improving C enter for R esearch in restore speed for backup systems that use inline chunk- I ntelligent S torage based deduplication[C]//FAST. 2013: 183-198.

  13. Chunk-based Caching vs. Forward Assembly Forward Assembly Chunk-based Caching 1. When chunks are re-used in a 1. Highly efficient, when chunks from relatively long distance (larger than the same container are used most in FAA), caching is more effective the FAA range 2. Low operating and management overhead Higher operating and management Workload sensitive, requires good overhead workload locality C enter for R esearch in I ntelligent S torage

  14. Agenda • Deduplication Process • Restore Process with Different Caching Schemes – Container/chunk based caching – Forward Assembly • Objective and Challenges • Proposed Approach – Look-ahead window assisted chunk based caching (all fixed) – Adaptive Look-ahead Chunk-based Caching (ALACC) • Evaluations • Conclusions and Future Work C enter for R esearch in I ntelligent S torage

  15. Objective and Challenges • Objective: – Forward assembly + chunk-based caching + LAW (limited memory space) • Challenges – When the total size of available memory for restore is limited and fixed, how to use these schemes in an efficient way, is unclear. – How to make better trade-offs to achieve fewer container-reads, but limit the computing overhead including the LAW, caching and forward assembly overhead. – How to make the design adapt to the changing workload is very challenging. C enter for R esearch in I ntelligent S torage

  16. Agenda • Deduplication Process • Restore Process with Different Caching Schemes – Container/chunk based caching – Forward Assembly • Objective and Challenges • Proposed Approach – Look-ahead window assisted chunk based caching (all fixed) – A daptive L ook- A head C hunk-based C aching (ALACC) • Evaluations • Conclusions and Future Work C enter for R esearch in I ntelligent S torage

  17. Look-ahead Window Assisted Chunk Cache What’s the caching policy? Look-Ahead Window (LAW) Covered Range FAA Covered Range Information for Chunk Cache Unknown Restored …… 22 2 8 22 18 2 5 10 14 10 15 7 22 18 23 12 13 32 23 28 6 FAA Container Read Buffer Chunk Cache 18 2 5 10 10 10 12 13 17 8 2 18 25 FAB1 FAB2 22 2 8 22 22 23 5 14 18 19 20 21 10 12 13 17 …… …… Restored Data 8 9 2 25 22 23 5 14 Storage Container Storage C enter for R esearch in I ntelligent S torage

  18. P-chunk: P robably used chunk 10 Chunks in the Read-in Container F-chunk: F uture used chunk 12 13 Look-Ahead Window Covered Range U-chunk: U nused chunk 17 FAA Covered Range Information for Chunk Cache Unknown Restored …… 22 2 8 22 18 2 5 10 14 10 15 7 22 18 23 12 13 32 23 28 6 Chunk Cache FAA Container Read Buffer 21 22 18 2 5 10 10 10 12 13 17 FAB1 FAB2 25 18 22 2 8 22 2 32 18 19 20 21 10 12 13 17 …… 28 5 …… Restored Data 8 9 2 25 22 23 5 14 F-cache P-cache Storage Container Storage C enter for R esearch in I ntelligent S torage

  19. Caching Priority of F-cache F-chunks being used in the near future have higher priority than F- chunks being used in the far future Look-Ahead Window Covered Range FAA Covered Range Information for Chunk Cache Unknown Restored …… 22 2 8 22 18 2 5 10 14 10 15 7 22 18 23 12 13 32 23 28 6 Chunk Cache High priority end FAA Container Read Buffer 18 2 5 10 10 22 10 12 13 17 FAB1 FAB2 18 22 2 8 22 32 12 18 19 20 21 10 12 13 17 …… …… 28 13 Restored Data 8 9 2 25 22 23 5 14 Storage F-cache Low priority end Container Storage C enter for R esearch in I ntelligent S torage

  20. Caching Priority of P-cache P-cache is LRU based caching policy Look-Ahead Window Covered Range FAA Covered Range Information for Chunk Cache Unknown Restored …… 21 22 2 8 22 18 2 5 10 14 10 15 7 22 18 23 12 13 32 23 28 6 Chunk Cache High priority end FAA Container Read Buffer 10 5 18 2 5 10 10 10 12 13 17 FAB1 FAB2 2 …… 22 2 8 22 8 18 19 20 21 10 12 13 17 …… 21 …… Restored Data 8 9 2 25 22 23 5 14 F-cache P-cache Storage Low priority end Container Storage C enter for R esearch in I ntelligent S torage

Recommend


More recommend