ziggurat a tiered file system for non volatile main
play

Ziggurat: A Tiered File System for Non-Volatile Main Memories and - PowerPoint PPT Presentation

Ziggurat: A Tiered File System for Non-Volatile Main Memories and Disks Shengan Zheng , Morteza Hoseinzadeh , Steven Swanson Shanghai Jiao Tong University University of California, San Diego 1 Background Non-volatile main


  1. Ziggurat: A Tiered File System for Non-Volatile Main Memories and Disks Shengan Zheng † , Morteza Hoseinzadeh § , Steven Swanson § † Shanghai Jiao Tong University § University of California, San Diego 1

  2. Background • Non-volatile main memory (NVMM) – Byte-addressability – Persistence 3D-XPoint NVDIMM – Direct access (DAX) • NVMM file systems – PMFS, SCMFS, NOVA – EXT4-DAX, XFS-DAX DRAM + Flash NVDIMM – Capacity? 2

  3. Motivation Bandwidth DRAM 10GB/s NVMM Optane SSD NVMe SSD SATA SSD 1GB/s Hard Disk Drive 100MB/s $/GB 10 0.01 0.1 1 3

  4. Motivation Bandwidth DRAM 10GB/s NVMM Optane SSD NVMe SSD SATA SSD 1GB/s Hard Disk Drive 100MB/s $/GB 10 0.01 0.1 1 4

  5. Tiered Storage System • SSD for speed • HDD for capacity SSD HDD 5

  6. Tiered Storage System • NVMM for speed • Disks for capacity NVMM SSD HDD 6

  7. Ziggurat Overview • Intelligent data placement policy – Send writes to the most suitable tier – High NVMM space utilization • Accurate predictors – Predict the synchronicity of each file (synchronicity predictor) – Predict the size of future writes to each file (write size predictor) • Efficient migration mechanism – Only migrate cold data in cold files – Migrate file data in groups 7

  8. Outline • Motivation • Data placement policy • Migration mechanism • Evaluation • Conclusion 8

  9. Data Placement Policy • Although NVMM is the fastest tier in Ziggurat, file writes should not always go to NVMM. Synchronicity predictor Data Placement Synchronously-updated Asynchronously-updated Large NVMM Disk Write writes size Small predictor NVMM NVMM writes 9

  10. Synchronicity Predictor • Predict whether the future accesses are likely to be synchronous write(0,2); Data blocks written: 0 / 4 File log fsync(); write(2,2); fsync(); Synchronous File data write(4,2); fsync(); 0 1 2 3 4 5 6 7 Data blocks written: 0 / 4 File log write(0,2); write(2,2); write(4,2); File data Asynchronous Synchronous fsync(); 0 1 2 3 4 5 6 7 10 Write entry offset, length

  11. Synchronicity Predictor • Predict whether the future accesses are likely to be synchronous write(0,2); Data blocks written: 2 / 4 File log 0,2 fsync(); write(2,2); fsync(); Synchronous File data write(4,2); fsync(); 0 1 2 3 4 5 6 7 Data blocks written: 0 / 4 File log write(0,2); write(2,2); write(4,2); File data Asynchronous Synchronous fsync(); 0 1 2 3 4 5 6 7 11 Write entry offset, length

  12. Synchronicity Predictor • Predict whether the future accesses are likely to be synchronous write(0,2); Data blocks written: 0 / 4 File log 0,2 fsync(); write(2,2); fsync(); Synchronous File data write(4,2); fsync(); 0 1 2 3 4 5 6 7 Data blocks written: 0 / 4 File log write(0,2); write(2,2); write(4,2); File data Asynchronous Synchronous fsync(); 0 1 2 3 4 5 6 7 12 Write entry offset, length

  13. Synchronicity Predictor • Predict whether the future accesses are likely to be synchronous write(0,2); Data blocks written: 2 / 4 File log 0,2 2,2 fsync(); write(2,2); fsync(); Synchronous File data write(4,2); fsync(); 0 1 2 3 4 5 6 7 Data blocks written: 0 / 4 File log write(0,2); write(2,2); write(4,2); File data Asynchronous Synchronous fsync(); 0 1 2 3 4 5 6 7 13 Write entry offset, length

  14. Synchronicity Predictor • Predict whether the future accesses are likely to be synchronous write(0,2); Data blocks written: 0 / 4 File log 0,2 2,2 fsync(); write(2,2); fsync(); Synchronous File data write(4,2); fsync(); 0 1 2 3 4 5 6 7 Data blocks written: 0 / 4 File log write(0,2); write(2,2); write(4,2); File data Asynchronous Synchronous fsync(); 0 1 2 3 4 5 6 7 14 Write entry offset, length

  15. Synchronicity Predictor • Predict whether the future accesses are likely to be synchronous write(0,2); Data blocks written: 2 / 4 File log 0,2 2,2 4,2 fsync(); write(2,2); fsync(); Synchronous File data write(4,2); fsync(); 0 1 2 3 4 5 6 7 Data blocks written: 0 / 4 File log write(0,2); write(2,2); write(4,2); File data Asynchronous Synchronous fsync(); 0 1 2 3 4 5 6 7 15 Write entry offset, length

  16. Synchronicity Predictor • Predict whether the future accesses are likely to be synchronous write(0,2); Data blocks written: 0 / 4 File log 0,2 2,2 4,2 fsync(); write(2,2); fsync(); Synchronous File data write(4,2); fsync(); 0 1 2 3 4 5 6 7 Data blocks written: 0 / 4 File log write(0,2); write(2,2); write(4,2); File data Asynchronous Synchronous fsync(); 0 1 2 3 4 5 6 7 16 Write entry offset, length

  17. Synchronicity Predictor • Predict whether the future accesses are likely to be synchronous write(0,2); Data blocks written: 0 / 4 File log 0,2 2,2 4,2 fsync(); write(2,2); fsync(); Synchronous File data write(4,2); fsync(); 0 1 2 3 4 5 6 7 Data blocks written: 2 / 4 File log 0,2 write(0,2); write(2,2); write(4,2); File data Asynchronous Synchronous fsync(); 0 1 2 3 4 5 6 7 17 Write entry offset, length

  18. Synchronicity Predictor • Predict whether the future accesses are likely to be synchronous write(0,2); Data blocks written: 0 / 4 File log 0,2 2,2 4,2 fsync(); write(2,2); fsync(); Synchronous File data write(4,2); fsync(); 0 1 2 3 4 5 6 7 Data blocks written: 4 / 4 File log 0,2 2,2 write(0,2); write(2,2); write(4,2); File data Asynchronous Synchronous fsync(); 0 1 2 3 4 5 6 7 18 Write entry offset, length

  19. Synchronicity Predictor • Predict whether the future accesses are likely to be synchronous write(0,2); Data blocks written: 0 / 4 File log 0,2 2,2 4,2 fsync(); write(2,2); fsync(); Synchronous File data write(4,2); fsync(); 0 1 2 3 4 5 6 7 Data blocks written: 6 / 4 File log 0,2 2,2 4,2 write(0,2); write(2,2); write(4,2); File data Asynchronous Synchronous fsync(); 0 1 2 3 4 5 6 7 19 Write entry offset, length

  20. Synchronicity Predictor • Predict whether the future accesses are likely to be synchronous write(0,2); Data blocks written: 0 / 4 File log 0,2 2,2 4,2 fsync(); write(2,2); fsync(); Synchronous File data write(4,2); fsync(); 0 1 2 3 4 5 6 7 Data blocks written: 0 / 4 File log 0,2 2,2 4,2 write(0,2); write(2,2); write(4,2); File data Asynchronous fsync(); 0 1 2 3 4 5 6 7 20 Write entry offset, length

  21. Write Size Predictor • Predict whether the incoming writes are both large and stable File log 0,4,3 4,4,1 5,1,0 6,1,0 File data 0 1 2 3 4 5 6 7 write(0,4); write(6,1); write(4,4); 21 Write entry offset, length, counter

  22. Write Size Predictor • Predict whether the incoming writes are both large and stable File log 0,4,3 4,4,1 5,1,0 6,1,0 0,4,? File data 0 1 2 3 4 5 6 7 write(0,4); write(6,1); write(4,4); 22 Write entry offset, length, counter

  23. Write Size Predictor • Predict whether the incoming writes are both large and stable Length ≥ 4 Large File log 0,4,3 4,4,1 5,1,0 6,1,0 0,4,? Stable Predecessor found File data 0 1 2 3 4 5 6 7 write(0,4); write(6,1); write(4,4); 23 Write entry offset, length, counter

  24. Write Size Predictor • Predict whether the incoming writes are both large and stable Length ≥ 4 Large File log 0,4,3 4,4,1 5,1,0 6,1,0 0,4,? Stable Predecessor found 0,4,4 File data 0 1 2 3 4 5 6 7 write(0,4); write(6,1); write(4,4); 24 Write entry offset, length, counter

  25. Write Size Predictor • Predict whether the incoming writes are both large and stable Length ≥ 4 Large File log 0,4,3 4,4,1 5,1,0 6,1,0 0,4,? Stable Predecessor found 0,4,4 File data 0 1 2 3 4 5 6 7 6,1,? write(0,4); write(6,1); write(4,4); 25 Write entry offset, length, counter

  26. Write Size Predictor • Predict whether the incoming writes are both large and stable Length ≥ 4 Large File log 0,4,3 4,4,1 5,1,0 6,1,0 0,4,? Stable Predecessor found 0,4,4 File data Length < 4 Small 0 1 2 3 4 5 6 7 6,1,? Predecessor found Stable write(0,4); write(6,1); write(4,4); 26 Write entry offset, length, counter

  27. Write Size Predictor • Predict whether the incoming writes are both large and stable Length ≥ 4 Large File log 0,4,3 4,4,1 5,1,0 6,1,0 0,4,? Stable Predecessor found 0,4,4 File data Length < 4 Small 0 1 2 3 4 5 6 7 6,1,? Predecessor found Stable 6,1,0 write(0,4); write(6,1); write(4,4); 27 Write entry offset, length, counter

  28. Write Size Predictor • Predict whether the incoming writes are both large and stable Length ≥ 4 Large File log 0,4,3 4,4,1 5,1,0 6,1,0 0,4,? Stable Predecessor found 0,4,4 File data Length < 4 Small 0 1 2 3 4 5 6 7 6,1,? Predecessor found Stable 6,1,0 write(0,4); write(6,1); write(4,4); 4,4,? 28 Write entry offset, length, counter

  29. Write Size Predictor • Predict whether the incoming writes are both large and stable Length ≥ 4 Large File log 0,4,3 4,4,1 5,1,0 6,1,0 0,4,? Stable Predecessor found 0,4,4 File data Length < 4 Small 0 1 2 3 4 5 6 7 6,1,? Predecessor found Stable 6,1,0 write(0,4); write(6,1); write(4,4); Length ≥ 4 Large 4,4,? Predecessor not found Unstable 29 Write entry offset, length, counter

Recommend


More recommend