file systems fated for senescence nonsense says science
play

File Systems Fated for Senescence? Nonsense, Says Science! Alex - PowerPoint PPT Presentation

File Systems Fated for Senescence? Nonsense, Says Science! Alex Conway , Ainesh Bakshi , Yizheng Jiao , Yang Zhan , Michael A. Bender , William Jannen , Rob Johnson , Bradley C. Kuszmaul , Donald E. Porter ,


  1. File Systems Fated for Senescence? Nonsense, Says Science! Alex Conway 🃠 , Ainesh Bakshi 🃠 , Yizheng Jiao ♢ , Yang Zhan ♢ , Michael A. Bender ♠ , William Jannen ♠ , Rob Johnson ♠ , Bradley C. Kuszmaul ♡ , Donald E. Porter ♢ , Jun Yuan ♣ and Martin Farach-Colton 🃠 🃠 Rutgers University, ♢ The University of North Carolina at Chapel Hill, ♠ Stony Brook University, ♡ Oracle Corporation and Massachusetts Institute of Technology, ♣ Farmingdale State College of SUNY

  2. File Systems Fated for Senescence? Nonsense, Says Science; The Essence of Semperjuvenescense is Coalescence!

  3. File Systems Fated for Senescence? Nonsense, Says Science; The Essence of old age Semperjuvenescense is Coalescence! being young forever merging together

  4. File System Aging Aging is fragmentation over time Performance

  5. In this talk Do file systems age? What can we do about it?

  6. Is aging a problem?

  7. Is aging a problem?

  8. Is aging a problem? Chris Hoffman at howtogeek.com says: “Linux’s ext2, ext3, and ext4 file systems… [are] designed to avoid fragmentation in normal use.” “If you do have problems with fragmentation on Linux, you probably need a larger hard disk.”

  9. Is aging a problem? Chris Hoffman at howtogeek.com says: “Linux’s ext2, ext3, and ext4 file systems… [are] designed to avoid fragmentation in normal use.” “If you do have problems with fragmentation on Linux, you probably need a larger hard disk.” “Modern Linux filesystems keep fragmentation at a minimum…Therefore it is not necessary to worry about fragmentation in a Linux system.”

  10. Is aging a problem? Chris Hoffman at howtogeek.com says: “Linux’s ext2, ext3, and ext4 file systems… [are] designed to avoid fragmentation in normal use.” “If you do have problems with fragmentation on Linux, you probably need a larger hard disk.” Nope “Modern Linux filesystems keep fragmentation at a minimum…Therefore it is not necessary to worry about fragmentation in a Linux system.”

  11. Is aging a problem?

  12. Is aging a problem? Aging happens in real filesystems • Smith and Seltzer (’97) Benchmarks should incorporate aging • Zhu, Chen and Chiueh (’05) • Agrawal, A. Arpaci-Dusseau and R. Arpaci-Dusseau (’09) Yep

  13. Is aging a problem? Nope Yep

  14. Let’s do some science!

  15. Inducing Aging We use three different workloads Developer workload Server workload Synthetic workloads

  16. Inducing Aging We use three different workloads Developer workload Server workload See the paper Synthetic workloads

  17. Simulating a Developer

  18. Simulating a Developer get coffee

  19. Simulating a Developer get coffee git pull git pull

  20. Simulating a Developer get coffee git pull make make git pull

  21. Simulating a Developer get coffee git pull make get coffee make git pull

  22. Simulating a Developer get coffee git pull make get coffee git pull make git pull

  23. Simulating a Developer get coffee git pull make get coffee git pull git pull add awesome features

  24. Simulating a Developer get coffee git pull make get coffee git pull git pull add awesome features get coffee

  25. Simulating a Developer get coffee git pull make get coffee git pull git pull add awesome features get coffee git pull

  26. Simulating a Developer get coffee git pull make get coffee git pull git pull add awesome features get coffee git pull fix bugs

  27. Simulating a Developer get coffee git pull make get coffee git pull git pull add awesome features get coffee git pull fix bugs ...

  28. Simulating a Developer get coffee git pull make get coffee git pull git pull add awesome features get coffee git pull fix bugs ... We can simulate a developer by replaying Git histories

  29. Simulating a Developer

  30. Simulating a Developer Use the Linux kernel repo from github.com Do 100 git pulls Measure Performance

  31. Measuring Aging time grep -r random_string /path/to/filesystem dir file1 file2 file3 file4

  32. Measuring Aging time grep -r random_string /path/to/filesystem dir file1 file2 file3 file4

  33. Measuring Aging time grep -r random_string /path/to/filesystem dir file1 file2 file3 file4

  34. Measuring Aging time grep -r random_string /path/to/filesystem dir file1 file2 file3 file4

  35. Measuring Aging time grep -r random_string /path/to/filesystem dir file1 file2 file3 file4

  36. Measuring Aging time grep -r random_string /path/to/filesystem dir file1 file2 file3 file4 Intrafile Fragmentation

  37. Measuring Aging time grep -r random_string /path/to/filesystem dir file1 file2 file3 file4 Intrafile Fragmentation

  38. Measuring Aging time grep -r random_string /path/to/filesystem dir file1 file2 file3 file4 Intrafile Fragmentation

  39. Measuring Aging time grep -r random_string /path/to/filesystem dir file1 file2 file3 file4 Interfile Intrafile Fragmentation Fragmentation

  40. Measuring Aging time grep -r random_string /path/to/filesystem dir file1 file2 file3 file4 Interfile Intrafile Fragmentation Fragmentation

  41. Measuring Aging time grep -r random_string /path/to/filesystem dir file1 file2 file3 file4 Interfile Intrafile Fragmentation Fragmentation Then normalize per gigabyte read

  42. Do modern file systems age?

  43. Git Workload on ext4 on HDD 800 Lower is better Time in seconds / GiB 600 14.3x 400 200 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 2 3 4 5 6 7 8 9 0 1 Git pulls performed Our Setup: Cold Cache, 3.4 GHz Quad Core, 4GiB RAM, 20 GiB HDD partition - SATA 7200 RPM

  44. Git Workload on ext4 on HDD 800 Lower is better Time in seconds / GiB 600 14.3x 400 200 2x slowdown 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 2 3 4 5 6 7 8 9 0 1 Git pulls performed Our Setup: Cold Cache, 3.4 GHz Quad Core, 4GiB RAM, 20 GiB HDD partition - SATA 7200 RPM

  45. Git Workload on ext4 on HDD 800 Lower is better Time in seconds / GiB 600 14.3x 4x slowdown 400 200 2x slowdown 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 2 3 4 5 6 7 8 9 0 1 Git pulls performed Our Setup: Cold Cache, 3.4 GHz Quad Core, 4GiB RAM, 20 GiB HDD partition - SATA 7200 RPM

  46. Git Workload on ext4 on HDD 800 Lower is better Time in seconds / GiB 600 14.3x 400 15 minutes to grep 1.2GiB 200 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 2 3 4 5 6 7 8 9 0 1 Git pulls performed Our Setup: Cold Cache, 3.4 GHz Quad Core, 4GiB RAM, 20 GiB HDD partition - SATA 7200 RPM

  47. How can we be sure this slowdown is due to aging?

  48. How can we be sure this slowdown is due to aging? I’m not old. My directory structure is different!

  49. File System Rejuvenation Idea: Copy same logical state to a new file system • After each 100 pulls • Compare grep cost

  50. Aging ext4 with Git on HDD 800 Lower is better Aged 600 Time in seconds / GiB 8.8x 400 200 Unaged 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 2 3 4 5 6 7 8 9 0 1 Git pulls performed

  51. Aging ext4 with Git on HDD 800 Lower is better Aged 600 Time in seconds / GiB 8.8x 400 Smaller average file size makes 200 Unaged the unaged 60% slower 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 2 3 4 5 6 7 8 9 0 1 Git pulls performed

  52. Is this specific to ext4?

  53. Aging other file systems with Git on HDD Btrfs F2FS 800 2000 600 1500 400 1000 20.6x 22.4x 200 500 0 0 Lower is better weird unaged XFS ZFS behavior on XFS 800 2000 2.2x 600 1500 400 1000 11.8x 200 500 0 0

  54. Will SSDs save us?

  55. Git Workload on XFS on SSD 30 Lower is better Aged Time in seconds / GiB 20 1.9x 10 Unaged 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 2 3 4 5 6 7 8 9 0 1 Git pulls performed

  56. Git Workload on SSD Btrfs ext4 30 30 20 20 2.2x 10 10 0 0 Lower is better F2FS ZFS 30 40 30 20 1.5x 20 10 10 0 0

  57. Git Workload on SSD Btrfs ext4 30 30 20 20 2.2x 10 10 ZFS and ext4 slow down with 0 0 smaller average file size Lower is better F2FS ZFS 30 40 30 20 1.5x 20 10 10 0 0

  58. Git Workload on SSD Btrfs ext4 30 30 20 20 2.2x 10 10 ZFS and ext4 slow down with 0 0 smaller average file size Lower is better F2FS ZFS 30 40 30 20 1.5x Told 20 10 10 ya! 0 0

  59. Aging is real Btrfs, ext4, F2FS, XFS, ZFS all age • Up to 22x on HDD • Up to 2x on SSD Git lets us replay a real development history • Induce aging by simulating years of use • Takes between 5 hours and 2 days • Download these scripts from betrfs.org

  60. How can we prevent aging?

  61. Design goals to address fragmentation Intrafile Fragmentation: Avoid breaking large files into small fragments

Recommend


More recommend