operational experiences with disk imaging in a multi
play

Operational Experiences with Disk Imaging in a Multi-Tenant - PowerPoint PPT Presentation

Operational Experiences with Disk Imaging in a Multi-Tenant Datacenter Kevin Atkinson, Gary Wong, and Robert Ricci 2 2 2 2 2 2 2 Properties of disk images and their usage have consequences for: Storage


  1. Operational Experiences with Disk Imaging in a Multi-Tenant Datacenter Kevin Atkinson, Gary Wong, and Robert Ricci

  2. � 2

  3. � 2

  4. � 2

  5. � 2

  6. � 2

  7. � 2

  8. � 2

  9. Properties of disk images and their usage have consequences for: � ❖ Storage ❖ Caching ❖ Pre-loading ❖ Distribution � 3

  10. � 4

  11. What does the working set look like? � 4

  12. What does the working set look like? What do the images themselves look like? � 4

  13. What does the working set look like? What do the images themselves look like? What are the key factors in pre-loading? � 4

  14. The dataset ❖ Four years (2009-2013): 279,972 requests ❖ Users: 1,301 individuals, 368 organizations ❖ Unique images: 714 ❖ Emulab ❖ ~600 PCs ❖ Facility / user image model � 5

  15. User Behavior

  16. “Emulab is a pretty odd beast and its users are even weirder.” � 7

  17. “Emulab is a pretty odd beast and its users are even weirder.” –Reviewer D � 7

  18. “Emulab is a pretty odd beast and its users are even weirder.” –Reviewer D [Emulab user] � 7

  19. Facility vs. user images Facility User 55.6% 44.4% � 8

  20. Facility vs. user images Facility User 55.6% 44.4% ����� ����� ���� �������� � ���� ����� ����� �������� ���� � 8

  21. Facility vs. user images Facility User 55.6% 44.4% ����� ����� ���� �������� � ���� 1) Most users stick to facility or user images ����� 2) Heaviest users use their own images ����� �������� ���� � 8

  22. Image popularity � 9

  23. Image popularity � 9

  24. Image popularity � 9

  25. Image popularity � 9

  26. Image popularity Exponential � 9

  27. Image popularity Exponential Heavy-Tailed � 9

  28. Image popularity 1) Facility images have a smaller, lighter tail 2) Most popular image < 13% of requests Exponential Heavy-Tailed � 9

  29. Scaling: total images � 10

  30. Scaling: total images � 10

  31. Scaling: total images � 10

  32. Scaling: total images As userbase grows, user images dominate the totals � 10

  33. Daily working set � 11

  34. Daily working set Small image set each day –※ good caching potential � 11

  35. Scaling: working set � 12

  36. Scaling: working set � 12

  37. Scaling: working set � 12

  38. Scaling: working set Facility will max out � 12

  39. Scaling: working set Facility will max out –※ In the limit, highly popular facility images account for most requests � 12

  40. Image Contents

  41. Block-level similarity Base � 14

  42. Block-level similarity Base Derived � 14

  43. Block-level similarity Base Derived � 14

  44. Block-level similarity Base Derived Percentage of blocks that need to be written to transform the base image into derived � 14

  45. Block-level similarity Derived: User image Base: Most similar facility image � 15

  46. Block-level similarity Derived: User image Base: Most similar facility image � 15

  47. Block-level similarity Derived: User image 1) De-duplicating storage an attractive option Base: Most similar facility image 2) Differential loading has potential � 15

  48. Pre-Loading

  49. Pre-loading: Size � 17

  50. Pre-loading: Size Spare Capacity � 17

  51. Pre-loading: Size Spare Capacity Mostly Full � 17

  52. Pre-loading: Size Spare Capacity WSS for facility images maxes out Mostly Full on large facilities � 17

  53. Pre-loading: Size 1) Key: Ratio of WSS to idle capacity 2) Effective when Spare Capacity ratio is high WSS for facility images maxes out Mostly Full on large facilities � 17

  54. Pre-loading: Rate � 18

  55. Pre-loading: Rate � 18

  56. Pre-loading: Rate Invest in fast, scalable imaging � 18

  57. Conclusions

  58. General conclusions ❖ Deduplicating, two-tier storage attractive ❖ Caching can be effective ❖ Image lifespan, idle periods ❖ Treat facility and user images differently ❖ Facility better targets for pre-loading ❖ Differential loading requires new strategies ❖ Potential savings, outline of optimization problem ❖ Images per organization, WSS per week � 20

  59. Explore the data, reproduce our results: � http://aptlab.net/p/tbres/nsdi14 � 21

  60. No dominant images ��� �� �� �� �� � ��� ��� ��� ��� ��� � 22

  61. No dominant images ��� �� �� �� No image dominates long-term, popular �� images change frequently � ��� ��� ��� ��� ��� � 22

  62. Image lifespan � 23

  63. Image lifespan A few days � 23

  64. Image lifespan A few days Four Years � 23

  65. Image lifespan A few days Four Years Two-tiered storage system attractive � 23

  66. Savings from deltas � 24

  67. Images per organization � 25

  68. Idle images � 26

  69. WSS per week � 27

  70. Top images RHL90-STD [D] 21,993 7.9% FEDORA10-STD 18,042 6.4% UBUNTU10-STD 14,402 5.1% RHL90-STD 13,182 4.7% FC4-UPDATE 12,097 4.3% 715/10 11,156 4.0% u FBSD410-STD 8,916 3.2% FEDORA8-STD 8,153 2.9% 237/69 7,512 2.7% u 296/35 7,179 2.6% u 787/24 6,243 2.2% u UBUNTU70-STD 6,021 2.2% UBUNTU12-64-STD 5,834 2.1% � 28

  71. Size considerations ❖ Small facilities with few idle disks ❖ Pre-loading not valuable ❖ Large facilities - focus on: ❖ Scalable reloading mechanisms ❖ Prediction and optimization for user requests � 29

Recommend


More recommend