storage deduplication in cloud computing
play

Storage Deduplication in Cloud Computing Joo Paulo and Jos Pereira - PowerPoint PPT Presentation

Storage Deduplication in Cloud Computing Joo Paulo and Jos Pereira University of Minho July 2010 Joo Paulo and Jos Pereira Storage Deduplication in Cloud Computing Cloud Computing Overview Cloud Computing Cloud services allow clients


  1. Storage Deduplication in Cloud Computing João Paulo and José Pereira University of Minho July 2010 João Paulo and José Pereira Storage Deduplication in Cloud Computing

  2. Cloud Computing Overview Cloud Computing Cloud services allow clients to shift their data and applications into the “cloud“. These services run in a scalable and dependable infrastructure, which has a large server pool in several data centres. Virtualization Virtualization is a key aspect to achieve the Elasticity provided by cloud computing. Virtual Machines (VMs) can be deployed/migrated in few minutes. VMs Isolation allows a better management of resources and failures. João Paulo and José Pereira Storage Deduplication in Cloud Computing

  3. Cloud Computing Overview Cloud Computing Cloud services allow clients to shift their data and applications into the “cloud“. These services run in a scalable and dependable infrastructure, which has a large server pool in several data centres. Virtualization Virtualization is a key aspect to achieve the Elasticity provided by cloud computing. Virtual Machines (VMs) can be deployed/migrated in few minutes. VMs Isolation allows a better management of resources and failures. João Paulo and José Pereira Storage Deduplication in Cloud Computing

  4. Deduplication Cloud services store client’s data, applications and VMs images. Deduplication allows to: Decrease storage’s size. Optimize the management of storage’s data. Deduplication introduces overhead to the service. João Paulo and José Pereira Storage Deduplication in Cloud Computing

  5. Outline Shared Storage Deduplication 1 Experimental Evaluation - Preliminary Results 2 Conclusions 3 Future Work and Challenges 4 João Paulo and José Pereira Storage Deduplication in Cloud Computing

  6. Shared Storage Deduplication Scenario VM VM VM VM VM VM VM VM VM VM VM VM Groups of VMs run in different physical machines. Each VM has its own virtual disk. Virtual disks are kept in a shared storage. João Paulo and José Pereira Storage Deduplication in Cloud Computing

  7. Shared Storage Deduplication XEN Blktap mechanism Blktap Implemented within Xen. Allows to implement virtual block devices for Virtual Machines. User-level disk I/O interface (Tapdisk). Allows to have independent per-disk handler processes. Easy to implement Copy-on-Write. João Paulo and José Pereira Storage Deduplication in Cloud Computing

  8. Shared Storage Deduplication XEN Blktap mechanism Physical Machine 1 Physical Machine 2 VM1 VM2 VM3 Tap Tap Tap aio aio aio VM1 VM2 VM3 Disk Disk Disk João Paulo and José Pereira Storage Deduplication in Cloud Computing

  9. Shared Storage Deduplication XEN Blktap mechanism Physical Machine 1 Physical Machine 2 VM1 VM2 VM3 Read/ write Tap Tap Tap aio aio aio VM1 VM2 VM3 Disk Disk Disk João Paulo and José Pereira Storage Deduplication in Cloud Computing

  10. Shared Storage Deduplication XEN Blktap mechanism Physical Machine 1 Physical Machine 2 VM1 VM2 VM3 Read/ write Tap Tap Tap aio aio aio VM1 VM2 VM3 Disk Disk Disk João Paulo and José Pereira Storage Deduplication in Cloud Computing

  11. Shared Storage Deduplication XEN Blktap mechanism Physical Machine 1 Physical Machine 2 VM1 VM2 VM3 Read/ write Tap Tap Tap aio aio aio VM1 VM2 VM3 Disk Disk Disk João Paulo and José Pereira Storage Deduplication in Cloud Computing

  12. Shared Storage Deduplication Deduplication Challenges Deduplication is usually used for backup scenarios where data is practically immutable. In a virtualized scenario where stored data changes constantly, we must have in account: The overhead introduced by the deduplication algorithm. The best approach to find duplicated data, which must be transparent to the VMs. The metadata needed to share identical data. João Paulo and José Pereira Storage Deduplication in Cloud Computing

  13. Shared Storage Deduplication Deduplication Algorithm Physical Machine 1 DHT Physical Machine 2 VM1 VM2 VM3 Tap Extend Tap disk Server disk Shared Storage João Paulo and José Pereira Storage Deduplication in Cloud Computing

  14. Shared Storage Deduplication Deduplication Algorithm Physical Machine 1 DHT Physical Machine 2 VM1 VM2 VM3 Read/ write Tap Extend Tap disk Server disk Shared Storage João Paulo and José Pereira Storage Deduplication in Cloud Computing

  15. Shared Storage Deduplication Deduplication Algorithm Physical Machine 1 DHT Physical Machine 2 VM1 VM2 VM3 V‐>P … V‐>P … V‐>P Read/ … … write … … Tap Extend Tap disk Server disk Shared Storage João Paulo and José Pereira Storage Deduplication in Cloud Computing

  16. Shared Storage Deduplication Deduplication Algorithm Physical Machine 1 DHT Physical Machine 2 VM1 VM2 VM3 V‐>P … V‐>P … V‐>P Read/ … … write … … Tap Extend Tap disk Server disk Shared Storage João Paulo and José Pereira Storage Deduplication in Cloud Computing

  17. Shared Storage Deduplication Deduplication Algorithm Physical Machine 1 DHT Physical Machine 2 VM1 VM2 VM3 Write Dirty addresses Dirty Dirty addresses addresses Tap Extend Tap disk Server disk Shared Storage João Paulo and José Pereira Storage Deduplication in Cloud Computing

  18. Shared Storage Deduplication Deduplication Algorithm Physical Machine 1 DHT Physical Machine 2 VM1 VM2 VM3 V‐>P … … COW Share Share Tap Extend Tap disk Server disk Shared Storage João Paulo and José Pereira Storage Deduplication in Cloud Computing

  19. Shared Storage Deduplication Deduplication Algorithm Physical Machine 1 DHT Physical Machine 2 VM1 VM2 VM3 V‐>P … … COW Share Share Tap Extend Tap disk Server disk Shared Storage João Paulo and José Pereira Storage Deduplication in Cloud Computing

  20. Shared Storage Deduplication Deduplication Algorithm Physical Machine 1 DHT Physical Machine 2 VM1 VM2 VM3 V‐>P … Hash‐>(Padd,Cont) … COW Share Share Tap Extend Tap disk Server disk Shared Storage João Paulo and José Pereira Storage Deduplication in Cloud Computing

  21. Shared Storage Deduplication Deduplication Algorithm Physical Machine 1 DHT Physical Machine 2 VM1 VM2 VM3 V‐>P … … Free blocks update queue Share Share Tap Extend Tap disk Server disk Shared Storage João Paulo and José Pereira Storage Deduplication in Cloud Computing

  22. Shared Storage Deduplication Deduplication Algorithm Physical Machine 1 DHT Physical Machine 2 VM1 VM2 VM3 Write COW COW Dirty COW addresses Addresses Addresses Tap Extend Tap disk Server disk free free blocks blocks buffer buffer Shared Storage João Paulo and José Pereira Storage Deduplication in Cloud Computing

  23. Shared Storage Deduplication Deduplication Algorithm Physical Machine 1 DHT Physical Machine 2 VM1 VM2 VM3 GC GC Tap Extend Tap disk Server disk free free blocks blocks buffer buffer Shared Storage João Paulo and José Pereira Storage Deduplication in Cloud Computing

  24. Shared Storage Deduplication Deduplication Algorithm Physical Machine 1 DHT Physical Machine 2 VM1 VM2 VM3 Hash‐>(Padd,Cont) GC GC Tap Extend Tap disk Server disk free free blocks blocks buffer buffer Shared Storage João Paulo and José Pereira Storage Deduplication in Cloud Computing

  25. Shared Storage Deduplication Deduplication Algorithm Physical Machine 1 DHT Physical Machine 2 VM1 VM2 VM3 Free blocks queue GC GC Tap Extend Tap disk Server disk free free blocks blocks buffer buffer Shared Storage João Paulo and José Pereira Storage Deduplication in Cloud Computing

  26. Shared Storage Deduplication Deduplication Algorithm Physical Machine 1 DHT Physical Machine 2 VM1 VM2 VM3 Free blocks queue GC/ GC/ Share Share Tap Extend Tap disk Server disk free free blocks blocks buffer buffer Shared Storage João Paulo and José Pereira Storage Deduplication in Cloud Computing

  27. Shared Storage Deduplication Deduplication Algorithm Physical Machine 1 DHT Physical Machine 2 VM1 VM2 VM3 Free blocks queue GC/ Share Tap Extend Tap disk Server disk free free blocks blocks buffer buffer Shared Storage João Paulo and José Pereira Storage Deduplication in Cloud Computing

  28. Shared Storage Deduplication Deduplication Algorithm Physical Machine 1 DHT Physical Machine 2 VM1 VM2 VM3 Free blocks queue GC/ Share Tap Extend Tap disk Server disk free free blocks blocks buffer buffer Shared Storage João Paulo and José Pereira Storage Deduplication in Cloud Computing

  29. Experimental Evaluation - Preliminary Results Outline Shared Storage Deduplication 1 Experimental Evaluation - Preliminary Results 2 Conclusions 3 Future Work and Challenges 4 João Paulo and José Pereira Storage Deduplication in Cloud Computing

  30. Experimental Evaluation - Preliminary Results Evaluated Prototype Physical Machine 1 VM1 VM2 Free blocks Without Distribution and queue Fault Tolerant design. GC/ Two Optimizations: Tap Share disk Set of mutexes for each VM’s Translation table. free blocks VM’s free blocks buffer buffer refilling granularity. Shared Storage João Paulo and José Pereira Storage Deduplication in Cloud Computing

Recommend


More recommend