with gpu in hybrid storage systems
play

with GPU in Hybrid Storage Systems Prince Hamandawana, Awais Khan, - PowerPoint PPT Presentation

Accelerating the Data Deduplication Performance with GPU in Hybrid Storage Systems Prince Hamandawana, Awais Khan, Changgyu Lee , Sungyong Park, Youngjae Kim Department of Computer Science and Engineering Sogang University, Seoul, Republic of


  1. Accelerating the Data Deduplication Performance with GPU in Hybrid Storage Systems Prince Hamandawana, Awais Khan, Changgyu Lee , Sungyong Park, Youngjae Kim Department of Computer Science and Engineering Sogang University, Seoul, Republic of Korea PDSW-DISCS 17 WIP session November 13, 2017, Denver, USA Laboratory for Advanced System Software 1

  2. Inline Deduplication in Cloud Storage System  To achieve high space utilization in Tiered Cloud Storage System, following techniques are discussed in community 1. Compression 2. Erasure Coding  Can’t remove replicated data across cluster  Difficult to deploy inline mode 3. Inline Data Deduplication  Higher Storage Efficiency by removing replicated data across cluster  Eliminating duplicated data in Cache tier  But, overhead of inline deduplication directly affects to performance.  In Hybrid Storage system, Cache-tier nodes equip SSDs and inline deduplication can reduce amount of writes to SSD. → Lower Write Amplification, Longer Lifetime 2

  3. Inline Deduplication Framework on Ceph CRUSH Algorithm Object Fingerprint Fingerprint Index Index Cache Node #1 Cache Node #1 Cache Tier (SSD) Storage Tier (SSD) Storage Node #1 Storage Node #2 Storage Node #3 3

  4. Inline Deduplication Framework on Ceph Fingerprint Fingerprint Index Index Chunking 1 2 3 4 Object Cache Node #1 Cache Node #1 Cache Tier (SSD) Storage Tier (SSD) Storage Node #1 Storage Node #2 Storage Node #3 4

  5. Inline Deduplication Framework on Ceph Fingerprint Fingerprint Index Index Fingerprinting 1 2 3 4 1 2 3 4 Cache Node #1 Cache Node #1 Cache Tier (SSD) Storage Tier (SSD) Storage Node #1 Storage Node #2 Storage Node #3 5

  6. Inline Deduplication Framework on Ceph Not Duplicate Fingerprint Fingerprint Index Index Deduplication 1 2 3 4 Check 1 2 3 4 Cache Node #1 Cache Node #1 Cache Tier (SSD) Storage Tier (SSD) Storage Node #1 Storage Node #2 Storage Node #3 6

  7. Inline Deduplication Framework on Ceph Duplicate Fingerprint Fingerprint Index Index Deduplication 2 3 4 Check 2 3 4 Cache Node #1 Cache Node #1 Cache Tier (SSD) Storage Tier (SSD) 1 Storage Node #1 Storage Node #2 Storage Node #3 7

  8. Inline Deduplication Framework on Ceph Increase Reference Count Fingerprint Fingerprint Index Index Deduplication 3 4 Check 3 4 Cache Node #1 Cache Node #1 Cache Tier (SSD) Storage Tier (SSD) 1 Storage Node #1 Storage Node #2 Storage Node #3 8

  9. Fingerprint Overhead and GPU Acceleration  Deduplication overhead consists of  Chunking  Calculating Fingerprint  Fingerprint Query  We observed fingerprint overhead is more than 70% in total deduplication overhead.  To reduce fingerprinting overhead, we propose to use GPU Acceleration for fingerprinting. 9

  10. Accelerating Fingerprint Calculation with GPU Fingerprint Fingerprint Index Index GPU 1 2 3 4 Fingerprinting GPU Cache Node #1 Cache Node #1 Cache Tier (SSD) Storage Tier (SSD) Storage Node #1 Storage Node #2 Storage Node #3 10

  11. Accelerating Fingerprint Calculation with GPU Fingerprint Fingerprint Index Index GPU Fingerprinting 1 2 3 4 1 2 3 4 GPU Cache Node #1 Cache Node #1 Cache Tier (SSD) Storage Tier (SSD) Storage Node #1 Storage Node #2 Storage Node #3 11

  12. Accelerating Fingerprint Calculation with GPU Fingerprint Fingerprint Index Index GPU Fingerprinting 1 2 3 4 GPU 1 2 3 4 Cache Node #1 Cache Node #1 Cache Tier (SSD) Storage Tier (SSD) Storage Node #1 Storage Node #2 Storage Node #3 12

  13. Experiment Setup  Ceph Jewel v10.2.5  CUDA Toolkit 8.0  4 OSD server  Intel Xeon ES-2640 v3 @ 2.60GHz  32GB memory  12GB NVIDIA Tesla K80 GPU  2 SSDs (Cache Tier), 4 HDD (Storage Tier)  Ceph RBD Client  Total 1GB size random 4MB writes using fio benchmark 13

  14. Preliminary Results 20 18  GPU Fingerprinting reduced 16 about 65% of fingerprint 14 Total Time (sec) overhead. 65% Reduced 12 10 8  Total Deduplication overhead 6 is reduced to 52%. 4 2 0 CPU GPU CPU GPU CPU GPU CPU GPU 128 256 512 1024 Chunk Size (KB) Chunking Fingerprint Fingerprint Query 14

  15. Q&A  Contact: Changgyu Lee (changgyu@sogang.ac.kr) Department of Computer Science and Engineering Sogang University, Seoul, Republic of Korea 15

Recommend


More recommend