leveraging value locality in optimizing nand flash based
play

Leveraging Value Locality in Optimizing NAND Flash-based SSDs Aayush - PowerPoint PPT Presentation

Leveraging Value Locality in Optimizing NAND Flash-based SSDs Aayush Gupta , Raghav Pisolkar, Bhuvan Urgaonkar and Anand Sivasubramaniam Computer Systems Lab The Pennsylvania State University 1 Agenda Relook at Locality Another


  1. Leveraging Value Locality in Optimizing NAND Flash-based SSDs Aayush Gupta , Raghav Pisolkar, Bhuvan Urgaonkar and Anand Sivasubramaniam Computer Systems Lab The Pennsylvania State University 1

  2. Agenda  Relook at Locality  Another dimension of Locality : Value Locality • Value Locality and SSDs  CA-SSD Design • Mapping Structures • Metadata Management  Evaluation • CA-SSD vs Traditional SSD 2

  3. Locality: The pillar of storage  Temporal Locality • If a logical address is accessed now, it is likely to be accessed again in the near future  Spatial Locality • If a logical address is accessed now, there is a high likelihood that its neighboring addresses will be accessed in the near future  Pervasive : L1/L2 cache, TLB, Buffer Cache, Virtual Memory, Disk Cache, Web Cache … 3

  4. Another Dimension of Locality  Value Locality • Certain content is accessed preferentially  Data deduplication using Content Addressable Storage (CAS) Can we use Value Locality to  Use cases of Value Locality (VL) address the idiosyncrasies of SSDs? • Network traffic reduction • Content based Caching • Efficient data storage (archival/backup) • E.g: Venti, Foundation, EMC Centera, Data Domain Storage Systems 4

  5. CAS suits SSD CAS SSD Provides Writes are a bottleneck Deduplication Read/Write asymmetry Block Erases 5

  6. CAS and SSD: Made for each other? CAS SSD Out of Place Updates 6

  7. Out of Place Updates in CAS Write Write (123, ‘ABC’) (123, ‘XYZ’) Logical Address 120 121 122 123 124 Translation 120 121 122 123 124 Physical Address XYZ ABC DEF PQR TUV Storage 7

  8. CAS and SSD: Made for each other? CAS SSD Out of Place Erase before Updates Write 8

  9. Problem with CAS CAS SSD Loss of Fast Random Sequentiality Reads Do real workloads exhibit Value Locality? 9

  10. Workloads [Koller10] Workload Writes Total Unique Unique (%) Requests Write Read Write dominant (Millions) Requests Requests (%) (%) web 77.0 3.8 42.35 32.05 Duplication mail 77.3 3.6 7.83 80.85 homes 96.7 4.4 66.37 80.75 [Koller10] Koller, R., and Rangaswami , R. “I/O Deduplication: Utilizing Content Similarity to Improve I/O Performance.” ( FAST’10) 10

  11. Value Popularity  VP represents the number of occurrences of each unique value in a workload • Signifies potential for deduplication for a workload 11

  12. Some Values are Very Popular 1 Cumulative fraction of 0.8 write accesses 0.6 0.5 8.8% 0.4 mail 0.2 0 24K 0 0.5 1 1.5 2 Unique Values (x 10 5 ) 12

  13. Some Values are Very Popular 1 Cumulative fraction of 0.8 write accesses 0.6 8.8% 0.5 30% web 0.4 mail homes 0.2 0 24K 84K 0 0.5 1 1.5 2 2.5 Unique Values (x 10 5 ) 13

  14. CA-SSD CAS SSD 14

  15. CA-SSD Design BB-RAM RAM SSD Controller Hash Co-processor 15

  16. CA-SSD Design Update Mapping Structures BB-RAM (Mapping H(Data ) structures) PPN NULL Write SSD LPN, Data Device Write Write Controller LPN, Data Driver PPN, Data (FTL) Return Data H(Data ) Hash Co-processor 16

  17. Mapping Structures: LPT & HPT HPT LPT Hash PPN LPN PPN L1 P1 H1 P1 L2 P2 H2 P2 L3 P1 H3 P3 H4 P4 L4 P4 17

  18. Mapping Structures: iLPT LPT HPT LPN PPN Hash PPN L1 P1 H1 P1 L2 P2 H2 P2 L3 P1 H3 P3 L4 P4 H4 P4 iLPT PPN LPN P1 L1, L3 P2 L2 P3 INV L4 P4 18

  19. Mapping Structures: iLPT & iHPT LPT HPT LPN PPN Hash PPN L1 P1 H1 P1 L2 P2 H2 P2 Remove L3 P1 H3 P3 L4 P4 H4 P4 iLPT iHPT PPN LPN PPN Hash P1 H1 P1 L1, L3 P2 L2 P2 H2 P3 INV P3 H3 L4 P4 P4 H4 19

  20. Metadata: Traditional SSD LPT LPN PPN L1 P1 L2 P2 RAM L3 P3 L4 P4 SSD Controller 20

  21. Metadata : CA-SSD Option 1: Larger RAM HPT LPT Hash PPN LPN PPN L1 P1 H1 P1 L2 P2 H2 P2 L3 P1 H3 P3 L4 P4 H4 P4 BB-RAM BB-RAM iHPT iLPT How do we fit the metadata SSD Not Scalable!! PPN Hash PPN LPN Controller in CA- SSD’s RAM? P1 H1 P1 L1,L3 P2 H2 P2 L2 Hash P3 H3 P3 INV Co-processor P4 H4 P4 L4 21

  22. Option 2 : Shrink Metadata LPT HPT Hash PPN LPN PPN L1 P1 H1 P1 L2 P2 H2 P2 L3 P1 H3 P3 L4 P4 H4 P4 BB-RAM iHPT iLPT SSD PPN Hash PPN LPN Controller P1 H1 P1 L1,L3 P2 H2 P2 L2 Hash P3 H3 P3 INV Co-processor P4 H4 P4 L4 22

  23. Temporal Value Locality  TVL implies that if a certain value is accessed now, it is likely to be accessed again in the near future not necessarily from the same address 23

  24. Temporal Value Locality: Writes web 1 Cumulative Fraction of 0.9 0.8 Write Requests 1.3K 141K  Higher TVL than 0.6 traditional TL  Shrink metadata 0.4 using TVL 0.2 Value LPN 0 0 0.75 1.5 2.25 Position in LRU Queue (x 10 5 ) 24

  25. Metadata Management: TVL HPT LPT Hash PPN LPN PPN L1 P1 H1 P1 L2 P2 H2 P2 L3 P1 H3 P3 L4 P4 P4 H4 BB-RAM iLPT iHPT SSD PPN Hash PPN LPN Controller P1 H1 P1 L1,L3 P2 H2 P2 L2 P3 H3 Hash P3 INV P4 H4 Co-processor P4 L4 25

  26. Metadata Management: TVL HPT LPT Hash PPN LPN PPN L1 P1 H1 P1 MRU L2 P2 H2 P2 L3 P1 LRU H3 P3 L4 P4 How does CA-SSD perform compared Discard BB-RAM to Traditional SSDs? iHPT iLPT SSD PPN LPN PPN Hash Controller P1 H1 P1 L1,L3 P2 H2 P2 L2 Hash P3 INV P3 H3 Co-processor P4 L4 26

  27. Evaluation : Response Time 14 Response Time (ms) 12 7ms Mail shows 10 NON CAS lower TVL 8 CAS 6 84% 16K 65% 4 64K 2 128K 0 web mail home Traces 27

  28. Evaluation : Response Time 14 Response Time (ms) 12 Similar to 10 infinite RAM 8 NON CAS 6 CAS 4 128K 2 0 web mail home Traces 28

  29. Total Writes : web Workload Writes GC writes 8 Total writes (millions) 7 6  Dedup reduces 5 valid content 94% reduction 4  75% reduction in 3 valid pages copied 2 1 0 Non CAS CAS 16K 64K 128K 29

  30. Total Erases : web 120 100 Block erases (Thousands)  Lesser number 80 of total writes 77% 60  Reduced GC invocation 40 20 0 NON CAS CAS 16K 64K 128K 30

  31. Conclusions  Workloads exhibit significant value locality • Characterization of Value Popularity and Temporal Value Locality  CAS and SSDs complement each other  Certain implementation challenges need to be addressed • Mapping structures • Metadata Management 31

  32. Thank You Questions??? 32

Recommend


More recommend