ssdalloc hybrid ssd ram memory management made easy
play

SSDAlloc: Hybrid SSD/RAM Memory Management Made Easy Anirudh Badam - PowerPoint PPT Presentation

SSDAlloc: Hybrid SSD/RAM Memory Management Made Easy Anirudh Badam Vivek S. Pai Princeton University 03/31/2011 1 Memory in Networked Systems 2 Memory in Networked Systems As a cache to reduce pressure on the disk Memcache like


  1. Transparent Tiering Today write free() read RAM 29 30 31 32 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 13 14 15 16 9 10 11 12 29 30 31 32 17 18 19 20 25 26 27 28 33 34 35 36 45 46 47 48 41 42 43 44 49 50 51 52 61 62 63 64 57 58 59 60 SSD 9

  2. Transparent Tiering Today write free() read RAM 29 30 31 32 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 13 14 15 16 5 6 7 8 9 10 11 12 29 30 31 32 17 18 19 20 21 22 23 24 25 26 27 28 33 34 35 36 45 46 47 48 37 38 39 40 41 42 43 44 49 50 51 52 61 62 63 64 53 54 55 56 57 58 59 60 SSD 9

  3. Transparent Tiering Today write free() read RAM 29 30 31 32 1 2 3 4 5 6 7 8 9 10 11 12 Indirection In the OS Table or in the FTL 1 2 3 4 13 14 15 16 5 6 7 8 9 10 11 12 29 30 31 32 17 18 19 20 21 22 23 24 25 26 27 28 33 34 35 36 45 46 47 48 37 38 39 40 41 42 43 44 49 50 51 52 61 62 63 64 53 54 55 56 57 58 59 60 SSD (log structured page store) 9

  4. Non-Transparent Tiering 10

  5. Non-Transparent Tiering • Redesign application to be flash aware • Custom object store with custom pointers • Reads, writes and garbage collection at an application object granularity • Avoid in-place writes (objects could be small) • Obtain the best performance and lifetime from flash memory device 10

  6. Non-Transparent Tiering • Redesign application to be flash aware • Custom object store with custom pointers • Reads, writes and garbage collection at an application object granularity • Avoid in-place writes (objects could be small) • Obtain the best performance and lifetime from flash memory device • Intrusive modifications needed • Expertise with flash memory needed 10

  7. Non-Transparent Tiering MyObject* obj = malloc( sizeof( MyObject ) ); malloc obj->x = 0; + obj->y = 1; SSD-swap obj->z = 2; free( obj ); MyObjectID oid = createObject( sizeof( MyObject ) ); MyObject* obj = malloc( sizeof( MyObject ) ); Application readObject( oid, obj ); obj->x = 0; Rewrite obj->y = 1; obj->z = 2; writeObject( oid, obj ); free( obj ); 11

  8. Our Goal 12

  9. Our Goal • Run mostly unmodified applications • Work via memory allocators in C-style programs 12

  10. Our Goal • Run mostly unmodified applications • Work via memory allocators in C-style programs • Use the DRAM effectively • Use it as an object cache (not as a page cache) 12

  11. Our Goal • Run mostly unmodified applications • Work via memory allocators in C-style programs • Use the DRAM effectively • Use it as an object cache (not as a page cache) • Use the SSD wisely • As a log-structured object store 12

  12. Our Goal • Run mostly unmodified applications • Work via memory allocators in C-style programs • Use the DRAM effectively • Use it as an object cache (not as a page cache) • Use the SSD wisely • As a log-structured object store • Reorganize virtual memory allocation to discern object information 12

  13. SSDAlloc Overview Application Virtual Memory (Object per page - OPP) Physical Memory SSD 13

  14. SSDAlloc Overview Memory Manager: Creates 64 objects of 1KB size Application Virtual Memory (Object per page - OPP) Physical Memory SSD 13

  15. SSDAlloc Overview Memory Manager: Creates 64 objects of 1KB size Application Virtual 1 2 3 4 ... Memory (Object per 61 62 63 64 page - OPP) Physical Memory SSD 13

  16. SSDAlloc Overview Memory Manager: Creates 64 objects of 1KB size Application Virtual 1 2 3 4 ... Memory (Object per 61 62 63 64 page - OPP) 1 Physical 12 Memory 33 Page Buffer SSD 13

  17. SSDAlloc Overview Memory Manager: Creates 64 objects of 1KB size Application Virtual 1 2 3 4 ... Memory (Object per 61 62 63 64 page - OPP) 15 16 17 18 2 3 4 5 1 ... Physical 19 20 21 22 6 7 8 9 12 Memory 23 24 25 26 10 11 13 14 33 Page Buffer RAM Object Cache SSD 13

  18. SSDAlloc Overview Memory Manager: Creates 64 objects of 1KB size Application Virtual 1 2 3 4 ... Memory (Object per 61 62 63 64 page - OPP) 15 16 17 18 2 3 4 5 1 ... Physical 19 20 21 22 6 7 8 9 12 Memory 23 24 25 26 10 11 13 14 33 Page Buffer RAM Object Cache Log structured object store SSD 13

  19. SSDAlloc Options 14

  20. SSDAlloc Options Object Per Page (OPP) Memory Page (MP) Application Defined 4KB objects Data Entity Objects (like pages) 14

  21. SSDAlloc Options Object Per Page (OPP) Memory Page (MP) Application Defined 4KB objects Data Entity Objects (like pages) Memory Manager Pool Allocator Coalescing Allocator 14

  22. SSDAlloc Options Object Per Page (OPP) Memory Page (MP) Application Defined 4KB objects Data Entity Objects (like pages) Memory Manager Pool Allocator Coalescing Allocator No. of pages * Virtual Memory No. of objects * page_size page_size 14

  23. SSDAlloc Options Object Per Page (OPP) Memory Page (MP) Application Defined 4KB objects Data Entity Objects (like pages) Memory Manager Pool Allocator Coalescing Allocator No. of pages * Virtual Memory No. of objects * page_size page_size Separate Page Buffer & No such Physical Memory RAM Object Cache separation 14

  24. SSDAlloc Options Object Per Page (OPP) Memory Page (MP) Application Defined 4KB objects Data Entity Objects (like pages) Memory Manager Pool Allocator Coalescing Allocator No. of pages * Virtual Memory No. of objects * page_size page_size Separate Page Buffer & No such Physical Memory RAM Object Cache separation Log-structured Object Log-structured Page SSD Usage Store Store 14

  25. SSDAlloc Options Object Per Page (OPP) Memory Page (MP) Application Defined 4KB objects Data Entity Objects (like pages) Memory Manager Pool Allocator Coalescing Allocator No. of pages * Virtual Memory No. of objects * page_size page_size Separate Page Buffer & No such Physical Memory RAM Object Cache separation Log-structured Object Log-structured Page SSD Usage Store Store Minimal changes restricted Code Changes No changes needed to memory allocation 14

  26. SSDAlloc Overview 15

  27. SSDAlloc Overview Application Virtual Memory RAM Object Cache SSD 15

  28. SSDAlloc Overview • Application A small set of pages in core Virtual Memory Page Buffer RAM Object Cache SSD 15

  29. SSDAlloc Overview • Application A small set of pages in core • Pages materialized on demand Virtual Memory from RAM object cache/SSD • Restricted in size to minimize Page Buffer RAM wastage (from OPP) RAM Object Cache Demand SSD Fetching 15

  30. SSDAlloc Overview • Application A small set of pages in core • Pages materialized on demand Virtual Memory from RAM object cache/SSD • Restricted in size to minimize Page Buffer RAM wastage (from OPP) • Implemented using mprotect RAM Object Cache Demand SSD Fetching 15

  31. SSDAlloc Overview • Application A small set of pages in core • Pages materialized on demand Virtual Memory from RAM object cache/SSD • Restricted in size to minimize Page Buffer RAM wastage (from OPP) • Implemented using mprotect RAM Object Cache Demand SSD Fetching 15

  32. SSDAlloc Overview • Application A small set of pages in core • Pages materialized on demand Virtual Memory from RAM object cache/SSD • Restricted in size to minimize Page Buffer RAM wastage (from OPP) • Implemented using mprotect RAM Object Cache Demand SSD Fetching 15

  33. SSDAlloc Overview • Application A small set of pages in core X • Pages materialized on demand Virtual Memory from RAM object cache/SSD • Restricted in size to minimize Page Buffer RAM wastage (from OPP) • Implemented using mprotect RAM Object • Page materialized in seg-fault handler Cache Demand SSD Fetching 15

  34. SSDAlloc Overview • Application A small set of pages in core X • Pages materialized on demand Virtual Memory from RAM object cache/SSD • Restricted in size to minimize Page Buffer RAM wastage (from OPP) • Implemented using mprotect RAM Object • Page materialized in seg-fault handler Cache • RAM Object Cache continuously Demand Dirty flushes dirty objects to the SSD in SSD Fetching Objects LRU order 15

  35. SSD Maintenance 16

  36. SSD Maintenance Virtual Memory Object RAM Object Tables Cache Dirty Objects SSD 16

  37. SSD Maintenance 16

  38. SSD Maintenance • Copy-and-compact garbage-collector/log-writer • Seek optimizations not needed 16

  39. SSD Maintenance • Copy-and-compact garbage-collector/log-writer • Seek optimizations not needed • Read at the head and write live and dirty objects • Use Object Tables to determine liveness 16

  40. SSD Maintenance • Copy-and-compact garbage-collector/log-writer • Seek optimizations not needed • Read at the head and write live and dirty objects • Use Object Tables to determine liveness • Garbage is disposed • Objects written elsewhere are garbage • OPP object which is “free” is garbage 16

  41. Implementation 17

  42. Implementation • 11,000 lines of C++ code (runtime library) • Implemented using mprotect, mmap, and madvise • SSDAlloc-OPP pool and array allocator • SSDAlloc-MP coalescing allocator (array allocations) • SSDFree frees the allocated data • Can coexist with malloc pointers 17

  43. SSD Usage Techniques 18

  44. SSD Usage Techniques Write Access < Finegrained Avoid DRAM High Programming Technique Logging 4KB GC Pollution Performance Ease ✔ SSD Swap SSD Swap (Write Logged) ✔ ✔ Application ✔ ✔ ✔ ✔ ✔ Rewrite 18

  45. SSD Usage Techniques Write Access < Finegrained Avoid DRAM High Programming Technique Logging 4KB GC Pollution Performance Ease ✔ SSD Swap SSD Swap (Write Logged) ✔ ✔ Application ✔ ✔ ✔ ✔ ✔ Rewrite ✔ ✔ ✔ ✔ ✔ ✔ SSDAlloc 18

  46. SSDAlloc Runtime Overhead 19

  47. SSDAlloc Runtime Overhead • Overhead for SSDAlloc runtime intervention Overhead Source Max Latency TLB Miss (DRAM Read) 0.014 μ Sec Object Table Lookup 0.046 μ Sec Page Materialization 0.138 μ Sec Page Dematerialization 0.172 μ Sec Signal Handling 0.666 μ Sec Combined Overhead 0.833 μ Sec 19

  48. SSDAlloc Runtime Overhead • Overhead for SSDAlloc runtime intervention Overhead Source Max Latency TLB Miss (DRAM Read) 0.014 μ Sec Object Table Lookup 0.046 μ Sec Page Materialization 0.138 μ Sec Page Dematerialization 0.172 μ Sec Signal Handling 0.666 μ Sec Combined Overhead 0.833 μ Sec • NAND Flash latency ~ 30-50 μ Sec 19

  49. SSDAlloc Runtime Overhead • Overhead for SSDAlloc runtime intervention Overhead Source Max Latency TLB Miss (DRAM Read) 0.014 μ Sec Object Table Lookup 0.046 μ Sec Page Materialization 0.138 μ Sec Page Dematerialization 0.172 μ Sec Signal Handling 0.666 μ Sec Combined Overhead 0.833 μ Sec • NAND Flash latency ~ 30-50 μ Sec • Can reach 1 Million IOPS 19

  50. Experiments 20

  51. Experiments • Comparing three allocation methods • malloc replaced with SSDAlloc-OPP • malloc replaced with SSDAlloc-MP • Swap 20

  52. Experiments • Comparing three allocation methods • malloc replaced with SSDAlloc-OPP • malloc replaced with SSDAlloc-MP • Swap • 2.4Ghz Quadcore CPU with 16GB RAM • RiData, Kingston, Intel X25-E, Intel X25-V and Intel X25-M 20

  53. Results Overview SSDAlloc-OP c-OPP’s gain vs Original Original Modified Modified Application LOC LOC Swap SSDAlloc-MP Memcached 11,193 21 5.5 - 17.4x 1.4 - 3.5x B+Tree 477 15 4.3 - 12.7x 1.4 - 3.2x Index Packet 1,540 9 4.8 - 10.1x 1.3 - 2.3x Cache HashCache 20,096 36 5.3 - 17.1x 1.3 - 3.3x 21

  54. Results Overview SSDAlloc-OP c-OPP’s gain vs Original Original Modified Modified Application LOC LOC Swap SSDAlloc-MP Memcached 11,193 21 5.5 - 17.4x 1.4 - 3.5x B+Tree 477 15 4.3 - 12.7x 1.4 - 3.2x Index Packet 1,540 9 4.8 - 10.1x 1.3 - 2.3x Cache HashCache 20,096 36 5.3 - 17.1x 1.3 - 3.3x • SSDAlloc applications write up to 32 times less data to the SSD than when compared to the traditional VM style applications 21

  55. Microbenchmarks 22

Recommend


More recommend