An Efficient Wear-level Architecture using Self-adaptive Wear Leveling Jianming Huang , Yu Hua, Pengfei Zuo, Wen Zhou, Fangting Huang Huazhong University of Science and Technology ICPP 2020
Non-volatile Memory NVM features − Non-volatility − Byte-addressability − Large capacity − DRAM-scale latency NVM drawbacks Intel Optane DC Persistent ‒ Limited endurance Memory ‒ High write energy consumption 2
Multi-level Cell NVM The MLC technique has been used in different kinds of # of cell NVM, including PCM, RRAM, and STT-RAM. single-level cell (SLC) 1 0 Wear-leveling schemes are V th necessary and important Compared with SLC NVM, MLC NVM # of cell Higher storage density multi-level cell (MLC) Lower cost Comparable read latency 01 11 10 00 Weaker endurance V th 10^7 of SLC PCM vs 10^5 of MLC PCM 3
Wear-leveling Schemes Table based wear-leveling scheme (TBWL) Algebraic based wear-leveling scheme (AWL) Hybrid wear-leveling scheme (HWL) 4
Wear-leveling Schemes Table based wear-leveling scheme (TBWL) LA PA WC LA PA WC 0 0 150 0 150 0 trigger wear-leveling 1 1 519 2 116 1 2 2 115 1 521 2 3 3 210 3 210 3 Segment Swapping 5
Wear-leveling Schemes Table based wear- leveling scheme (TBWL) Algebraic based wear-leveling scheme (AWL) Gap line Start line PA LA A A A A E D 0 B B B E A A 1 C C E B B B 2 D E C C C C 3 4 E D D D D E Initial State Step 1 Step 2 Step 3 Final State Step 4 region-based Start-Gap (RBSG) 6
Wear-leveling Schemes Table based wear- leveling scheme (TBWL) Algebraic wear-leveling scheme (AWL) Hybrid wear-leveling scheme (HWL) prn0 prn2 prn0 prn2 A E D H A E H D B F C G B F Read Out G C Write Back Line Shift C G B F C G F B Step 1 Step 2 Step 3 D H A E D H E A SRAM SRAM NVM NVM (Memory Controller) (Memory Controller) PCM-S 7
RAA Attack for TBWL RAA attack • Repeated Address Attack (RAA) • Repeatedly write data to the same address The lifetime of TBWL under RAA attack • One region contains the limited lines • All lines in one region are repeatedly written • NVM is worn out at the early stage ‒ 𝑈ℎ𝑓 number of lines within a region × (Tℎ𝑓 endurance of a line) 8
BPA Attack for AWL BPA attack • Birthday Paradox Attack (BPA) • Randomly select logical addresses and repeatedly write to each 9
BPA Attack for AWL BPA attack The lifetime of AWL under BPA attack • Lifetime is low 10
BPA Attack for HWL The lifetime of HWL under BPA attack • Smaller wear-leveling granularity increases the NVM lifetime 11
Problems of Existing Wear-leveling Schemes TBWL and AWL fail to defend against attacks • RAA attack leads to low lifetime in TBWL • BPA attack leads to low lifetime in AWL 12
Problems of Existing Wear-leveling Schemes TBWL and AWL fail to defend against attacks HWL obtains high lifetime with small granularity • The large granularity leads to low lifetime • The small granularity leads to high lifetime 13
Problems of Existing Wear-leveling Schemes TBWL and AWL fail to defend against attacks HWL obtains high lifetime with small granularity The cache hit ratio of HWL is affected by the granularity • The wear-leveling entries stored on cache are limited • Entries with large granularity cover large NVM high cache hit ratio • Entries with small granularity cover small NVM low cache hit ratio 14
Problems of Existing Wear-leveling Schemes TBWL and AWL fail to defend against attacks HWL obtains high lifetime with small granularity The cache hit ratio of HWL is affected by the granularity High performance and lifetime are in conflict How to address the conflict between the lifetime and performance is important 15
SAWL SAWL: self-adaptive wear-leveling scheme High hit ratio & unbalanced Split regions to decrease write distribution the granularity achieve high lifetime and performance Merge regions to increase Low hit ratio & uniform the granularity write distribution 16
Architecture of SAWL Memory Controller Global Translation Integrated Mapping Cached Mapping Directory (GTD) Table (IMT) Table (CMT) Data lrn 1 ,wlg 1 ,prn 1 ,key 1 ( prn,key ) tpma Exchange tlma tpma key (2,4),(4,5), ,(15,6) 0 0 3 0 lrn 2 ,wlg 2 ,prn 2 ,key 2 Address (6,3),(5,5), ,(18,7) 1 1 8 1 Translation (8,2),(10,7), ,(3,6) 2 lrn 3 ,wlg 3 ,prn 3 ,key 3 2 4 0 3 1 1 Region (SRAM) lrn k ,wlg k ,prn k ,key k (DRAM) Split/Merge sync region 0 region 1 region 2 region N region 0 region M line 0 line 0 line 0 line 0 line 0 line 0 line 1 line 1 line 1 line 1 line 1 line 1 line 2 line 2 line 2 line 2 line 2 line 2 Translation lines Data lines (NVM) • IMT (translation lines): record the locations in which the user data are actually stored wear-leveling the data lines 17
Architecture of SAWL Memory Controller Global Translation Integrated Mapping Cached Mapping Directory (GTD) Table (IMT) Table (CMT) Data lrn 1 ,wlg 1 ,prn 1 ,key 1 ( prn,key ) tpma Exchange tlma tpma key (2,4),(4,5), ,(15,6) 0 0 3 0 lrn 2 ,wlg 2 ,prn 2 ,key 2 Address (6,3),(5,5), ,(18,7) 1 1 8 1 Translation (8,2),(10,7), ,(3,6) 2 lrn 3 ,wlg 3 ,prn 3 ,key 3 2 4 0 3 1 1 Region (SRAM) lrn k ,wlg k ,prn k ,key k (DRAM) Split/Merge sync region 0 region 1 region 2 region N region 0 region M line 0 line 0 line 0 line 0 line 0 line 0 line 1 line 1 line 1 line 1 line 1 line 1 line 2 line 2 line 2 line 2 line 2 line 2 Translation lines Data lines (NVM) • CMT: buffer the recently used IMT entries reduce translation latency 18
Architecture of SAWL Memory Controller Global Translation Integrated Mapping Cached Mapping Directory (GTD) Table (IMT) Table (CMT) Data lrn 1 ,wlg 1 ,prn 1 ,key 1 ( prn,key ) tpma Exchange tlma tpma key (2,4),(4,5), ,(15,6) 0 0 3 0 lrn 2 ,wlg 2 ,prn 2 ,key 2 Address (6,3),(5,5), ,(18,7) 1 1 8 1 Translation (8,2),(10,7), ,(3,6) 2 lrn 3 ,wlg 3 ,prn 3 ,key 3 2 4 0 3 1 1 Region (SRAM) lrn k ,wlg k ,prn k ,key k (DRAM) Split/Merge sync region 0 region 1 region 2 region N region 0 region M line 0 line 0 line 0 line 0 line 0 line 0 line 1 line 1 line 1 line 1 line 1 line 1 line 2 line 2 line 2 line 2 line 2 line 2 Translation lines Data lines (NVM) • GTD: record the relationships between logical/physical addresses of translation lines wear-leveling the translation lines 19
Operations in SAWL Merge the region logical Physical logical Physical region region region region A F 1. Choose two unmerged neighborhood logical A D 0 2 B E regions. B C 0 C B C A 1 3 D A D B 2. Physically exchange the data to satisfy the E F E D 5 8 algebraic mapping between the logical and 8 5 E C F F physical regions. lrn wlg prn key lrn wlg prn key 0 2 3 0 0 4 2 3 CMT entry CMT entry 3. Update the relevant CMT entries on the prn key prn key lrn lrn SRAM and the IMT table on the NVM. 0 3 0 0 2 3 1 8 1 1 2 3 5 2 1 5 8 1 20 IMT entry IMT entry
Operations in SAWL Merge the region logical physical logical physical region region region region Split the region A D A D 0 2 1. Logically split the region without data move. B C B C The data have already satisfied the algebraic 0 2 C B mapping. C B 1 D A 3 D A lrn wlg prn key lrn wlg prn key 2. Update the CMT entries and IMT table. 0 4 2 3 0 2 2 1 CMT entry CMT entry prn lrn key prn key lrn 0 2 3 0 3 1 1 2 3 1 2 1 21 IMT entry IMT entry
When to Adjust the Region We use the hit ratio as the trigger to merge/split region • Hit ratio below 90% significantly decreases the performance • Hit ratio above 95% slightly impacts on the performance hit ratio >= 95% Split the regions Merge the regions hit ratio <= 90% 22
Parameter in SAWL SOW: size of the observation window • Small SOW -> cache hit rate frequently fluctuates 23
Parameter in SAWL SOW: size of the observation window • Small SOW -> cache hit rate frequently fluctuates • Large SOW -> systems miss important trigger point 24
Parameter in SAWL SOW: size of the observation window • Small SOW -> cache hit rate frequently fluctuates • Large SOW -> systems miss important trigger point • We use 2^22 as the SOW value 25
Parameter in SAWL SSW: size of the settling window • Small SSW -> frequently adjust region size 26
Parameter in SAWL SSW: size of the settling window • Small SSW -> frequently adjust region size • Large SSW -> fail to sufficiently adjust the region size 27
Parameter in SAWL SSW: size of the settling window • Small SSW -> frequently adjust region size • Large SSW -> fail to sufficiently adjust the region size • We use 2^22 as the SSW value 28
Experimental Setup Configuration of simulated system via Gem5 Comparisons • Baseline: an NVM system without wear-leveling scheme. • NWL-4: naive wear-leveling scheme with a region consisting of 4 memory lines. • NWL-64: naive wear-leveling scheme with a region consisting of 64 memory lines. • AWL schemes: RBSG and TLSR. • HWL schemes: PCM-S and MWSR. Benchmark: SPEC2006 29
Recommend
More recommend