Approximate Storage in Solid-State Memories Adrian Sampson University of Washington Jacob Nelson Karin Strauss Microsoft Research & UW University of Washington Luis Ceze sa pa MICRO 2013
Compiler Runtime Vector GPU CPU Processor Accelerator
Network Disk Display Memory I/O Compute Storage
Disk Memory I/O Compute Storage
r o s f y t b n i e m j s a h c m r e c a r y t m m s g t c n a i x p z m I t f f jmeint correcting % 0 0 1 100% per % 0 8 80% s s output quality loss o % l 0 6 y t i 60% l a u q % 0 4 t u p 40% t u 4 o . 3 % 0 2 2 . 3 3 8 2 . 6 2 . 20% 7 4 0 % 1 2 . 0 Main-memory applications using failed blocks. s × e t 2 r i . w 2 g o l r o s n 0% e s 3 2.8 2.6 2.4 2.2 2 1.8 1.6 average write steps
Themes in approximate computing approx precise LO HI x ± y Interleaving: Error mitigation: Programs are both Exploit the hardware approximate & precise to minimize error
:) Phase-change memory (PCM) + Non-volatile Surpass DRAM’s scaling limits Faster than flash “Almost” as fast as DRAM
:( Phase-change memory (PCM) + Write speed Cells wear out & energy over time
Phase-change memory (PCM) :( Multi-level cells are denser Write speed & energy but need more time and energy. Cells wear out over time Cells wear out and can no longer be used. over time
Phase-change memory (PCM) : ( Multi-level cells are denser but need more time and energy to protect against errors. Cells wear out over time and can no longer be used for precise data storage.
Phase-change memory (PCM) : ( Fast Dense
Phase-change memory (PCM) : ( Fast Dense Accurate
Approximate storage in PCM Trade off accuracy for performance in multi-level cell accesses. Use worn-out memory for approximate data instead of throwing it away.
Approximate storage in PCM 1 Trade off accuracy for performance in multi-level cell accesses. 2 Use worn-out memory for approximate data instead of throwing it away.
Approximate storage in PCM 1 Trade off accuracy for performance in multi-level cell accesses. 2 Use approximate throwing it away.
Single-level cells high 1 low 0 analog value digital value
Multi-level cells high 11 10 01 low 00 analog value digital value
Writing to multi-level cells high 11 probability 10 01 low 00 analog value digital value
Writing to multi-level cells, approximately high 11 probability 10 01 low 00 analog value digital value
Speed Density Accuracy
Iterative writes high 11 target range 10 01 low 00 time
Iterative writes, approximately high 11 target range 10 01 low 00 time
Iterative writes, approximately high 11 target range 10 01 low 00 time
wider target range fewer iterations to converge faster writes (or better density at the same speed)
Encoding to minimize error in approximate MLC LO HI x ± y 1 cell, 4 bits 0 0 0 0 reliable unreliable
Encoding to minimize error in approximate MLC LO HI x ± y 4 cells, 16 bits 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 lots of errors
Encoding to minimize error in approximate MLC LO HI x ± y 4 cells, 16 bits 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 lots of errors
Write speedup for approximate MLC 2.5 2 best write speedup 1.5 1 0.5 0 mc smm sor fft lu zxing jmeint raytracer pa nn ml image mean main-memory benchmarks persistent data Writes are 1.7 × faster on average with quality loss under 10%
Approximate storage in PCM Trade off performance in accesses. Use worn-out memory for approximate data instead of throwing it away.
Failed cells are a fact of life 0 1 1 0 1 0 0 1 1 1 0 1 0 0 1 0 1 1 1 0 0 0 1 1 0 0 1 0 1 1 0 0 1 0 0 0 1 a good block
Failed cells are a fact of life 0 1 1 0 1 0 0 1 1 1 0 1 0 0 1 0 1 1 1 0 0 0 1 1 0 0 1 0 1 1 0 0 1 0 0 0 1 a (tragically) failed block
Traditional error correction 0 1 1 0 1 0 0 1 1 1 0 1 0 0 1 0 1 1 1 0 0 0 1 1 0 0 1 0 1 1 0 0 1 0 0 0 1 corrected data block correction bits
Correction resources are exhaustible 0 1 1 0 1 0 0 1 1 1 0 1 0 0 1 0 1 1 1 0 0 0 1 1 0 0 1 0 1 1 0 0 1 0 0 0 1 uncorrectable (bad) block correction bits e t a m i x o r p p a
Prioritized error correction LO HI x ± y 0 1 1 0 1 0 0 1 1 1 0 1 0 0 1 0 1 1 1 0 0 0 1 1 0 0 1 0 1 1 0 0 1 0 0 0 1 uncorrectable (bad) block correction bits e t a m i x o r p p a error exposed where it does the least harm
Lifetime extension with block recycling 2 normalized lifetime (writes) 1.8 1.6 1.4 1.2 1 0.8 0.6 0.4 0.2 0 mc smm sor fft lu zxing jmeint raytracer pa nn ml image mean main-memory benchmarks persistent data Lifetime extended by 23% on average or from about 5.2 to 6.5 years
Network Disk Display Memory I/O Compute Storage
Network Disk Display Memory I/O Compute Storage
Recommend
More recommend