flipping bits in memory without accessing them an
play

Flipping Bits in Memory Without Accessing Them: An Experimental - PowerPoint PPT Presentation

Flipping Bits in Memory Without Accessing Them: An Experimental Study of DRAM Disturbance Errors ISCA 2014 Yoongu Kim 1 Ross Daly 1 Jeremie Kim 1 Chris Fallin 1 Ji Hye Lee 1 Donghyuk Lee 1 Chris Wilkerson 2 Konrad Lai Onur Mutlu 1 1 Carnegie


  1. Flipping Bits in Memory Without Accessing Them: An Experimental Study of DRAM Disturbance Errors ISCA 2014 Yoongu Kim 1 Ross Daly 1 Jeremie Kim 1 Chris Fallin 1 Ji Hye Lee 1 Donghyuk Lee 1 Chris Wilkerson 2 Konrad Lai Onur Mutlu 1 1 Carnegie Mellon University 2 Intel Labs Presented by Sam Schiferl and Pedram Zamirai

  2. Outline 1. Motivation 2. DRAM Structure 3. Disturbance Errors 4. Test System Setup 5. Results 6. Proposed Solution 7. Conclusion 8. Discussion 2

  3. Motivation ● As DRAM process technology continues to downscale, memory reliability suffers due to: Smaller cell holds limited charge ○ ○ Cells are closer together, which can lead to electromagnetic coupling Higher variation in process technology ○ ● These issues can lead to the violation of memory isolation ○ An access to one memory address should not have unintended side effects on data stored in other addresses ● The authors investigate the vulnerability of three major commodity DRAM manufacturers to targeted disturbance error attacks 3

  4. DRAM Structure ● Charge stored in capacitor to represent 0/1 Access transistor used to read/write ● data to specific cell Single memory cell 1 Rows of cells 4 1 Figure from paper

  5. DRAM Access 1. Row’s wordline is raised to high 2. Row-buffer reads/write desired columns 3. Row’s wordline is closed Single memory cell 1 Rows of cells 5 1 Figure from paper

  6. DRAM Access 1. Row’s wordline is raised to high 2. Row-buffer reads/write desired columns 3. Row’s wordline is closed Single memory cell 1 Rows of cells 6 1 Figure from paper

  7. DRAM Access 1. Row’s wordline is raised to high 2. Row-buffer reads/write desired columns 3. Row’s wordline is closed Single memory cell 1 Rows of cells 7 1 Figure from paper

  8. DRAM Access 1. Row’s wordline is raised to high 2. Row-buffer reads/write desired columns 3. Row’s wordline is closed Single memory cell 1 Rows of cells 8 1 Figure from paper

  9. DRAM Access 1. Row’s wordline is raised to high 2. Row-buffer reads/write desired columns 3. Row’s wordline is closed Single memory cell 1 Rows of cells 9 1 Figure from paper

  10. DRAM Refresh ● The charge of a memory cell constantly leaks, eventually leading to a loss of data Data must be refreshed ● periodically by raising the wordline ● DRAM specifications guarantee a retention time before the cell loses data ○ 64 ms retention time for DDR3 10

  11. DRAM Refresh ● The charge of a memory cell constantly leaks, eventually leading to a loss of data Data must be refreshed ● periodically by raising the wordline ● DRAM specifications guarantee a retention time before the cell loses data ○ 64 ms retention time for DDR3 11

  12. Disturbance Errors ● Unwanted interaction between two Victims isolated circuit components Repeatedly toggling the voltage of a ● Aggressor wordline can cause cells in nearby rows to leak charge at a faster rate - Victims leak entire charge prior to refresh Causes: ● ○ Noise injection ○ Bridges ○ Hot-carrier injection 12

  13. Disturbance Error Attack ● Repeatedly read data from same row in DRAM and track bit flips in other DRAM rows Flush line from cache after each read ● mov (X), %eax mov (X), %eax mov (Y), %ebx clflush (X) Induces errors clflush (X) clflush (Y) mfence mfence Does not induce errors jmp code1a jmp code1a X & Y map to the same bank, but different rows 13

  14. Experimental Methodology ● Testing platform ○ 8 Xilinx FPGA boards ○ DDR3-800 memory controller Run at 50 ฀ C ○ DRAM modules ● ○ 129 DDR3 DRAM modules 972 DRAM chips ○ ● Test Parameters ○ Activation Interval (AI) ○ Refresh Interval (RI) ○ Data Pattern (DP) 14

  15. Types of Tests 1. Toggle all lines in module repeatedly and locate all disturbed cells ○ Quickly identify all disturbed cells throughout an entire module 2. Toggle single row repeatedly and identify specific disturbed cells ○ Correlate victim cells with aggressor rows 15

  16. Manufacturing Date ● No error in 19 oldest modules ● Relatively recent phenomenon 16

  17. Effective Parameters ● Access patterns ○ Repeated toggling of wordline Access Pattern Disturbance Errors? ○ Opening & closing cause the problem (open-read-close) N Yes ● Refresh interval (RI) Activation interval (AI) ● (open-write-close) N Yes Data Patterns ● open-read N -close No open-write N -close No 17

  18. Effective Parameters ● Access patterns ● Refresh interval (RI) RI ↓ ⇒ Errors ↓ ○ ■ Less leakage Less row openings ■ ● Activation interval (AI) ● Data Patterns 18

  19. Effective Parameters ● Access patterns ● Refresh interval (RI) Activation interval (AI) ● ○ AI ↑ ⇒ Errors ↓ ■ Less row openings in each RI Data Patterns ● 19

  20. Effective Parameters ● Access patterns ● Refresh interval (RI) Activation interval (AI) ● Data Patterns ● ○ Victim cells lose charge when they are disturbed True-cell: High voltage = 1 ○ ○ Anti-cell: High voltage = 0 True is dominant ○ ○ Errors are mostly 1 → 0 20

  21. Address Correlation ● No errors in aggressor itself ● Strong peaks at ±1 Great effect on two immediate neighbor ○ ○ Logical and physical adjacency highly correlate Errors in non-adjacent rows ● ○ Physically-adjacent ⇎ Logically-adjacent 21

  22. Sensitivity Results ● Errors are mostly repeatable ○ Ten iterations of testing ○ Relatively constant average number of errors (±0.25%) ● Victim cells ≠ Weak cells Weak cells = cells with shortest retention time ○ ● Not strongly affected by temperature ±20 ฀ C from ambient temperature → No effect ○ 22

  23. Probabilistic Adjacent Row Activation (PARA) ● After closing a row, memory controller might refresh one of the adjacent rows by probability of P (small constant) Stateless solution ○ ● It picks one of the neighbors randomly ● Number of accesses ↑ ⇒ Refresh Probability ↑ Cannot prevent disturbance errors with absolute certainty ● 23

  24. Conclusion ● Demonstrated, characterized and analyzed disturbance errors ● Repeated accesses to the same row corrupts data in other rows Emerging problem (affect current and future computing systems) ● Proposed several solutions ● 24

  25. Discussion Points ● Does the type of processor (ARM vs x86) have an effect on the feasibility of the attack? 25

  26. Discussion Points ● Does the type of processor (ARM vs x86) have an effect on the feasibility of the attack? How practical is their PARA solution that relies on probabilistically ● refreshing candidate victim rows? 26

  27. Discussion Points ● Does the type of processor (ARM vs x86) have an effect on the feasibility of the attack? How practical is their PARA solution that relies on probabilistically ● refreshing candidate victim rows? ● Should this attack be mitigated with a software or a hardware solution? 27

  28. Potential Solutions Solution Probable Defect Make better chips Future smaller cells Correct errors High cost & unable to correct multi-bit errors Refresh all rows frequently Degrade performance and energy efficiency Map faulty cells to spare cells (manufacturer) Not enough spare cells Retire cells (end-user) 1: Every row in the module is a victim row 1. Disable/remap faulty addresses 2: refreshes victim rows more frequently even 2. Refresh faulty addresses more frequently when there is no access to the module Identify “hot” rows and refresh neighbors High hardware overhead to identify hot rows 28

Recommend


More recommend