Binary Access Memory: An Optimized Lookup Table for Successive Approximation Applications Benjamin Hershberg*, Skyler Weaver*, Seiji Takeuchi†, Koichi Hamashita †, Un -Ku Moon* *School of Electrical Engineering and Computer Science, Oregon State University †Asahi Kasei EMD Corporation, Atsugi, Japan
Presentation Overview • Introduction & Motivation • Binary Access Memory (BAM) – Basic idea of BAM – Global pre-fetching – Local pre-charging – Asynchronous BAM • Conclusion
Introduction & Motivation
Typical SAR Error Correction m DAC SAR V in • Popular SAR error correction methods – Radix calibration – Trimming – Lookup table (LUT) • Outside the loop • Inside the loop
Generalized SAR Error Correction n DAC Lookup Table (BAM) m SAR V in • Remaps each SAR code to some DAC code • Payoff: enables new ways of implementing binary search • Drawback: power, latency
Lookup Table Implementation • RAM – Random Access Memory – But, binary search is not a random access pattern! • BAM – Binary Access Memory – Exploit probabilistic aspects of binary search to reduce the latency and power requirements of the lookup table...
BAM memory organization
SRAM ADDR[6:0] ADDR[6:5] ADDR[2:0] ADDR[4:3] COLUMN SELECT 8 words 8 words 8 words 8 words ROW SELECT 8 words 8 words 8 words 8 words 8 words 8 words 8 words 8 words 8 words 8 words 8 words 8 words
Useful Properties of Binary Search Property 1 • A binary search is a one-way journey down the search tree SA begin Step 1 Signal Level Step 2 Step 3 Step 4 Step 5 Step 6 Step 7
Useful Properties of Binary Search Property 1 • A binary search is a one-way journey down the search tree SA begin Step 1 Signal Level Step 2 Step 3 Step 4 Step 5 Step 6 Step 7
Useful Properties of Binary Search Property 1 • A binary search is a one-way journey down the search tree SA begin Step 1 Signal Level Step 2 Step 3 Step 4 Step 5 Step 6 Step 7
How can we improve the organization of data words in the memory? • SRAM: organizes data according to similarity in address code • BAM: organizes data according to similarity in location within the search tree
Useful Properties of Binary Search Property 2 • The probability that a node will be visited during a binary search is a non-uniform distribution. 0.25 0.125 0.5 0.125 0.25 0.125 1.00 SA begin 0.125 0.25 Step 1 0.125 0.5 0.125 Most likely Least likely
How can we improve the organization of data words in the memory? • Make the nodes with the highest probability of being visited the “easiest” to access. • Minimize average energy/bit and average latency.
BAM memory organization Level 2 Level 3 Level 1
Basic Operation – Step 1 1011100 Step Memory Address 1 1 0 0 0 0 0 0 Level 1 1110000 1011000 1001111 2 3 1100000 1001110 0 0 4 Level 2 1010100 5 0 1 6 Level 3 1010000 1001101 7 1000000 1001100 1 1 Level 1 Level 2 Level 3 0110000 1001011 1 1001000 0100000 1000100 1001010 0010000 1001001 Level 2 Level 3 Level 1
Basic Operation – Step 2 1011100 Step Memory Address 1 1 0 0 0 0 0 0 Level 1 1110000 1011000 1001111 2 1 1 0 0 0 0 0 3 1001110 1100000 0 0 4 Level 2 1010100 5 0 1 1001101 6 Level 3 1010000 7 1000000 1001100 1 1 Level 1 Level 2 Level 3 1001011 0110000 1 1001000 0100000 1000100 1001010 1001001 0010000 Level 2 Level 3 Level 1
Basic Operation – Step 3 1011100 Step Memory Address 1 1 0 0 0 0 0 0 Level 1 1110000 1011000 1001111 2 1 1 0 0 0 0 0 3 1 0 1 0 0 0 0 1001110 1100000 0 0 4 Level 2 1010100 5 0 1 1001101 6 1010000 Level 3 1000000 1001100 7 1 1 Level 1 Level 2 Level 3 1001011 0110000 1 1001000 0100000 1000100 1001010 1001001 0010000 Level 2 Level 3 Level 1
Basic Operation – Step 4 1011100 Step Memory Address 1 1 0 0 0 0 0 0 Level 1 1110000 1011000 1001111 2 1 1 0 0 0 0 0 3 1 0 1 0 0 0 0 1001110 1100000 0 0 4 1 0 0 1 0 0 0 Level 2 1010100 5 0 1 1001101 6 Level 3 1010000 7 1000000 1001100 1 1 Level 1 Level 2 Level 3 1001011 0110000 1 1001000 0100000 1000100 1001010 1001001 0010000 Level 2 Level 3 Level 1
Basic Operation – Final Result 1011100 Step Memory Address 1 1 0 0 0 0 0 0 Level 1 1110000 1011000 1001111 2 1 1 0 0 0 0 0 3 1 0 1 0 0 0 0 1001110 1100000 0 0 4 1 0 0 1 0 0 0 Level 2 1010100 5 1 0 0 1 1 0 0 0 1 1001101 6 1 0 0 1 1 1 0 Level 3 1010000 7 1 0 0 1 1 0 1 1000000 1001100 1 1 Level 1 Level 2 Level 3 1001011 0110000 1 1001000 0100000 1000100 1001010 1001001 0010000 Level 2 Level 3 Level 1
Basic Operation – Decode • Simple decoding options Step Memory Address 1 1 0 0 0 0 0 0 Level 1 • Level Select 2 1 1 0 0 0 0 0 – Determined by location of the 3 1 0 1 0 0 0 0 4 1 0 0 1 0 0 0 Level 2 ‘walking 1’ 5 1 0 0 1 1 0 0 • Block Select 6 1 0 0 1 1 1 0 Level 3 – Use parent level’s address 7 1 0 0 1 1 0 1 Level 1 Level 2 Level 3 bits (either specific select lines from the parent level’s decoder or raw address bits will work) • Block Decode – Use own level’s address bits
Reduced Number of Block Switches • In BAM, number of 1011100 1110000 1011000 1001111 block switches per 1100000 1001110 0 0 1010100 0 1 1001101 1010000 conversion always 1000000 1001100 1 1 0110000 1001011 1 1001000 equals the number of 0100000 1000100 1001010 1001001 0010000 levels Level 2 Level 3 Level 1 3-level Memory Depth Average SRAM block Average BAM block and Organization switches switches 7bit - 3x2x2 4.5 3 9bit – 3x3x3 5.5 3 12bit: 4x4x4 7.5 3 14bit: 4x4x6 8.5 3
Pre-fetching
Useful Properties of Binary Search Property 3 • Only the two children nodes directly below the current node have a chance of being accessed on the next step. • Reduce latency by pre-fetching both possible children nodes during the parent’s step SA begin 0.5 Step 1 0.5 Step 2 Step 3 Step 4 Step 5
Pre-fetch Top Level Changes n 2n n n DAC Lookup Table (BAM) m-1 SAR V in • Reduce effective access latency • Store both children words at the parent’s address
Sub-Block Re-Structuring for Pre-Fetch Level 1 Sub-Block (No Prefetch) (111) word ADDR[2:0] LSB[2:0] (110) word block select enable (101) word (000 or 100) Decode word 3-to-8 (011) word (010) word (001) word LBL LBL LBL LBL LBL LBL LBL LBL
Sub-Block Re-Structuring for Pre-Fetch Level 1 Sub-Block (Prefetch) (11) double word ADDR[1:0] (10) double word enable (01) double word (00) (empty) word Decode 2-to-4 LBL LBL LBL LBL LBL LBL LBL LBL
Pre-charging
Pre-Charging 101100 Step Memory Address 100111 110000 (7) 0 0 0 0 0 0 Level 1 0 0 (empty) 1 1 0 0 0 0 0 101000 000000 0 1 2 1 1 0 0 0 0 100000 100110 1 1 3 1 0 1 0 0 0 Level 2 100100 4 1 0 0 1 0 0 1 010000 5 1 0 0 1 1 0 Level 3 100101 6 1 0 0 1 1 1 7 0 0 0 0 0 0 Level 1 Level 2 Level 3 Level 2 Level 3 Level 1 • With pre-fetching implemented, there is now a word which is guaranteed to be the first accessed after a sub-block switch.
Pre-Charging Block Inner block Acquire Data at Buffer data to Switch decode local block output system output • Worst-case latency occurs when switching to a new sub-block – In some designs, output glitching can also occur – Solution: pre-charging • When not selected, a sub- block’s ‘off state’ is to pre - charges its local bit lines to the double-word which will always be requested first
Pre-Charging Block Inner block Acquire Data at Buffer data to Switch decode local block output system output • Worst-case latency occurs when switching to a new sub-block – In some designs, output glitching can also occur – Solution: pre-charging • When not selected, a sub- block’s ‘off state’ is to pre - charges its local bit lines to the double-word which will always be requested first
Pre-Charging Block Buffer data to Switch system output • Worst-case latency occurs when switching to a new sub-block – In some designs, output glitching can also occur – Solution: pre-charging • When not selected, a sub- block’s ‘off state’ is to pre - charges its local bit lines to the double-word which will always be requested first
Asynchronous BAM
Useful Properties of Binary Search Property 4 • There is only one step number which a node can be visited during, and this step number is known for all nodes. When STEP = 2, P(is current node) = 2 -(STEP-1) When STEP != 2, P(is current node) = 0 SA begin Step 1 • Use this knowledge to generate an asynchronous DONE signal Step 2 Step 3 Step 4 Step 5 Step 6 Step 7
Asynchronous BAM Step DONE bit value Steps where words store a ‘0’ DONE bit: 1 0 2 1 3 0 4 1 5 0 Step 1 6 1 7 0 Step 2 Step 3 Step 4 Step 5 Step 6 Step 7
Asynchronous BAM Step DONE bit value Steps where words store a ‘1’ DONE bit: 1 0 2 1 3 0 4 1 5 0 Step 1 6 1 7 0 Step 2 Step 3 Step 4 Step 5 Step 6 Step 7
Recommend
More recommend