towards a reconfigurable bit serial bit parallel vector
play

Towards a Reconfigurable Bit-Serial/Bit-Parallel Vector Accelerator - PowerPoint PPT Presentation

2020 IEEE International Symposium on Circuits and Systems Virtual, October 10-21, 2020 Towards a Reconfigurable Bit-Serial/Bit-Parallel Vector Accelerator Using In-Situ Processing-in-SRAM Khalid Al-Hawaj , Olalekan Afuye, Shady Agwa, Alyssa


  1. 2020 IEEE International Symposium on Circuits and Systems Virtual, October 10-21, 2020 Towards a Reconfigurable Bit-Serial/Bit-Parallel Vector Accelerator Using In-Situ Processing-in-SRAM Khalid Al-Hawaj , Olalekan Afuye, Shady Agwa, Alyssa Apsel, Christopher Batten Cornell University Electrical and Computer Engineering 2020 IEEE International Symposium on Circuits and Systems Virtual, October 10-21, 2020 Page 0 of XX

  2. T OWARDS A R ECONFIGURABLE B IT -S ERIAL /B IT -P ARALLEL V ECTOR A CCELERATOR U SING I N -S ITU P ROCESSING - IN -SRAM Khalid Al-Hawaj, Olalekan Afuye, Shady Agwa, Alyssa Apsel, Christopher Batten Cornell University Electrical and Computer Engineering

  3. T HE R ETURN OF V ECTOR E NGINES ▪ There is a resurgence of interest in vector abstraction evident by recent ISA extensions (e.g., ARM SVE and RISC-V RVV). ▪ Vector machines leverage vector abstraction to increase performance in executing data-level parallel (DLP) workloads efficiently by exploiting regularity. Motivation • Background • VRAM • Conclusion Page 1 of 18

  4. D ISADVANTAGES OF V ECTOR M ACHINES ▪ Vector machines require highly expensive multi-ported state elements (i.e., register files) to feed vector arithmetic and logical unit (ALU). ▪ Recent work on in-situ processing-in- SRAM shows promise in fusing the vector register file with the ALU to enable efficient vector acceleration using bit-serial execution paradigm.* * S. Jeloka et al., “A Configurable TCAM/BCAM/SRAM Using 28nm Push - Rule 6T Bit Cell”, VLSIC ‘15. Asanovic et al., “The T0 Vector Microprocessor,” HotChips ‘95 * J. Wang at al. “A Compute SRAM with Bit -Serial Integer/Floating-Point Operations for Programmable * In- Memory Vector Acceleration”. ISSCC ‘19. Motivation • Background • VRAM • Conclusion Page 2 of 18

  5. V ECTOR RAM ▪ We propose vector RAM (VRAM) leveraging in-situ processing-in-SRAM to create vector accelerator in two different flavors: bit-serial vector RAM (BS-VRAM) and bit-parallel vector RAM (BP-VRAM). ▪ Main contributions: 1. Detailed circuit-level design of both BS-VRAM and BP-VRAM 2. Implementation of 17 different macro-operations for BS-VRAM and BP-VRAM using micro-operation abstraction 3. Detailed study of the trade-offs in area, cycle time, latency, throughput, and energy for BS-VRAM vs. BP-VRAM Motivation • Background • VRAM • Conclusion Page 3 of 18

  6. O UTLINE ▪ Motivation ▪ Background: Bit-line Compute ▪ Vector RAM • VRAM Circuits • VRAM Micro-Programming • VRAM Macro-Programming • Evaluation ▪ Conclusion

  7. O UTLINE ▪ Motivation ▪ Background: Bit-line Compute ▪ Vector RAM • VRAM Circuits • VRAM Micro-Programming • VRAM Macro-Programming • Evaluation ▪ Conclusion

  8. B ACKGROUND : B IT - LINE C OMPUTE BL 0 BL 0 BL 2 BL 2 BL 3 BL 3 BL 1 BL 1 WL 0 W 0 B 0 W 0 B 1 W 0 B 2 W 0 B 3 WL 1 W 1 B 0 W 1 B 2 W 1 B 3 W 1 B 1 WL 2 W 2 B 0 W 2 B 2 W 2 B 3 W 2 B 1 WL 3 W 3 B 0 W 3 B 2 W 3 B 1 W 3 B 3 Motivation • Background • VRAM • Conclusion Page 4 of 18

  9. B ACKGROUND : B IT - LINE C OMPUTE BL 0 BL 0 BL 2 BL 2 BL 3 BL 3 BL 1 BL 1 WL 0 W 0 B 0 W 0 B 1 W 0 B 2 W 0 B 3 WL 1 W 1 B 0 W 1 B 2 W 1 B 3 W 1 B 1 WL 2 W 2 B 0 W 2 B 2 W 2 B 3 W 2 B 1 WL 3 W 3 B 0 W 3 B 2 W 3 B 1 W 3 B 3 Motivation • Background • VRAM • Conclusion Page 4 of 18

  10. B ACKGROUND : B IT - LINE C OMPUTE BL 0 BL 0 BL 1 BL 1 BL 2 BL 2 BL 3 BL 3 WL 0 WL 1 S. Jeloka et al., ”A 28 nm Configurable Memory (TCAM/BCAM/SRAM) Using Push -Rule 6T BitCell Enabling Logic-in- Memory.” IEEE Journal of Solid - State Circuits ‘16. Motivation • Background • VRAM • Conclusion Page 5 of 18

  11. B ACKGROUND : B IT - LINE C OMPUTE BL 0 BL 0 BL 1 BL 1 BL 2 BL 2 BL 3 BL 3 WL 0 WL 1 ROW 0 0 0 1 1 ROW 1 0 1 0 1 BL` BL’ S. Jeloka et al., ”A 28 nm Configurable Memory (TCAM/BCAM/SRAM) Using Push -Rule 6T BitCell Enabling Logic-in- Memory.” IEEE Journal of Solid - State Circuits ‘16. Motivation • Background • VRAM • Conclusion Page 5 of 18

  12. B ACKGROUND : B IT - LINE C OMPUTE BL 0 BL 0 BL 1 BL 1 BL 2 BL 2 BL 3 BL 3 1=WL 0 1=WL 1 ROW 0 0 0 1 1 ROW 1 0 1 0 1 BL` BL’ S. Jeloka et al., ”A 28 nm Configurable Memory (TCAM/BCAM/SRAM) Using Push -Rule 6T BitCell Enabling Logic-in- Memory.” IEEE Journal of Solid - State Circuits ‘16. Motivation • Background • VRAM • Conclusion Page 5 of 18

  13. B ACKGROUND : B IT - LINE C OMPUTE BL 0 BL 0 BL 1 BL 1 BL 2 BL 2 BL 3 BL 3 1=WL 0 1=WL 1 ROW 0 0 0 1 1 ROW 1 0 1 0 1 BL` 0 BL’ S. Jeloka et al., ”A 28 nm Configurable Memory (TCAM/BCAM/SRAM) Using Push -Rule 6T BitCell Enabling Logic-in- Memory.” IEEE Journal of Solid - State Circuits ‘16. Motivation • Background • VRAM • Conclusion Page 6 of 18

  14. B ACKGROUND : B IT - LINE C OMPUTE BL 0 BL 0 BL 1 BL 1 BL 2 BL 2 BL 3 BL 3 1=WL 0 1=WL 1 ROW 0 0 0 1 1 ROW 1 0 1 0 1 BL` 0 0 BL’ S. Jeloka et al., ”A 28 nm Configurable Memory (TCAM/BCAM/SRAM) Using Push -Rule 6T BitCell Enabling Logic-in- Memory.” IEEE Journal of Solid - State Circuits ‘16. Motivation • Background • VRAM • Conclusion Page 6 of 18

  15. B ACKGROUND : B IT - LINE C OMPUTE BL 0 BL 0 BL 1 BL 1 BL 2 BL 2 BL 3 BL 3 1=WL 0 1=WL 1 ROW 0 0 0 1 1 ROW 1 0 1 0 1 BL` 0 0 0 BL’ S. Jeloka et al., ”A 28 nm Configurable Memory (TCAM/BCAM/SRAM) Using Push -Rule 6T BitCell Enabling Logic-in- Memory.” IEEE Journal of Solid - State Circuits ‘16. Motivation • Background • VRAM • Conclusion Page 6 of 18

  16. B ACKGROUND : B IT - LINE C OMPUTE BL 0 BL 0 BL 1 BL 1 BL 2 BL 2 BL 3 BL 3 1=WL 0 1=WL 1 ROW 0 0 0 1 1 ROW 1 0 1 0 1 BL` 0 0 0 1 BL’ S. Jeloka et al., ”A 28 nm Configurable Memory (TCAM/BCAM/SRAM) Using Push -Rule 6T BitCell Enabling Logic-in- Memory.” IEEE Journal of Solid - State Circuits ‘16. Motivation • Background • VRAM • Conclusion Page 6 of 18

  17. B ACKGROUND : B IT - LINE C OMPUTE BL 0 BL 0 BL 1 BL 1 BL 2 BL 2 BL 3 BL 3 1=WL 0 1=WL 1 ROW 0 0 0 1 1 ROW 1 0 1 0 1 BL` 0 0 0 1 AND BL’ S. Jeloka et al., ”A 28 nm Configurable Memory (TCAM/BCAM/SRAM) Using Push -Rule 6T BitCell Enabling Logic-in- Memory.” IEEE Journal of Solid - State Circuits ‘16. Motivation • Background • VRAM • Conclusion Page 6 of 18

  18. B ACKGROUND : B IT - LINE C OMPUTE BL 0 BL 0 BL 1 BL 1 BL 2 BL 2 BL 3 BL 3 1=WL 0 1=WL 1 ROW 0 0 0 1 1 ROW 1 0 1 0 1 BL` 0 0 0 1 AND BL’ 1 S. Jeloka et al., ”A 28 nm Configurable Memory (TCAM/BCAM/SRAM) Using Push -Rule 6T BitCell Enabling Logic-in- Memory.” IEEE Journal of Solid - State Circuits ‘16. Motivation • Background • VRAM • Conclusion Page 7 of 18

  19. B ACKGROUND : B IT - LINE C OMPUTE BL 0 BL 0 BL 1 BL 1 BL 2 BL 2 BL 3 BL 3 1=WL 0 1=WL 1 ROW 0 0 0 1 1 ROW 1 0 1 0 1 BL` 0 0 0 1 AND BL’ 1 0 S. Jeloka et al., ”A 28 nm Configurable Memory (TCAM/BCAM/SRAM) Using Push -Rule 6T BitCell Enabling Logic-in- Memory.” IEEE Journal of Solid - State Circuits ‘16. Motivation • Background • VRAM • Conclusion Page 7 of 18

  20. B ACKGROUND : B IT - LINE C OMPUTE BL 0 BL 0 BL 1 BL 1 BL 2 BL 2 BL 3 BL 3 1=WL 0 1=WL 1 ROW 0 0 0 1 1 ROW 1 0 1 0 1 BL` 0 0 0 1 AND BL’ 1 0 0 S. Jeloka et al., ”A 28 nm Configurable Memory (TCAM/BCAM/SRAM) Using Push -Rule 6T BitCell Enabling Logic-in- Memory.” IEEE Journal of Solid - State Circuits ‘16. Motivation • Background • VRAM • Conclusion Page 7 of 18

  21. B ACKGROUND : B IT - LINE C OMPUTE BL 0 BL 0 BL 1 BL 1 BL 2 BL 2 BL 3 BL 3 1=WL 0 1=WL 1 ROW 0 0 0 1 1 ROW 1 0 1 0 1 BL` 0 0 0 1 AND BL’ 1 0 0 0 S. Jeloka et al., ”A 28 nm Configurable Memory (TCAM/BCAM/SRAM) Using Push -Rule 6T BitCell Enabling Logic-in- Memory.” IEEE Journal of Solid - State Circuits ‘16. Motivation • Background • VRAM • Conclusion Page 7 of 18

  22. B ACKGROUND : B IT - LINE C OMPUTE BL 0 BL 0 BL 1 BL 1 BL 2 BL 2 BL 3 BL 3 1=WL 0 1=WL 1 ROW 0 0 0 1 1 ROW 1 0 1 0 1 BL` 0 0 0 1 AND BL’ 1 0 0 0 NOR S. Jeloka et al., ”A 28 nm Configurable Memory (TCAM/BCAM/SRAM) Using Push -Rule 6T BitCell Enabling Logic-in- Memory.” IEEE Journal of Solid - State Circuits ‘16. Motivation • Background • VRAM • Conclusion Page 7 of 18

  23. O UTLINE ▪ Motivation ▪ Background: Bit-line Compute ▪ Vector RAM • VRAM Circuits • VRAM Micro-Programming • VRAM Macro-Programming • Evaluation ▪ Conclusion

  24. O UTLINE ▪ Motivation ▪ Background: Bit-line Compute ▪ Vector RAM • VRAM Circuits • VRAM Micro-Programming • VRAM Macro-Programming • Evaluation ▪ Conclusion

  25. VRAM: C IRCUITS Motivation • Background • VRAM • Conclusion Page 8 of 18

  26. VRAM: C IRCUITS Motivation • Background • VRAM • Conclusion Page 8 of 18

  27. VRAM: C IRCUITS Motivation • Background • VRAM • Conclusion Page 8 of 18

  28. VRAM: C IRCUITS Motivation • Background • VRAM • Conclusion Page 8 of 18

  29. VRAM: C IRCUITS — B IT -S ERIAL C OMPUTE L OGIC Motivation • Background • VRAM • Conclusion Page 9 of 18

  30. VRAM: C IRCUITS — B IT -S ERIAL C OMPUTE L OGIC Motivation • Background • VRAM • Conclusion Page 9 of 18

  31. VRAM: C IRCUITS — B IT -S ERIAL C OMPUTE L OGIC Motivation • Background • VRAM • Conclusion Page 9 of 18

  32. VRAM: C IRCUITS — B IT -S ERIAL C OMPUTE L OGIC Motivation • Background • VRAM • Conclusion Page 9 of 18

Recommend


More recommend