a new era of silicon prototyping in computer architecture
play

A New Era of Silicon Prototyping in Computer Architecture Research - PowerPoint PPT Presentation

A New Era of Silicon Prototyping in Computer Architecture Research Christopher Torng Computer Systems Laboratory School of Electrical and Computer Engineering Cornell University Recent History of Prototypes at Cornell University 4 6 8 1 1


  1. A New Era of Silicon Prototyping in Computer Architecture Research Christopher Torng Computer Systems Laboratory School of Electrical and Computer Engineering Cornell University

  2. Recent History of Prototypes at Cornell University 4 6 8 1 1 1 0 0 0 2 2 2 7 1 0 2 Cornell University Christopher Torng 2 / 20

  3. Recent History of Prototypes at Cornell University DCS (2014) BRGTC1 (2016) TSMC 65nm IBM 130nm 1mm x 2.2mm 2mm x 2mm 4 6 8 1 1 1 0 0 0 2 2 2 7 1 0 2 Cornell University Christopher Torng 2 / 20

  4. Recent History of Prototypes at Cornell University DCS (2014) BRGTC1 (2016) TSMC 65nm IBM 130nm 1mm x 2.2mm 2mm x 2mm 4 6 8 1 1 1 0 0 0 2 2 2 7 1 0 2 Celerity (2017) TSMC 16nm FinFET 5mm x 5mm Cornell University Christopher Torng 2 / 20

  5. Recent History of Prototypes at Cornell University BRGTC2 (2018) DCS (2014) BRGTC1 (2016) TSMC 65nm IBM 130nm TSMC 28nm 1mm x 1.25mm 1mm x 2.2mm 2mm x 2mm 4 6 8 1 1 1 0 0 0 2 2 2 7 1 0 2 Celerity (2017) TSMC 16nm FinFET 5mm x 5mm Cornell University Christopher Torng 2 / 20

  6. Recent History of Prototypes at Cornell University BRGTC2 (2018) DCS (2014) BRGTC1 (2016) TSMC 65nm IBM 130nm TSMC 28nm 1mm x 1.25mm 1mm x 2.2mm 2mm x 2mm 4 6 8 1 1 1 0 0 0 2 2 2 7 1 0 2 Celerity (2017) TSMC 16nm FinFET 5mm x 5mm PCOSYNC (2018) IBM 180nm 2mm x 1mm Cornell University Christopher Torng 2 / 20

  7. Recent History of Prototypes at Cornell University Why Prototype? Research Ideas BRGTC2 (2018) DCS (2014) BRGTC1 (2016) ◮ Smart Sharing TSMC 65nm IBM 130nm TSMC 28nm Architectures 1mm x 1.25mm 1mm x 2.2mm 2mm x 2mm ◮ Interconnection Networks 4 6 8 1 1 1 0 0 0 2 2 2 for Manycores 7 1 0 ◮ Python-Based Hardware 2 Modeling ◮ High-Level Synthesis Celerity (2017) ◮ Synthesizable Analog IP TSMC 16nm FinFET 5mm x 5mm ◮ Scalable Baseband PCOSYNC (2018) IBM 180nm Synchronization 2mm x 1mm ◮ Integrated Voltage Regulation Cornell University Christopher Torng 2 / 20

  8. Recent History of Prototypes at Cornell University Why Prototype? Chip-Based Startups BRGTC2 (2018) DCS (2014) BRGTC1 (2016) ◮ Graphcore TSMC 65nm IBM 130nm TSMC 28nm 1mm x 1.25mm ◮ Nervana 1mm x 2.2mm 2mm x 2mm ◮ Cerebras 4 6 8 1 1 1 0 0 0 2 2 2 ◮ Wave Computing ◮ Horizon Robotics 7 1 0 2 ◮ Cambricon ◮ DeePhi ◮ Esperanto Celerity (2017) ◮ SambaNova TSMC 16nm FinFET ◮ Eyeriss 5mm x 5mm PCOSYNC (2018) ◮ Tenstorrent IBM 180nm ◮ Mythic 2mm x 1mm ◮ ThinkForce ◮ Groq ◮ Lightmatter Cornell University Christopher Torng 2 / 20

  9. BRGTC2 — Batten Research Group Test Chip 2 Chip Overview 1.25 mm ◮ TSMC 28 nm ◮ 1 mm × 1.25 mm ◮ 6.7M-transistor Bloom I$ I$ D$ D$ Filter Tag Tag Tag Tag ◮ Quad-core in-order Core Accel Shared RISC-V RV32IMAF P MDU L 1.0 mm ◮ Shared L1 caches (32kB) L L0 Core I$ I$ D$ D$ Shared LLFUs Data Data Data Data Shared ◮ Designed and tested in Core FPU Core PyMTL (Python-based hardware modeling) ◮ Fully synthesizable PLL ◮ Smart sharing mechanisms ◮ Hardware bloom filter xcel ◮ Runs work-stealing runtime Cornell University Christopher Torng 3 / 20

  10. Key Changes Driving A New Era Ecosystems for Open Builders Productive Tools for Small Teams Problem : Closed tools & IP makes dev tough Problem : Small teams with a limited workforce Changes : Open-source ecosystem with RISC-V Changes : Productive & open tool development tick ( . . . ) Significantly Cheaper Costs Problem : Building chips is expensive Changes : MPW tiny chips in advanced nodes $ Cornell University Christopher Torng 4 / 20

  11. Ecosystems for Open Builders Problem : A closed-source chip-building ecosystem (tools & IP) makes chip development tough Problems with Closed-Source Infrastructure ◮ Difficult to replicate results (including your own) ◮ Anything closed-source propagates up and down the stack Software and ISA Ecosystem for Open Builders ⊲ E.g., modified MIPS ISA ⊲ Spill-over to other stages of the design flow Cycle-Level Modeling ◮ Heavy impact on things I care about ⊲ Sharing results and artifacts ⊲ Portability RTL Modeling ⊲ Maintenance ◮ Reinventing the wheel ASIC Flow How important is a full ecosystem? Cornell University Christopher Torng 5 / 20

  12. Ecosystems for Open Builders Key Change : The open-source ecosystem revolving around RISC-V is growing The RISC-V Ecosystem ◮ Software toolchain and ISA RISC-V ⊲ Linux, compiler toolchain, modular ISA Software and ISA Ecosystem for Open Builders ◮ Cycle-level modeling RISC-V ⊲ gem5 system-level simulator supports RISC-V multicore Cycle-Level Modeling ⊲ We can now model complex RISC-V systems ◮ RTL modeling RISC-V ⊲ Open implementations and supporting infrastructure RTL Modeling (e.g., Rocket, Boom, PULP , Diplomacy, FIRRTL, FireSim) ◮ ASIC flows RISC-V ⊲ Reference flows available from community for inspiration ASIC Flow Cornell University Christopher Torng 6 / 20

  13. Ecosystems for Open Builders How has the RISC-V ecosystem helped in the design of BRGTC2? BRGTC2 in the RISC-V Ecosystem ◮ Software toolchain and ISA ⊲ Not booting Linux... L1 Instruction $ (32KB) Host Interface ⊲ Upstream GCC support ⊲ Incremental design w/ RV32 modularity Instruction Memory Arbiter ◮ Cycle-level modeling ⊲ Multicore gem5 simulations of our system ⊲ Decisions : L0 buffers, how many resources to share, Synthesizable PLL Data Memory Arbiter impact of resource latencies, programs fitting in the cache LLFU Arbiter ◮ RTL modeling L1 Data $ Int Mul/Div FPU (32KB) ⊲ This was our own... ◮ ASIC flows ⊲ Reference methodologies available from other projects (e.g., Celerity) Cornell University Christopher Torng 7 / 20

  14. Key Changes Driving A New Era Ecosystems for Open Builders Productive Tools for Small Teams Problem : Closed tools & IP makes dev tough Problem : Small teams with a limited workforce Changes : Open-source ecosystem with RISC-V Changes : Productive & open tool development tick ( . . . ) Significantly Cheaper Costs Problem : Building chips is expensive Changes : MPW tiny chips in advanced nodes $ Cornell University Christopher Torng 8 / 20

  15. Productive Tools for Small Teams Problem : Small teams have a limited workforce and yet must handle challenging projects An Enormous Challenge for Small Teams Functional-Level Design & Simulation ◮ Small teams exist in both academia as well as in Cycle-Level industry Design & Simulation ◮ Time to first tapeout can be anywhere up to a few years RTL Design & Simulation ◮ What do big companies do? Synthesis ⊲ Throw money and engineers at the problem Post-Synthesis Gate-Level Simulation ◮ Generally stuck with tools that “work” Floorplanning Power Routing ⊲ If you have enough engineers Placement ⊲ E.g., System Verilog Clock Tree Synthesis Routing Post-Place-and-Route Gate-Level Simulation Power Analysis DRC LVS RCX Transistor-Level Sim Tape Out Cornell University Christopher Torng 9 / 20

  16. Productive Tools for Small Teams Key Change : Productive open-source tools progressing and maturing quickly Focusing on BRGTC2 Open Functional-Level Design & Simulation Python- ◮ PyMTL Hardware Modeling Framework Based Cycle-Level ⊲ Python-based hardware design and test Design & Simulation HW ⊲ Beta version of PyMTL v2 Modeling RTL ⊲ https://github.com/cornell-brg/pymtl Design & Simulation Synthesis ◮ The Open Modular VLSI Build System Open Post-Synthesis Modular ⊲ Two chips taped out (180nm/28nm) Gate-Level Simulation VLSI Floorplanning ⊲ Reference ASIC flow available Build Power Routing ⊲ https://github.com/cornell-brg/alloy-asic Placement System Clock Tree Synthesis ◮ Fully Synthesizable PLL + Routing ⊲ To be open-sourced soon Post-Place-and-Route Synth ⊲ All-digital PLL used in BRGTC2/Celerity Gate-Level Simulation PLL ⊲ Avoid mixed-signal design Power Analysis (to be DRC RCX LVS opened) Transistor-Level Sim Tape Out Cornell University Christopher Torng 10 / 20

  17. PyMTL PyMTL: A Unified Framework for Vertically Integrated Computer Architecture Research Derek Lockhart, Gary Zibrat, Christopher Batten 47th ACM/IEEE Int’l Symp. on Microarchitecture (MICRO) Cambridge, UK, Dec. 2014 Mamba: Closing the Performance Gap in Productive Hardware Development Frameworks Shunning Jiang, Berkin Ilbeyi, Christopher Batten 55th ACM/IEEE Design Automation Conf. (DAC) San Francisco, CA, June 2018 Cornell University Christopher Torng 11 / 20

  18. Open Modular VLSI Build System – At A High Level https://github.com/cornell-brg/alloy-asic Problem : Rigid, static ASIC flows Typical ASIC Flows ◮ Flows are automated for exact sequences of steps ⊲ Want to add/remove a step? Modify the build system. Copies.. ⊲ Once the flow is set up, you don’t want to touch it anymore ◮ Adding new steps between existing steps is troublesome ⊲ Steps downstream magically reach upstream — hardcoding ⊲ In general, the overhead to add new steps is high ◮ Difficult to support different configurations of the flow ⊲ E.g., chip flow vs. block flow ⊲ How to add new steps before or after ⊲ Each new chip ends up with a dedicated non-reusable flow Cornell University Christopher Torng 12 / 20

Recommend


More recommend