a new era of silicon prototyping in computer architecture
play

A New Era of Silicon Prototyping in Computer Architecture Research - PowerPoint PPT Presentation

A New Era of Silicon Prototyping in Computer Architecture Research Christopher Torng Computer Systems Laboratory School of Electrical and Computer Engineering Cornell University Recent History of Prototypes at Cornell University Why


  1. A New Era of Silicon Prototyping in Computer Architecture Research Christopher Torng Computer Systems Laboratory School of Electrical and Computer Engineering Cornell University

  2. Recent History of Prototypes at Cornell University Why Prototype? Research Ideas BRGTC2 (2018) DCS (2014) BRGTC1 (2016) I Smart Sharing TSMC 65nm IBM 130nm TSMC 28nm Architectures 1mm x 1.25mm 1mm x 2.2mm 2mm x 2mm I Interconnection Networks 4 6 8 1 1 1 0 0 0 2 2 2 for Manycores 7 1 0 I Python-Based Hardware 2 Modeling I High-Level Synthesis Celerity (2017) I Synthesizable Analog IP TSMC 16nm FinFET 5mm x 5mm I Scalable Baseband PCOSYNC (2018) IBM 180nm Synchronization 2mm x 1mm I Integrated Voltage Regulation Cornell University Christopher Torng 2 / 20

  3. Recent History of Prototypes at Cornell University Why Prototype? Chip-Based Startups BRGTC2 (2018) DCS (2014) BRGTC1 (2016) I Graphcore TSMC 65nm IBM 130nm TSMC 28nm 1mm x 1.25mm I Nervana 1mm x 2.2mm 2mm x 2mm I Cerebras 4 6 8 1 1 1 0 0 0 2 2 2 I Wave Computing I Horizon Robotics 7 1 0 2 I Cambricon I DeePhi I Esperanto Celerity (2017) I SambaNova TSMC 16nm FinFET I Eyeriss 5mm x 5mm PCOSYNC (2018) I Tenstorrent IBM 180nm I Mythic 2mm x 1mm I ThinkForce I Groq I Lightmatter Cornell University Christopher Torng 2 / 20

  4. BRGTC2 — Batten Research Group Test Chip 2 Chip Overview 1.25 mm I TSMC 28 nm I 1 mm × 1.25 mm I 6.7M-transistor Bloom I$ I$ D$ D$ Filter Tag Tag Tag Tag I Quad-core in-order Core Accel Shared RISC-V RV32IMAF P MDU L 1.0 mm I Shared L1 caches (32kB) L L0 Core I$ I$ D$ D$ Shared LLFUs Data Data Data Data Shared I Designed and tested in Core FPU Core PyMTL (Python-based hardware modeling) I Fully synthesizable PLL I Smart sharing mechanisms I Hardware bloom filter xcel I Runs work-stealing runtime Cornell University Christopher Torng 3 / 20

  5. Key Changes Driving A New Era Ecosystems for Open Builders Productive Tools for Small Teams Problem : Closed tools & IP makes dev tough Problem : Small teams with a limited workforce Changes : Open-source ecosystem with RISC-V Changes : Productive & open tool development tick ( . . . ) Significantly Cheaper Costs Problem : Building chips is expensive Changes : MPW tiny chips in advanced nodes $ Cornell University Christopher Torng 4 / 20

  6. Ecosystems for Open Builders Problem : A closed-source chip-building ecosystem (tools & IP) makes chip development tough Problems with Closed-Source Infrastructure I Difficult to replicate results (including your own) I Anything closed-source propagates up and down the stack Software and ISA Ecosystem for Open Builders . E.g., modified MIPS ISA . Spill-over to other stages of the design flow Cycle-Level Modeling I Heavy impact on things I care about . Sharing results and artifacts . Portability RTL Modeling . Maintenance I Reinventing the wheel ASIC Flow How important is a full ecosystem? Cornell University Christopher Torng 5 / 20

  7. Ecosystems for Open Builders Key Change : The open-source ecosystem revolving around RISC-V is growing The RISC-V Ecosystem I Software toolchain and ISA RISC-V . Linux, compiler toolchain, modular ISA Software and ISA Ecosystem for Open Builders I Cycle-level modeling RISC-V . gem5 system-level simulator supports RISC-V multicore Cycle-Level Modeling . We can now model complex RISC-V systems I RTL modeling RISC-V . Open implementations and supporting infrastructure RTL Modeling (e.g., Rocket, Boom, PULP , Diplomacy, FIRRTL, FireSim) I ASIC flows RISC-V . Reference flows available from community for inspiration ASIC Flow Cornell University Christopher Torng 6 / 20

  8. Ecosystems for Open Builders How has the RISC-V ecosystem helped in the design of BRGTC2? BRGTC2 in the RISC-V Ecosystem I Software toolchain and ISA . Not booting Linux... L1 Instruction $ (32KB) Host Interface . Upstream GCC support . Incremental design w/ RV32 modularity Instruction Memory Arbiter I Cycle-level modeling . Multicore gem5 simulations of our system . Decisions : L0 buffers, how many resources to share, Synthesizable PLL Data Memory Arbiter impact of resource latencies, programs fitting in the cache LLFU Arbiter I RTL modeling L1 Data $ Int Mul/Div FPU (32KB) . This was our own... I ASIC flows . Reference methodologies available from other projects (e.g., Celerity) Cornell University Christopher Torng 7 / 20

  9. Key Changes Driving A New Era Ecosystems for Open Builders Productive Tools for Small Teams Problem : Closed tools & IP makes dev tough Problem : Small teams with a limited workforce Changes : Open-source ecosystem with RISC-V Changes : Productive & open tool development tick ( . . . ) Significantly Cheaper Costs Problem : Building chips is expensive Changes : MPW tiny chips in advanced nodes $ Cornell University Christopher Torng 8 / 20

  10. Productive Tools for Small Teams Problem : Small teams have a limited workforce and yet must handle challenging projects An Enormous Challenge for Small Teams Functional-Level Design & Simulation I Small teams exist in both academia as well as in Cycle-Level industry Design & Simulation I Time to first tapeout can be anywhere up to a few years RTL Design & Simulation I What do big companies do? Synthesis . Throw money and engineers at the problem Post-Synthesis Gate-Level Simulation I Generally stuck with tools that “work” Floorplanning Power Routing . If you have enough engineers Placement . E.g., System Verilog Clock Tree Synthesis Routing Post-Place-and-Route Gate-Level Simulation Power Analysis DRC LVS RCX Transistor-Level Sim Tape Out Cornell University Christopher Torng 9 / 20

  11. Productive Tools for Small Teams Key Change : Productive open-source tools progressing and maturing quickly Focusing on BRGTC2 Open Functional-Level Design & Simulation Python- I PyMTL Hardware Modeling Framework Based Cycle-Level . Python-based hardware design and test Design & Simulation HW . Beta version of PyMTL v2 Modeling RTL . https://github.com/cornell-brg/pymtl Design & Simulation Synthesis I The Open Modular VLSI Build System Open Post-Synthesis Modular . Two chips taped out (180nm/28nm) Gate-Level Simulation VLSI Floorplanning . Reference ASIC flow available Build Power Routing . https://github.com/cornell-brg/alloy-asic Placement System Clock Tree Synthesis I Fully Synthesizable PLL + Routing . To be open-sourced soon Post-Place-and-Route Synth . All-digital PLL used in BRGTC2/Celerity Gate-Level Simulation PLL . Avoid mixed-signal design Power Analysis (to be DRC RCX LVS opened) Transistor-Level Sim Tape Out Cornell University Christopher Torng 10 / 20

  12. PyMTL PyMTL: A Unified Framework for Vertically Integrated Computer Architecture Research Derek Lockhart, Gary Zibrat, Christopher Batten 47th ACM/IEEE Int’l Symp. on Microarchitecture (MICRO) Cambridge, UK, Dec. 2014 Mamba: Closing the Performance Gap in Productive Hardware Development Frameworks Shunning Jiang, Berkin Ilbeyi, Christopher Batten 55th ACM/IEEE Design Automation Conf. (DAC) San Francisco, CA, June 2018 Cornell University Christopher Torng 11 / 20

  13. Open Modular VLSI Build System – At A High Level https://github.com/cornell-brg/alloy-asic Problem : Rigid, static ASIC flows Typical ASIC Flows I Flows are automated for exact sequences of steps . Want to add/remove a step? Modify the build system. Copies.. . Once the flow is set up, you don’t want to touch it anymore I Adding new steps between existing steps is troublesome . Steps downstream magically reach upstream — hardcoding . In general, the overhead to add new steps is high I Difficult to support different configurations of the flow . E.g., chip flow vs. block flow . How to add new steps before or after . Each new chip ends up with a dedicated non-reusable flow Cornell University Christopher Torng 12 / 20

  14. Open Modular VLSI Build System – At A High Level https://github.com/cornell-brg/alloy-asic Better ASIC Flows – Modularize the ASIC flow! I Use the build system to mix, match, and assemble steps together . Create modular steps that know how to run/clean themselves . The build system can also check prerequisites and outputs before and after execution to make sure each step can run I Assemble the ASIC flow as a graph . Can target architecture papers by assembling a minimal graph . Can target VLSI papers by assembling a medium graph w/ more steps (e.g., need dedicated floorplan) . Can target a chip by assembling a full-featured tapeout graph Cornell University Christopher Torng 13 / 20

  15. Simple Front-End-Only ASIC Flow Cornell University Christopher Torng 14 / 20

  16. BRGTC2 ASIC Flow Cornell University Christopher Torng 15 / 20

  17. Key Changes Driving A New Era Ecosystems for Open Builders Productive Tools for Small Teams Problem : Closed tools & IP makes dev tough Problem : Small teams with a limited workforce Changes : Open-source ecosystem with RISC-V Changes : Productive & open tool development tick ( . . . ) Significantly Cheaper Costs Problem : Building chips is expensive Changes : MPW tiny chips in advanced nodes $ Cornell University Christopher Torng 16 / 20

  18. Significantly Cheaper Costs Problem : Building chips is expensive Key Change : Multi-project wafer services offer advanced node runs with small minimum sizes Snapshot from Muse Semiconductor Cornell University Christopher Torng 17 / 20

Recommend


More recommend