A Network of Time Division Multiplexing for FPGAs Rosemary Francis

Motivation • FPGAs are now home to complex Systems on-Chip • Designs require the use of Network-on- Chip • FPGA global wiring is simple in comparison with ASIC Networks-on-Chip • Networks for FPGAs use lots of wires or lots of logic • Hard blocks are limited by the soft IP blocks

Goals • Improve wiring density through TDM • Use TDM components for effective soft NoC implementation • Funnel data to high-speed hard blocks – Hard NoC – Multipliers – Block RAM

Hierarchy of interconnect Coarse-grain packet-switched network Time-division multiplexed wires in a fine-grain network Clusters of logic elements with local interconnect

Architecture: Stratix vs TDM SRAM TDM Global routing Stratix Global routing Switch box Switch box Local routing Local routing LUT LUT Cluster of logic elements Cluster of logic elements with latched inputs

Wire Sharing • Many wires can be 1 shared without a problem 2 1 2 3 4 3 4 5

Wire Sharing • Many wires can be shared without a problem 1 • Other configurations 1 require a more intelligent approach 2 2 Conflict!!

Wire Sharing • Many wires can be shared without a problem 2 • Other configurations 1 require a more intelligent approach 2 3 • Signals can be 4 delayed to allow more efficient wire 3 use without 4 5 rerouting

Parameter selection • Assume infinite time slots to reduce wiring – Determine optimum number of TDM wires

Infinite resources 70 65 Total number of wires needed 60 55 50 45 40 35 30 25 20 15 10 5 0 0 6 7 8 10 12 14 16 18 Number of TDM wires

Parameter selection • Assume infinite time slots to reduce wiring – Determine optimum number of TDM wires • Vary number of time slots – Determine optimum number of time slots – Investigate the effect this has on latency

Determine number of time slots 70 65 60 Wires per switch box 55 50 45 40 35 30 25 20 15 10 5 0 1 8 12 16 20 24 28 32 36 Number of time slots (= number of configurations bits per mux)

Number of time slots vs latency Normalised latency of critical path 4 3.5 3 2.5 2 1.5 1 0.5 0 1 8 12 16 20 24 28 32 36 Number of time slots (=number of configuration bits per mux)

Parameter selection • Assume infinite time slots to reduce wiring – Determine optimum number of TDM wires • Vary number of time slots – Determine optimum number of time slots – Investigate the effect this has on latency • Using optimum number of time slots – Re-evaluate optimum number of TDM wires

Limited resources 70 65 60 Total number of wires needed 55 50 45 40 35 30 25 20 15 10 5 0 0 6 7 8 10 12 14 16 18 Number of TDM wires

Architectural drawbacks • Extra configuration SRAM • High-speed interconnect clock • Benchmarks run over three times slower • New CAD tools needed – Re-routing in space as well as time – Optimise for TDM wiring at every stage

Conclusions • Using TDM wiring we can reduce the number of wires whilst increasing the data rate within channels – 75% less wiring * 24 time slots * 3 times slower means 2 times channel data rate • This will allow – the design of effective global interconnect – more efficient sharing of on-chip resources – simplification of multi-chip designs

Future Work • Current scheduling algorithm gives • Large wire reduction • Large latency penalty • Is there a better compromise? • Halve the wiring, small latency penalties • How can we reduce latency in other ways? • Better scheduling algorithms • Circuit redesign

Thanks for listening... Rosemary.Francis@cl.cam.ac.uk

A Network of Time Division Multiplexing for FPGAs Rosemary Francis - PowerPoint PPT Presentation

A Network of Time Division Multiplexing for FPGAs Rosemary Francis Motivation FPGAs are now home to complex Systems on-Chip Designs require the use of Network-on- Chip FPGA global wiring is simple in comparison with ASIC

The BIST History of FPGAs FPGAs The BIST History of The BISTory BISTory of of FPGAs FPGAs

1 Wavelength Division Analog Carrier Systems Multiplexing AT&T (USA) Multiple

FPGAs 1 CMPE691/491: Advanced FPGA Design FPGAs Large array of configurable logic blocks

Physical Design For FPGAs Rajeev Jayaraman Physical Implementation Tools Xilinx Inc. ISPD-2001

Multiplexing UDP-based protocols with QUIC January 2018, Melbourne Multiplexing QUIC and RFC

Efficient Multi-Ported Memories for FPGAs Eric LaForest Greg Steffan University of Toronto

From Channel Slicing to From Channel Slicing to Spatial Division Multiplexing Spatial Division

Multiplexing ITS323: Introduction to Data Communications CSS331: Fundamentals of Data

Known issues of spatial multiplexing MIMO Spatial multiplexing MIMO significantly improves

A Network of Time-Division Multiplexed Wiring for FPGAs Rosemary Francis Simon Moore Robert

FPGAs as Tools and Architectures at ETH Systems FPGAs as Tools and Architectures at ETH Systems

Hybrid Dot-Product Design for FP-Enabled FPGAs Bogdan Pasca Intel ARITH 2019, June 10-12, 2019

Exceptions and Processes (cont) time multiplexing: operating system alternates which mechanism

Virtex-7 FPGAs Target Software Virtex-7 FPGAs Target Software Defined Radio Applications Defined

Linux and FPGAs Chad D. Kersey chad@cdkersey.com cdkersey@gatech.edu Linux and FPGAs - p. 1/9

Physical optimization for Physical optimization for FPGAs using post- FPGAs using post-

ACCC Regulation Conference 2004 Evaluating the effectiveness of regulation 29-30 July 2004

22-2-2019 1 22-2-2019 Why focus on the Sustainable Development of SIDS? Why focus on the

Agenda 10.00 10.20: Registration 10.20 11.00: Opening remarks 11.00 11.50: Sector

Find Your Path to a More Inclusive Workplace Culture August 11, 2020 Moderated by: Brenda B.

The Illinois Geometry Lab Fostering a culture of undergraduate research and community engagement

NGC 185 & 147 S. Tony Sohn (STScI, HSTPROMO) Mark Fardal, Ekta Patel, Gurtina Besla, Marla

Thomas Patton Montana Bureau of Mines and Geology 1300 W. Park Street Butte, Montana 406 496

Climate Action in Colorado: How to engage in legislative & regulatory spaces What's up at the