Hardwired networks on chip for FPGAs Kees Goossens (TUD, NXP) Muhammad Aqeel Wahlah (TUD) 2 overview applications network on chip FPGA key ideas – hardwired NOC – unified interconnect – data coercion / type casting dynamic partial reconfiguration – multiple applications – multiplex sub-applications (“hardware tasks”) example conclusions Kees Goossens 2009-06-02 Tubs.CITY 1
3 applications A1 BA A2 BAC C1 C2 C3 T1 T2 T3 task / function mapped on IP – includes storage / buffering application: set of communicating IPs / tasks / ... – data, control, code – communication via connections use case: set of concurrent applications Kees Goossens 2009-06-02 Tubs.CITY 4 network on chip (NOC) connects ports on hardware blocks (IP) – data, control connections: virtual wires T3 A1 A2 programmable at run-time IP – set up & destroy connections by programming control registers in the NOC NOC NI NI BA IP styles of communication IP NI R R NI T2 – address-based / IP memory-mapped R NI – streaming BAC IP T1 real-time / quality of service Kees Goossens 2009-06-02 Tubs.CITY 2
5 FPGA fabric IO processor soft IP are configured in LUT LUT – configurable elements (LUT) CPU – and switch boxes (not shown) with a given configuration granularity (frame) using the configuration de/encrypt LUT LUT interconnect (ICAP) accelerator hard IP off-chip – CPU memory LUT LUT – on-chip memories (BRAM, ...) – off-chip memory interfaces on-chip – decryption IP memory – etc. LUT LUT on-chip memory Kees Goossens ICAP 2009-06-02 Tubs.CITY 6 application on FPGA IO A1 processor map application LUT LUT (IPs + interconnect + storage) A2 CPU on soft + hard IP soft control interconnect soft data interconnect traditionally data and control de/encrypt LUT LUT interconnects are separate accelerator could also use NOC for both BAC off-chip memory LUT LUT on-chip BA memory LUT LUT on-chip memory Kees Goossens ICAP 2009-06-02 Tubs.CITY 3
7 multiple applications on FPGA IO A1 processor interconnects and IPs of different LUT LUT applications share reconfiguration A2 T3 CPU regions (frames) – dynamic reconfiguration is soft control interconnect soft data interconnect global, not partial de/encrypt LUT LUT – applications interfere accelerator T1 BAC off-chip memory LUT LUT on-chip BA memory T2 LUT LUT on-chip memory Kees Goossens ICAP 2009-06-02 Tubs.CITY 8 overview application network on chip FPGA key ideas – hardwired NOC – unified interconnect – data coercion / type casting dynamic partial reconfiguration – multiple applications – multiplex sub-applications (“hardware tasks”) example conclusions Kees Goossens 2009-06-02 Tubs.CITY 4
9 1. hardwired interconnect IO A1 processor replace soft interconnect(s) CFR by hard interconnect(s) A2 T3 CPU interconnect regions of LUTs (CFR) hard interconnect(s) ~35 X smaller area de/encrypt CFR accelerator ~5 X higher speed – program, don’t configure BAC off-chip bit-level (CFR) vs. T1 memory CFR transaction-level (NOC) reconfigurability on-chip – memory mapped BA memory – streaming CFR T2 on-chip memory Kees Goossens ICAP 2009-06-02 Tubs.CITY 10 1. hardwired interconnect IO c3 C1 processor dynamic partial reconfiguration CFR – no constraints on soft IP T3 CPU placement C2 hard interconnect(s) loss of flexibility de/encrypt CFR accelerator – fewer LUTs BAC off-chip T1 memory CFR on-chip memory T2 CFR on-chip memory Kees Goossens ICAP 2009-06-02 Tubs.CITY 5
11 2. unified interconnect IO A1 processor one interconnect (e.g. NOC) for CFR – data for functional mode A2 T3 CPU – control for programming single hard interconnect – bitstreams for configuration de/encrypt CFR accelerator dynamic partitioning of different interconnects BAC off-chip T1 memory CFR on-chip BA memory CFR T2 on-chip memory Kees Goossens ICAP 2009-06-02 Tubs.CITY 12 3. data coercion bitstream IO processor data = control = bitstream = … CFR CPU connect a data port single hard interconnect to a configuration port – decrypt bitstreams de/encrypt CFR accelerator off-chip memory CFR data on-chip memory CFR on-chip memory Kees Goossens 2009-06-02 Tubs.CITY 6
13 3. data coercion IO processor data = control = bitstream CFR CPU PH connect a data port to a configuration port single hard interconnect – decrypt bitstreams de/encrypt CFR accelerator – run-time compute / optimise bitstream bitstreams • JIT, peephole off-chip memory CFR on-chip memory CFR IP on-chip memory Kees Goossens 2009-06-02 Tubs.CITY 14 3. data coercion IO processor data = control = bitstream = test CFR data TR CPU connect a data port single hard interconnect to a configuration port – decrypt bitstreams de/encrypt CFR accelerator – run-time compute / optimise bitstreams data off-chip connect a data port to a test port TV memory CFR – run-time structural test on-chip memory DUT CFR on-chip memory test data Kees Goossens 2009-06-02 Tubs.CITY 7
15 overview applications network on chip FPGA key ideas – hardwired NOC – unified interconnect – data coercion / type casting dynamic partial reconfiguration – multiple applications – multiplex sub-applications (“hardware tasks”) example conclusions Kees Goossens 2009-06-02 Tubs.CITY 16 dynamic partial reconfiguration “hardware operating system” implements run-time scheduling of 1. multiple concurrent applications – independent applications on own virtual platform • no communication, no interference – activation given by user, environment, etc. T1 T2 T3 app T A app AC app D A1 BA A2 BAC C1 C2 C3 time Kees Goossens 2009-06-02 Tubs.CITY 8
17 dynamic partial reconfiguration “hardware operating system” implements run-time scheduling of 1. multiple concurrent applications 2. parts of single applications (soft IP, “hardware tasks”) – multiplex resources of a single application app T A C app D A1 BA A2 BAC C1 C2 C3 time Kees Goossens 2009-06-02 Tubs.CITY 18 dynamic partial reconfiguration “hardware operating system” implements run-time scheduling of 1. multiple concurrent applications 2. parts of single applications (soft IP, “hardware tasks”) – multiplex resources of a single application – internal state state app T A C app D A1 BA A2 BAC C1 C2 C3 time Kees Goossens 2009-06-02 Tubs.CITY 9
19 dynamic partial reconfiguration 1. system manager – resource management (CFR, NOC, …) • inter-application virtual platforms T application manager A C BAC application manager system manager time Kees Goossens 2009-06-02 Tubs.CITY 20 dynamic partial reconfiguration 1. system manager – resource management (CFR, NOC, …) • inter-application virtual platforms • intra-application phases – NOC programming – soft IP / (sub)-application configuration A C BAC application manager system manager time Kees Goossens 2009-06-02 Tubs.CITY 10
21 dynamic partial reconfiguration 1. system manager 2. application manager – application programming T application manager A C BAC application manager system manager time Kees Goossens 2009-06-02 Tubs.CITY 22 dynamic partial reconfiguration 1. system manager A1 BA A2 BAC C1 C2 C3 2. application manager – application programming – intra-application persistent data management state A C BAC application manager system manager time Kees Goossens 2009-06-02 Tubs.CITY 11
23 overview applications FPGA network on chip key ideas – hardwired NOC – unified interconnect – data coercion / type casting dynamic partial reconfiguration – multiple applications – multiplex sub-applications (“hardware tasks”) example conclusions Kees Goossens 2009-06-02 Tubs.CITY 24 modelling SystemC – bit & cycle accurate NOC model – behavioural CFR models – accurate bitstream structure – behavioural hard IP models model – starting / stopping of applications • dynamic, based on user input – starting / stopping of sub-applications • dynamic, based on flow of data – configuration: loading of bitstreams for soft IP; clock & reset – programming: of NOC, system & sub-application managers – management of persistent state Kees Goossens 2009-06-02 Tubs.CITY 12
25 example IO A1 processor system manager CFR – program NOC for configuration A2 system CPU manager single hard interconnect de/encrypt CFR accelerator BAC off-chip memory application CFR manager on-chip BA memory CFR on-chip memory Kees Goossens 2009-06-02 Tubs.CITY bitstream 26 programming example data IO A1 processor system manager CFR – program NOC for configuration A2 system – configure: load bitstreams CPU manager • including bitstream syntax, etc. single hard interconnect de/encrypt CFR accelerator BAC off-chip memory application CFR manager on-chip BA memory CFR on-chip memory Kees Goossens 2009-06-02 Tubs.CITY 13
Recommend
More recommend