will fpga reconfiguration change the synthesis problem
play

Will FPGA reconfiguration change the synthesis problem? Prof. Dirk - PowerPoint PPT Presentation

Will FPGA reconfiguration change the synthesis problem? Prof. Dirk Stroobandt Ghent University, Belgium Hardware and Embedded Systems group Universiteit Gent Faculteit Ingenieurswetenschappen Vakgroep Elektronica en Informatiesystemen


  1. Will FPGA reconfiguration change the synthesis problem? Prof. Dirk Stroobandt Ghent University, Belgium Hardware and Embedded Systems group Universiteit Gent – Faculteit Ingenieurswetenschappen – Vakgroep Elektronica en Informatiesystemen – 11 December 2015

  2. Outline • What is Parameterized Run-time Reconfiguration? • The importance of the parameter choice • Effects on logic synthesis 2

  3. Outline • What is Parameterized Run-time Reconfiguration? • The importance of the parameter choice • Effects on logic synthesis 3

  4. FPGA Run-Time Reconfiguration? • Today: configurability on a large time scale – Prototyping – System update – ... • We: configurability on a smaller time scale – Dynamic circuit specialization • Frequently changing (regular) inputs vs. infrequently changing parameters • Parameters trigger a reconfiguration (through configuration manager) – Goals: • Improve performance • Reduce area • Minimize design effort 4

  5. Conventional Dynamic Reconfiguration CPU FPGA Static Application F 1 F 2 Software Reconfiguration F 1 F 2 Dynamic Request config. config. Configuration config. F 1 F 2 DB DB Manager DB Configuration Interface 5

  6. Conventional Tool Flow Place Static HDL Tech. Static Synthesis & Design Mapping Config. Route Place Tech. F 1 F 1 HDL Synthesis & Mapping Config. Route Place Tech. F 2 F 2 HDL Synthesis & Mapping Config. Route … … 6

  7. Dynamic Circuit Specialization not feasible! • Application where part of the input data changes infrequently – Conventional implementation (no reconfiguration): Generic circuit, Store data in memory, Overwrite memory – Dynamic circuit specialization: Reconfigure with configuration specialized for the data • Example: Adaptive FIR filter (16-tap, 8-bit coefficients) 2 128 possible configurations! ... 7

  8. Our solution: Parameterized Configuration A B Parameters 0 0 { 0 1 0 0 0 0 1 } 0 1 { 0 1 0 1 0 0 1 } { 0 1 0 A+B AB A 1 } 1 0 { 0 1 0 1 0 1 1 } 1 1 { 0 1 0 1 1 1 1 } Parameterized Specialized Configuration Configurations * K. Bruneel and D. Stroobandt, “Automatic Generation of Run-time Parameterizable Configurations,” FPL 2008. 8

  9. Dynamic Circuit Specialization (micro-reconfiguration) CPU FPGA Static Application FIR(2, 8) FIR(4,9) Software Reconfiguration FIR Dynamic Request config. config. Configuration config. FIR DB DB Manager DB Configuration Interface 9

  10. Two stage approach Generic • Off-line stage: Functionality – In: Generic functionality • Specification of the generic functionality Off-line Stage • Distinction regular and parameter inputs – Out: Parameterizable Configuration • Software function Parameterizable • outputs specialized configurations for given Configuration parameter values • On-line stage: On-line Stage – Evaluate parameterizable configuration – Out: Specialized Configuration Specialized – Repeat every time parameters change Configuration 10

  11. Param. Configuration Tool Flow • Tunable truth table bits Param. HDL – Adapted Tech. Mapper: TMAP Synthesis * – Map to Tunable LUTs (TLUTs) – [FPL2008], [ReConFig2008], Tech. Mapping * [DATE2009] Place * & Route * • Tunable routing bits – Adapted Tech. Mapper Param. Config. – Adapted Placer – Adapted Router 11

  12. Outline • What is Parameterized Run-time Reconfiguration? • The importance of the parameter choice • Effects on logic synthesis 12

  13. Parameterizable HDL design in 0 entity multiplexer is in 1 port ( --BEGIN PARAM in 2 sel : in std_logic_vector(2 downto 0); --END PARAM in 3 out in : in std_logic_vector(7 downto 0); in 4 out : out std_logic ); in 5 end multiplexer; in 6 architecture behaviour of multiplexer is in 7 begin out <= in(conv_integer(sel)); sel 0 end behaviour; sel 1 sel 2 13

  14. Synthesis* in 7 in 6 in 3 in 2 sel 0 in 5 in 4 sel 0 in 1 in 0 A A A A A A A A sel 1 sel 1 O O O O A A A A sel 2 O O A A Two types of inputs: O • Regular inputs out • Parameter inputs 14

  15. Conventional technology mapping in 7 in 6 in 3 in 2 sel 0 in 5 in 4 sel 0 in 1 in 0 A A A A A A A A sel 1 sel 1 O O O O A A A A sel 2 O O A A K-input LUT (K=3): Tech. Mapping: O Can implement any Search for covering Boolean function with of input circuit with out up to K arguments. K-input subcircuits. 15

  16. TMAP: Tunable LUT mapping in 7 in 6 in 3 in 2 sel 0 in 5 in 4 sel 0 in 1 in 0 A A A A A A A A sel 1 sel 1 O O O O A A A A sel 2 O O Search covering with Tunable LUT (TLUT) can A A subcircuits that have implement any Boolean O up to K regular inputs function with K regular and any number of inputs and any number of out parameter inputs. parameter inputs. 16

  17. LUT structure and functionality in 7 in 6 in 5 in 4 in 3 in 2 in 1 in 0 L 3 L 0 sel 0 sel 1 L 4 L 1 sel 2   L sel . in sel . in L 5 0 0 3 0 2    L sel . L sel .( sel . in sel . in ) 1 1 0 1 0 1 0 0 … out 17

  18. Place and Route in 7 in 6 in 5 in 4 in 3 in 2 in 1 in 0 L 3 L 0 sel 0 sel 1 L 4 L 1 sel 2 L 5 out 18

  19. Experiment: 16-tap FIR, 8-bit coefficients Generic Parameterizable Specialized configuration 1301 (-56%) area (LUTs) 2999 1146 clock freq. 115 (+37%) 84 119 (MHz) 0.166 gen. time (ms) 0 35634 29 2 128 conf. memory (kB) 0 Less area (-56%) Higher clock frequency (+37%) The reduced generation time (5 orders) Less memory (only 29kB) – No NP-hard problems (place and route) at run-time – TMAP flow finds similarity between configurations – More functionality in one TLUT – Less LUTs can be placed closer together – Compressed form of all configurations – Functionality is moved to the tuning functions – Only evaluation of the tuning functions – Less congestion because less nets 19

  20. When should we use parameterized reonfiguration? Use the Functional Density as a measure for implementation efficiency. � �� = � ∙ � A: The area needed T: The total execution time N: The number of operations *A. M. Dehon, Reconfigurable architectures for general- purpose computing, Massachusetts Institute of Technology, 1996. 20

  21. Parameter Selection Fu Avg. Time between parameter changes (clock cycles) Profiler to trade off gain versus overhead of reconfiguration 21 n c t io n a l D e n sit y ( O p s/ s/ L U T s)

  22. Outline • What is Parameterized Run-time Reconfiguration? • The importance of the parameter choice • Effects on logic synthesis 22

  23. Original logic synthesis solution (3-input LUT) in 7 in 6 in 3 in 2 sel 0 in 5 in 4 sel 0 in 1 in 0 A A A A A A A A sel 1 sel 1 O O O O A A A A sel 2 O O A A O out 23

  24. Making subtrees according to K regular inputs sel 1 in 2 in 3 sel 2 in 4 in 0 in 1 sel 0 sel 0 in 6 in 7 sel 0 sel 1 in 5 A A A A A A A A A O A A O A O A A sel 2 sel 1 sel 2 O O A A A O O out 24

  25. Separate parameters from other inputs in 0 sel in 1 sel in 2 sel in 3 sel sel in 4 sel in 5 sel in 6 sel in 7 A A A A A A A A O O O O O O O out 25

  26. Changing the tree depth sel in 0 in 1 sel in 2 sel A A in 3 sel O A sel in 4 O A sel in 5 A O sel in 6 O A sel in 7 A O O A O out 26

  27. Conclusions • Parameterized reconfiguration opens up new optimization possibilities using run-time reconfiguration • Parameters are to be treated differently in Technology Mapping • Therefore parameters and regular inputs should be treated differently in logic synthesis • Cost of parameter calculations (Boolean functions) should also be taken into account • New challenge in synthesis 27

  28. Submit to IWLS Paper abstract sumission: March 11, 2016 www.iwls.org 28

  29. Last slide • Much of this work was done in the framework of the EU- FP7 project FASTER and is now continued in the EU- H2020 project (FETHPC) EXTRA • Tools at https://github.com/UGent-HES/tlut_flow • Questions? • More information: http://hes.elis.ugent.be/ 29

Recommend


More recommend