Will FPGA reconfiguration change the synthesis problem? Prof. Dirk Stroobandt Ghent University, Belgium Hardware and Embedded Systems group Universiteit Gent – Faculteit Ingenieurswetenschappen – Vakgroep Elektronica en Informatiesystemen – 11 December 2015
Outline • What is Parameterized Run-time Reconfiguration? • The importance of the parameter choice • Effects on logic synthesis 2
Outline • What is Parameterized Run-time Reconfiguration? • The importance of the parameter choice • Effects on logic synthesis 3
FPGA Run-Time Reconfiguration? • Today: configurability on a large time scale – Prototyping – System update – ... • We: configurability on a smaller time scale – Dynamic circuit specialization • Frequently changing (regular) inputs vs. infrequently changing parameters • Parameters trigger a reconfiguration (through configuration manager) – Goals: • Improve performance • Reduce area • Minimize design effort 4
Conventional Dynamic Reconfiguration CPU FPGA Static Application F 1 F 2 Software Reconfiguration F 1 F 2 Dynamic Request config. config. Configuration config. F 1 F 2 DB DB Manager DB Configuration Interface 5
Conventional Tool Flow Place Static HDL Tech. Static Synthesis & Design Mapping Config. Route Place Tech. F 1 F 1 HDL Synthesis & Mapping Config. Route Place Tech. F 2 F 2 HDL Synthesis & Mapping Config. Route … … 6
Dynamic Circuit Specialization not feasible! • Application where part of the input data changes infrequently – Conventional implementation (no reconfiguration): Generic circuit, Store data in memory, Overwrite memory – Dynamic circuit specialization: Reconfigure with configuration specialized for the data • Example: Adaptive FIR filter (16-tap, 8-bit coefficients) 2 128 possible configurations! ... 7
Our solution: Parameterized Configuration A B Parameters 0 0 { 0 1 0 0 0 0 1 } 0 1 { 0 1 0 1 0 0 1 } { 0 1 0 A+B AB A 1 } 1 0 { 0 1 0 1 0 1 1 } 1 1 { 0 1 0 1 1 1 1 } Parameterized Specialized Configuration Configurations * K. Bruneel and D. Stroobandt, “Automatic Generation of Run-time Parameterizable Configurations,” FPL 2008. 8
Dynamic Circuit Specialization (micro-reconfiguration) CPU FPGA Static Application FIR(2, 8) FIR(4,9) Software Reconfiguration FIR Dynamic Request config. config. Configuration config. FIR DB DB Manager DB Configuration Interface 9
Two stage approach Generic • Off-line stage: Functionality – In: Generic functionality • Specification of the generic functionality Off-line Stage • Distinction regular and parameter inputs – Out: Parameterizable Configuration • Software function Parameterizable • outputs specialized configurations for given Configuration parameter values • On-line stage: On-line Stage – Evaluate parameterizable configuration – Out: Specialized Configuration Specialized – Repeat every time parameters change Configuration 10
Param. Configuration Tool Flow • Tunable truth table bits Param. HDL – Adapted Tech. Mapper: TMAP Synthesis * – Map to Tunable LUTs (TLUTs) – [FPL2008], [ReConFig2008], Tech. Mapping * [DATE2009] Place * & Route * • Tunable routing bits – Adapted Tech. Mapper Param. Config. – Adapted Placer – Adapted Router 11
Outline • What is Parameterized Run-time Reconfiguration? • The importance of the parameter choice • Effects on logic synthesis 12
Parameterizable HDL design in 0 entity multiplexer is in 1 port ( --BEGIN PARAM in 2 sel : in std_logic_vector(2 downto 0); --END PARAM in 3 out in : in std_logic_vector(7 downto 0); in 4 out : out std_logic ); in 5 end multiplexer; in 6 architecture behaviour of multiplexer is in 7 begin out <= in(conv_integer(sel)); sel 0 end behaviour; sel 1 sel 2 13
Synthesis* in 7 in 6 in 3 in 2 sel 0 in 5 in 4 sel 0 in 1 in 0 A A A A A A A A sel 1 sel 1 O O O O A A A A sel 2 O O A A Two types of inputs: O • Regular inputs out • Parameter inputs 14
Conventional technology mapping in 7 in 6 in 3 in 2 sel 0 in 5 in 4 sel 0 in 1 in 0 A A A A A A A A sel 1 sel 1 O O O O A A A A sel 2 O O A A K-input LUT (K=3): Tech. Mapping: O Can implement any Search for covering Boolean function with of input circuit with out up to K arguments. K-input subcircuits. 15
TMAP: Tunable LUT mapping in 7 in 6 in 3 in 2 sel 0 in 5 in 4 sel 0 in 1 in 0 A A A A A A A A sel 1 sel 1 O O O O A A A A sel 2 O O Search covering with Tunable LUT (TLUT) can A A subcircuits that have implement any Boolean O up to K regular inputs function with K regular and any number of inputs and any number of out parameter inputs. parameter inputs. 16
LUT structure and functionality in 7 in 6 in 5 in 4 in 3 in 2 in 1 in 0 L 3 L 0 sel 0 sel 1 L 4 L 1 sel 2 L sel . in sel . in L 5 0 0 3 0 2 L sel . L sel .( sel . in sel . in ) 1 1 0 1 0 1 0 0 … out 17
Place and Route in 7 in 6 in 5 in 4 in 3 in 2 in 1 in 0 L 3 L 0 sel 0 sel 1 L 4 L 1 sel 2 L 5 out 18
Experiment: 16-tap FIR, 8-bit coefficients Generic Parameterizable Specialized configuration 1301 (-56%) area (LUTs) 2999 1146 clock freq. 115 (+37%) 84 119 (MHz) 0.166 gen. time (ms) 0 35634 29 2 128 conf. memory (kB) 0 Less area (-56%) Higher clock frequency (+37%) The reduced generation time (5 orders) Less memory (only 29kB) – No NP-hard problems (place and route) at run-time – TMAP flow finds similarity between configurations – More functionality in one TLUT – Less LUTs can be placed closer together – Compressed form of all configurations – Functionality is moved to the tuning functions – Only evaluation of the tuning functions – Less congestion because less nets 19
When should we use parameterized reonfiguration? Use the Functional Density as a measure for implementation efficiency. � �� = � ∙ � A: The area needed T: The total execution time N: The number of operations *A. M. Dehon, Reconfigurable architectures for general- purpose computing, Massachusetts Institute of Technology, 1996. 20
Parameter Selection Fu Avg. Time between parameter changes (clock cycles) Profiler to trade off gain versus overhead of reconfiguration 21 n c t io n a l D e n sit y ( O p s/ s/ L U T s)
Outline • What is Parameterized Run-time Reconfiguration? • The importance of the parameter choice • Effects on logic synthesis 22
Original logic synthesis solution (3-input LUT) in 7 in 6 in 3 in 2 sel 0 in 5 in 4 sel 0 in 1 in 0 A A A A A A A A sel 1 sel 1 O O O O A A A A sel 2 O O A A O out 23
Making subtrees according to K regular inputs sel 1 in 2 in 3 sel 2 in 4 in 0 in 1 sel 0 sel 0 in 6 in 7 sel 0 sel 1 in 5 A A A A A A A A A O A A O A O A A sel 2 sel 1 sel 2 O O A A A O O out 24
Separate parameters from other inputs in 0 sel in 1 sel in 2 sel in 3 sel sel in 4 sel in 5 sel in 6 sel in 7 A A A A A A A A O O O O O O O out 25
Changing the tree depth sel in 0 in 1 sel in 2 sel A A in 3 sel O A sel in 4 O A sel in 5 A O sel in 6 O A sel in 7 A O O A O out 26
Conclusions • Parameterized reconfiguration opens up new optimization possibilities using run-time reconfiguration • Parameters are to be treated differently in Technology Mapping • Therefore parameters and regular inputs should be treated differently in logic synthesis • Cost of parameter calculations (Boolean functions) should also be taken into account • New challenge in synthesis 27
Submit to IWLS Paper abstract sumission: March 11, 2016 www.iwls.org 28
Last slide • Much of this work was done in the framework of the EU- FP7 project FASTER and is now continued in the EU- H2020 project (FETHPC) EXTRA • Tools at https://github.com/UGent-HES/tlut_flow • Questions? • More information: http://hes.elis.ugent.be/ 29
Recommend
More recommend