Fast dynamic and partial reconfiguration Data Path with low Hardware overhead on Xilinx FPGAs with low Hardware overhead on Xilinx FPGAs Michael Hübner 1 , Diana Göhringer 2 , Juanjo Noguera 3 , Jürgen Becker 1 1 Karlsruhe Institute of Technology (KIT), Germany 1 K l h I tit t f T h l (KIT) G 2 Fraunhofer IOSB, Germany 3 Xilinx Inc., Dublin Institut für Technik der Informationsverarbeitung (ITIV) Institut für Technik der Informationsverarbeitung (ITIV) KIT – University of the State of Baden-Wuerttemberg and www.kit.edu National Research Center of the Helmholtz Association
Outline Introduction and motivation Related work Concept of Fast Simplex Link (FSL) internal configuration access port (ICAP) t (ICAP) Realization and results Conclusion and future work C l i d f t k 2 4/18/2010 Fast dynamic and partial reconfiguration Data Path with low Institut für Technik der Informationsverarbeitung (ITIV) Hardware overhead on Xilinx FPGAs
Introduction and motivation Dynamic and partial reconfiguration: “parts of a configuration can be substituted while other parts stay operative without any disturbance” Spatial and temporal partitioning exploitation to increase performance and to reduce power consumption f d t d ti In a processor based design (MicroBlaze), the configuration access port is one of the “devices” on the configuration access port is one of the devices on the OPB or PLB bus � Why is it not a part of the processor’s microarchitecture? � Why is it not a part of the processor s microarchitecture? spatial parallelism T Temporal l parallelism FPGA 3 4/18/2010 Fast dynamic and partial reconfiguration Data Path with low Institut für Technik der Informationsverarbeitung (ITIV) Hardware overhead on Xilinx FPGAs
Traditional usage of the ICAP: 1. Dynamic Reconfiguration g y g ICAP was traditionally used for run-time adaptive systems: � Loading partial bitreams from external memory transfer � Loading partial bitreams from external memory, transfer to ICAP FPGA FPGA User IP User IP User-IP User-IP MicroBlaze/ MicroBlaze/ MicroBlaze/ MicroBlaze/ PowerPC PowerPC Interface Interface UART UART to PC to PC OPB-B OPB-B Module A Module A Bus Bus External External Partial Flash- Flash- Flash Flash Controller Controller Module Memory Memory 31bit for Virtex 5 Module B Module B Module C Module C and Virtex 6 and Virtex 6 HWIcap HWIcap 16bit for Spartan 6 Module D Module D Module E Module E ICAP ICAP 4 4/18/2010 Prof. Max Mustermann – Fakultät für Musterwissenschaften: Institut für Technik der Informationsverarbeitung (ITIV) Präsentationstitel
Traditional usage of the ICAP (cont) : Data transfer through read- and writeback g ICAP was used to transfer data from one BRAM to another � Reduction of signal line utilization novel degree of freedom � Reduction of signal line utilization, novel degree of freedom Test application: Slide show on Virtex 2 pp VGA core has no connection via signal Example with 7 encapsulated modules line to PPC This example: Sander et. Al.: „ Data Reallocation by Exploiting FPGA Configuration Mechanisms”, RAW 2008, April V Very nice extension: i t i Shelbourne et. Al.: “MetaWire: Using FPGA Configuration Circuitry to Emulate a Network-on-Chip“, FPL 2008, September „ Fast dynamic and partial reconfiguration Data Path with low Institut für Technik der Informationsverarbeitung (ITIV) Hardware overhead on Xilinx FPGAs
ICAP is more than a configuration port… g p ICAP can be used in different modes: Access port to the reconfigurable logic, consuming configuration data Access port to the reconfigurable logic producing configuration Access port to the reconfigurable logic, producing configuration data (e.g. Readback of configuration data for safety reasons (bit flips etc.) Access port to processing elements already configured on the FPGA, consuming (write mode) data to be processed Access port to processing elements already configured on the Access port to processing elements already configured on the FPGA, producing (read mode) data which were processed FPGA FPGA User-IP User-IP MicroBlaze/ MicroBlaze/ Interface Interface Interface Interface PowerPC PowerPC PowerPC PowerPC In general two modes of operation: UART UART to PC to PC 1. for hardware reconfiguration purposes OPB-Bus OPB-Bus Module A Module A External External Flash- Flash- Flash Flash 2. for data transfer purposes Controller Controller Memory Memory Module B Module B Module C Module C HWIcap HWIcap p p Module D M d l Module D M d l D D Module E Module E ICAP ICAP 6 4/18/2010 Fast dynamic and partial reconfiguration Data Path with low Institut für Technik der Informationsverarbeitung (ITIV) Hardware overhead on Xilinx FPGAs
Realization alternatives for processor – ICAP connection p Several approaches exist where a processor triggers the reconfiguration of an accelerator reconfiguration of an accelerator FPGA Dual bus version (newest version) Dual bus version (newest version) Memory External Memory Controller e.g. Flash Efficient usage for MicroBlaze Processor (PPC or FPGA (Xilinx Virtex 4) HW-ICAP P MicroBlaze) MicroBlaze) Xilinx Xilinx LB or OPB Other External Memory Peripheral Memory Controller Devices Processor PLB PLB e.g. Flash (MPMC) ICAP or (PPC405 or XCL MicroBlaze) MicroBlaze) B Other Peripheral Devices HW-ICAP Xilinx Single bus version enables DMA transfer Si l b i bl DMA t f ICAP for bistreams to the ICAP Numerous previous work e g : Numerous previous work, e.g.: Blodget et. Al.:“A Lightweight Approach for Embedded Reconfiguration of FPGAs“, DATE 2003 Claus et. Al.: „A multi-platform controller allowing for maximum 7 4/18/2010 Prof. Max Mustermann – Fakultät für Musterwissenschaften: Institut für Technik der Informationsverarbeitung (ITIV) dynamic partial reconfiguration throughput“, FPL 2008 Präsentationstitel
Novel exploitation possibilities of the ICAP in adaptive microprocessor architectures: The i-Core p Lets assume ICAP is integrated into the processor pipeline That would mean: • Processor commands are reserved for the ICAP: R Reconfiguration mode: fi i d • ICAP write config. • ICAP read config. Data transfer mode: D t t f d • ICAP write process data ICAP • ICAP read process data • The ICAP is included directly into the data path of the processor � lowest delay for data transfer � lowest delay for data transfer � see ICAP from „the software point of view“ and write simple programs for accessing it for accessing it Extended version based on the picture used by Extended version based on the picture used by Prof. Lizy Kurian John, Univ. Austin, Texas 8 4/18/2010 Fast dynamic and partial reconfiguration Data Path with low Institut für Technik der Informationsverarbeitung (ITIV) Hardware overhead on Xilinx FPGAs
Exploitation of the novel concept p p Pipeline with Configuration Port Configuration Port The novel concept increases the flexibility of a FPGA based processor tremendously ICAP The ICAP as data sink and source can be seen as a multipurpose ALU p p Legend: -IF: Instruction fetch From the user (programmer) point of view -ID: Instruction decode -EX: Execute the hardware complexity is hidden through -MEM: Memory access -WB: Writeback the provided libraries p Accessible with standard C construction Further hardware abstraction which definitely will increase the acceptance of run-time adaptive hardware increase the acceptance of run-time adaptive hardware 9 4/18/2010 Fast dynamic and partial reconfiguration Data Path with low Institut für Technik der Informationsverarbeitung (ITIV) Hardware overhead on Xilinx FPGAs
Exploitation of the novel concept (cont) p p ( ) The novel concept enables the run-time adaptation of the processors microarchitecture: Realized instruction (within the ISA) be reconfigured at runtime and realizes therefore a dynamic reconfigurable instruction set processor In general: An adaptive microarchitecture is possible: g p p - Power and energy reduction via pipeline balancing - Using ipc (instruction per cycles) variation reduce power consumption - Dynamic instruction level parallelism pipeline adaptation y p p p p - Adaptive issue queue for reduced power at high performance (Please see in the our paper the references, they did not use this novel approach!) Decentralized processor approach: ICAP connects cores on any position of the chip � Novel quality of processors: The i-Core provides the run-time adaptation of the microarchitecture Pipeline with Configuration Port An example from a real experiment: adaptation of pipeline from 5 to 3 stages ICAP reduction of 90mW power consumption! reduction of 90mW power consumption! (Publication under review: ReCoSoC 2010) Legend: -IF: Instruction fetch -ID: Instruction decode 10 4/18/2010 Fast dynamic and partial reconfiguration Data Path with low Institut für Technik der Informationsverarbeitung (ITIV) -EX: Execute -MEM: Memory access Hardware overhead on Xilinx FPGAs -WB: Writeback
Recommend
More recommend