Reconfigurable Computing Reconfigurable Computing Partitioning Partitioning Chapter 5 Chapter 5 Prof. Dr.- -Ing. Jürgen Teich Ing. Jürgen Teich Prof. Dr. Lehrstuhl für Hardware- -Software Software- -Co Co- -Design Design Lehrstuhl für Hardware Reconfigurable Computing
Partitioning - - Motivation Motivation Partitioning A design implementation is often too big to allow an � implementation on a single FPGA. Possible solutions are: � � Spatial partitioning: The design is partitioned into many FPGAs. Each partition block is implemented in one single FPGA. All the FPGAs are used simultaneously. � Temporal partitioning: The design is partitioned into blocks, each of which will be executed on one FPGA at a given time. � We will give a short overview on spatial partitioning in the first part of the chapter. Temporal partitioning algorithms will be considered in detail in the second part of the chapter. Reconfigurable Computing 2
Partitioning – – definitions definitions Partitioning � Dataflow graph: A dataflow or sequencing graph or task graph G =(V,E) is a directed acyclic graph in which: � each node v i in V represents a task with execution time d i � An edge e =(u, v) represents a data dependency between the nodes u and v. � Scheduling and ordering relation: Given a DFG G =(V,E) with a precedence relation among the nodes s: V � A schedule is a function s: V → N . � A schedule defines for each node, the time at which the node will be executed on the reconfigurable device. – A schedule is feasible iff ∀ (u,v) ∈ V: s(u) ≤ s(v) � We define an ordering relation ≤ induced by any schedule s as follows: u ≤ v ↔ s(u) ≤ s(v) Reconfigurable Computing 3
Partitioning – – definitions definitions Partitioning � The relation ≤ can be extended to sets as follows: (A ≤ B) ↔ ∀ a ∈ A, b ∈ B: either a is not in relation with b or a ≤ b. s: V � Partition: Given a DFG G=(V,E) and a set R={R 1 , R 2 , ...., R k } of reconfigurable devices. A partition P of a graph G toward R is its division into some disjoint subsets P 1, P 2, ,…,P r : ∀ P i ∃ R j : S(P i ) ≤ S(R j ) ∧ T(P i ) ≤ T(R j ) where S(X) = size of X and T(X) = # terminals of X � A partition is called spatial iff (p ij =1 iff P i will be implemented in R j ) |{ P i ∈ P: p ij = 1}| =1 ∀ R j ∈ R A partition is temporal iff ∃ R j ∈ R: |{P i ∈ P: p ij = 1}| >1 � � If all the devices in R are of the same type, then the partition is said to be uniform. � If |R|=1,we have a single device partition Reconfigurable Computing 4
Spatial partitioning Spatial partitioning 5 Reconfigurable Computing
Spatial partitioning partitioning - - Problem Problem Spatial � Partitioning Constraints: Each FPGA a is characterized by: � The size, i.e., the number of LUTs, FFs b c available � The terminals, i.e., the number of I/O d e f pins available on the device � A partition is valid iff: for a block B produced by the partition, we have: a � S(B) <= S(device) where S(X) = size of X � T(B) <= T(device) where T(X) = # b c terminals of X d e f Reconfigurable Computing 6
Spatial partitioning partitioning - - Problem Problem Spatial � Objectives: The following objectives a are possible: � Minimize the number of cut nets b c � Minimize the number of produced blocks � Minimize the delay d e f Difficult problem due to all the � constraints which are not always a compatible. b c � Solution approaches: � Use of heuristics for automatic d e f partitioning � Manual intervention Reconfigurable Computing 7
Spatial partitioning – – Approaches Approaches - - Hierarchical Hierarchical Spatial partitioning � Motivation: � Reduce the problem complexity � Keep the global view during partitioning � Improve the final result in terms of number of devices. � Respecting the design hierarchy facilitates debugging � Performance improvement � Approach: B � Apply an algorithm for clustering a flat netlist (creates red envelopes) � Flatten the hierarchy except created (red) clusters Hierarchical spatial partitioning � Partition this flat netlist (reduced problem size) Reconfigurable Computing 8
Spatial partitioning – – Approaches Approaches - - Hierarchical Hierarchical Spatial partitioning � Removing all non-valid blocks may produce a big amount of glue logic in the final problem � Some non-valid blocks may be partitioned separately by applying a divide-and-conquer strategy � ST quality is used to determine Flattening the hierarchy how good a partition block is: ST = S/T (S=Size, T=Terminal) Small size, big I/O pin number, poor ST-quality defines the ratio size/terminal � Poor ST-quality: Blocks having many connections with other hierarchy blocks � Removing hierarchy is preferable Partitioning Reconfigurable Computing 9
Spatial partitioning – – Approaches Approaches - - Hierarchical Hierarchical Spatial partitioning � Good ST-quality: Blocks having few Big size, small I/O pin number, good ST-quality connections with other hierarchy blocks � Splitting is preferable Average ST-quality: calculated recursively � in a bottom-up fashion (for a global view) � Device ST-quality: ST(D). � Device filling is good when the ST-quality of the assigned block is larger or equal to the device quality. ST-quality ST >= ST(D) and ST >= ST(D) and Blocks ST < ST(D) ST >= average ST ST < average ST Leaf block Remove Split Split Non leaf block with big amount of glue logic Remove Split Split Non leaf block with small amount of glue logic Remove Split Remove Splitting Reconfigurable Computing 10
Spatial partitioning – – User intervention User intervention Spatial partitioning � Fully automatic partitioning never Top satisfies designers A B C � User intervention may lead to more H G D E F efficient results FPGA FPGA FPGA 3 1 2 � A mixture of manual and automatic Pre-assignment of blocks to FPGAs strategies istherefore common Top � User intervention: A B C � Assignment of hierarchy blocks to H devices G D E F � Hierarchy modification Flattening � Manual guidance of the automatic Top partitioning A B C � Invoking automatic partitioning on selected blocks (splitting) G E F Reconfigurable Computing 11
Spatial partitioning – – User intervention User intervention Spatial partitioning Top Top A A B C B G Ungrouping E F H G D E F Top Top A Splitting B C A B C H D C H G D E F G E F Reconfigurable Computing 12
Spatial partitioning – – Timing Timing – – Block replication Block replication Spatial partitioning Critical path optimization 30 ns 30 ns B1 B2 B1 B2 10 ns 20 ns 10 ns 20 ns 20 ns 70 ns 50 ns B3 B1 B2 B3 B2 10 ns 30 ns 10 ns 30 ns Reducing the number of I/O pins 30 ns 30 ns B1 B2 B1 B2 10 ns 20 ns 10 ns 20 ns 70 ns 50 ns B1 B3 B2 B3 B2 10 ns 30 ns 10 ns 30 ns Reconfigurable Computing 13
Temporal partitioning Temporal partitioning 14 Reconfigurable Computing
Temporal partitioning – – Problem definition Problem definition Temporal partitioning � Temporal partitioning: a � We consider a single device temporal partitioning of a DFG G=(V,E) for a b c device R � A temporal partition can also be defined d e f as an ordered partition of G with the constraints imposed by R. � With the ordering relation imposed on cycle a the partition, we reduce the solution space to only those partitions which can b c be scheduled on the device for execution. � Therefore, cycles are not allowed in the dataflow graph. Otherwise, the resulting d e f partition may not be schedulable on the device Reconfigurable Computing 15
Temporal partitioning - - Problem Problem Temporal partitioning � Goal: P 2 P 3 P 1 � Computation and scheduling of a Configuration graph � In a configuration graph, P 4 P 5 � Nodes are partitions or bitstreams Inter-configuration � Edges reflect the precedence in a given DFG registers A configuration graph � The partition blocks communicate by means of inter-configuration registers usually IO Register IO Register Bus mapped into the processor address space IO Register IO Register IO Register Block IO Register � The configuration sequence is controlled IO Register by a host processor IO Register Processor � On configuration, save register values. FPGA This requires a given amount of memory FPGA register mapping into the address spaces of the processor � After reconfiguration, copy values back Reconfigurable Computing 16
Recommend
More recommend