Uniform sub-array accesses (patterns) Tiler o : origin of the reference pattern F : Fitting matrix – the shape of the tiles in the array 5 0 0 8 PhD defense – Calin Glitia Modeling intensive signal processing applications 11/34
Uniform sub-array accesses (patterns) Tiler o : origin of the reference pattern F : Fitting matrix – the shape of the tiles in the array P : Paving matrix – uniform spacing of the tiles 5 0 0 8 Formal specification: � r � o + ( P F ) · mod s array i PhD defense – Calin Glitia Modeling intensive signal processing applications 11/34
Uniform sub-array accesses (patterns) Tiler o : origin of the reference pattern F : Fitting matrix – the shape of the tiles in the array P : Paving matrix – uniform spacing of the tiles 5 0 0 8 Formal specification: � r � o + ( P F ) · mod s array i PhD defense – Calin Glitia Modeling intensive signal processing applications 11/34
Common motif-based accesses Pattern examples 5 4 4 3 3 0 0 0 0 0 5 5 5 5 0 3 0 0 0 0 PhD defense – Calin Glitia Modeling intensive signal processing applications 12/34
Common motif-based accesses Pattern examples 5 4 4 3 3 0 0 0 0 0 5 5 5 5 0 3 0 0 0 0 Paving example 4 4 4 0 0 0 � 0 � 1 � 2 0 9 0 9 0 9 � � � r = r = r = 2 2 2 � � � � 1 0 5 4 4 4 F = s pattern = 0 1 3 0 0 0 � � � � 0 10 o = s array = 0 � 0 9 0 � 1 9 0 � 2 9 � � � 0 5 r = r = r = 1 1 1 4 4 4 � 0 3 � � 3 � P = s repetition = 1 0 3 0 0 0 � 0 � 1 � 2 0 9 0 9 0 9 � � � r = r = r = 0 0 0 PhD defense – Calin Glitia Modeling intensive signal processing applications 12/34
Summary A RRAY -OL Specification Data- fi ow oriented visual formalism Express the regularity of computations/data accesses Exploit the parallelism PhD defense – Calin Glitia Modeling intensive signal processing applications 13/34
Summary A RRAY -OL Specification Data- fi ow oriented visual formalism Express the regularity of computations/data accesses Exploit the parallelism Rules that allow static analysis PhD defense – Calin Glitia Modeling intensive signal processing applications 13/34
Summary A RRAY -OL Specification Data- fi ow oriented visual formalism Express the regularity of computations/data accesses Exploit the parallelism Rules that allow static analysis Limitations Numerical values for the multidimensional spaces/accesses PhD defense – Calin Glitia Modeling intensive signal processing applications 13/34
Summary A RRAY -OL Specification Data- fi ow oriented visual formalism Express the regularity of computations/data accesses Exploit the parallelism Rules that allow static analysis Limitations Numerical values for the multidimensional spaces/accesses Cycles not allowed in the dependence graph PhD defense – Calin Glitia Modeling intensive signal processing applications 13/34
Summary A RRAY -OL Specification Data- fi ow oriented visual formalism Express the regularity of computations/data accesses Exploit the parallelism Rules that allow static analysis Limitations Numerical values for the multidimensional spaces/accesses Cycles not allowed in the dependence graph Extension: inter-repetition dependences PhD defense – Calin Glitia Modeling intensive signal processing applications 13/34
Need for uniform dependences State construction Transfer data between different instances of the same repetition. PhD defense – Calin Glitia Modeling intensive signal processing applications 14/34
Need for uniform dependences State construction Transfer data between different instances of the same repetition. Examples: Sum, Integrate PhD defense – Calin Glitia Modeling intensive signal processing applications 14/34
Need for uniform dependences State construction Transfer data between different instances of the same repetition. Examples: Sum, Integrate Integrate – Flatten ... +[0] +[1] +[2] ... 0 ... PhD defense – Calin Glitia Modeling intensive signal processing applications 14/34
Need for uniform dependences State construction Transfer data between different instances of the same repetition. Examples: Sum, Integrate Integrate – Flatten ... +[0] +[1] +[2] ... 0 ... Uniform data dependences between instances of a repetition PhD defense – Calin Glitia Modeling intensive signal processing applications 14/34
Uniform dependences Integrate Inter-repetition dependence ( ∞ ) + p out ( ∞ ) p in ( ∞ ) default PhD defense – Calin Glitia Modeling intensive signal processing applications 15/34
Uniform dependences Integrate Inter-repetition dependence 1 Data dependence: p out → p in ( ∞ ) + p out ( ∞ ) p in ( ∞ ) default PhD defense – Calin Glitia Modeling intensive signal processing applications 15/34
Uniform dependences Integrate Inter-repetition dependence 1 Data dependence: p out → p in ( ∞ ) + 2 Dependence vector inside the p out ( ∞ ) p in repetition space ( ∞ ) d = � 1 � default PhD defense – Calin Glitia Modeling intensive signal processing applications 15/34
Uniform dependences Integrate Inter-repetition dependence 1 Data dependence: p out → p in ( ∞ ) + 2 Dependence vector inside the p out ( ∞ ) p in () repetition space 0 ( ∞ ) 3 Initial values: default link for dependences that exit the repetition space default PhD defense – Calin Glitia Modeling intensive signal processing applications 15/34
Uniform dependences Integrate Inter-repetition dependence 1 Data dependence: p out → p in ( ∞ ) + 2 Dependence vector inside the p out ( ∞ ) p in () repetition space 0 ( ∞ ) 3 Initial values: default link for dependences that exit the repetition space d = � 1 � default Calin Glitia, Philippe Dumont, and Pierre Boulet. Array-OL with delays, a domain specific specification language for multidimensional intensive signal processing. Multidimensional Systems and Signal Processing , 2009. PhD defense – Calin Glitia Modeling intensive signal processing applications 15/34
Initial values Initial values – default link d = � 2 , 0 � PhD defense – Calin Glitia Modeling intensive signal processing applications 16/34
Initial values Initial values – default link Same initial value PhD defense – Calin Glitia Modeling intensive signal processing applications 16/34
Initial values Initial values – default link Same initial value Different values – Tiler � � F = � 0 � o = 0 � 1 � 0 P = 0 1 PhD defense – Calin Glitia Modeling intensive signal processing applications 16/34
Initial values Initial values – default link Same initial value Different values – Tiler Different default links – Exclusive tilers F = � 1 � o = � − 4 � P = � 4 0 � � 1 � F = � 0 � o = � 4 0 � P = PhD defense – Calin Glitia Modeling intensive signal processing applications 16/34
Initial values Initial values – default link Same initial value Different values – Tiler Different default links – Exclusive tilers F = � 1 � o = � − 4 � P = � 4 0 � � 1 � F = � 0 � o = � 4 0 � P = PhD defense – Calin Glitia Modeling intensive signal processing applications 16/34
Complex dependences Dependence constructions: Multiple default links PhD defense – Calin Glitia Modeling intensive signal processing applications 17/34
Complex dependences Dependence constructions: Multiple default links Multiple dependences on a repetition space PhD defense – Calin Glitia Modeling intensive signal processing applications 17/34
Complex dependences Dependence constructions: Multiple default links Multiple dependences on a repetition space Dependences connected through the hierarchy Dependences on the complete repetition space PhD defense – Calin Glitia Modeling intensive signal processing applications 17/34
Impact on the parallelism PhD defense – Calin Glitia Modeling intensive signal processing applications 18/34
Impact on the parallelism The repetition space is split in parallel hyper-planes PhD defense – Calin Glitia Modeling intensive signal processing applications 18/34
Impact on the parallelism The repetition space is split in parallel hyper-planes Pipeline execution following the distance vector PhD defense – Calin Glitia Modeling intensive signal processing applications 18/34
Impact on the parallelism The repetition space is split in parallel hyper-planes Pipeline execution following the distance vector Scheduling uniform loops Alain Darte and Yves Robert. Constructive methods for scheduling uniform loop nests. IEEE Trans. Parallel Distributed Systems , 5(8):814–822, 1994. PhD defense – Calin Glitia Modeling intensive signal processing applications 18/34
Outline Array Oriented Language Application Architecture Inter repetition dependence Mapping Code Generation PhD defense – Calin Glitia Modeling intensive signal processing applications 19/34
Outline Modeling and Analysis of Real-Time Embedded Systems Array Oriented Language Application Architecture Inter repetition dependence Mapping Code Generation PhD defense – Calin Glitia Modeling intensive signal processing applications 19/34
Modeling and Analysis of Real-Time Embedded Systems Profile UML – standard OMG Model Driven Engineering Co-design: application, architecture, mapping Repetitive Structure Modeling All the A RRAY -OL concepts are included Proposed by the DaRT team PhD defense – Calin Glitia Modeling intensive signal processing applications 20/34
Model repeated inter-connected architecture topologies Physical connections between architecture components Compact expression PhD defense – Calin Glitia Modeling intensive signal processing applications 21/34
Model repeated inter-connected architecture topologies Physical connections between architecture components Compact expression Cyclic uniform inter-connections PhD defense – Calin Glitia Modeling intensive signal processing applications 21/34
Summary inter-repetition dependences + 1 Expression of state constructions PhD defense – Calin Glitia Modeling intensive signal processing applications 22/34
Summary inter-repetition dependences + 1 Expression of state constructions 2 Complex dependences through the hierarchy PhD defense – Calin Glitia Modeling intensive signal processing applications 22/34
Summary inter-repetition dependences + 1 Expression of state constructions 2 Complex dependences through the hierarchy 3 Parallelism – pipeline PhD defense – Calin Glitia Modeling intensive signal processing applications 22/34
Summary inter-repetition dependences + 1 Expression of state constructions 2 Complex dependences through the hierarchy 3 Parallelism – pipeline 4 Repeated inter-connected architectures PhD defense – Calin Glitia Modeling intensive signal processing applications 22/34
Outline Array Oriented Language Application Architecture Inter repetition dependence Mapping Code Generation PhD defense – Calin Glitia From a high-level specification to the execution 23/34
Outline Modeling and Analysis of Real-Time Embedded Systems Array Oriented Language Application Architecture Inter repetition dependence Mapping High-level refactoring – data-parallel transformations – strategies Code Generation PhD defense – Calin Glitia From a high-level specification to the execution 23/34
Execution Logical space and time as mixed dimensions of multidimensional structure Specification: expresses the data dependences between all the data elements that transits the system And a partial execution order between all the execution of the tasks in of the system PhD defense – Calin Glitia From a high-level specification to the execution 24/34
Execution Logical space and time as mixed dimensions of multidimensional structure Specification: expresses the data dependences between all the data elements that transits the system And a partial execution order between all the execution of the tasks in of the system EFFICIENT execution Optimized code generation Projection of specification into physical space and time PhD defense – Calin Glitia From a high-level specification to the execution 24/34
Execution Logical space and time as mixed dimensions of multidimensional structure Specification: expresses the data dependences between all the data elements that transits the system And a partial execution order between all the execution of the tasks in of the system EFFICIENT execution Optimized code generation Projection of specification into physical space and time Adapt a specification to the execution High-level refactoring Execution that re fi ects the specification PhD defense – Calin Glitia From a high-level specification to the execution 24/34
Projection into space and time Multi-dimensional structures repetition spaces data structures PhD defense – Calin Glitia From a high-level specification to the execution 25/34
Projection into space and time Multi-dimensional structures repetition spaces data structures ⇓ ⇓ in space in time PhD defense – Calin Glitia From a high-level specification to the execution 25/34
Projection into space and time Multi-dimensional structures repetition spaces data structures ⇓ ⇓ in space in time ← → linked (trade-off) PhD defense – Calin Glitia From a high-level specification to the execution 25/34
Projection into space and time Multi-dimensional structures repetition spaces data structures ⇓ ⇓ in space in time ← → linked (trade-off) Take into account the execution constraints Data dependences Available resources PhD defense – Calin Glitia From a high-level specification to the execution 25/34
Projection example Horizontal Filter Vertical Filter ( 240 , 1080 , ∞ ) ( 720 , 120 , ∞ ) ( 1920 , 1080 , ∞ ) ( 720 , 1080 , ∞ ) ( 720 , 480 , ∞ ) ( 13 ) ( 3 ) ( 14 ) ( 4 ) PhD defense – Calin Glitia From a high-level specification to the execution 26/34
Projection example Horizontal Filter Vertical Filter ( 240 , 1080 , ∞ ) ( 720 , 120 , ∞ ) ( 1920 , 1080 , ∞ ) ( 720 , 1080 , ∞ ) ( 720 , 480 , ∞ ) ( 13 ) ( 3 ) ( 14 ) ( 4 ) PhD defense – Calin Glitia From a high-level specification to the execution 26/34
Projection example Horizontal Filter Vertical Filter ( 240 , 1080 , ∞ ) ( 720 , 120 , ∞ ) ( 1920 , 1080 , ∞ ) ( 720 , 1080 , ∞ ) ( 720 , 480 , ∞ ) ( 13 ) ( 3 ) ( 14 ) ( 4 ) Maximal parallelism Memory size Infinite data structures – Blocking points PhD defense – Calin Glitia From a high-level specification to the execution 26/34
Projection example Horizontal Filter Vertical Filter ( 240 , 1080 , ∞ ) ( 720 , 120 , ∞ ) ( 1920 , 1080 , ∞ ) ( 720 , 1080 , ∞ ) ( 720 , 480 , ∞ ) ( 13 ) ( 3 ) ( 14 ) ( 4 ) Pipeline Execution Order PhD defense – Calin Glitia From a high-level specification to the execution 26/34
Projection example ( 240 , 120 , ∞ ) Horizontal Filtre Vertical Filter ( 14 ) ( 3 ) ( 1920 , 1080 , ∞ ) ( 720 , 480 , ∞ ) ( 14 , 13 ) ( 3 , 14 ) ( 3 , 4 ) ( 13 ) ( 3 ) ( 14 ) ( 4 ) Fusion of successive repetitions Minimize the arrays – macro-patterns Distribution of the common repetition Each processor its macro-patterns in memory PhD defense – Calin Glitia From a high-level specification to the execution 26/34
Projection example ( 240 , 120 , ∞ ) Horizontal Filtre Vertical Filter ( 14 ) ( 3 ) ( 1920 , 1080 , ∞ ) ( 720 , 480 , ∞ ) ( 14 , 13 ) ( 3 , 14 ) ( 3 , 4 ) ( 13 ) ( 3 ) ( 14 ) ( 4 ) Re-computations When intermediate values are consumed by multiple repetitions Trade-off Recompute values Keep in memory – increase of memory size PhD defense – Calin Glitia From a high-level specification to the execution 26/34
High-level transformations Adapt a specification to execution change the granularity of the repetitions array sizes reductions PhD defense – Calin Glitia From a high-level specification to the execution 27/34
High-level transformations Adapt a specification to execution change the granularity of the repetitions array sizes reductions “High-level” loop transformations repetition = visual representation of data-parallel loop nest fusion, change paving, tiling, collapse, . . . Calin Glitia and Pierre Boulet. High level loop transformations for multidimensional signal processing embedded applications. In International Symposium on Systems, Architectures, Modeling, and Simulation (SAMOS VIII) , Samos, Greece, July 2008. PhD defense – Calin Glitia From a high-level specification to the execution 27/34
Optimization strategies – memory size reduction r 2 r 3 r 4 r 5 r 1 r 6 r 7 MAXIMAL reduction of the intermediate arrays PhD defense – Calin Glitia From a high-level specification to the execution 28/34
Optimization strategies – memory size reduction r 2 r 3 r 4 r 5 r 1 r 6 r 7 MAXIMAL reduction of the intermediate arrays Fusion of multiple repetitions Minimizes only the last intermediate array Re-computations! PhD defense – Calin Glitia From a high-level specification to the execution 28/34
Optimization strategies – memory size reduction r 2 r 3 r 4 r 5 r 1 r 6 r 7 MAXIMAL reduction of the intermediate arrays Fusion of multiple repetitions Minimizes only the last intermediate array Re-computations! Complete fusion? Too much re-computations Limited array reduction PhD defense – Calin Glitia From a high-level specification to the execution 28/34
Optimization strategies – memory size reduction r 2 r 3 r 4 r 5 r 1 r 6 r 7 MAXIMAL reduction of the intermediate arrays Strategy that limits the re-computations using result from complete fusion and two-by-two fusions where re-computations are introduces and minimal achievable array reduction PhD defense – Calin Glitia From a high-level specification to the execution 28/34
Optimization strategies – memory size reduction r 12 r 2 r 3 r 4 r 5 r 1 r 67 r 14 r 6 r 7 MAXIMAL reduction of the intermediate arrays Repetitions Repetitions Re-computations Reduction factor before fusion after fusion (product) of the output arrays � � 8 × 128 × 96 10 × 8 9 . 29 1228 . 8 119 × 119 × 96 1 1 96 96 × 80 × 80 × 96 1 96 80 × 80 96 1 1 1 128 × 96 × 80 128 × 96 × 80 1 1 119 × 128 × 96 � � 1 12288 119 128 × 96 × 128 × 96 1 1 1 PhD defense – Calin Glitia From a high-level specification to the execution 28/34
Optimization strategies – memory size reduction r 12 r 2 r 3 r 4 r 5 r 1 r 67 r 14 r 6 r 7 MAXIMAL reduction of the intermediate arrays Calin Glitia, Pierre Boulet, ´ Eric Lenormand, and Michel Barreteau. Repetitive model refactoring strategy for the design space exploration of intensive signal processing applications. Journal of Systems Architecture, Special Issue: Hardware/Software CoDesign . PhD defense – Calin Glitia From a high-level specification to the execution 28/34
And the inter-repetition dependences? Why ? To allow the use of the refactoring tools on models with uniform dependences PhD defense – Calin Glitia From a high-level specification to the execution 29/34
And the inter-repetition dependences? Why ? To allow the use of the refactoring tools on models with uniform dependences Typically: ⇒ PhD defense – Calin Glitia From a high-level specification to the execution 29/34
And the inter-repetition dependences? Why ? To allow the use of the refactoring tools on models with uniform dependences Typically: ⇒ Algorithm The global accesses and dependences MUST remain unchanged Automatically compute new dependences after a transformation PhD defense – Calin Glitia From a high-level specification to the execution 29/34
Recommend
More recommend