optimizations for intensive signal processing
play

Optimizations for intensive signal processing applications on - PowerPoint PPT Presentation

Optimizations for intensive signal processing applications on Systems-on-Chip Calin Glitia September 6, 2010 PhD defense Calin Glitia 1/34 Intensive signal processing Detection systems Multimedia PhD defense Calin Glitia 2/34


  1. Uniform sub-array accesses (patterns) Tiler o : origin of the reference pattern F : Fitting matrix – the shape of the tiles in the array 5 0 0 8 PhD defense – Calin Glitia Modeling intensive signal processing applications 11/34

  2. Uniform sub-array accesses (patterns) Tiler o : origin of the reference pattern F : Fitting matrix – the shape of the tiles in the array P : Paving matrix – uniform spacing of the tiles 5 0 0 8 Formal specification: � r � o + ( P F ) · mod s array i PhD defense – Calin Glitia Modeling intensive signal processing applications 11/34

  3. Uniform sub-array accesses (patterns) Tiler o : origin of the reference pattern F : Fitting matrix – the shape of the tiles in the array P : Paving matrix – uniform spacing of the tiles 5 0 0 8 Formal specification: � r � o + ( P F ) · mod s array i PhD defense – Calin Glitia Modeling intensive signal processing applications 11/34

  4. Common motif-based accesses Pattern examples 5 4 4 3 3 0 0 0 0 0 5 5 5 5 0 3 0 0 0 0 PhD defense – Calin Glitia Modeling intensive signal processing applications 12/34

  5. Common motif-based accesses Pattern examples 5 4 4 3 3 0 0 0 0 0 5 5 5 5 0 3 0 0 0 0 Paving example 4 4 4 0 0 0 � 0 � 1 � 2 0 9 0 9 0 9 � � � r = r = r = 2 2 2 � � � � 1 0 5 4 4 4 F = s pattern = 0 1 3 0 0 0 � � � � 0 10 o = s array = 0 � 0 9 0 � 1 9 0 � 2 9 � � � 0 5 r = r = r = 1 1 1 4 4 4 � 0 3 � � 3 � P = s repetition = 1 0 3 0 0 0 � 0 � 1 � 2 0 9 0 9 0 9 � � � r = r = r = 0 0 0 PhD defense – Calin Glitia Modeling intensive signal processing applications 12/34

  6. Summary A RRAY -OL Specification Data- fi ow oriented visual formalism Express the regularity of computations/data accesses Exploit the parallelism PhD defense – Calin Glitia Modeling intensive signal processing applications 13/34

  7. Summary A RRAY -OL Specification Data- fi ow oriented visual formalism Express the regularity of computations/data accesses Exploit the parallelism Rules that allow static analysis PhD defense – Calin Glitia Modeling intensive signal processing applications 13/34

  8. Summary A RRAY -OL Specification Data- fi ow oriented visual formalism Express the regularity of computations/data accesses Exploit the parallelism Rules that allow static analysis Limitations Numerical values for the multidimensional spaces/accesses PhD defense – Calin Glitia Modeling intensive signal processing applications 13/34

  9. Summary A RRAY -OL Specification Data- fi ow oriented visual formalism Express the regularity of computations/data accesses Exploit the parallelism Rules that allow static analysis Limitations Numerical values for the multidimensional spaces/accesses Cycles not allowed in the dependence graph PhD defense – Calin Glitia Modeling intensive signal processing applications 13/34

  10. Summary A RRAY -OL Specification Data- fi ow oriented visual formalism Express the regularity of computations/data accesses Exploit the parallelism Rules that allow static analysis Limitations Numerical values for the multidimensional spaces/accesses Cycles not allowed in the dependence graph Extension: inter-repetition dependences PhD defense – Calin Glitia Modeling intensive signal processing applications 13/34

  11. Need for uniform dependences State construction Transfer data between different instances of the same repetition. PhD defense – Calin Glitia Modeling intensive signal processing applications 14/34

  12. Need for uniform dependences State construction Transfer data between different instances of the same repetition. Examples: Sum, Integrate PhD defense – Calin Glitia Modeling intensive signal processing applications 14/34

  13. Need for uniform dependences State construction Transfer data between different instances of the same repetition. Examples: Sum, Integrate Integrate – Flatten ... +[0] +[1] +[2] ... 0 ... PhD defense – Calin Glitia Modeling intensive signal processing applications 14/34

  14. Need for uniform dependences State construction Transfer data between different instances of the same repetition. Examples: Sum, Integrate Integrate – Flatten ... +[0] +[1] +[2] ... 0 ... Uniform data dependences between instances of a repetition PhD defense – Calin Glitia Modeling intensive signal processing applications 14/34

  15. Uniform dependences Integrate Inter-repetition dependence ( ∞ ) + p out ( ∞ ) p in ( ∞ ) default PhD defense – Calin Glitia Modeling intensive signal processing applications 15/34

  16. Uniform dependences Integrate Inter-repetition dependence 1 Data dependence: p out → p in ( ∞ ) + p out ( ∞ ) p in ( ∞ ) default PhD defense – Calin Glitia Modeling intensive signal processing applications 15/34

  17. Uniform dependences Integrate Inter-repetition dependence 1 Data dependence: p out → p in ( ∞ ) + 2 Dependence vector inside the p out ( ∞ ) p in repetition space ( ∞ ) d = � 1 � default PhD defense – Calin Glitia Modeling intensive signal processing applications 15/34

  18. Uniform dependences Integrate Inter-repetition dependence 1 Data dependence: p out → p in ( ∞ ) + 2 Dependence vector inside the p out ( ∞ ) p in () repetition space 0 ( ∞ ) 3 Initial values: default link for dependences that exit the repetition space default PhD defense – Calin Glitia Modeling intensive signal processing applications 15/34

  19. Uniform dependences Integrate Inter-repetition dependence 1 Data dependence: p out → p in ( ∞ ) + 2 Dependence vector inside the p out ( ∞ ) p in () repetition space 0 ( ∞ ) 3 Initial values: default link for dependences that exit the repetition space d = � 1 � default Calin Glitia, Philippe Dumont, and Pierre Boulet. Array-OL with delays, a domain specific specification language for multidimensional intensive signal processing. Multidimensional Systems and Signal Processing , 2009. PhD defense – Calin Glitia Modeling intensive signal processing applications 15/34

  20. Initial values Initial values – default link d = � 2 , 0 � PhD defense – Calin Glitia Modeling intensive signal processing applications 16/34

  21. Initial values Initial values – default link Same initial value PhD defense – Calin Glitia Modeling intensive signal processing applications 16/34

  22. Initial values Initial values – default link Same initial value Different values – Tiler � � F = � 0 � o = 0 � 1 � 0 P = 0 1 PhD defense – Calin Glitia Modeling intensive signal processing applications 16/34

  23. Initial values Initial values – default link Same initial value Different values – Tiler Different default links – Exclusive tilers F = � 1 � o = � − 4 � P = � 4 0 � � 1 � F = � 0 � o = � 4 0 � P = PhD defense – Calin Glitia Modeling intensive signal processing applications 16/34

  24. Initial values Initial values – default link Same initial value Different values – Tiler Different default links – Exclusive tilers F = � 1 � o = � − 4 � P = � 4 0 � � 1 � F = � 0 � o = � 4 0 � P = PhD defense – Calin Glitia Modeling intensive signal processing applications 16/34

  25. Complex dependences Dependence constructions: Multiple default links PhD defense – Calin Glitia Modeling intensive signal processing applications 17/34

  26. Complex dependences Dependence constructions: Multiple default links Multiple dependences on a repetition space PhD defense – Calin Glitia Modeling intensive signal processing applications 17/34

  27. Complex dependences Dependence constructions: Multiple default links Multiple dependences on a repetition space Dependences connected through the hierarchy Dependences on the complete repetition space PhD defense – Calin Glitia Modeling intensive signal processing applications 17/34

  28. Impact on the parallelism PhD defense – Calin Glitia Modeling intensive signal processing applications 18/34

  29. Impact on the parallelism The repetition space is split in parallel hyper-planes PhD defense – Calin Glitia Modeling intensive signal processing applications 18/34

  30. Impact on the parallelism The repetition space is split in parallel hyper-planes Pipeline execution following the distance vector PhD defense – Calin Glitia Modeling intensive signal processing applications 18/34

  31. Impact on the parallelism The repetition space is split in parallel hyper-planes Pipeline execution following the distance vector Scheduling uniform loops Alain Darte and Yves Robert. Constructive methods for scheduling uniform loop nests. IEEE Trans. Parallel Distributed Systems , 5(8):814–822, 1994. PhD defense – Calin Glitia Modeling intensive signal processing applications 18/34

  32. Outline Array Oriented Language Application Architecture Inter repetition dependence Mapping Code Generation PhD defense – Calin Glitia Modeling intensive signal processing applications 19/34

  33. Outline Modeling and Analysis of Real-Time Embedded Systems Array Oriented Language Application Architecture Inter repetition dependence Mapping Code Generation PhD defense – Calin Glitia Modeling intensive signal processing applications 19/34

  34. Modeling and Analysis of Real-Time Embedded Systems Profile UML – standard OMG Model Driven Engineering Co-design: application, architecture, mapping Repetitive Structure Modeling All the A RRAY -OL concepts are included Proposed by the DaRT team PhD defense – Calin Glitia Modeling intensive signal processing applications 20/34

  35. Model repeated inter-connected architecture topologies Physical connections between architecture components Compact expression PhD defense – Calin Glitia Modeling intensive signal processing applications 21/34

  36. Model repeated inter-connected architecture topologies Physical connections between architecture components Compact expression Cyclic uniform inter-connections PhD defense – Calin Glitia Modeling intensive signal processing applications 21/34

  37. Summary inter-repetition dependences + 1 Expression of state constructions PhD defense – Calin Glitia Modeling intensive signal processing applications 22/34

  38. Summary inter-repetition dependences + 1 Expression of state constructions 2 Complex dependences through the hierarchy PhD defense – Calin Glitia Modeling intensive signal processing applications 22/34

  39. Summary inter-repetition dependences + 1 Expression of state constructions 2 Complex dependences through the hierarchy 3 Parallelism – pipeline PhD defense – Calin Glitia Modeling intensive signal processing applications 22/34

  40. Summary inter-repetition dependences + 1 Expression of state constructions 2 Complex dependences through the hierarchy 3 Parallelism – pipeline 4 Repeated inter-connected architectures PhD defense – Calin Glitia Modeling intensive signal processing applications 22/34

  41. Outline Array Oriented Language Application Architecture Inter repetition dependence Mapping Code Generation PhD defense – Calin Glitia From a high-level specification to the execution 23/34

  42. Outline Modeling and Analysis of Real-Time Embedded Systems Array Oriented Language Application Architecture Inter repetition dependence Mapping High-level refactoring – data-parallel transformations – strategies Code Generation PhD defense – Calin Glitia From a high-level specification to the execution 23/34

  43. Execution Logical space and time as mixed dimensions of multidimensional structure Specification: expresses the data dependences between all the data elements that transits the system And a partial execution order between all the execution of the tasks in of the system PhD defense – Calin Glitia From a high-level specification to the execution 24/34

  44. Execution Logical space and time as mixed dimensions of multidimensional structure Specification: expresses the data dependences between all the data elements that transits the system And a partial execution order between all the execution of the tasks in of the system EFFICIENT execution Optimized code generation Projection of specification into physical space and time PhD defense – Calin Glitia From a high-level specification to the execution 24/34

  45. Execution Logical space and time as mixed dimensions of multidimensional structure Specification: expresses the data dependences between all the data elements that transits the system And a partial execution order between all the execution of the tasks in of the system EFFICIENT execution Optimized code generation Projection of specification into physical space and time Adapt a specification to the execution High-level refactoring Execution that re fi ects the specification PhD defense – Calin Glitia From a high-level specification to the execution 24/34

  46. Projection into space and time Multi-dimensional structures repetition spaces data structures PhD defense – Calin Glitia From a high-level specification to the execution 25/34

  47. Projection into space and time Multi-dimensional structures repetition spaces data structures ⇓ ⇓ in space in time PhD defense – Calin Glitia From a high-level specification to the execution 25/34

  48. Projection into space and time Multi-dimensional structures repetition spaces data structures ⇓ ⇓ in space in time ← → linked (trade-off) PhD defense – Calin Glitia From a high-level specification to the execution 25/34

  49. Projection into space and time Multi-dimensional structures repetition spaces data structures ⇓ ⇓ in space in time ← → linked (trade-off) Take into account the execution constraints Data dependences Available resources PhD defense – Calin Glitia From a high-level specification to the execution 25/34

  50. Projection example Horizontal Filter Vertical Filter ( 240 , 1080 , ∞ ) ( 720 , 120 , ∞ ) ( 1920 , 1080 , ∞ ) ( 720 , 1080 , ∞ ) ( 720 , 480 , ∞ ) ( 13 ) ( 3 ) ( 14 ) ( 4 ) PhD defense – Calin Glitia From a high-level specification to the execution 26/34

  51. Projection example Horizontal Filter Vertical Filter ( 240 , 1080 , ∞ ) ( 720 , 120 , ∞ ) ( 1920 , 1080 , ∞ ) ( 720 , 1080 , ∞ ) ( 720 , 480 , ∞ ) ( 13 ) ( 3 ) ( 14 ) ( 4 ) PhD defense – Calin Glitia From a high-level specification to the execution 26/34

  52. Projection example Horizontal Filter Vertical Filter ( 240 , 1080 , ∞ ) ( 720 , 120 , ∞ ) ( 1920 , 1080 , ∞ ) ( 720 , 1080 , ∞ ) ( 720 , 480 , ∞ ) ( 13 ) ( 3 ) ( 14 ) ( 4 ) Maximal parallelism Memory size Infinite data structures – Blocking points PhD defense – Calin Glitia From a high-level specification to the execution 26/34

  53. Projection example Horizontal Filter Vertical Filter ( 240 , 1080 , ∞ ) ( 720 , 120 , ∞ ) ( 1920 , 1080 , ∞ ) ( 720 , 1080 , ∞ ) ( 720 , 480 , ∞ ) ( 13 ) ( 3 ) ( 14 ) ( 4 ) Pipeline Execution Order PhD defense – Calin Glitia From a high-level specification to the execution 26/34

  54. Projection example ( 240 , 120 , ∞ ) Horizontal Filtre Vertical Filter ( 14 ) ( 3 ) ( 1920 , 1080 , ∞ ) ( 720 , 480 , ∞ ) ( 14 , 13 ) ( 3 , 14 ) ( 3 , 4 ) ( 13 ) ( 3 ) ( 14 ) ( 4 ) Fusion of successive repetitions Minimize the arrays – macro-patterns Distribution of the common repetition Each processor its macro-patterns in memory PhD defense – Calin Glitia From a high-level specification to the execution 26/34

  55. Projection example ( 240 , 120 , ∞ ) Horizontal Filtre Vertical Filter ( 14 ) ( 3 ) ( 1920 , 1080 , ∞ ) ( 720 , 480 , ∞ ) ( 14 , 13 ) ( 3 , 14 ) ( 3 , 4 ) ( 13 ) ( 3 ) ( 14 ) ( 4 ) Re-computations When intermediate values are consumed by multiple repetitions Trade-off Recompute values Keep in memory – increase of memory size PhD defense – Calin Glitia From a high-level specification to the execution 26/34

  56. High-level transformations Adapt a specification to execution change the granularity of the repetitions array sizes reductions PhD defense – Calin Glitia From a high-level specification to the execution 27/34

  57. High-level transformations Adapt a specification to execution change the granularity of the repetitions array sizes reductions “High-level” loop transformations repetition = visual representation of data-parallel loop nest fusion, change paving, tiling, collapse, . . . Calin Glitia and Pierre Boulet. High level loop transformations for multidimensional signal processing embedded applications. In International Symposium on Systems, Architectures, Modeling, and Simulation (SAMOS VIII) , Samos, Greece, July 2008. PhD defense – Calin Glitia From a high-level specification to the execution 27/34

  58. Optimization strategies – memory size reduction r 2 r 3 r 4 r 5 r 1 r 6 r 7 MAXIMAL reduction of the intermediate arrays PhD defense – Calin Glitia From a high-level specification to the execution 28/34

  59. Optimization strategies – memory size reduction r 2 r 3 r 4 r 5 r 1 r 6 r 7 MAXIMAL reduction of the intermediate arrays Fusion of multiple repetitions Minimizes only the last intermediate array Re-computations! PhD defense – Calin Glitia From a high-level specification to the execution 28/34

  60. Optimization strategies – memory size reduction r 2 r 3 r 4 r 5 r 1 r 6 r 7 MAXIMAL reduction of the intermediate arrays Fusion of multiple repetitions Minimizes only the last intermediate array Re-computations! Complete fusion? Too much re-computations Limited array reduction PhD defense – Calin Glitia From a high-level specification to the execution 28/34

  61. Optimization strategies – memory size reduction r 2 r 3 r 4 r 5 r 1 r 6 r 7 MAXIMAL reduction of the intermediate arrays Strategy that limits the re-computations using result from complete fusion and two-by-two fusions where re-computations are introduces and minimal achievable array reduction PhD defense – Calin Glitia From a high-level specification to the execution 28/34

  62. Optimization strategies – memory size reduction r 12 r 2 r 3 r 4 r 5 r 1 r 67 r 14 r 6 r 7 MAXIMAL reduction of the intermediate arrays Repetitions Repetitions Re-computations Reduction factor before fusion after fusion (product) of the output arrays  � �  8 × 128 × 96 10 × 8 9 . 29 1228 . 8 119 × 119 × 96 1 1 96   96 × 80 × 80 × 96   1 96 80 × 80   96 1 1 1 128 × 96 × 80 128 × 96 × 80 1 1 119 × 128 × 96 � � 1 12288 119 128 × 96 × 128 × 96 1 1 1 PhD defense – Calin Glitia From a high-level specification to the execution 28/34

  63. Optimization strategies – memory size reduction r 12 r 2 r 3 r 4 r 5 r 1 r 67 r 14 r 6 r 7 MAXIMAL reduction of the intermediate arrays Calin Glitia, Pierre Boulet, ´ Eric Lenormand, and Michel Barreteau. Repetitive model refactoring strategy for the design space exploration of intensive signal processing applications. Journal of Systems Architecture, Special Issue: Hardware/Software CoDesign . PhD defense – Calin Glitia From a high-level specification to the execution 28/34

  64. And the inter-repetition dependences? Why ? To allow the use of the refactoring tools on models with uniform dependences PhD defense – Calin Glitia From a high-level specification to the execution 29/34

  65. And the inter-repetition dependences? Why ? To allow the use of the refactoring tools on models with uniform dependences Typically: ⇒ PhD defense – Calin Glitia From a high-level specification to the execution 29/34

  66. And the inter-repetition dependences? Why ? To allow the use of the refactoring tools on models with uniform dependences Typically: ⇒ Algorithm The global accesses and dependences MUST remain unchanged Automatically compute new dependences after a transformation PhD defense – Calin Glitia From a high-level specification to the execution 29/34

Recommend


More recommend