Symbolic Computation of Latency for Dataflow Graphs Adnan Bouakaz Pascal Fradet Alain Girault SYNCHRON International Workshop, Bamberg December 7th, 2016
Introduction Outline Introduction 1 Application model Scheduling policy Symbolic analysis Preliminary results 2 p q Graph A − − → B 3 Generalization to chains and acyclic graphs 4 Experiments 5 Conclusion 6 1 Bouakaz, Fradet and Girault Symbolic Computation of Latency
Introduction Application model Data-flow models of computation Stream-processing applications are found in many embedded systems video codecs, software defined radio, ... computationally intensive strict quality-of-service requirements low energy consumption more and more these applications run on many-core platforms Data-flow models of computation are good at: Expressing task-level parallelism Achieving efficient implementation Guaranteeing performances at compile time: throughput : stream oriented applications latency : automatic control oriented applications buffer sizes : all embedded applications 2 Bouakaz, Fradet and Girault Symbolic Computation of Latency
Introduction Application model Acyclic Synchronous Data-FLow (SDF) graphs [Lee and Messerschmitt, Proc. 1987] rate actor edge 3 2 1 3 A B C t A =15 t B =8 t C =17 execution time 3 Bouakaz, Fradet and Girault Symbolic Computation of Latency
Introduction Application model Acyclic Synchronous Data-FLow (SDF) graphs [Lee and Messerschmitt, Proc. 1987] 3 2 1 3 A B C z A · 3 = z B · 2 z B · 1 = z C · 3 System of Balance Equations Consistent SDF graph G : this system has a non-null solution z = [ A 2 , B 3 , C 1 ] Repetition vector of G : � Iteration = firing sequence that returns G to its initial state � � � � � � � � 0 6 0 0 A 2 B 3 C 1 0 0 3 0 3 Bouakaz, Fradet and Girault Symbolic Computation of Latency
Introduction Scheduling policy Scheduling policy As Soon As Possible (ASAP) [Sriram and Bhattacharyya 2000] No auto-concurrency Modeling Techniques auto-concurrency buffer size 8 3 2 3 2 z = [2 , 3] � A B A B 3 t A =15 t B =8 2 4 Bouakaz, Fradet and Girault Symbolic Computation of Latency
Introduction Scheduling policy Scheduling policy As Soon As Possible (ASAP) [Sriram and Bhattacharyya 2000] No auto-concurrency Modeling Techniques auto-concurrency buffer size 8 3 2 3 2 z = [2 , 3] � A B A B 3 t A =15 t B =8 2 transient phase steady state A 15 30 45 60 75 90 B 23 38 46 54 68 76 84 98 106 4 Bouakaz, Fradet and Girault Symbolic Computation of Latency
Introduction Scheduling policy Scheduling policy Definition: Multi-iteration latency of graph G : L G ( n ) = the finish time of the n th iteration. L G (1) A B L G (2) 5 Bouakaz, Fradet and Girault Symbolic Computation of Latency
Introduction Scheduling policy Scheduling policy Definition: Input-output latency of graph G : ℓ G ( n ) = the duration between the start and ending of the n th iteration. ℓ G (1) A B ℓ G (2) 5 Bouakaz, Fradet and Girault Symbolic Computation of Latency
Introduction Scheduling policy Scheduling policy Definition: Period of graph G : L G ( n ) P G = the average length of an iteration = lim n n →∞ Definition: Throughput of graph G : 1 T G = P G P G A B P G 5 Bouakaz, Fradet and Girault Symbolic Computation of Latency
Introduction Symbolic analysis Symbolic analysis parametric partially specified dataflow SDF graph SDF graph graph instantiation numerical SDF graph numerical NP-complete analysis for HSDF results 6 Bouakaz, Fradet and Girault Symbolic Computation of Latency
Introduction Symbolic analysis Symbolic analysis parametric partially specified dataflow SDF graph SDF graph graph symbolic instantiation analysis symbolic numerical SDF graph symbolic formulas symbolic numerical numerical NP-complete evaluation evaluation analysis for HSDF results 6 Bouakaz, Fradet and Girault Symbolic Computation of Latency
Preliminary results Outline Introduction 1 Preliminary results 2 Duality theorem p q Graph A − − → B 3 Generalization to chains and acyclic graphs 4 Experiments 5 Conclusion 6 7 Bouakaz, Fradet and Girault Symbolic Computation of Latency
Preliminary results Duality theorem Duality theorem Definition: The dual of an SDF graph G : G − 1 is obtained by reversing all edges of G . Duality theorem: Let G be any (cyclic or not) live graph and G − 1 be its dual, then T G = T G − 1 and ∀ i. L G ( i ) = L G − 1 ( i ) . 7 2 3 A 30 60 G A B B 42 72 2 A =10 B =12 t 3 t L G ( n )= L G − 1 ( n ) A 42 72 G − 1 B 24 48 8 Bouakaz, Fradet and Girault Symbolic Computation of Latency
Preliminary results Duality theorem Duality theorem Definition: The dual of an SDF graph G : G − 1 is obtained by reversing all edges of G . Duality theorem: Let G be any (cyclic or not) live graph and G − 1 be its dual, then T G = T G − 1 and ∀ i. L G ( i ) = L G − 1 ( i ) . Proof: Using SDF-to-HSDF transformation + unfolding: A 1 A 1 B 1 B 1 A 2 A 2 HSDF( G − 1 ) HSDF( G ) B 2 B 2 A 3 A 3 8 Bouakaz, Fradet and Girault Symbolic Computation of Latency
p q Graph A → B − − − Outline Introduction 1 Preliminary results 2 p q Graph A − − → B 3 Enabling patterns Minimum latency Generalization to chains and acyclic graphs 4 Experiments 5 Conclusion 6 9 Bouakaz, Fradet and Girault Symbolic Computation of Latency
p q Graph A → B − − − p q Preliminaries about graph A − − → B Four parameters: p, q ∈ N + and t B ∈ R + . A , t Repetition vector: � � q p z A = gcd( p, q ) , z B = gcd( p, q ) ASAP period: P G = max( z A t A , z B t B ) . Problem statement What is θ A,B the min. size of channel A − − → B s.t. the ASAP execution achieves the max. throughput? Solution p + q − gcd( p, q ) < θ A,B ≤ 2( p + q − gcd( p, q )) Proof: 18 cases in total: p, q → 6 cases; t A , t B → 3 cases 10 Bouakaz, Fradet and Girault Symbolic Computation of Latency
p q Graph A → B Enabling patterns − − − Enabling patterns A time-independent analytic and parametric characterization of the data-dependency A → B that covers one iteration. Example: Graph A 8 5 − − → B with t A = 20 and t B = 7 enabling point A 1 A 2 A 3 A 4 A 5 0 8 11 9 12 10 0 B 1 B 2 B 3 B 4 B 5 B 6 B 7 B 8 A � B 2 A � B 2 A � B 2 A � B A � B A i � B j ⇔ i firings of A enables j firings of B . Unfolded pattern: A � B ; A � B 2 ; A � B ; A � B 2 ; A � B 2 11 Bouakaz, Fradet and Girault Symbolic Computation of Latency
p q Graph A → B Enabling patterns − − − Enabling patterns Unfolded pattern: A � B ; A � B 2 ; A � B ; A � B 2 ; A � B 2 � �� � � �� � block block 12 Bouakaz, Fradet and Girault Symbolic Computation of Latency
p q Graph A → B Enabling patterns − − − Enabling patterns Unfolded pattern: A � B ; A � B 2 ; A � B ; A � B 2 ; A � B 2 � �� � � �� � block block Factorized pattern: � i =1 ·· 2 with f 1 = 1 , f 2 = 2 � A � B ; [ A � B 2 ] f i 12 Bouakaz, Fradet and Girault Symbolic Computation of Latency
p q Graph A → B Enabling patterns − − − Enabling patterns Unfolded pattern: A � B ; A � B 2 ; A � B ; A � B 2 ; A � B 2 � �� � � �� � block block Factorized pattern: � i =1 ·· 2 with f 1 = 1 , f 2 = 2 � A � B ; [ A � B 2 ] f i General case: q − r A � B k +1 � α j � j =1 ·· � � A � B k ; gcd( p,q ) � � � ( j − 1) r � jr with p = kq + r and α j = − q − r q − r 12 Bouakaz, Fradet and Girault Symbolic Computation of Latency
p q Graph A → B Enabling patterns − − − Enabling patterns Case B. p < q Case A. p ≥ q Let q = kp + r with 0 ≤ r < p Let p = kq + r with 0 ≤ r < q Case B.1. r = 0 Case A.1. r = 0 A k � B A � B k Case B.2. p ≥ 2 r Case A.2. q ≤ 2 r � A k +1 � B ; � A k � B � γ j � j =1 ·· r q − r � A � B k ; � A � B k +1 � α j � j =1 ·· gcd( p,q ) gcd( p,q ) Case B.3. p < 2 r Case A.3. q > 2 r p − r �� � j =1 ·· �� A � B k � β j ; A � B k +1 � j =1 ·· r A k +1 � B � λ j ; A k � B gcd( p,q ) gcd( p,q ) � � − � ( j − 1) p α j = � jr − � ( j − 1) r � � � jp γ j = − 1 r r q − r q − r � � − � ( j − 1) q � � � − � ( j − 1) r � jq β j = − 1 jr λ j = r r p − r p − r 13 Bouakaz, Fradet and Girault Symbolic Computation of Latency
p q Graph A → B Minimum latency − − − Multi-iteration latency: Case z A t A ≥ z B t B A imposes a higher load than B A never gets idle = ⇒ P G = z A t A ⇒ L G ( n ) = n P G +∆ = P G + ∆ L G ( n ) = n P G + ∆ A,B ⇐ A,B A,B ≥ P G n n n ∆ A,B is the remaining execution time for actor B after actor A has finished its firings of the n th iteration ∆ A,B ∆ A,B is constant over all iterations so lim n → + ∞ = 0 n 5 3 (graph A − − → B with T A = 14 and t B = 8 ) 14 Bouakaz, Fradet and Girault Symbolic Computation of Latency
Recommend
More recommend