optimal software pipelining of loops with control flows
play

Optimal Software Pipelining of Loops with Control Flows Han-Saem - PowerPoint PPT Presentation

Optimal Software Pipelining of Loops with Control Flows Han-Saem YUN Jihong KI M Soo-Mook MOON Computer Architecture and Embedded Systems Lab. Seoul National University, KOREA 16 th I nternational Conference on Supercomputing (I CS


  1. Optimal Software Pipelining of Loops with Control Flows Han-Saem YUN Jihong KI M Soo-Mook MOON Computer Architecture and Embedded Systems Lab. Seoul National University, KOREA 16 th I nternational Conference on Supercomputing (I CS’ 02) 1 1/39 CARES Lab. / Seoul National Univ. ICS ’ 02

  2. Software Pipelining (SP) Example op2 op1 op5 op3 op2 op1 op5 op1 op1: r1= r1-r2 op2: cc0= (r1< = 0) op6 op5 op3 op2 op1 op2 op6 op3: if (!cc0) op6 op4 op7 op3 op7 op5 op1 op7 op6 op5 op3 op2 op1 op4: r1= r1+ r3 op7 op6 op4 op4 op8 op1 op8 op8 op7 op6 op5 op5: r5= r1< < 4 op3 op2 op1 op6: r5= load @r5 op7: cc0= (r5< = 0) op8: if (!cc0) op7 op6 op8 op4 op7 op8 op5 op1 Much more complex than SP of loops without control flows ! op8 op6 op2 Even formulating the problem is very difficult !! (let alone discussing optimality issue..) 2 2/39 CARES Lab. / Seoul National Univ. ICS ’ 02

  3. Optimal SP • Optimally software pipelined program a= d+ 1 if d= = 0 – every execution path of b= g1(a) c= f1(a) b= g2(a) c= f2(a) the program runs in the shortest possible time d= b+ 1 if c= = 0 subject to true dependences, and.. • resource constraints.. a= d+ 1 if d= = 0 (well-defined??) b= g1(a) c= f1(a) For any path, there is a dependence d= b+ 1 if c= = 0 chain whose length is equal to the number of execution cycles of the path a= d+ 1 if d= = 0 b= g2(a) c= f2(a) length of dependence chain = 6 = # of execution cycles d= b+ 1 if c= = 0 3 3/39 CARES Lab. / Seoul National Univ. ICS ’ 02

  4. Previous Results on Optimal SP loops wit hout cont rol f lows loops wit h cont rol f lows Generally, NO optimal wit hout solution exists! Poly-time optimal algorithm resource � for some loops.. � const raint s Aiken, Nicolau, PLDI’ 88 � Schwiegelshon, Gasperoni, � (only t rue Unroll & modulo scheduling ’ 89 Ebcioglu, MICRO dependences) � illustrated 2 loops.. NP-hard wit h Poly-time approximation algo. NOT well-defined.. resource � Gasperoni et al. PPL ’ 94 const raint s � < about 2 x optimal performance optimality fomulation & heuristics P ract ical � register pressure, set t ings.. � D-cache conscious, � clustered VLIW, … 4 4/39 CARES Lab. / Seoul National Univ. ICS ’ 02

  5. Schwiegelshohn et al.’ s Result [MICRO ’ 89 , JPDS ’ 91 ] The set of all loops with control flows • Illustrates some loops that cannot have semantically equivalent, optimally software-pipelined programs – lacks a formalism required to develop generalized results • No further research result on optimal SP has been reported for more than a decade – Possibly having been discouraged by the pessimistic result 5 5/39 CARES Lab. / Seoul National Univ. ICS ’ 02

  6. Our Recent Result [CC ’ 01 ] Schwiegelshohn’ s result • Describe a strong necessary condition for a loop to have equivalent optimally software-pipelined program – i.e., present a nonexistence proof for loops that do not satisfy the condition – generalization of Schwiehelshon’ s result • As part of the formal treatment, propose a formalization of software pipelining of loops with control flows 6 6/39 CARES Lab. / Seoul National Univ. ICS ’ 02

  7. Our Contribution 1: Optimality Condition Schwiegelshohn’ s result Our previous result • Necessary and sufficient condition for a loop to have semantically equivalent optimally software-pipelined program – We call the condition “ optimality condition” – Exactly identify what can and cannot be achieved by SP • Developed a decision procedure to compute the condition 7 7/39 CARES Lab. / Seoul National Univ. ICS ’ 02

  8. Our Contribution 1: Optimality Condition Loops without Loops with optimal solutions optimal solutions • Necessary and sufficient condition for a loop to have semantically equivalent optimally software-pipelined program – We call the condition “ optimality condition” – Exactly identify what can and cannot be achieved by SP • Developed a decision procedure to compute the condition 8 8/39 CARES Lab. / Seoul National Univ. ICS ’ 02

  9. Our Contribution 2: Optimal Algorithm Optimality condition Loops without Loops with optimal solution optimal solution • Algorithm to compute an optimal solution for every loop satisfying the optimality condition (the necessary and sufficient condition) – Quite expensive, but covers the right region completely. – Also serves as a proof for the sufficient part of the optimality condition 9 9/39 CARES Lab. / Seoul National Univ. ICS ’ 02

  10. Our Contribution 3: Conservative Optimal Algo. Optimality condition • An efficient algorithm to compute an optimal solution for “ almost every” loops satisfying the optimality condition – “ almost” ≡ more than 90% (for loops used in our experiment..) – “ efficient” ≡ in less than 30 sec. 10 10/ 39 CARES Lab. / Seoul National Univ. ICS ’ 02

  11. Our Contribution 4: Experiments Optimality B condition A C • Measure actual portion of each region – A : loops without optimal solutions – B : loops with optimal solutions, but the optimal solutions cannot be computed by the (efficient) conservative optimal algorithm – C : loops with optimal solutions, and the optimal solutions can be computed by the (efficient) conservative optimal algorithm 11 11/ 39 CARES Lab. / Seoul National Univ. ICS ’ 02

  12. Our Contribution 4: Experiments Optimality condition B A C • Actual portion of each region – Quite optimistic ! ( A ≅ 10% , C ≅ 75%) (for loops used in our experiment..) • Resource requirement for optimal solution – Not so excessive 12 12/ 39 CARES Lab. / Seoul National Univ. ICS ’ 02

  13. Road Map 1. Optimality Condition 2. Optimal SP Algorithm 3. Conservative Optimal SP Algorithm 4. Experimental Results 13 13/ 39 CARES Lab. / Seoul National Univ. ICS ’ 02

  14. Optimality Condition: Intuitive Explanation a= a+ 1 a= a+ 1 a= a+ 1 b= b+ 1 a= a+ 1 a= a+ 1 k nodes b= b+ 1 a= a+ 1 b= b+ 1 k cycles a= a+ 1 a= a+ 1 b= b+ 1 b= b+ 1 • Operations in the sequential programs b= b+ 1 are required to be b= b+ 1 a= a+ 1 b= b+ 1 k nodes moved by only bounded range to Moved acoross k-1 “ a= a+ 1” yield time optimal (what if k → ∞ ?) b= b+ 1 execution A sequential path Time optimal parallel path – for any execution path 14 14/ 39 CARES Lab. / Seoul National Univ. ICS ’ 02

  15. Example Loops Loops with optimal solution Show the existence of the optimal solution by construction (i.e., by optimal SP algorithm presented in next chapter) Loops without optimal solution Present nonexistence proof in the following slide 15 15/ 39 CARES Lab. / Seoul National Univ. ICS ’ 02

  16. Sketch of Nonexistence Proof • What if a loop – does NOT satisfying the optimality condition – but has an optimal solution ? • Then, we can construct problematic execution paths s.t. – Code motions of unbounded range are needed for them to be optimally executed • Intuitively, code motion of unbounded range incurs – for conditional branch: unbounded code expansion – for non-branch: unbounded live range ⇒ unbounded registers – Neither of them is possible • But, how to prove mathematically? – Establish formalization of software pipelining! – Schwiegelshohn’ s proof is based on quite a specific property of SP transformation. Thus, fail to lead to generalized results 16 16/ 39 CARES Lab. / Seoul National Univ. ICS ’ 02

  17. For Loops Violating Optimality Condition … • Can’ t we guarantee anything for them ? – in terms of optimal SP a= a+ 1 b= b+ 1 • From the nonexistence proof, – The problematic paths are exposed to too much a= a+ 1 parallelism.. a= a+ 1 • I.e., requires too large scheduling windows a= a+ 1 k nodes – But, the same problem is also in the single-path loops. How has it been handled ? • Add artifact dependences !! a= a+ 1 a= f() a= f’ (a) b= b+ 1 b= a+ 1 b= a+ 1 b= b+ 1 c= b* 2 c= b* 2 b= b+ 1 k nodes e.g.: Every operation in i -th iteration is dependent • on every operation in ( i-a) -th iteration.. b= b+ 1 – For some a> 0 17 17/ 39 CARES Lab. / Seoul National Univ. ICS ’ 02

  18. Road Map 1. Optimality Condition 2. Optimal SP Algorithm 3. Efficient Optimal SP Algorithm 4. Experimental Results 18 18/ 39 CARES Lab. / Seoul National Univ. ICS ’ 02

  19. Optimal SP Algorithm Computes an optimal solution for every loop satisfying the • optimality condition – Mostly based on the latest version of Aiken & Nicolau’ s Perfect Pipelining [TPDS ’ 95] • Modification to renaming framework – Aiken’ s original algorithm doesn’ t always handle false dependences appropriately • Use Ebcioglu’ s on the fly dynamic renaming restrictively – For optimal SP, false dependences should be overcome completely • so that the parallel schedule is constrained by true dependences only – We use SSA (Static Single Assignment) for renaming framework 19 19/ 39 CARES Lab. / Seoul National Univ. ICS ’ 02

Recommend


More recommend