Loop Transformations for Parallelism & Locality Loop Permutation Idea Previously – Swap the order of two loops to increase parallelism, to improve spatial – Data dependences and loops locality, or to enable other transformations – Loop transformations – Also known as loop interchange – Parallelization – Loop interchange Example Today do i = 1,n do j = 1,n – Loop interchange do j = 1,n do i = 1,n – Loop transformations and transformation frameworks x = A(2,j) x = A(2,j) This access strides through This code is invariant with – Loop permutation enddo enddo a row of A respect to the inner loop, enddo enddo – Loop reversal yielding better locality – Loop skewing – Loop fusion CS553 Lecture Loop Transformations 1 CS553 Lecture Loop Transformations 2 Loop Interchange (cont) Legality of Loop Interchange Example Case analysis of the direction vectors (=,=) do i = 1,n do j = 1,n The dependence is loop independent, so it is unaffected by interchange do j = 1,n do i = 1,n x = A(i,j) x = A(i,j) (=,<) This array has stride This array now has stride 1 enddo enddo The dependence is carried by the j loop. n access access enddo enddo After interchange the dependence will be (<,=), so the dependence will still be carried by the j loop, so the dependence relations do not change. (Assuming column-major order for Fortran) (<,=) The dependence is carried by the i loop. After interchange the dependence will be (=,<), so the dependence will still be carried by the i loop, so the dependence relations do not change. CS553 Lecture Loop Transformations 3 CS553 Lecture Loop Transformations 4 1
Legality of Loop Interchange (cont) Loop Interchange Example Case analysis of the direction vectors (cont.) Consider the (<,>) case (<,<) The dependence distance is positive in both dimensions. do i = 1,n do j = 1,n After interchange it will still be positive in both dimensions, so the do j = 1,n do i = 1,n dependence relations do not change. C(i,j) = C(i+1,j-1) C(i,j) = C(i+1,j-1) enddo enddo (<,>) enddo enddo The dependence is carried by the outer loop. After interchange the dependence will be (>,<), which changes the Before After dependences and results in an illegal direction vector, so interchange is (1,1) C(1,1) = C(2,0) (1,1) C(1,1) = C(2,0) illegal. (1,2) C(1,2) = C(2,1) (2,1) C(2,1) = C(3,0) . . . . . . δ f δ a (>,*) (=,>) (2,1) C(2,1) = C(3,0) (1,2) C(1,2) = C(2,1) Such direction vectors are not possible for the original loop. CS553 Lecture Loop Transformations 5 CS553 Lecture Loop Transformations 6 Frameworks for Loop Transformations Legality of Loop Interchange, Reprise Reduced case analysis of the direction vectors Unimodular Loop Transformations [Banerjee 90],[Wolf & Lam 91] – can represent loop permutation, loop reversal, and loop skewing (=,=) – unimodular linear mapping (determinant of matrix is + or - 1) The dependence is loop independent, so it is unaffected by interchange – T i = i’, T is a matrix, i and i’ are iteration vectors (=,<) The dependence is carried by the j loop. After interchange the dependence will be (<,=), so the dependence will – transformation is legal if the transformed dependence vector remain still be carried by the j loop, so the dependence relations do not change. lexicographically positive – limitations (<,>) – only perfectly nested loops The dependence is carried by the outer loop. – all statements are transformed the same After interchange the dependence will be (>,<), which changes the dependences and results in an illegal direction vector, so interchange is illegal. CS553 Lecture Loop Transformations 7 CS553 Lecture Loop Transformations 8 2
Loop Reversal Loop Reversal and Distance Vectors Idea Impact – Change the direction of loop iteration – Reversal of loop i negates the i th entry of all distance vectors associated with the loop ( i.e., From low-to-high indices to high-to-low indices or vice versa) – What about direction vectors? Benefits When is reversal legal? – Improved cache performance – When the loop being reversed does not carry a dependence – Enables other transformations (coming soon) ( i.e ., When the transformed distance vectors remain legal) Example Example do i = 1,5 Dependence: Flow do i = 1,6 do i = 6,1,-1 do j = 1,6 Distance Vector: (1,1) A(i) = B(i) + C(i) A(i) = B(i) + C(i) A(i,j) = A(i-1,j-1)+1 Transformed enddo enddo enddo Distance Vector: (1,-1) legal enddo CS553 Lecture Loop Transformations 9 CS553 Lecture Loop Transformations 10 Loop Reversal Example Loop Skewing Legality Original code – Loop reversal will change the direction of the dependence relation do i = 1,6 do j = 1,5 j Is the following legal? A(i,j) = A(i-1,j+1)+1 i enddo do i = 1,6 enddo Dependence: Flow A(i) = A(i-1) Distance Vector: (1) enddo (1, -1) Distance vector: Can we permute the original loop? do i = 6,1,-1 Skewing: Dependence: Anti Flow A(i) = A(i-1) Distance Vector: (1) ( − 1) enddo j’ i’ CS553 Lecture Loop Transformations 11 CS553 Lecture Loop Transformations 12 3
Transforming the Dependences and Array Accesses Transforming the Loop Bounds Original code Original code do i = 1,6 do i = 1,6 do j = 1,5 do j = 1,5 A(i,j) = A(i-1,j+1)+1 A(i,j) = A(i-1,j+1)+1 j j enddo enddo i enddo i enddo Bounds: Dependence vector: New Array Accesses: j’ Transformed code i’ do i’ = 1,6 do j’ = 1+i’,5+i’ j’ A(i’,j’-i’) = A(i’-1,j’-i’+1)+1 enddo enddo i’ CS553 Lecture Loop Transformations 13 CS553 Lecture Loop Transformations 14 Loop Fusion Legality of Loop Fusion Idea Basic Conditions – Combine multiple loop nests into one – Both loops must have same structure – Same loop depth Example Can we relax any of these – Same loop bounds do i = 1,n restrictions? do i = 1,n A(i) = A(i-1) – Same iteration directions A(i) = A(i-1) enddo – Dependences must be preserved B(i) = A(i)/2 do j = 1,n e.g., Flow dependences must not become anti dependences enddo B(j) = A(j)/2 enddo do i = 1,n do i = 1,n Pros Cons body1 body1 All cross-loop Ensure that fusion − May improve data locality − May hurt data locality enddo body2 dependences does not introduce − May hurt icache performance − Reduces loop overhead do i = 1,n enddo flow from body1 dependences from − Enables array contraction (opposite of scalar expansion) body2 to body2 body2 to body1 − May enable better instruction scheduling enddo CS553 Lecture Loop Transformations 15 CS553 Lecture Loop Transformations 16 4
Loop Fusion Example Loop Fusion Example (cont) What are the dependences? Loop reversal is legal for the original loops – Does not change the direction of any dep in the original code do i = 1,n What are the dependences? – Will reverse the direction in the fused loop: s 3 δ a s 2 will become s 2 δ f s 3 s 1 A(i) = B(i) + 1 do i = 1,n enddo do i = n,1 s 1 δ f s 2 s 1 A(i) = B(i) + 1 s 1 A(i) = B(i) + 1 s 1 δ f s 2 do i = 1,n do i = n,1,-1 enddo s 1 δ f s 2 s 2 C(i) = A(i)/2 s 2 C(i) = A(i)/2 s 1 A(i) = B(i) + 1 s 3 δ a s 2 s 1 δ f s 2 enddo do i = n,1 s 2 δ f s 3 s 3 D(i) = 1/C(i+1) s 2 C(i) = A(i)/2 s 2 C(i) = A(i)/2 s 2 δ f s 3 enddo do i = 1,n enddo s 2 δ f s 3 s 3 D(i) = 1/C(i+1) Fusion changes the dependence s 3 D(i) = 1/C(i+1) enddo between s 2 and s 3 , so fusion is illegal enddo do i = n,1 s 3 After reversal and fusion all original D(i) = 1/C(i+1) Is there some transformation that will enable fusion of these loops? dependences are preserved enddo CS553 Lecture Loop Transformations 17 CS553 Lecture Loop Transformations 18 Concepts Next Time Using direction and distance vectors Lecture Transformation legality (from previous) – More loop transformations – must respect data dependences – Another transformation framework – scalar expansion as a technique to remove anti and output dependences Transformations: – What is the benefit? – What do they enable? – When are they legal? Unimodular transformation framework – represents loop permutation, loop reversal, and loop skewing – provides mathematical framework for ... – testing transformation legality, – transforming array accesses and loop bounds, – and combining transformations CS553 Lecture Loop Transformations 19 CS553 Lecture Loop Transformations 20 5
Recommend
More recommend