SSA Introduction Sebastian Hack hack@cs.uni-saarland.de Compiler Construction 2013 saarland university computer science 1
Another kind of CFGs p D ◦ ( ℓ ) x ← e ℓ x ← e D • ( ℓ ) q Effects on edges. Nodes called Nodes are basic blocks of program points. One data flow fact instructions. Closer to the hardware. per program point. Join of data flow Edges denote flow of control. Every facts done in fixpoint iteration node has incoming ( ◦ ) and (cf. data flow slides). outgoing ( • ) data flow information: � D ◦ ( ℓ ) := D • ( p ) p ∈ pred ( ℓ ) 2
Problem and Motivation A : x ← 1 � Consider Constant Propagation B : y ← 1 � Lattice: D := ( Vars → Z ⊤ ) ⊥ C : � Per CFG node we have to keep a D : true ( y = 1) E : false ( y = 1) mapping from V := | Vars | variables to abstract values F : G : x ← 2 � Space requirement N × V H : true ( x = 1) I : false ( x = 1) � Thus runtime O ( N × V ) rounds in the fixpoint iteration J : K : y ← 2 � and O ( N × V 2 ) in analysis updates L : true (?) M : false (?) per variable N : print( x ) 3
Flow-Insensitive Constant Propagation A : x ← 1 B : y ← 1 � Get around storing a map from vars to Z ⊤ at every program point C : � Keep one element x ∈ D per CFG not per program point D : true ( y = 1) E : false ( y = 1) � Solve the single equation F : G : x ← 2 � d ⊒ f i ( d ) H : true ( x = 1) I : false ( x = 1) i J : K : y ← 2 � Loss of precision because abstract values of all definitions of a variable L : true (?) M : false (?) are joined N : print( x ) 4
SSA A : x 1 ← 1 B : y 1 ← 1 � Flow-Insensitive Analyses x 2 ← φ ( x 1 , x 5 ) C : y 2 ← φ ( y 1 , y 4 ) � Each Variable has a static single assignment, i.e. one program point D : true ( y 2 = 1) E : false ( y 2 = 1) where it occurs on the left-hand side of an assignment x 4 ← φ ( x 2 , x 3 ) F : G : x 3 ← 2 x 5 ← 2 − x 4 � Identify program points and variable H : true ( x 5 = 1) I : false ( x 5 = 1) names J : y 4 ← φ ( y 2 , y 3 ) K : y 3 ← 2 � φ -functions select proper definitions at control-flow joins L : true (?) M : false (?) N : print( x 5 ) 5
(Un-Conditional) Constant Propagation in SSA � Perform flow-insensitive analysis on SSA-program � Domain: D := ( Vars → Z ⊤ ⊥ ) � Transfer functions: � ; � ♯ D := D � x ← e ; � ♯ D D [ x �→ � e � ♯ ] := � x ← M [ e ]; � ♯ D := D [ x �→ ⊤ ] � M [ e 1 ] ← e 2 � ♯ D := D � x 0 ← φ ( x 1 , . . . , x n ) � ♯ D D [ x 0 �→ � := 1 ≤ i ≤ n D ( x i )] � φ -functions make join over different reaching definitions explicit � Solve single inequality � D ⊒ f i D i by fixpoint iteration 6
Example A : x 1 ← 1 0 1 2 3 B : y 1 ← 1 x 1 ⊥ 1 1 1 y 1 ⊥ 1 1 1 x 2 ← φ ( x 1 , x 5 ) ⊥ ⊥ 1 ⊤ x 2 C : y 2 ← φ ( y 1 , y 4 ) y 2 ⊥ ⊥ 1 ⊤ ⊥ 2 2 2 x 3 D : true ( y 2 = 1) E : false ( y 2 = 1) x 4 ⊥ ⊥ ⊤ ⊤ ⊥ ⊥ ⊤ ⊤ x 4 ← φ ( x 2 , x 3 ) x 5 F : G : x 3 ← 2 x 5 ← 2 − x 4 y 3 ⊥ 2 2 2 ⊥ ⊥ ⊤ ⊤ y 4 H : true ( x 5 = 1) I : false ( x 5 = 1) Round-robin iteration. Initialization J : y 4 ← φ ( y 2 , y 3 ) K : y 3 ← 2 with ⊥ . Fixed point reached after three rounds. Precision loss at φ s L : true (?) M : false (?) because we could not exclude unreachable code. N : print( x 5 ) 7
Conditional Constant Propagation on SSA called sparse conditional constant propagation (SCCP) [Wegman et al. 1991] � Consider control flow as well. Perform two analysis in parallel � Cooperation between two domains: D := Vars → Z ⊤ Blocks → C := { d , r } ⊥ � d = dead code, r = reachable code � Two transfer functions per program point i : f i : D × C → D for constant propagation g i : D × C → C for reachability � Solve system of equations � f i ( x , y ) ⊒ x � g i ( x , y ) x ∈ D , y ∈ C y ⊒ 8
Example A : x 1 ← 1 0 1 2 x 1 ⊥ 1 1 B : y 1 ← 1 y 1 ⊥ 1 1 x 2 ⊥ 1 1 y 2 ⊥ 1 1 x 2 ← φ ( x 1 , x 5 ) x 3 ⊥ 2 2 C : x 4 ⊥ 1 1 y 2 ← φ ( y 1 , y 4 ) x 5 ⊥ 1 1 y 3 ⊥ 2 2 A r r r D : true ( y 2 = 1) E : false ( y 2 = 1) B d r r C d r r D d r r E d d d x 4 ← φ ( x 2 , x 3 ) F d r r F : G : x 3 ← 2 x 5 ← 2 − x 4 G d d d H d r r I d d d J d r r H : true ( x 5 = 1) I : false ( x 5 = 1) K d d d L d r r M d r r J : y 4 ← φ ( y 2 , y 3 ) K : y 3 ← 2 N d r r Round-robin interation. Each column shows the value of x ∈ D (upper rows) and y ∈ C (lower rows) in a single iteration of the L : true (?) M : false (?) fixpoint algorithm. Initial values are ⊥ and d . Root node A initialized with r . Fixed point reached after one round. Can prove code dead in cooperation with constant propagation N : print( x 5 ) information. 9
Transfer Functions � For constant propagation (functions f i ) � ℓ : x ← e ; � ♯ D , C D [ x ← � e � ♯ D ] := � ℓ : x ← M [ e ]; � ♯ D , C := D [ x ← ⊤ ] D [ x 0 �→ � X ] � ℓ : x 0 ← φ ( x 1 , . . . , x n ) � ♯ D , C := X := { x i | C ( pred ( ℓ, i )) = r } � · � ♯ D , C := D � For reachability (functions g i ) � � � e � ♯ D ⊑ 0 � d � ℓ : true ( e ) � ♯ D , C := C ℓ �→ otherwise r � � 0 ⊑ � e � ♯ D � r � ℓ : false ( e ) � ♯ D , C := C ℓ �→ otherwise d � · � ♯ D , C := C 10
φ -functions have semantics X 1 ← x i 1 . . . X n ← x in i -th edge x i 1 ← X 1 . . . ← x in X n ← φ ( x 11 , . . . , x m 1 ) x 01 . . . x 0 n ← φ ( x 1 n , . . . , x mn ) ≡ Commonly stated as All φ -functions are evaluated simultaneously at the beginning of the block 11
Where to place φ -functions? � φ -functions have to be placed such that 1. SSA program P ′ has the same semantics as original program P 2. Every variable has exactly one program point where it is defined x 1 ← . . . x 2 ← . . . y ← x ? + 1 � Observation: � First point reached by two different definitions of (non-SSA) variable has to contain a φ -function � In the SSA-form program, every use is reached by a single unique definition 12
Join Points Definition ∗ ∗ Two paths p : X 0 → X j and q : Y 0 → Y k converge at a program point Z if 1. X 0 � = Y 0 2. Z = X j = Y k ⇒ j = j ′ ∨ k = k ′ 3. X j ′ = Y k ′ = � A program point Z needs a φ -function for variable a , if it is the convergence point of two program points X 0 and Y 0 where each is a definition of a � Formally: J ( S ) := { Z | X , Y ∈ S converge at Z } . � J ( defs ( a )) is the set of program points where φ -functions have to be placed for a � How to compute join points efficiently? 13
Dominance � Every SSA variable has a unique program point where it is defined � The definition of a SSA variable dominates all its (non- φ ) uses Definition (Dominance) A node X in the CFG dominates a node Y if every path from entry to Y contains X . Write X ≥ Y . � Dominance is a partial order � Dominance is a tree order: For every X , Y , Z with X ≥ Z and Y ≥ Z holds X ≥ Y or Y ≥ X � Strict dominance: X > Y := X ≥ Y ∧ X � = Y � Immediate/direct dominator: idom ( Z ) = X with X > Z ∧ ∄ Y : X > Y > Z 14
Dominance Frontiers Efficiently computing SSA. . . [Cytron et al. 1991] Definition (Dominance Frontier) DF ( X ) = { Y | X � > Y ∧ ( ∃ Z predecessor of Y : X ≥ Z } � DF + ( X ) is the least fixed point S of S = DF ( X ∪ S ) � Theorem: DF + ( X ) = J ( X ) � Proof Sketch: ∗ → Z there is a node in { X } ∪ DF + ( X ) 1. Show that for every path p : X on p that dominates Z ∗ ∗ 2. Show that the convergence point Z of two paths X → Z , Y → Z is contained in DF + ( X ) ∪ DF + ( Y ) 3. Using this, we can show that J ( S ) ⊆ DF + ( S ) 4. Show DF ( S ) ⊆ J ( S ) for entry ∈ S 5. Using induction on DF i show that DF + ( S ) ⊆ J ( S ) 15
Dominance Frontiers Definition (Dominance Frontier) DF ( X ) = { Y | X � > Y ∧ ( ∃ Z predecessor of Y : X ≥ Z } � Can be efficiently computed by a bottom up traversal over the dominance tree: 1. Each CF-successor Z of X is either dominated by X or not 2. if not, it is in the dominance frontier of X 3. if yes, look at the dominance frontier of Z : All Y ∈ DF ( Z ) not dominated by X are also in DF ( X ) DF ( X ) = { Y successor of X | X � > Y } � ∪ { Y ∈ DF ( Z ) | X �≥ Y } X = idom ( Z ) 16
SSA Construction Cytron et al. 1. Compute dominance tree 2. Compute iterated dominance frontiers DF + ( X ) for all definitions of each variable 3. Rename variables � Every use takes lowest definition in the dominance tree � Note that φ -function uses happen at the end of the predecessors � First lemma of proof sketch guarantees that this definition is available 17
Recommend
More recommend