compiler design
play

Compiler Design Spring 2018 8.0 Data-Flow Analysis Thomas R. Gross - PowerPoint PPT Presentation

Compiler Design Spring 2018 8.0 Data-Flow Analysis Thomas R. Gross Computer Science Department ETH Zurich, Switzerland 1 Properties of program analysis Analysis must be correct Depends on use Do not mislead the compiler Analysis


  1. Compiler Design Spring 2018 8.0 Data-Flow Analysis Thomas R. Gross Computer Science Department ETH Zurich, Switzerland 1

  2. Properties of program analysis § Analysis must be correct § Depends on use § Do not mislead the compiler § Analysis should be accurate § As accurate as possible (given the information in the program) § As accurate as feasible (may not be able to keep all details) 3

  3. Outline § Introduction § Why do we need data-flow analysis § Examples § 8.1 Program representation § 8.2 Points § 8.3 Paths § 8.4 Transfer functions 4

  4. 6

  5. int foo(int max, int [] A) { int k = 1; int minVal = A[0]; while (k < max) { if (A[k] < minVal) { minVal = A[k]; } k = k + 1; } return minVal; } 9

  6. int foo(int max, int [] A) { k = 1 minVal = A[0] L: TCond1 = k < max if (TCond1) TCond2 = A[k] < minVal if (TCond2) minVal = A[k] k = k + 1 Goto L return minVal } 10

  7. int foo(int max, int [] A) { k = 1 minVal = A[0] L: TCond1 = k < max if (TCond1) TCond2 = A[k] < minVal if (TCond2) minVal = A[k] k = k + 1 Goto L return minVal } 11

  8. int foo(int max, int [] A) { k = 1 minVal = A[0] L: TCond1 = k < max if (TCond1) TCond2 = A[k] < minVal if (TCond2) minVal = A[k] k = k + 1 Goto L return minVal } 13

  9. § All instructions in a box are executed together § We ignore for now that there could be exceptions if (ConditionVariable) § Test ConditionVariable and proceed accordingly § “TRUE” path § “FALSE” path § Only one path is taken § ”Goto Label” has the obvious meaning 15

  10. Basic block § The maximal sequence of instructions that is executed together is known as a “basic block” § Together means: w/o control flow change § You can form basic blocks directly from the JavaLi IR § May be harder in other languages or IRs 16

  11. Basic block § Observations § An if-statement ends a basic block § We do not know which statement will be executed next § A goto-statement ends a basic block § A return statement ends a basic block § A label starts a basic block (and ends the previous block) § Method call does not end a basic block § For some compilers a method call ends a basic block 17

  12. Control-flow graph § We can build a graph that captures the control flow § CFG: control-flow graph § Nodes: basic blocks § Edges: There is an edge between block B 1 and block B 2 if B 2 may be executed immediately after B 1 . § Two special nodes: ENTRY and EXIT § ENTRY has no in-edges § EXIT has no out-edges § All other nodes have at least one in-edge and one out-edge 18

  13. k = 1 B0 minVal = A[0] L: TCond1 = k < max B1 if (TCond1) B2 TCond2 = A[k] < minVal if (TCond2) B3 minVal = A[k] k = k + 1 B4 Goto L B5 return minVal 19

  14. k = 1 ENTRY B0 minVal = A[0] L: TCond1 = k < max B1 if (TCond1) B2 TCond2 = A[k] < minVal if (TCond2) B3 minVal = A[k] k = k + 1 B4 Goto L EXIT B5 return minVal 20

  15. x = 2; Construct the CFG for if (x > 0) { this program (5 min) y = 0; while (x < Nmax) { y = y + 1; x = 2 * x; if (x == Nspecial) { x = x * x; } } } else { y = 1; } 22

  16. § How many nodes (basic blocks) are in the CFG? § Did you include ENTRY and EXIT? 23

  17. 25

  18. Control-flow graph § Edge defines successor/predecessor relationship § A block can be its own predecessor § Edges capture possible successor (predecessor) relationships § Upon further inspection may want to remove edges and/or blocks § Always built for one method/function § May want to include header block to deal with parameters § Some compilers may include in header code to free registers, ”stack banging”, etc., and may have an exit block for restore operations 27

  19. Control-flow graph § Exceptional control flow discussion postponed § “try-throw-catch” § Exception triggered by hardware 28

  20. CFG construction § Basic blocks are a convenient abstraction § Not necessary for compiler § Some algorithms easier to describe when using basic blocks, with control flow edges between blocks § But many algorithms easy to explain in flat IR (sequences of statements) § If you must/want to construct a CFG (and identify blocks): § Single pass over current IR § Form basic blocks (CFG nodes) on the fly 29

  21. Comments on intermediate representations § Previous example “close” to JavaLi source § For illustration § Statements: JavaLi statements (almost) § Basic block concept applies also to low-level (assembler) languages § Statement: asm instruction § Or to other kinds of internal representations 30

  22. Comments (cont’d) § Many algorithms easier to explain/discuss for “simple” statements § destination = source 1 op source 2 § Instead of destination = source 1 op source 2 op source 3 op source 4 … § We assume that our programs have this form (2 operands) § Three-address code § It’s easy to transform a program a = b + c + d § turns into temp1 = c + d 31 a = b + temp1

  23. 33

  24. § Once we identify c+d as a “common sub-expression” (i.e., an expression that’s evaluated more than once) it’s easy to change the program § We could also work on large trees but it’s painful § Of course, it’s not clear if the compiler should replace the second occurrence of c+d by temp1 . 34

  25. Analysis inside a basic block § “Local” program analysis § As the compiler knows that all instructions are executed together, it’s “easy” to analyze a basic block § It’s still far from trivial to consider transformations § …or to identify operands 35

  26. Example a = b + c + d ; x = b + d; § There is a common sub-expression but the compiler (most likely) won’t find it § Even if we deal with integer operands… 36

  27. Another example a[k] = 1; a[m] = 2; b = a[k] + 1; § Can the compiler assume a[k] = 1? § k , m method parameters (int) § a some (large) array (int) § no multi-threading, no hidden changes to k , m 38 § No, as k == m is possible.

  28. Analysis inside a basic block § Analysis (and optimization) of basic blocks postponed § Not difficult § Chapter 8 of Aho et al. contains a discussion of the topic 39

  29. Terminology § Local {analysis | transformation}: inside a basic block § Global {analysis | transformation}: inside a method/function § Intra-procedural… § Inter-procedural {analysis | transformation}: across methods/functions 40

  30. Outline § Introduction § Why do we need data-flow analysis § Examples § 8.1 Program representation § 8.2 Points § 8.3 Paths § 8.4 Transfer functions 41

  31. 43

  32. Points (cont’d) § Points can be extended to basic blocks as well § Point (as before): a place in a program § Given a basic block B § P before_B : point before basic block B is executed § P after_B : point after basic block B is executed § Drop B if no risk of confusion 45

  33. k = 1 B0 minVal = A[0] L: TCond1 = k < max B1 if (TCond1) TCond2 = A[k] < minVal B2 if (TCond2) B3 minVal = A[k] k = k + 1 B4 Goto L B5 return minVal 46

  34. 52

  35. Comments § Points may have multiple predecessors § Join points § Join nodes (in the CFG) § Points may have multiple successors § Split points § Split nodes § When summarizing paths we may just list the basic blocks 53

  36. Paths B0 B1 B2 B3 54

  37. Paths i = 0 B0 if (i>0) B1 B2 B3 56

  38. B0 if (i>0) Paths B1 B2 if (i<0) B3 B4 B5 B6 57

  39. Paths § We are interested in all paths in a CFG § Even if they can never be taken in an execution § Need summary information § There can be arbitrarily many paths 59

  40. B0 Loops B1 if (COND) B2 B0 B1 B3 § B3 B0 B1 B2 B1 B3 § B0 B1 B2 B1 B2 B1 B3 § B0 B1 B2 B1 B2 B1 B2 B1 B2 B1 B2 B1 B2 B1 B2 B1 B2 B1 B2 … § 60

  41. Paths § We deal only with finite paths § Practical view: What has happened when program execution reaches B3 § P before_B3 § Execution (for some input) may never reach B3… § Summary information is needed 62

Recommend


More recommend