cse 501 course outline implementation of programming
play

CSE 501: Course outline Implementation of Programming Languages - PDF document

CSE 501: Course outline Implementation of Programming Languages Models of compilation/analysis Main focus: program analysis and transformation how to represent programs? Standard optimizing transformations how to analyze programs? what


  1. CSE 501: Course outline Implementation of Programming Languages Models of compilation/analysis Main focus: program analysis and transformation • how to represent programs? Standard optimizing transformations • how to analyze programs? what to analyze? Basic representations and analyses • how to transform programs? what transformations to apply? Fancier representations and analyses Study imperative, functional, and object-oriented languages Interprocedural representations, analyses, and transformations Official prerequisites: • for imperative, functional, and OO languages • CSE 401 or equivalent • CSE 505 or equivalent Run-time system issues • garbage collection Reading: • compiling dynamic dispatch, first-class functions, ... Appel’s “Modern Compiler Implementation” + ~20 papers from literature Dynamic (JIT) compilation • “Compilers: Principles, Techniques, & Tools”, a.k.a. the Dragon Book, as a reference Other program analysis frameworks and tools Coursework: • model checking, constraints, best-effort "bug finders" • periodic homework assignments • major course project • midterm, final Craig Chambers 1 CSE 501 Craig Chambers 2 CSE 501 Why study compilers? Goals for language implementation Meeting area of programming languages, architectures Correctness • capabilities of compilers greatly influence design of these others Efficiency • of: time, data space, code space Program representation, analysis, and transformation • at: compile-time, run-time is widely useful beyond pure compilation • software engineering tools Support expressive, safe language features • DB query optimizers, programmable graphics renderers • first-class, higher-order functions (domain-specific languages and optimizers) • method dispatching • safety/security checking of code, • exceptions, continuations e.g. in programmable/extensible systems, networks, • reflection, dynamic code loading databases • bounds-checked arrays, ... • garbage collection Cool theoretical aspects, too • ... • lattice domains, graph algorithms, computability/complexity Support desirable programming environment features • fast turnaround • separate compilation, shared libraries • source-level debugging • profiling • ... Craig Chambers 3 CSE 501 Craig Chambers 4 CSE 501

  2. Standard compiler organization Mixing front-ends and back-ends Analysis Synthesis of input program of output program Define intermediate language ( front-end ) ( back-end ) (e.g. Java bytecode, MSIL, SUIF, WIL, C, C--, ...) character Compile multiple languages into it stream intermediate • each such compiler may not be much more than a front-end form Lexical Analysis token Compile to multiple targets from it Optimization stream • may not be much more than back-end Syntactic Analysis intermediate form Or, interpret/execute it directly abstract syntax Or, perform other analyses of it tree Interpreter Code Generation Semantic Analysis target language Advantages: annotated AST • reuse of front-ends and back-ends • portable “compiled” code Interpreter Intermediate Code Generation BUT: design of portable intermediate language is hard intermediate • how universal? form across input language models? target machine models? • high-level or low-level? Craig Chambers 5 CSE 501 Craig Chambers 6 CSE 501 Key questions Overview of optimizations How are programs represented in the compiler? First analyze program to learn things about it Then transform the program based on info How are analyses organized/structured? Repeat... • Over what region of the program are analyses performed? • What analysis algorithms are used? Requirement: don’t change the semantics! • transform input program into semantically equivalent but better output program What kinds of optimizations can be performed? • Which are profitable in practice? • How should analyses/optimizations be sequenced/ combined? Analysis determines when transformations are: • legal How best to compile in face of: • profitable • pointers, arrays • first-class functions • inheritance & message passing • parallel target machines Caveat: “optimize” a misnomer • result is almost never optimal Other issues: • sometimes slow down some programs on some inputs (although hope to speed up most programs on most • speeding compilation inputs) • making compilers portable, table-driven • supporting tools like debuggers, profilers, garbage collect’rs Craig Chambers 7 CSE 501 Craig Chambers 8 CSE 501

  3. Semantics Scope of analysis Exactly what are the semantics that are to be preserved? Peephole : across a small number of “adjacent” instructions [adjacent in space or time] Subtleties: • trivial analysis • evaluation order • arithmetic properties like associativity, commutativity Local : within a basic block • behavior in “error” cases • simple, fast analysis Intraprocedural (a.k.a. global ): Some languages very precise across basic blocks, within a procedure • programmers always know what they’re getting • analysis more complex: branches, merges, loops Others weaker Interprocedural : • allow better performance (but how much?) across procedures, within a whole program • analysis even more complex: calls, returns • hard with separate compilation Semantics selected by compiler option? Whole-program : analysis examines whole program in order to prove safety Craig Chambers 9 CSE 501 Craig Chambers 10 CSE 501 A tour of common optimizations/transformations common subexpression elimination (CSE) ⇒ x := a + b x := a + b ... ... arithmetic simplifications: y := a + b y := x • constant folding • can also eliminate redundant memory references, ⇒ x := 3 + 4 x := 7 branch tests • strength reduction ⇒ x := y * 4 x := y << 2 partial redundancy elimination (PRE) • like CSE, but with earlier expression only available along constant propagation subset of possible paths ⇒ ⇒ if ... then ⇒ x := 5 x := 5 x := 5 if ... then y := x + 2 y := 5 + 2 y := 7 ... ... x := a + b t := a + b; x := t end else t := a + b end integer range analysis ... ... y := a + b y := t • fold comparisons based on range analysis • eliminate unreachable code for(index = 0; index < 10; index ++) { if index >= 10 goto _error a[index] := 0 } • more generally, symbolic assertion analysis Craig Chambers 11 CSE 501 Craig Chambers 12 CSE 501

  4. copy propagation pointer/alias analysis ⇒ ⇒ ⇒ x := y x := y p := &x p := &x p := &x w := w + x w := w + y *p := 5 *p := 5 *p := 5 y := x + 1 y := 5 + 1 y := 6 dead (unused) assignment elimination x := 5 x := y ** z *p := 3 ... // no use of x := x + 1 ⇒ y ??? x := 6 • augments lots of other optimizations/analyses • a common clean-up after other optimizations: ⇒ ⇒ x := y x := y x := y ⇒ w := w + x w := w + y w := w + y ... // no use of x partial dead assignment elimination • like DAE, except assignment only used on some later paths dead (unreachable) code elimination if false goto _else ... goto _done _else: ... _done: • another common clean-up after other optimizations Craig Chambers 13 CSE 501 Craig Chambers 14 CSE 501 loop-invariant code motion parallelization ⇒ ⇒ for j := 1 to 10 for j := 1 to 10 for i := 1 to 1000 forall i := 1 to 1000 for i := 1 to 10 t := b[j] a[i] := a[i] + 1 a[i] := a[i] + 1 a[i] := a[i] + b[j] for i := 1 to 10 loop interchange, skewing, reversal, ... a[i] := a[i] + t blocking/tiling induction variable elimination • restructuring loops for better data cache locality ⇒ for i := 1 to 10 for p := &a[1] to &a[10] for i := 1 to 1000 a[i] := a[i] + 1 *p := *p + 1 for j := 1 to 1000 • a[i] is several instructions, *p is one for k := 1 to 1000 c[i,j] += a[i,k] * b[k,j] ⇒ for i := 1 to 1000 by TILESIZE loop unrolling for j := 1 to 1000 by TILESIZE ⇒ for i := 1 to N for i := 1 to N by 4 for k := 1 to 1000 a[i] := a[i] + 1 a[i] := a[i] + 1 for i’ := i to i+TILESIZE a[i+1] := a[i+1] + 1 for j’ := j to j+TILESIZE a[i+2] := a[i+2] + 1 c[ i’ , j’ ] += a[ i’ ,k] * b[k, j’ ] a[i+3] := a[i+3] + 1 Craig Chambers 15 CSE 501 Craig Chambers 16 CSE 501

Recommend


More recommend