Intermediate Representaions Concepts of Programming Languages (CoPL) Malte Skambath malte@skambath.de November 16, 2015
Intermediate Overview Representaions Malte Skambath We need Compilers! We need Compilers! Classical Compiler Process Classical Compiler Process Machine Models Stack Machines Register Machines Three-Address Code Machine Models Static-Single-Assignment Implementations Stack Machines LLVM Register Machines CIL Conclusion Implementations LLVM CIL Conclusion 2 / 34
Intermediate Developing Software Representaions We need compilers! Malte Skambath We need Compilers! Classical Compiler Process Machine Models Stack Machines Register Machines Three-Address Code Static-Single-Assignment Implementations LLVM ? CIL Conclusion x86 3 / 34
Intermediate Developing Software Representaions We need compilers! Malte Skambath We need Compilers! Classical Compiler Process Machine Models Stack Machines Register Machines Three-Address Code Static-Single-Assignment Implementations LLVM ? CIL Conclusion x86 AMD64 ARM 3 / 34
Intermediate Developing Software Representaions We need compilers! Malte Skambath We need Compilers! Classical Compiler Process Machine Models Stack Machines Register Machines Three-Address Code Static-Single-Assignment Implementations LLVM ? CIL Conclusion x86 AMD64 ARM 3 / 34
Intermediate Intermediate Representation Representaions The solution! Malte Skambath We need Compilers! Classical Compiler Process Machine Models Stack Machines Register Machines Three-Address Code Static-Single-Assignment Implementations LLVM CIL Intermediate Representation Conclusion x86 AMD64 ARM 4 / 34
Intermediate Intermediate Representation Representaions Malte Skambath We need Compilers! Classical Compiler Process Machine Models Definition Stack Machines Register Machines An intermediate representation (IR) is data structure as Three-Address Code Static-Single-Assignment representation of a program between a high-level Implementations programming language and machine code. LLVM CIL Conclusion An intermediate language (IL) is a low-level assembly language as IR for a virtual machine. 5 / 34
Intermediate Classical Compiler Process Representaions Malte Skambath We need Compilers! Classical Compiler Process Lexical Analysis (Scanner) Machine Models Stack Machines Tokens Register Machines Three-Address Code Syntax Analysis (Parser) Static-Single-Assignment Frontend Implementations ST/AST LLVM CIL Semantic Analysis Conclusion CFG Optimization CFG Code Generation Backend 6 / 34
Intermediate Abstract Sytax Tree Representaions Malte Skambath An abstract syntax tree (AST) . . . We need Compilers! . . . describes the syntactical structure of a program Classical Compiler . . . depends on the programming language Process Machine Models . . . is generated during by the parser Stack Machines Register Machines program Three-Address Code Static-Single-Assignment Implementations return block while LLVM CIL Conclusion . . . body condition variable sum . . . assign bin op: * variable sum variable i 7 / 34
Intermediate Control-Flow-Graph Representaions Malte Skambath We need Compilers! Classical Compiler Process int s = 1; Machine Models Stack Machines for ( int i=1; i<=10; i++) i ← 1 , s ← 0 Register Machines s += i; Three-Address Code Static-Single-Assignment return (s); Implementations no i ≤ 10 LLVM CIL yes Conclusion ret ( s ) i ← i + 1 s ← s + i 8 / 34
Intermediate Stack Machines Representaions Malte Skambath Definition We need Compilers! A general Stack Machine has Classical Compiler Process ◮ a stack as storage Machine Models ◮ a set of instructions / operations op = F ( a 1 , a 2 , . . . , a n ) Stack Machines Register Machines including ( push and pop ) Three-Address Code Static-Single-Assignment Executing an operation takes the arguments from top of Implementations LLVM the stack, computes the result in the accumulator, and CIL pushes the result back the stack. Conclusion Example 3 push 1 push 2 2 2 5 push 3 1 1 1 1 1 add pop 9 / 34
Intermediate Stack-machines Representaions Code Generation Malte Skambath We need Compilers! Classical Compiler We can generate the control by traversing the syntax tree. Process x 2 + y 2 . � Assume we have to compute the expression Machine Models Stack Machines Register Machines Three-Address Code Static-Single-Assignment Implementations LLVM CIL Conclusion 10 / 34
Intermediate Stack-machines Representaions Code Generation Malte Skambath We need Compilers! Classical Compiler We can generate the control by traversing the syntax tree. Process x 2 + y 2 . � Assume we have to compute the expression Machine Models Stack Machines Register Machines Three-Address Code AST Static-Single-Assignment push x Implementations sqrt push x LLVM CIL mul Conclusion push y add push y mul mul mul add sqrt x x y y 10 / 34
Intermediate Stack Machines Representaions Summary Malte Skambath We need Compilers! Classical Compiler Process ◮ Programs for stack machines are short Machine Models Stack Machines Only the opcodes ( or constants) in the byte code. Register Machines Three-Address Code ◮ In practical use stack machines can be extended Static-Single-Assignment 1. An external memory to store and load values Implementations LLVM (computations are still limited to the stack) CIL 2. Top-Level registers Conclusion 3. Metainformations (see CIL later) ◮ Problem: Most processor-architectures use registers. ⇒ Hybrid Models, Special informations in the intermediate representation. 11 / 34
Intermediate Register Machines Representaions Malte Skambath We need Compilers! Classical Compiler Definition Process A register machine . . . Machine Models Stack Machines ◮ consists of an infinite number of memory cells named Register Machines Three-Address Code registers Static-Single-Assignment Implementations ◮ each register is accessible LLVM CIL ◮ has a limited set of instruction / operations: Conclusion 1. Arithmetical Operations: Computes a function F using selected registers � o 1 , � . . . , � o n � as operands and stores the result in a target register � r � 2. Jumps/Branches 12 / 34
Intermediate Three-Address Code (3AC/TAC) Representaions ◮ Each TAC is a sequence of instructions I 1 , I 2 , . . . , I n for a Malte Skambath register machine. We need Compilers! ◮ Instructions can be Classical Compiler Process 1. Assignments r1 := r0 2. Unconditional Jumps (Instructions can be labeled) Machine Models Stack Machines Register Machines L0: goto L1 Three-Address Code ... Static-Single-Assignment L1: r0 := 1 Implementations LLVM 3. Conditional Branches CIL Conclusion if a<b then goto L1 4. Arithmetical operations r3 := add (r1,r2) ◮ Each instruction contains at most 3 registers 13 / 34
Intermediate Three-Address Code (3AC/TAC) Representaions ◮ Each TAC is a sequence of instructions I 1 , I 2 , . . . , I n for a Malte Skambath register machine. We need Compilers! ◮ Instructions can be Classical Compiler Process 1. Assignments r1 := r0 2. Unconditional Jumps (Instructions can be labeled) Machine Models Stack Machines Register Machines L0: goto L1 Three-Address Code ... Static-Single-Assignment L1: r0 := 1 Implementations LLVM 3. Conditional Branches CIL Conclusion if a<b then goto L1 4. Arithmetical operations r3 := add (r1,r2) ◮ Each instruction contains at most 3 registers � x 2 + y 2 ) Example ( t1 := x * x t2 := y * y t3 := t1 + t2 result := sqrt(t3) 13 / 34
Intermediate Three-Address Code (3AC/TAC) Representaions How to design the Byte-Code Malte Skambath We need Compilers! For practical use we should store TAC in byte code format. Classical Compiler ◮ Each operation has an opcode for the virtual machine Process Machine Models ◮ Each instruction can be represented by tuples Stack Machines Register Machines Three-Address Code Quadruples Triples Static-Single-Assignment Implementations opcode op1 op2 opcode op1 op2 LLVM t1 MUL x x MUL x x CIL Conclusion t2 MUL y y MUL y y t1 ADD t1 t2 ADD (1) (2) res SQRT t1 - SQRT (3) - Note Registers can be assigned implicitly (Triples). But then each register has to be assigned only once. 14 / 34
Intermediate Static-Single-Assignment Representaions Malte Skambath We need Compilers! Classical Compiler Definition (Static-Single Assignment) Process Machine Models A Three-Adress Code is in Static-Single Assignment -from if Stack Machines each register gets assigned once in the code. Register Machines Three-Address Code Static-Single-Assignment � x 2 + y 2 ) Example ( Implementations LLVM CIL Not in SSA SSA Conclusion L1: x := x * x L1: x0 := x * x L2: y := y * y L2: y0 := y * y L3: x := x + y L3: x1 := x0 + y0 L4: z := sqrt(x) L4: z := sqrt(x1) 15 / 34
Recommend
More recommend