Open64/ORC compilers Sébastian Pop Université Louis Pasteur Strasbourg, Project A3 INRIA FRANCE Open64/ORC compilers – p.1
Short history 1994: Ragnarok compiler for MIPS R8000 Open64/ORC compilers – p.2
Short history 1994: Ragnarok compiler for MIPS R8000 designed for scientific applications Open64/ORC compilers – p.2
Short history 1994: Ragnarok compiler for MIPS R8000 designed for scientific applications August 1994: start Mongoose compiler Open64/ORC compilers – p.2
Short history 1994: Ragnarok compiler for MIPS R8000 designed for scientific applications August 1994: start Mongoose compiler scientific and non-scientific applications Open64/ORC compilers – p.2
Short history 1994: Ragnarok compiler for MIPS R8000 designed for scientific applications August 1994: start Mongoose compiler scientific and non-scientific applications fast and stable for day-to-day development Open64/ORC compilers – p.2
Short history 1994: Ragnarok compiler for MIPS R8000 designed for scientific applications August 1994: start Mongoose compiler scientific and non-scientific applications fast and stable for day-to-day development 1999: focus on IA-64: SGIpro 1.0 (alias osprey1.0) Open64/ORC compilers – p.2
Short history 1994: Ragnarok compiler for MIPS R8000 designed for scientific applications August 1994: start Mongoose compiler scientific and non-scientific applications fast and stable for day-to-day development 1999: focus on IA-64: SGIpro 1.0 (alias osprey1.0) 2001: Intel and ICT Chinese Academy of Sc. ORC (Open Research Compiler) Open64/ORC compilers – p.2
Compiler’s Structure 1. FE (Front-ends) 2. WHIRL (Intermediate Representation) 3. IPA (Inter Procedural Analysis) 4. LNO (Loop Nest Optimizer) 5. WOPT (Global Optimizer) 6. CG (Code Generator) 7. ORC (Open Research Compiler) Compiler’s Structure – p.3
Front Ends Use GCC’s C/C++ and Cray F90 front ends Compiler’s Structure – p.4
Front Ends Use GCC’s C/C++ and Cray F90 front ends Each front end has its own specific trees Compiler’s Structure – p.4
Front Ends Use GCC’s C/C++ and Cray F90 front ends Each front end has its own specific trees Translation to WHIRL Compiler’s Structure – p.4
Front Ends Use GCC’s C/C++ and Cray F90 front ends Each front end has its own specific trees Translation to WHIRL Question: Is this translation valid? Compiler’s Structure – p.4
Front Ends Use GCC’s C/C++ and Cray F90 front ends Each front end has its own specific trees Translation to WHIRL Question: Is this translation valid? Test suites were not GPL-ed, could use GCC test suites (inappropriate) Compiler’s Structure – p.4
Front Ends Use GCC’s C/C++ and Cray F90 front ends Each front end has its own specific trees Translation to WHIRL Question: Is this translation valid? Test suites were not GPL-ed, could use GCC test suites (inappropriate) Bug data base wasn’t GPL-ed. Compiler’s Structure – p.4
WHIRL Winning Hierarchical Intermediate Representation Language Compiler’s Structure – p.5
WHIRL Winning Hierarchical Intermediate Representation Language 5 levels: VH, H, M, L, VL Lowering happens when needed Each optimization performed at the right level Compiler’s Structure – p.5
WHIRL Winning Hierarchical Intermediate Representation Language whirl2c and whirl2f dump WHIRL in compilable files. whirl2a dump WHIRL in ASCII. Compiler’s Structure – p.5
Inter Procedural Analysis file1.c file2.cxx file3.f Suppose that we want to build a project containing 3 files and use the IPA for optimizing it. Compiler’s Structure – p.6
Inter Procedural Analysis file1.c file2.cxx file3.f The first step invokes the right front−end. C front−end C++ front−end F90 front−end Compiler’s Structure – p.6
Inter Procedural Analysis file1.c file2.cxx file3.f C front−end C++ front−end F90 front−end The compiler transforms front−end specific trees into WHIRL trees. WHIRL dumper This representation is then file1.o file2.o file3.o dumped into a .o file. These .o files behave like normal relocatable code (I.e. can be put in archives, etc.) Compiler’s Structure – p.6
Inter Procedural Analysis file1.c file2.cxx file3.f .a files can contain WHIRL The linker is called as usual C front−end C++ front−end F90 front−end trees, as well as on the last step of compilation. normal .o files. WHIRL dumper file1.o file2.o file3.o lib1.a lib2.so Linker Compiler’s Structure – p.6
Inter Procedural Analysis file1.c file2.cxx file3.f C front−end C++ front−end F90 front−end WHIRL dumper file1.o file2.o file3.o lib1.a lib2.so Linker Inter Procedural Analysis (IPA) Some files contain WHIRL trees: the compilation is not complete, and the IPA is called. Compiler’s Structure – p.6
Inter Procedural Analysis file1.c file2.cxx file3.f C front−end C++ front−end F90 front−end WHIRL dumper file1.o file2.o file3.o lib1.a lib2.so Linker Inter Procedural Analysis (IPA) Inter Procedural Optimizations (IPO) Compiler’s Structure – p.6
Inter Procedural Analysis file1.c file2.cxx file3.f C front−end C++ front−end F90 front−end WHIRL dumper file1.o file2.o file3.o lib1.a lib2.so Linker Inter Procedural Analysis (IPA) Inter Procedural Optimizations (IPO) Loop Nest Optimizer (LNO) Compiler’s Structure – p.6
Inter Procedural Analysis file1.c file2.cxx file3.f C front−end C++ front−end F90 front−end WHIRL dumper file1.o file2.o file3.o lib1.a lib2.so Linker Inter Procedural Analysis (IPA) Inter Procedural Optimizations (IPO) Loop Nest Optimizer (LNO) Main Optimizer (WOPT) Compiler’s Structure – p.6
Inter Procedural Analysis file1.c file2.cxx file3.f C front−end C++ front−end F90 front−end WHIRL dumper file1.o file2.o file3.o lib1.a lib2.so Linker Inter Procedural Analysis (IPA) Inter Procedural Optimizations (IPO) Loop Nest Optimizer (LNO) Main Optimizer (WOPT) Code Generator (CG) Compiler’s Structure – p.6
Inter Procedural Analysis file1.c file2.cxx file3.f C front−end C++ front−end F90 front−end WHIRL dumper file1.o file2.o file3.o lib1.a lib2.so Linker Inter Procedural Analysis (IPA) Inter Procedural Optimizations (IPO) Loop Nest Optimizer (LNO) Main Optimizer (WOPT) Code Generator (CG) Executable file Compiler’s Structure – p.6
Inter Procedural Analysis Idea: gather information over a whole project Compiler’s Structure – p.7
Inter Procedural Analysis Idea: gather information over a whole project Solution: save WHIRL trees in .o files build a global tree at link time perform all optimizations generate code Compiler’s Structure – p.7
Loop Nest Optimizer LNO works on High level WHIRL. Lowering removed unstructured control flow (gotos, switch, ...) Compiler’s Structure – p.8
Loop Nest Optimizer Analyzes extract information from WHIRL and construct specific Intermediate Representations (IRs): Array Dependence Graph LEGO: for data distributions Array and vectors accesses Vector space Systems of equations Polytope Compiler’s Structure – p.8
Loop Nest Optimizer Main optimizers in LNO: Loop unrolling Hoist conditionals Hoist varying lower bounds Dead store eliminate arrays Loop reversal / fission / fusion / tiling Array scalarization Prefetch Inter iteration Common Subexpression Elimination Compiler’s Structure – p.8
Global Optimizer WOPT works on Medium-level WHIRL (arrays lowered into load/store + offset, ...) Compiler’s Structure – p.9
Global Optimizer Main intermediate representations: CFG (Control Flow Graph) SSA (Static Single Assignement) Main optimizations: SSA-PRE (Partial Redundancy Elimination) DCE (Dead Code Elimination) IVR (Induction Variable Recognition) VNFRE (Value Numbering based Full Redundancy Elimination) Copy propagation Compiler’s Structure – p.9
Code Generator Code Generator works on CGIR. explicit CFG each BB contains a list of instructions each instruction is under the form OP_result OP_code OP_opnd This representation is close to assembler code. Compiler’s Structure – p.10
Code Generator Main optimizers in CG are: EBO: Extended Block Optimizer GRA: Global Register Allocation LRA: Local Register Allocation GCM: Global Code Motion SWP: Software Pipelining CIO: Cross Iteration loop Optimizations FREQ: execution frequencies of BBs and edges Compiler’s Structure – p.10
Open Research Compiler ORC is an extension of the Code Generator. ORC added the following infrastructure: IPFEC Regions: structures the CFG into a tree If-conversion PRDB: Predicate Relation DataBase Microscheduler Local/Global instruction scheduling Compiler’s Structure – p.11
Partial Redundancy Elimination Compiler’s Structure – p.12
Predicated code IF_COND BB1 BB2 Compiler’s Structure – p.13
Predicated code IF_COND If−conversion <BB1, p1> <BB2, p2> IF−conversion = replaces "if" constructs with guarded statements converts control−dependences to data−dependences Compiler’s Structure – p.13
Recommend
More recommend