design and implementation of a tricore backend for the
play

Design and Implementation of a TriCore Backend for the LLVM Compiler - PowerPoint PPT Presentation

Design and Implementation of a TriCore Backend for the LLVM Compiler Framework Studienarbeit Christoph Erhardt Friedrich-Alexander-Universit at Erlangen-N urnberg November 20, 2009 A TriCore Backend for LLVM (November 20, 2009) 1 25


  1. Design and Implementation of a TriCore Backend for the LLVM Compiler Framework Studienarbeit Christoph Erhardt Friedrich-Alexander-Universit¨ at Erlangen-N¨ urnberg November 20, 2009 A TriCore Backend for LLVM (November 20, 2009) 1 – 25

  2. Overview Overview The TriCore Processor Architecture The LLVM Compiler Infrastructure Design and Implementation of the Backend Evaluation & Conclusion A TriCore Backend for LLVM (November 20, 2009) Overview 2 – 25

  3. Motivation What do we need it for? TriCore chips are omnipresent around here: Quadcopter High striker Carolo Cup ... The RTSC (Real-Time Systems Compiler) project: Operating system aware compiler for real-time applications Processes atomic basic blocks Based on LLVM A TriCore Backend for LLVM (November 20, 2009) Overview 3 – 25

  4. Motivation What do we need it for? TriCore chips are omnipresent around here: Quadcopter High striker Carolo Cup ... The RTSC (Real-Time Systems Compiler) project: Operating system aware compiler for real-time applications Processes atomic basic blocks Based on LLVM RTSC should be able to generate TriCore machine code A TriCore Backend for LLVM (November 20, 2009) Overview 3 – 25

  5. The TriCore Processor Architecture Overview Three-in-one architecture Real-time microcontroller unit DSP Superscalar RISC processor A TriCore Backend for LLVM (November 20, 2009) The TriCore Processor Architecture 4 – 25

  6. The TriCore Processor Architecture Overview Three-in-one architecture Real-time microcontroller unit DSP Superscalar RISC processor Basic features Load/store architecture 32-bit data, address, and instruction words Some special 16-bit instruction words for higher code density Little-endian byte order 16 data + 16 address registers A TriCore Backend for LLVM (November 20, 2009) The TriCore Processor Architecture 4 – 25

  7. Peculiarities Some things that TriCore handles in an unusual way Strict distinction between data and address registers: Also reflected in the calling conventions Serious problem for the compiler! Data registers are also used for floating-point operands Special DSP-oriented instructions and addressing modes Task/context model: Automatic context save/restore upon call/return Context save areas (linked lists managed by hardware) A TriCore Backend for LLVM (November 20, 2009) The TriCore Processor Architecture 5 – 25

  8. The LLVM Compiler Infrastructure Overview Open-source compiler infrastructure project started in 2000 Main sponsor: Apple Inc. Written in C++ A TriCore Backend for LLVM (November 20, 2009) The LLVM Compiler Infrastructure 6 – 25

  9. Basic Architecture The classical three tiers of a compiler Clang x86 code C source x86 frontend generator assembly ... ... ... ... Fortran LLVM-GCC SPARC code SPARC Optimizer source frontend generator assembly LLVM LLVM assembly/ assembly/ bitcode bitcode Language-specific frontends Optimizer: generic IR, analysis/transformation passes Several backends for machine code generation A TriCore Backend for LLVM (November 20, 2009) The LLVM Compiler Infrastructure 7 – 25

  10. Unique Characteristics What does LLVM have that others don’t? Not merely a compiler, but a compiler infrastructure : Static compilation Just-in-time compilation Strictly modular, library-based architecture: Easily extendible Possibility to incorporate parts of LLVM in other projects BSD-style licence Produces highly optimized machine code in an efficient way: Memory-efficient Time-efficient A TriCore Backend for LLVM (November 20, 2009) The LLVM Compiler Infrastructure 8 – 25

  11. Design and Implementation of the Backend Overview Extensive generic code generation framework: Makes work a lot easier ... but also imposes some problems in specific cases Fixed class hierarchy Many target-independent algorithms: Instruction scheduling Register colouring ... Code generation process executed by a series of passes A TriCore Backend for LLVM (November 20, 2009) Design and Implementation of the Backend 9 – 25

  12. Code Generation Process List DAGs List DAGs DAGs LLVM code native SSA form not legalized legalized (SSA form) instructions DAG DAG Instruction SSA-Based Scheduling Lowering Legalization Selection Optimization TriCoreTargetLowering TriCoreDAGToDAGISel TriCoreInstrInfo TriCoreInstrInfo List SSA form TriCoreRegisterInfo TriCoreAsmPrinter TriCoreLoadStoreOpt TriCoreVirtInstrResolver TriCoreInstrInfo TriCoreInstrInfo Post- Assembly Peephole Pro-/Epilogue Register Allocation Printing Optimization Insertion Allocation Passes List List Text List List with resolved with resolved assembly with physical with physical stack stack code registers registers references references A TriCore Backend for LLVM (November 20, 2009) Design and Implementation of the Backend 10 – 25

  13. TableGen One tool to rule them all... Problem Backend contains large portions of descriptive data C++ obviously not suitable A TriCore Backend for LLVM (November 20, 2009) Design and Implementation of the Backend 11 – 25

  14. TableGen One tool to rule them all... Problem Backend contains large portions of descriptive data C++ obviously not suitable TableGen Language for domain-specific modelling Similar to object-oriented approach: Classes, records (objects), attributes Inheritance Definition files ( .td ) preprocessed by tblgen tool → Auto-generation of C++ code Used for description of: Subtargets, registers Calling conventions Instruction set A TriCore Backend for LLVM (November 20, 2009) Design and Implementation of the Backend 11 – 25

  15. SelectionDAG Construction Largely automated Directed acyclic graph Per basic block Nodes: instructions Edges: Data dependencies Control flow dependencies Example %mul = mul i32 %a, %a %mul4 = mul i32 %b, %b %add = add nsw i32 %mul4, %mul ret i32 %add

  16. SelectionDAG Construction Largely automated EntryToken Register %reg1025 Register %reg1024 0xa8321e8 0xa832988 0xa832900 ch i32 i32 Directed acyclic graph CopyFromReg CopyFromReg 0xa832878 0xa832a10 Per basic block i32 ch i32 ch Nodes: instructions mul mul Edges: 0xa832548 0xa8325d0 i32 i32 Data dependencies Control flow dependencies Register %D2 ր add 0xa8327f0 0xa832658 i32 i32 Example CopyToReg 0xa8324c0 ch flag %mul = mul i32 %a, %a %mul4 = mul i32 %b, %b %add = add nsw i32 %mul4, %mul TriCoreISD::RET_FLAG 0xa8326e0 ret i32 %add ch GraphRoot isel input for euclidSquare:entry

  17. Troubles The integer vs. pointer problem Problem TriCore strictly distinguishes between addresses and data integers Have to be put into separate register files → calling conventions! LLVM’s backend framework treats pointers just like integers... A TriCore Backend for LLVM (November 20, 2009) Design and Implementation of the Backend 13 – 25

  18. Troubles The integer vs. pointer problem Problem TriCore strictly distinguishes between addresses and data integers Have to be put into separate register files → calling conventions! LLVM’s backend framework treats pointers just like integers... Solution Annotation of “pointer / no pointer” flag in value type class Promotion of this flag throughout the DAG construction phase (required some hacks...) Case differentiations in all relevant situations A TriCore Backend for LLVM (November 20, 2009) Design and Implementation of the Backend 13 – 25

  19. Instruction Selection Largely auto-generated EntryToken Register %reg1025 Register %reg1024 0xa8321e8 0xa832988 0xa832900 ch i32 i32 CopyFromReg CopyFromReg 0xa832878 0xa832a10 i32 ch i32 ch mul mul 0xa832548 0xa8325d0 i32 i32 Pattern matching → Register %D2 add 0xa8327f0 0xa832658 i32 i32 def MULrr2 : Rr2Instr<0x0a, (outs DR:$c), (ins DR:$a, DR:$b), "mul\t$c, $a, $b", [(set DR:$c, (mul DR:$a, DR:$b))]>; CopyToReg 0xa8324c0 ch flag TriCoreISD::RET_FLAG 0xa8326e0 ch GraphRoot isel input for euclidSquare:entry

  20. Instruction Selection Largely auto-generated EntryToken Register %reg1025 Register %reg1024 EntryToken Register %reg1025 0xa8321e8 0xa832988 0xa832900 0xa8321e8 0xa832988 ch i32 i32 ch i32 Register %reg1024 CopyFromReg CopyFromReg CopyFromReg 0xa832900 0xa832878 0xa832a10 0xa832878 i32 i32 ch i32 ch i32 ch mul mul MULrr2 CopyFromReg 0xa832548 0xa8325d0 0xa832548 0xa832a10 i32 i32 i32 i32 ch Pattern matching → Register %D2 Register %D2 add MADDrrr2 0xa8327f0 0xa8327f0 0xa832658 0xa832658 i32 i32 i32 i32 def MULrr2 : Rr2Instr<0x0a, (outs DR:$c), (ins DR:$a, DR:$b), "mul\t$c, $a, $b", [(set DR:$c, (mul DR:$a, DR:$b))]>; CopyToReg CopyToReg 0xa8324c0 0xa8324c0 ch flag ch flag TriCoreISD::RET_FLAG RETsys 0xa8326e0 0xa8326e0 ch ch GraphRoot GraphRoot isel input for euclidSquare:entry scheduler input for euclidSquare:entry

  21. Scheduling & Register Allocation Target-independent algorithms Scheduling DAGs → list (SSA form) Target-independent algorithm using data from the instruction description table A TriCore Backend for LLVM (November 20, 2009) Design and Implementation of the Backend 15 – 25

Recommend


More recommend