code generation issues
play

Code Generation Issues Aslan Askarov aslan@cs.au.dk Based on slides - PowerPoint PPT Presentation

Compilation 2014 Code Generation Issues Aslan Askarov aslan@cs.au.dk Based on slides by E. Ernst Administrativia November 28 guest lecture by Filip Sieczkowski (AU) Memory Models December 12 guest lecture by Kevin


  1. 
 
 Compilation 2014 Code Generation Issues Aslan Askarov aslan@cs.au.dk 
 Based on slides by E. Ernst

  2. Administrativia • November 28 – guest lecture by Filip Sieczkowski (AU) • Memory Models • December 12 – guest lecture by Kevin Millikin (G) • Register allocation in JIT compilers • The very last hand-in “Putting it all together” is due on December 10 • No new functional requirements, just fix your bugs and resubmit

  3. Code generation issues • Until now we have discussed Tiger on an abstract platform • In reality, it needs to run on a concrete one • Need consider • Concrete assembly language • Calling conventions • Transformation from abstract assembly to concrete assembly • Note: no textbook reference here, but check out linked material on x86 assembly

  4. The x86 assembly language • Specifies machine instructions for a large family of CPUs – we choose the 32 bit subfamily • Only very little abstraction • Symbolic addresses (crucial!) • Choice among jumps (please ignore) • Generation of meta/debug information (please ignore) • Checks several things • “This is actually a MOVE” (bit-pattern could mean anything) • “MUL is actually capable of operating on that register” • Syntax not standardized: we use AT&T syntax that is compatible with gcc inline code

  5. The x86 assembly language • Example instruction: move $-42, 8(%ebp) • General format: <instr><S> <src>, <dst> • <S> ∈ {b,s,w,l,q,t} specifies size ( l : 32 bit) • $ indicates literal number • % indicates register • o(%r1, %r2, n) means o + %r1 + %r2 * n • Example label: mylabel • gives the “current address” the name ‘mylabel’ • Example directives: • .text .data .globl .ascii .asciz 
 .size .func .type .endfunc

  6. The x86 assembly language • Assembler produces object file that contains segments (address ranges with a purpose) • Segments containing runnable code: .text • Segments containing initialized data: .data • Remember to specify segment to fit purpose • Specifying data: • .ascii .asciz • Metadata: • .func .type .endfunc .size .globl • For examples: check provided *.s

  7. The x86 assembly language .text # PROCEDURE tigermain .globl tigermain .func tigermain .type tigermain, @function tigermain: # FRAME tigermain(1 formals, 10 locals) pushl %ebp movl %esp, %ebp subl $44, %esp # SP, FP, calleesaves, argregs have values L2_blocks: # x86gen:122 movl %ebp, -8(%ebp) # x86gen:246 x86frame:575 movl -8(%ebp), %ebx # x86gen:251 x86frame:367 addl $-4, %ebx # x86gen:251 x86frame:372 . . . movl -28(%ebp), %eax # x86gen:117 x86frame:582 jmp L1_block_done # x86gen:172 L1_block_done: # x86gen:122 # FP, SP, RV, calleesaves still live leave ret .data .size tigermain, .-tigermain L0_string: .endfunc .long 13 # END tigermain .asciz "DefaultString"

  8. The x86 assembly language • A reasonable selection of instructions addl $17, %esp jne Label addl %eax, %ebx leave call Label movl $17, %eax cltd movl $Label, %eax cmpl $17, %ecx movl %eax, %ebx cmpl %eax, %edx movl %eax, 17(%ebp) idivl %ebx movl %esp, %ebp imull %eax movl 17(%ebp), %eax je Label negl %eax jg Label pushl %eax jge Label pushl %ebp jl Label ret jle Label subl $17, %esp jmp Label subl %eax, %ebx

  9. Calling Convention (cdecl) • For practical reasons, gcc is used to do linking • includes startup symbols/code • easy integration with C (e.g., runtime.c, showstack.c) • convenient calling convention • Must use gcc -static .. to allow debugging • Avoid ‘smart’ options ( -fomit-frame-pointer ) • Conventions • all parameters passed on stack (‘push’) • last parameter pushed first, .., return address last • return value passed in %eax • caller cleans up stack

  10. Recall Tiger frame slots old FP (prev.frame) Memory[saved_FP], Memory[staticLink]? ... arg_k Memory[FP+4*(2+k)] ... arg_1 Memory[FP+4*(2+1)] staticLink Memory[FP+8] returnAddr Memory[FP+4] Top of frame: saved_FP FP Memory[FP] known early localVar_1 Memory[FP-4*1] ... localVar_m Memory[FP-4*m] temp_1 Memory[FP-4*(m+1)] ... Bottom: known temp_p Memory[FP-4*(m+p)] later ... SP (args for next) SP SP Memory[FP-who_cares] SP

  11. Calling Conventions at Work • Consider an invocation of the function g(_) from the function f(…) L33_f: . . . • Reverse-push parameters pushl %ebx pushl %ebp • Push static link call L3_g • Call (clean up after return) addl $8, %esp . . . • Callee sets up stack frame L3_g: pushl %ebp • ends by destructing it again movl %esp, %ebp subl $44, %esp . . . leave ret

  12. From Abstract to Concrete Assembly • Maximal Munch will generate instructions, but abstract • Abstract instructions: Like concrete ones, but using an unlimited supply of ‘temporaries’ • Register allocation: Reusing registers as much as possible, then spill • We will just spill!

  13. Generated Code • Note dependencies: Each instruction uses some temporaries and defines others (or the same) . . . | munchStm (T.MOVE (T.TEMP t, T.CALL (T.NAME l, args))) = ( emit (A.OPER { assem = "\tcall " ^ S.name l , src = munchArgs args , dst = F.calldefs , jump = NONE , doc = "x86gen:68"}) ; emit (freeArgs (length args)) ; emit (moveInstr F.EAX t "70")) • In 2-op instr. (e.g. addl ), first src is implicit dependency emit (A.OPER { assem = "\taddl `s1, `d0" , src = [r, munchExp e2] (* old-r used *) , dst = [r] , jump = NONE , doc = "x86gen:270"})

  14. From Abstract to Concrete Assembly • Actual source code examples: tigermain: tigermain: pushl %ebp pushl %ebp movl %esp, %ebp movl %esp, %ebp subl $44, %esp subl $44, %esp L2_blocks: L2_blocks: movl %ebp, -8(%ebp) movl %ebp, t111 movl -8(%ebp), %ebx addl $-4, t111 addl $-4, %ebx movl t111, t110 movl %ebx, -8(%ebp) movl $0, t112 movl -8(%ebp), %ebx pushl t112 movl %ebx, -20(%ebp) movl $10, t113 movl -12(%ebp), %ebx pushl t113 movl $0, %ebx call initArray movl %ebx, -12(%ebp) . . . movl -12(%ebp), %ebx pushl %ebx movl -16(%ebp), %ebx movl $10, %ebx movl %ebx, -16(%ebp) movl -16(%ebp), %ebx pushl %ebx call initArray . . .

  15. Spilling: Basic Idea • Starting point: instruction using input/output • Use instead: same instruction, in/out via registers • Add instructions to move data to/from those registers bloopl src dst move bloopl move src R1 R2 dst

  16. Spilling: Implementation • Starting point: instruction i using [s0] , [] • Use instead: A.OPER variant • Add movl instruction to move data into fun expand (A.OPER {src=s0::ss, jump = SOME (j::js), ...}) = raise Bug "Encountered OPER that uses temps and jumps" | expand (i as (A.OPER {src=[], dst=[], ...})) = [i] | expand (i as A.OPER {assem, src=[s0], dst=[], jump, doc}) = if isRegister s0 then [i] else (* s0 other temp *) [ A.OPER { assem = "\tmovl " ^ ofs s0 ^ "(%ebp), `d0" , src = [] , dst = [EBX] , jump = NONE , doc = doc ^ " x86frame:265"} , A.OPER { assem = assem , src = [EBX] , dst = [] , jump = jump , doc = doc ^ " x86frame:270"}] . . .

  17. Spilling: Implementation • Invariant maintained: Registers are initialized and used locally in snippet of code for one abstract instruction — so there are no conflicts • Note special casing with registers . . . | expand (i as A.OPER {assem, src=[s0], dst=[], jump, doc}) = if isRegister s0 then [i] else (* s0 other temp *) [ A.OPER { assem = "\tmovl " ^ ofs s0 ^ "(%ebp), `d0" , src = [] , dst = [EBX] , jump = NONE , doc = doc ^ " x86frame:265"} , A.OPER { assem = assem , src = [EBX] , dst = [] , jump = jump , doc = doc ^ " x86frame:270"}] . . .

  18. Summary • Abstract platform (Jouette) now dropped • Choice: x86, 32 bit • Assembly language: safer than machine code • Assembly instruction format, labels, directives • The notion of a segment • Actual assembly: check out test0?.s • An example required set of instructions • Calling conventions: cdecl • Transformation abstract/concrete assembly code

  19. References • Assembly directives: see ‘as’ manual • https://sourceware.org/binutils/docs/as/index.html • Sign extension (CTLD): • https://en.wikipedia.org/wiki/Sign_extension

Recommend


More recommend