HW1 Project • Will be assigned tonight • Proposal due next Wed (2/16) • Due on 2/24 (Wed), 5pm • Do you have a team? • Submit to hand-in bins outside Lopata 408. • Do you have an idea? • Homework will NOT be accepted after due • Have you started reading related date/time. materials? • Collaboration is NOT allowed. • Welcome to discuss with me! Chenyang Lu CSE 467S 1 Chenyang Lu CSE 467S 2 More project ideas Now, we head to MobiLab! • Program applications on Agilla • Existing examples: • Fire tracking (see today’s demo) • Intruder tracking • Network exploration • Implement power management modules on TinyDB Chenyang Lu CSE 467S 3 Chenyang Lu CSE 467S 4 Transformation of Software optimize Analyze assembly HLL Software HLL assembly compile assembly assemble HLL Optimization and Analysis Relative Address assembly assembly Physical Address object load executable link Absolute Address Chenyang Lu CSE 467S 5 Chenyang Lu CSE 467S 6
Program design and analysis • Optimizing for execution time. • Optimizing for energy/power. • Optimizing for program size. Optimization for Performance • They may conflict with each other! Chenyang Lu CSE 467S 7 Chenyang Lu CSE 467S 8 Why do we need to know? Expression simplification • Constant folding: • Understand various optimization levels (-O1, - • 8+1 = 9 O2, etc.) • Algebraic: • Look at mixed compiler/assembler output. • a*b + a*c = a*(b+c) • Modifying compiler output requires care: • Strength reduction: • correctness; • loss of hand-tweaked code. • a*2 = a<<1 Chenyang Lu CSE 467S 9 Chenyang Lu CSE 467S 10 Dead code elimination Function calls • Dead code: #define DEBUG 0 if (DEBUG) dbg(p1); 0 • Can be eliminated by 0 analysis of control flow, 1 constant folding. dbg(p1); Chenyang Lu CSE 467S 11 Chenyang Lu CSE 467S 12
ARM Function calls • Understand the overhead of function calls • Branch and link instruction: • Caller and callee must follow a consistent BL foo == MOV r14, r15 protocol (order) to access the stack B foo • Different compilers and programmers may follow • r15 contains the current PC different orders • Access the stack (ARM) • Copies current PC to r14. • Push: STR r0, [r13, #4]! • To return from subroutine: • Pop: SUB r13, #4 MOV r15,r14 Chenyang Lu CSE 467S 13 Chenyang Lu CSE 467S 14 Nested function calls Example Protocol for ARM int main() { f1(int x); } • R13 always points to the top of stack void f1(int a) { f2(a); } • Caller: call a function • Push parameters to stack ; f1 is called by main() • BL (R15 (PC) � R14; jump) f1 LDR r0, [r13] ; load para. into r0 from stack STR r14, [r13] ; store f1’s return addr. • Callee: receive a call ; f1 calls f2() • Read parameters from stack STR r0, [r13, #4]! ; push para. for f2 to stack • Overwrite top of stack with return address (R14) BL f2 ; branch and link to f2 • Callee: return ; f1 receives return from f2() • Load PC with return address (on top of stack) SUB r13, #4 ; pop f2’s para. off stack • Caller: receive a return ; f1 returns to main() • Pop callee’s return address from stack LDR r15, [r13] ; restore register and return • SHARC: PC stack � no need to push/pop return addresses Correct p. 82 of textbook! Chenyang Lu CSE 467S 15 Chenyang Lu CSE 467S 16 Function inlining Reading • Sec 5.5, 5.6, 5.7 int foo(a,b,c) { return a + b - c;} z = foo(w,x,y); � z = w + x - y; � Improve performance • Eliminates procedure linkage overhead: � May increase code size • Not always… • Affect instruction cache behavior Chenyang Lu CSE 467S 17 Chenyang Lu CSE 467S 18
Recommend
More recommend