PIN • Dynamic instrumentation framework PIN: Building Customized • Goals: Program Analysis Tools with – Easy-to-use Dynamic Instrumentation – Portable – Transparent Luk et al PLDI 2005 – Efficient Presented by Godmar Back – Robust CS 6304 Spring 2007 1/23/2007 2 PIN Architecture Sample • Note: no architecture -dependent code CS 6304 Spring 2007 1/23/2007 3 CS 6304 Spring 2007 1/23/2007 4 How PIN works Traces Entry • Reads binary: • PIN offers instrumentation at BB0 – Parse ELF binary same way OS would – finds entry point different levels. • Parses machine instructions of binary (1) • Parses machine instructions of (compiled) • Key concept is a trace. BB1 instrumentation code (2) • Trace: straight-line sequence of • Inserts (2) in (1) as directed by tool instructions that end with BB2 • Translates mix to (same-architecture) machine instructions unconditional transfer (or if too large) – No IR is used Exits • PIN translates one trace at a time – Translated instructions stored in a cache • Executes translated instructions only CS 6304 Spring 2007 1/23/2007 5 CS 6304 Spring 2007 1/23/2007 6 1
Instrumentation Levels Techniques (1): Trace Linking • By default instrumentation done by trace • When a trace ends • Provides “by instrumentation” – implemented as – Examples: convenience only • virtual method dispatch: jmp *eax for (b : basicblocks) { • Function call return for (i : instructions(b)) { …. }} – First time, return to VM, examine where it ended and where it goes, translate subsequent trace. • Routine: – Second time, would like to jump directly to successor – Add instrumentation once a routine is entered trace if at all possible • Image: • Easy if ends with direct jump – Perform some instrumentation when an image is loaded (executable, .so file, etc.) • Need prediction if it ends with indirect jump CS 6304 Spring 2007 1/23/2007 7 CS 6304 Spring 2007 1/23/2007 8 Trace Linking Cloning & Trace Linking (cont’d) • Q. Why do we still need a mis- prediction • Q.: what is the check overhead here? in number of instructions? CS 6304 Spring 2007 1/23/2007 9 CS 6304 Spring 2007 1/23/2007 10 Register Reallocation Register Allocation • Virtual vs. physical registers • Traditional approach: – “Virtual registers” are machine registers (%EAX, – Build CFG. Do Liveness analysis. Compute %EBX, etc.) as seen by the application program’s interference graph. Color it. Assign registers compiler – Won’t work here because entire CFG is not – “Physical registers” are the ones holding the actual known – it’s incrementally built. values during the execution of translated code • Must map virtual to physical • Alternative: – Must guarantee that life virtual registers are not – Linear-scan register assignment destroyed; spill to memory if needed. • Register allocation problem! CS 6304 Spring 2007 1/23/2007 11 CS 6304 Spring 2007 1/23/2007 12 2
Linear Scan Allocation [Poletto’99] Register Reallocation (cont’d) Assume: 2 physical • When linking traces, would like to avoid • Idea: registers rearranging registers: thus, on code cache – Determine live ranges miss, jit target trace with v-to-p mapping – Range defined as that origin trace ended with. instruction index • Second time around (if target trace is A – Assign registers greedily reached from different origin): A, B – When spilling, spill the one – need compensation code with the farthest end range B, D • By comparison: valgrind always spills all • Q.: What is the heuristics? D,E virtual registers to memory D CS 6304 Spring 2007 1/23/2007 13 CS 6304 Spring 2007 1/23/2007 14 Register Reconciliation (1) Register Reconciliation (2) CS 6304 Spring 2007 1/23/2007 15 CS 6304 Spring 2007 1/23/2007 16 Register Reconciliation (3) Other Optimizations • Inlining vs. Bridging EBX is thread-local • Big question: when to inline (will examine Location (optimized for on Thursday) single-threaded case) • Inlining optimization: – Can avoid saving caller-saved registers blindly (including eflags) – why? CS 6304 Spring 2007 1/23/2007 17 CS 6304 Spring 2007 1/23/2007 18 3
Performance Analysis NULL-tool overhead • What counts is overhead/slowdown. – How much is acceptable? 120%? 200%? 2000%? • Must know NULL-tool overhead/baseline slowdown • Obviously, tool overhead depends on tool – How much work is done in tool code (in common path?) – How efficient is bridging code/how often could inlining be applied? CS 6304 Spring 2007 1/23/2007 19 CS 6304 Spring 2007 1/23/2007 20 NULL-tool: PIN vs Competition Count-BB Tool CS 6304 Spring 2007 1/23/2007 21 CS 6304 Spring 2007 1/23/2007 22 BB-tool: PIN vs Competition Applications • Only few at the time PIN was published; many more now, see http://rogue.colorado.edu/pin • Mainly used in architecture community so far – Cache simulation, program phase analysis, etc. • Top CS 6304 Spring 2007 1/23/2007 23 CS 6304 Spring 2007 1/23/2007 24 4
PIN Goals Revisited • Easy-to-use – Yes, but little support for accessing internals (e.g. liveness ranges etc.); little support for accessing symbolic information Discussion/Questions • Portable – Yes: four architectures, 3 OS • Transparent – Almost completely (minus address space effects) • Efficient – According to their benchmarks for simple codes • Robust – In my experience, pretty robust CS 6304 Spring 2007 1/23/2007 25 5
Recommend
More recommend