CS502: Compiler Design Runtime Environments Manas Thakur Fall 2020
Going backstage Character stream Machine-Independent Machine-Independent Lexical Analyzer Lexical Analyzer Code Optimizer Code Optimizer B a c k e n d Intermediate representation Token stream F r o n t e n d Syntax Analyzer Code Generator Syntax Analyzer Code Generator Target machine code Syntax tree Machine-Dependent Machine-Dependent Semantic Analyzer Semantic Analyzer Code Optimizer Code Optimizer Syntax tree Target machine code Intermediate Intermediate Symbol Code Generator Code Generator Table Intermediate representation Manas Thakur CS502: Compiler Design 2
What all from the runtime interests a compiler? ● Memory – holds data and code – Our interest: Storage layouts ● Processor(s) – perform(s) computations in registers – Our interest: Register allocation ● Instruction set – defines the primitives available for execution – Our interest: Code generation ● Ultimate aim: Performance – in terms of time and memory – Our interest: Optimization Manas Thakur CS502: Compiler Design 3
Typical memory subdivision while executing a program Code Instructions Data across all Static procedures (globals, constants) Data that outlives Heap procedures (malloc, new) Free memory Compiler’s responsibility: Data local to To reserve space for all Stack these kinds of memory procedures (variables, parameters, temporaries) Manas Thakur CS502: Compiler Design 4
Procedure abstraction ● A namespace for locals and parameters ● Also the return value ● Compiler passes (recall ICG) introduce temporaries ● We need to reserve space for all of them ● Operations: – Call another procedure ● caller vs callee – Return from the current procedure Manas Thakur CS502: Compiler Design 5
How do we call and return from a procedure? ● In the caller: – Save state of current procedure ● Program counter (where to resume) ● Registers (holding current computations) – Store arguments in a callee-accessible location – Transfer control-flow Some of these tasks can be performed ● In the callee: either by the caller or by the callee – Collect parameters e.g., caller-save vs callee-save registers – Declare variables – Perform computations (perhaps in temporaries) ● May involve accessing globals – Return to caller ● Store return value in a caller-accessible location Manas Thakur CS502: Compiler Design 6
Supporting procedure calls ● Only one procedure runs at a time – Unless? ● If foo calls bar , bar returns before foo – bar comes last but goes fjrst – Last In, First Out! ● Procedure calls are modelled using a stack – Called control-stack or “the stack” ● Each active procedure has an activation record or frame in the stack Manas Thakur CS502: Compiler Design 7
Activation record (a general structure) Previous frame Actual parameters Return value(s) (point to callers Control/access link or other frames) (e.g., register values while Saved machine status transferring control-fmow) Stack pointer (used to access Local data other items) Temporaries Frame pointer (boundary of Next frame current frame) Manas Thakur CS502: Compiler Design 8
Addressing items in activation records Actual parameters Return value(s) SP - offset Control/access link Growing addresses Saved machine status SP Local data SP + offset Temporaries Manas Thakur CS502: Compiler Design 9
Activation records: Design decisions ● Items communicated between caller and Caller’s AR callee placed near the caller – Parameters Actual parameters – Return value Return value(s) – Advantage? ● Fixed-length items placed together Control/access link – Parameters Saved machine status – Return value – Control link Local data ● Space requirement of locals/temporaries sometimes not known early Temporaries Manas Thakur CS502: Compiler Design 10
Complications ● Access to non-local data – Store globals at a “globally known” location (recall Static from Slide 4?) ● Nested procedures – Similar to yet different from nested blocks – Store nesting-depth with each variable – Use access links to point to the frames of enclosing procedures ● Passing procedures as arguments – Or as return values Some other time! – Functional languages ● Challenge: – Doing all this efficiently Manas Thakur CS502: Compiler Design 11
Referencing variables with access links ● An access link points to the most recent activation of the procedure that contains the current procedure – When can we have multiple activations of a procedure on the control-stack? ● Suppose – N p is the nesting-depth of procedure p that refers to non-local variable a – N a is the nesting-depth of the procedure, say q , that defines a ● N p – N a access links would have to be traversed when in procedure p to get to the activation record of q ● Can we make this more efficient? Manas Thakur CS502: Compiler Design 12
Displays as an alternative to access links ● Traversing access links one-by-one may be costly in case of a high nesting-depth difference for the variable to be accessed ● Idea: – Use a global array with the pointer to the most recently active procedure with nesting-depth i at index i – The array is called a display (say d ) – Advantage: ● If I am a procedure m with nesting-depth k , and I want to access a variable a with nesting-depth l ≤ k , I only have to follow a maximum of two pointers: – One to d[l] , which gives the AR defining a – Another for the offset of a from the SP of the obtained AR Next class: Heap management. Manas Thakur CS502: Compiler Design 13
CS502: Compiler Design Runtime Environments (Cont.) Manas Thakur Fall 2020
Heap ● A chunk of memory used usually for dynamically allocated data – using malloc, calloc, new, etc. ● Goal: – Have as much space as possible to serve allocation requests ● Challenge: – When to deallocate a previously allocated chunk – Why didn’t this challenge exist with a stack? ● Memory associated with a frame gets popped out automatically once the corresponding procedure finishes execution. Manas Thakur CS502: Compiler Design 15
Memory allocation ● Simple task ● Keep a pointer to the first available memory location ● Allocate the requested block when a request comes – Well, there are again multiple ways to do this: ● First fit ● Best fit ● Read OS books for more! ● Move the pointer to the next free location ● Challenge: – Memory eventually fills up! – Need deallocations. Manas Thakur CS502: Compiler Design 16
Explicit deallocation ● Programmer’s task to deallocate memory ● Most languages till 1990s had explicit deallocation – Exception: Lisp had garbage collection far back in 1958! ● Examples: – free in C – delete in C++ ● Problem: – Often difficult to visualize when to free memory – Deleting conservatively as well as aggressively may lead to memory-related issues Manas Thakur CS502: Compiler Design 17
Problems with bad explicit deallocation ● Too conservative: – Memory leaks ● Memory fills up while running applications ● Buy next smartphone with higher GBs of RAM! – What if it’s a high-end server at a government institute? – What if it’s an iPhone? :-) ● Too aggressive: – Dangling pointers ● A pointer to freed memory ● Using such pointers might lead to weird (and harmful) behaviour Manas Thakur CS502: Compiler Design 18
Implicit deallocation of memory ● Also called garbage collection ● Motto: – Don’t trust the programmer Instead: – Trust the compiler writer! ● Idea: – Memory that is no longer in use should be reclaimed automatically ● Examples: – OO: Java, Smalltalk – Functional: Lisp, ML, Haskell – Logic: Prolog – Scripting: Awk, Perl Manas Thakur CS502: Compiler Design 19
Garbage collection schemes ● One shot: – Pause the program – Give full control to a GC pass – Hope the situation improves once GC is over ● On-the-fly (aka incremental): – Perform some GC actions periodically ● say after each call to new , and/or every time a procedure returns – Sometimes a one-shot GC may be kept as backup ● Concurrent: – Separate thread for GC – Relatively complicated, but gaining popularity Manas Thakur CS502: Compiler Design 20
Garbage collection algorithms ● Reference counting ● Mark and sweep ● Baker’s We will get a glimpse of the colored ones ● Lieberman’s ● Generational ● Region-based ● Parallel ● Many in the JVM itself: – G1, Parallel, Concurrent mark and sweep (CMS), Serial, Shenandoah, ZGC ... list keeps growing. Manas Thakur CS502: Compiler Design 21
Reference counting GC ● With each allocated chunk (from now on, object) obj : – maintain the count of references (or pointers) that point to obj ● Operations and actions: – Allocate obj (e.g., q = new T() , such that the allocated chunk is named obj ): ● Initialize obj.rc to one – Copy obj (e.g., using p = q ): ● ++obj.rc – Reference changes to obj’ (e.g., q = r ): ● --obj.rc – obj.rc becomes zero: ● Reclaim obj Manas Thakur CS502: Compiler Design 22
Recommend
More recommend