1
play

1 What Do Linkers Do? (cont) What Do Linkers Do? Step 2: - PDF document

Today Linking Case study: Library interpositioning Linking CSci 2021: Machine Architecture and Organization May 1st, 2020 Your instructor: Stephen McCamant Based on slides originally by: Randy Bryant, Dave OHallaron 1 2 Bryant and


  1. Today  Linking  Case study: Library interpositioning Linking CSci 2021: Machine Architecture and Organization May 1st, 2020 Your instructor: Stephen McCamant Based on slides originally by: Randy Bryant, Dave O’Hallaron 1 2 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition Example C Program Static Linking Programs are translated and linked using a compiler driver :   linux> gcc -Og -o prog main.c sum.c  linux> ./prog int sum(int *a, int n); int sum(int *a, int n) { int array[2] = {1, 2}; int i, s = 0; main.c sum.c Source files int main() for (i = 0; i < n; i++) { { s += a[i]; Translators Translators int val = sum(array, 2); } (cpp, cc1, as) (cpp, cc1, as) return val; return s; } } Separately compiled main.o sum.o sum.c main.c relocatable object files Linker (ld) Fully linked executable object file prog (contains code and data for all functions defined in main.c and sum.c ) 3 4 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition Why Linkers? Why Linkers? (cont)  Reason 1: Modularity  Reason 2: Efficiency  Program can be written as a collection of smaller source files,  Time: Separate compilation rather than one monolithic mass.  Change one source file, compile, and then relink.  No need to recompile other source files.  Can build libraries of common functions (more on this later)  e.g., Math library, standard C library  Space: Libraries  Common functions can be aggregated into a single file...  Yet executable files and running memory images contain only code for the functions they actually use. 5 6 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 1

  2. What Do Linkers Do? (cont) What Do Linkers Do?  Step 2: Relocation  Step 1: Symbol resolution  Merges separate code and data sections into single sections  Programs define and reference symbols (global variables and functions):  void swap() {…} /* define symbol swap */  Relocates symbols from their relative locations in the .o files to  swap(); /* reference symbol swap */ their final absolute memory locations in the executable.  int *xp = &x; /* define symbol xp, reference x */  Updates all references to these symbols to reflect their new  Symbol definitions are stored in object file (by assembler) in symbol table . positions.  Symbol table is an array of struct s  Each entry includes name, size, and location of symbol.  During symbol resolution step, the linker associates each symbol reference Let’s look at these two steps in more detail…. with exactly one symbol definition. 7 8 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition Three Kinds of Object Files (Modules) Executable and Linkable Format (ELF)  Relocatable object file ( .o file)  Standard binary format for object files  Contains code and data in a form that can be combined with other relocatable object files to form executable object file.  One unified format for  Each .o file is produced from exactly one source ( .c ) file  Relocatable object files ( .o ),  Executable object files (a.out )  Executable object file ( a.out file)  Shared object files ( .so )  Contains code and data in a form that can be copied directly into memory and then executed.  Generic name: ELF binaries  Shared object file ( .so file)  Special type of relocatable object file that can be loaded into memory and linked dynamically, at either load time or run-time.  Called Dynamic Link Libraries (DLLs) by Windows 9 10 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition ELF Object File Format ELF Object File Format (cont.) Elf header  .symtab section  Word size, byte ordering, file type (.o, exec,  0 0  Symbol table .so), machine type, etc. ELF header ELF header  Procedure and static variable names Segment header table Segment header table Segment header table   Section names and locations  Page size, virtual addresses memory segments (required for executables) (required for executables) .rel.text section (sections), segment sizes. .text section  .text section  Relocation info for .text section .text section  .rodata section .rodata section  Addresses of instructions that will need to be  Code modified in the executable .data section .data section  Instructions for modifying. .rodata section  .bss section .bss section .rel.data section  Read only data: jump tables, ...  .symtab section .symtab section  Relocation info for .data section .data section  .rel.txt section  Addresses of pointer data that will need to be .rel.txt section  Initialized global variables modified in the merged executable .rel.data section .rel.data section .bss section .debug section   .debug section .debug section  Info for symbolic debugging ( gcc -g )  Uninitialized global variables  “Block Started by Symbol” Section header table Section header table Section header table   “Better Save Space”  Offsets and sizes of each section  Has section header but occupies no space 11 12 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 2

  3. Step 1: Symbol Resolution Linker Symbols Referencing  Global symbols a global…  Symbols defined by module m that can be referenced by other modules. …that’s defined here  E.g.: non- static C functions and non- static global variables. int sum(int *a, int n); int sum(int *a, int n)  External symbols {  Global symbols that are referenced by module m but defined by some int array[2] = {1, 2}; int i, s = 0; other module. int main() for (i = 0; i < n; i++) { { s += a[i];  Local symbols int val = sum(array, 2); }  Symbols that are defined and referenced exclusively by module m . return val; return s;  E.g.: C functions and global variables defined with the static } } sum.c main.c attribute.  Local linker symbols are not local program variables Defining a global Referencing Linker knows a global… nothing of i or s Linker knows nothing of val …that’s defined here 14 15 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition How Linker Resolves Duplicate Symbol Local Symbols Definitions  Local non-static C variables vs. local static C variables  local non-static C variables: stored on the stack  Program symbols are either strong or weak  local static C variables: stored in either .bss, or .data  Strong : procedures and initialized globals  Weak : uninitialized globals int f() { static int x = 0; Compiler allocates space in .data for p1.c p2.c return x; each definition of x int foo=5; int foo; strong weak } p1() { p2() { strong Creates local symbols in the symbol strong int g() } } table with unique names, e.g., x.1 { and x.2 . static int x = 1; return x; } 16 17 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition Linker Puzzles Linker’s Symbol Rules int x; Link time error: two strong symbols ( p1 )  Rule 1: Multiple strong symbols are not allowed p1() {} p1() {}  Each item can be defined only once  Otherwise: Linker error int x; int x; References to x will refer to the same p1() {} p2() {} uninitialized int. Is this what you really want?  Rule 2: Given a strong symbol and multiple weak symbols, int x; double x; Writes to x in p2 might overwrite y ! int y; p2() {} choose the strong symbol Evil! p1() {}  References to the weak symbol resolve to the strong symbol int x=7; double x; Writes to x in p2 will overwrite y ! int y=5; p2() {} Nasty! p1() {}  Rule 3: If there are multiple weak symbols, pick an arbitrary one References to x will refer to the same initialized int x=7; int x;  Can override this with gcc – fno-common p1() {} p2() {} variable. Nightmare scenario: two identical weak structs, compiled by different compilers with different alignment rules. 18 19 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 3

Recommend


More recommend