compilation linking revisited memory and c c modules
play

Compilation/linking revisited Memory and C/C++ modules From Reading - PDF document

Compilation/linking revisited Memory and C/C++ modules From Reading #6 source object file 1 file 1 source object linking file 2 file 2 load compilation (relocation + file library linking) object file 1 source object file N


  1. Compilation/linking revisited Memory and C/C++ modules From Reading #6 source object file 1 file 1 source object linking file 2 file 2 load compilation (relocation + file library linking) object file 1 source object file N file N library object file M Usually performed by gcc/g++ in one uninterrupted sequence Will return to OOP topics (templates and library tools) soon Layout of C/C++ programs A sample C program – demo.c � Has text section #include <stdio.h> object 1 definition object 2 definiton Source code of course: the function 1 int a[10]={0,1,2,3,4,5,6,7,8,9}; � machine code static object 5 definition int b[10]; Header section � Has initialized object 3 definition void main(){ Machine code section global data: a … becomes function 2 (a.k.a. text section) int i; static int k = 3; � Uninitialized Initialized data section global data: b ............ for(i = 0; i < 10; i++) { Object Symbol table section � Static data: k printf("%d\n",a[i]); module � object 4 definition b[i] = k*a[i]; Relocation information � Has a local function 3 } section variable: i static object 5 definition } A possible structure of demo.o Linux object file format \177ELF .text Offset Contents Comment … Header section 0 124 number of bytes of Machine code section .rodata 4 44 number of bytes of initialized data section � “ ELF ” – stands for Executable and 8 40 number of bytes of Uninitialized data section (array b[] ) … ( not part of this object module ) .data 12 60 number of bytes of Symbol table section Linking Format 16 44 number of bytes of Relocation information section … Machine code section (124 bytes) – A 4-byte magic number followed by a series .bss 20 X code for the top of the for loop (36 bytes) 56 X code for call to printf() (22 bytes) … of named sections 68 X code for the assignment statement (10 bytes) 88 X code for the bottom of the for loop (4 bytes) .symtab 92 X code for exiting main() (52 bytes) Object module � Addresses assume the object file is … Initialized data section (44 bytes) 144 0 beginning of array a[] .rel.text placed at memory address 0 contains neither 148 1 … : 176 8 uninitialized – When multiple object files are linked .rel.data 180 9 end of array a[] (40 bytes) 184 3 variable k (4 bytes) … data ( b ), nor together, we must update the offsets Symbol table section (60 bytes) .debug 188 X array a[] : offset 0 in Initialized data section (12 bytes) any local (relocation) … 200 X variable k : offset 40 in Initialized data section (10 bytes) 210 X array b[] : offset 0 in Uninitialized data section (12 bytes) .line variables ( i ) � Tools to read contents: objdump and 222 X main : offset 0 in Machine code section (12 bytes) 234 X printf : external, used at offset 56 of Machine code section (14 bytes) … Relocation information section (44 bytes) readelf – not available on all systems Section 248 X relocation information header table

  2. ELF sections ELF Sections (cont.) \177ELF \177ELF .text .text … … � .text = machine code (compiled program .rodata .rodata � .rel.text = list of locations in .text section … … instructions) .data that need to be modified when linked .data � .rodata = read-only data … … with other object files .bss .bss � .data = initialized global variables … … � .rel.data = relocation information for .symtab .symtab � .bss = “ block storage start ” for global variables referenced but not … … uninitialized global variables – actually .rel.text defined .rel.text … … just a placeholder that occupies no space � .debug = debugging symbol table; only .rel.data .rel.data in the object file … … created if compiled with -g option .debug .debug � .symtab = symbol table with information � .line = mapping between line numbers in … … about functions and global variables .line .line source and machine code in .text; used … … defined and referenced in the program by debugger programs Section Section header table header table Creation of a load module Loading and memory mapping Load Module (logical) Object Module A � Interleaved from Header Section � Includes Code Code address Machine Code space of multiple object memory Section program 2 Header Section initialized Initialized data Static data modules Static data for stack, Machine Code Section Header Section Section Symbol table uninitialized Section Code dynamic – Sections must be Initialized data Dynamic data Section “ relocated ” load module data (i.e., Dynamic data Symbol table Static data Unused Section free store), Machine Code � Addresses relative to logical Section Unused address and un- beginning of a logical space Dynamic data address space initialized module (logical) address Stack space of Unused program 1 global data Header Section – Necessary to translate Logical Stack address Machine Code Initialized data from beginnings of space � Physical Section Section object modules Stack Initialized data memory is Section � When loaded – OS (logical) address loading shared by OPERATING Symbol table space of SYSTEM Section memory program 3 will translate again to multiple memory mapping Symbol table mapping Section absolute addresses PHYSICAL MEMORY programs Object Module B Dynamic memory allocation From source physical memory Code Code code for printf() program to initialized initialized source program Static data Static data code for top of for loop uninitialized uninitialized “placement” in int a[10]={0,1,2,3,4,5,6,7,8,9}; int b[10]; Dynamic data code for call to printf() Dynamic data void main() code for b[i] = k*a[i] increment of memory during dynamic data { Unused int i; logical Unused static int k = 3; address logical space address execution space for(i = 0; i < 10; i++) { printf("%d\n",a[i]); b[i] = k*a[i]; array a[] Stack Stack }/*endfor*/ }/*end main*/ (logical) address (logical) address space of the space of the program program OPERATING OPERATING array b[] SYSTEM SYSTEM variable k PHYSICAL MEMORY PHYSICAL MEMORY Before dynamic memory allocation After dynamic memory allocation

  3. Sections of an executable file Variables and objects in memory ' A ' 16916 Segments: 01000001 0100001000010100 � Variables and data objects are data containers with names � The value of the variable is the code stored in the container � To evaluate a variable is to fetch the code from the container and interpret it properly � To store a value in a variable is to code the value and store the code in the container � The size of a variable is the size of its container Overflow is when a data code is More about overflow larger than the size of its container � Previous slide showed example of " right � e.g., char i; // just 1 byte variable i overflow " – result truncated (also warning) 01001001100101100000001011010100 int *p = (int*)&i; // legal *p = 1673579060; 01000001 010001… // result if " big endian " storage: X � Compilers handle " left overflow " by � If whole space (X) belongs to this program: truncating too (usually without any warning) – Seems OK if X does not contain important data for rest of the program ’ s execution – Easily happens: unsigned char i = 255; – Bad results or crash if important data are overwritten � If all or part of X belongs to another process, the 11111111 program is terminated by the OS for a memory i++; // What is the result of this increment? access violation (i.e., segmentation fault) 1 00000000 Placement & padding – word Pointers are data containers too � Compiler places � As its value is a memory variable x Not like this! address, we say it " points " data at word x.a x.b to a place in memory 0100100110010110000000101101010001101101 boundaries 8090346 � It points at just 1 byte, so it byte with address – e.g., word = 4 bytes 8090346 a machine word a machine word must " know " what data type � Imagine: starts at that address struct { byte with address data variable x – How many bytes? 8090346 completely Compilers do it this way ignored, junk – How to interpret the bits? char a; 8090346 padding x.a x.b � Question: What is stored in int* p int b; integer 01001001 10010110000000101101010001101101 "data container" the 4 bytes at addresses } x; 802340..802343 in the ...0101 01000001010000100100001101000100 1100... a machine word a machine word diagram at right? � Classes too address address address address – Continued next slide 802340 802341 802342 802343 See/try ~mikec/cs32/demos/padding*.c*

Recommend


More recommend