Reminder: compiling & linking Linux object file format \177ELF .text … .rodata � “ ELF ” – stands for Executable and source object … file 1 file 1 Linking Format .data … – A 4-byte magic number followed by a series source object .bss linking file 2 file 2 load … compilation (relocation + of named sections file library .symtab linking) object � Addresses assume the object file is … file 1 .rel.text placed at memory address 0 source object … file N file N – When multiple object files are linked .rel.data library object … together, we must update the offsets file M .debug (relocation) … .line � Tools to read contents: objdump and … readelf – not available on all systems Usually performed by gcc/g++ in one uninterrupted sequence Section header table ELF sections ELF Sections (cont.) \177ELF \177ELF .text .text … … � .text = machine code (compiled program .rodata .rodata � .rel.text = list of locations in .text section … … instructions) .data .data that need to be modified when linked � .rodata = read-only data … … with other object files .bss .bss � .data = initialized global variables … … � .rel.data = relocation information for .symtab .symtab � .bss = “ block storage start ” for global variables referenced but not … … uninitialized global variables – actually .rel.text defined .rel.text … … just a placeholder that occupies no space � .debug = debugging symbol table; only .rel.data .rel.data in the object file … … created if compiled with -g option .debug .debug � .symtab = symbol table with information … � .line = mapping between line numbers in … about functions and global variables .line .line source and machine code in .text; used … … defined and referenced in the program by debugger programs Section Section header table header table Reminder again: … linking Creation of a load module Load Module � Interleaved from Object Module A source object multiple object Header Section file 1 file 1 modules Machine Code Header Section Section – Sections must be Initialized data source object Section “ relocated ” linking file 2 file 2 load Symbol table compilation (relocation + file Section � Addresses relative to library linking) Machine Code object Section beginning of a file 1 module source object file N file N Header Section – Necessary to translate library Initialized data Machine Code from beginnings of object Section Section file M object modules Initialized data Section � When loaded – OS Symbol table Section will translate again to Symbol table Section absolute addresses Object Module B
Loading and memory mapping From source physical memory (logical) � Includes Header Section Code Code address code for printf() Machine Code space of program to memory Section program 2 source program initialized Initialized data Static data for stack, Static data code for top of for loop Section Symbol table uninitialized “placement” in Section Code dynamic int a[10]={0,1,2,3,4,5,6,7,8,9}; int b[10]; Dynamic data load module data (i.e., code for call to printf() Dynamic data Static data void main() code for b[i] = k*a[i] Unused free store), memory during { logical int i; Unused address and un- static int k = 3; logical space Dynamic data address space initialized execution for(i = 0; i < 10; i++) { (logical) address Stack printf("%d\n",a[i]); space of Unused b[i] = k*a[i]; array a[] program 1 global data Logical }/*endfor*/ Stack address }/*end main*/ space � Physical Stack memory is (logical) address loading OPERATING shared by array b[] space of SYSTEM variable k memory program 3 multiple memory mapping PHYSICAL MEMORY mapping programs Dynamic memory allocation Sections of an executable file Code Code Segments: initialized initialized Static data Static data uninitialized uninitialized Dynamic data Dynamic data increment of dynamic data Unused logical Unused address logical space address space Stack Stack (logical) address (logical) address space of the space of the program program OPERATING OPERATING SYSTEM SYSTEM PHYSICAL MEMORY PHYSICAL MEMORY Before dynamic memory allocation After dynamic memory allocation Overflow is when a data code is Variables and objects in memory larger than the size of its container ' A ' 16916 � e.g., char i; // just 1 byte variable i 01000001 0100001000010100 int *p = (int*)&i; // legal 01001001100101100000001011010100 � Variables and data objects are data containers *p = 1673579060; with names // result if " big endian " storage: X � The value of the variable is the code stored in the � If whole space (X) belongs to this program: container – Seems OK if X does not contain important data for rest of � To evaluate a variable is to fetch the code from the program ’ s execution the container and interpret it properly – Bad results or crash if important data are overwritten � To store a value in a variable is to code the value � If all or part of X belongs to another process, the and store the code in the container program is terminated by the OS for a memory � The size of a variable is the size of its container access violation (i.e., segmentation fault)
More about overflow Placement & padding – word � Previous slide showed example of " right � Compiler places variable x Not like this! overflow " – result truncated (also warning) data at word x.a x.b 0100100110010110000000101101010001101101 boundaries 01000001 010001… – e.g., word = 4 bytes a machine word a machine word � Compilers handle " left overflow " by � Imagine: truncating too (usually without any warning) struct { variable x data completely Compilers do it this way ignored, junk char a; padding – Easily happens: unsigned char i = 255; x.a x.b int b; 01001001 10010110000000101101010001101101 11111111 } x; i++; // What is the result of this increment? a machine word a machine word – Classes too 1 00000000 See/try ~mikec/cs32/demos/padding.cpp ...0101 01000001010000100100001101000100 1100... Pointers are data containers too What is ? address address address address 802340 802341 802342 802343 � As its value is a memory address � Could be four chars: ‘ A ’ , 802340 address, we say it " points " ...0101 01000001010000100100001101000100 1100... ‘ B ’ , ‘ C ’ , ‘ D ’ to a place in memory 8090346 char* b ASCII code for 'A' 802340 � Or it could be two shorts: � It points at just 1 byte, so it byte with address 8090346 must " know " what data type 16961, 17475 address 802340 starts at that address ...0101 01000001010000100100001101000100 1100... – All numerical values shown here byte with address binary code for short 16916 – How many bytes? are for a " little endian " machine 802340 short* s 8090346 (on a little endian machine) (more about endian next slide) – How to interpret the bits? 8090346 address 802340 int* p � Question: What is stored in � Maybe it ’ s a long or an ...0101 01000001010000100100001101000100 1100... integer the 4 bytes at addresses "data container" int: 1145258561 binary code for int 1145258561 802340 int* p (on a little endian machine) 802340..802343 in the ...0101 01000001010000100100001101000100 1100... � It could be a floating point diagram at right? address 802340 address address address address number too: 781.035217 ...0101 01000001010000100100001101000100 1100... – Continued next slide 802340 802341 802342 802343 binary code for float 781.035217 802340 float* f (on a little endian machine) Dynamic memory allocation Beware: two different byte orders � Matters to actual value of anything but chars � OS memory manager (OSMM) allocates large blocks at a time to individual processes � Say: short int x = 1; � A process memory manager (PMM) then takes over � On a big endian machine it looks like this: 0000000000000001 Operating System – Some Macs, JVM, TCP/IP " Network Byte Order " large memory large memory OS memory blocks blocks � On a little endian machine it looks like this: manager 0000000100000000 – Intel, most communication hardware Process memory Process memory � Only important when dereferencing pointers management management Process Process – See/try ~mikec/cs32/demos/endian.c 2 1
Recommend
More recommend