What does the compiler actually do with my code? An introduction to the C++ ABI Filip Strömbäck
1 Introduction 2 What is an ABI? 3 Object layout 4 Function calls 5 Virtual functions 6 Exceptions
What does the compiler actually do with my code? Filip Strömbäck 2 The topic for today How are parts of C++ realized on x86 and AMD64? • Object layout • Function calls • Virtual function calls • Exceptions
What does the compiler actually do with my code? Filip Strömbäck 3 Why? If you know the implementation... behaviour strange things) Note: Everything discussed here is highly system specifjc, and most likely undefjned behavior according to the standard! • ...you can reason about the effjciency of your solution • ...you can see why some things are undefjned • (...you can abuse undefjned behaviour and do really
What does the compiler actually do with my code? Compiler Explorer ( http://ref.x86asm.net/ ) x86 instruction reference • //www.uclibc.org/docs/psABI-x86_64.pdf ) System V ABI ( https: • OSDev Wiki ( https://wiki.osdev.org/ ) • • Filip Strömbäck In a debugger • objdump -d -M intel <program> • g++ -S -masm=intel <file> or cl /FAs <file> • How? 4 • Read the assembler output from the compiler! • Figure out why it does certain things: • Lots of tinkering and thinking!
1 Introduction 2 What is an ABI? 3 Object layout 4 Function calls 5 Virtual functions 6 Exceptions
What does the compiler actually do with my code? Filip Strömbäck 6 What is an ABI (Application Binary Interface)? Specifjes how certain aspects of a language are realized on a particular CPU Specifjes: Language specifjcation + ABI ⇒ compiler • Size of built-in types • Object layout • Function calls (calling conventions) • Exception handling • Name mangling • ...
What does the compiler actually do with my code? Filip Strömbäck 7 Difgerent systems use difgerent ABIs There are two major ABIs: Variants for many systems: • System V ABI (Linux, MacOS on AMD64) • Microsoft ABI (Windows) • x86 • AMD64 • ARM • ...
1 Introduction 2 What is an ABI? 3 Object layout 4 Function calls 5 Virtual functions 6 Exceptions
What does the compiler actually do with my code? Filip Strömbäck 9 Integer types and endianness char a{0x08}; short b{0x1234}; // = 4660 int c{0x00010203}; // = 66051 long d{0x1101020304}; // = 73031353092
What does the compiler actually do with my code? // = 66051 d: c: b: a: 08 d{0x1101020304}; // = 73031353092 Filip Strömbäck long c{0x00010203}; int // = 4660 short b{0x1234}; a{0x08}; char Integer types and endianness 9 Big endian (ARM) 12 34 00 01 02 03 00 00 00 11 01 02 03 04
What does the compiler actually do with my code? // = 66051 d: c: b: a: 08 d{0x1101020304}; // = 73031353092 Filip Strömbäck long c{0x00010203}; int // = 4660 short b{0x1234}; a{0x08}; char Integer types and endianness 9 Little endian (x86) 34 12 03 02 01 00 04 03 02 01 11 00 00 00
What does the compiler actually do with my code? long d{100}; e d padding c b a }; int e{4}; int c{3}; Filip Strömbäck int b{2}; int a{1}; struct simple { Example: alignment Other types 10 padding • Each type has a size and an alignment • Members are placed sequentially, respecting the
What does the compiler actually do with my code? Filip Strömbäck 11 The type system The type system is not present in the binary! It just helps us to keep track of how to interpret bytes in memory! struct foo { int a, b, c; }; foo x{1, 2, 3}; int y[3] = {1, 2, 3}; short z[6] = {1, 0, 2, 0, 3, 0}; All look the same in memory!
1 Introduction 2 What is an ABI? 3 Object layout 4 Function calls 5 Virtual functions 6 Exceptions
What does the compiler actually do with my code? ebp High Low Stack Registers edi esi esp Filip Strömbäck ebx edx ecx eax Starting simple – x86 13 Stack frame
What does the compiler actually do with my code? call fn 3 2 1 return address fn – locals mov "r", eax add esp, 12 push 1 Filip Strömbäck push 2 push 3 } int r = fn(1, 2, 3); int main() { int fn(int a, int b, int c); The default on x86 – cdecl 14 main – locals
What does the compiler actually do with my code? sub esp, 8 3 z return address fn – locals mov "r", eax add esp, 12 call fn ;; initialize z at esp push 3 Filip Strömbäck } int r = fn(z, 3); large z{ 1, 2 }; int main() { int fn(large a, int b); struct large { int a, b; }; The default on x86 – cdecl 15 main – locals
What does the compiler actually do with my code? lea eax, "z" 3 &z return address fn – locals mov "r", eax add esp, 8 push eax push 10 Filip Strömbäck } int r = fn(z, 3); large z{ 1, 2 }; int main() { int fn(large &a, int b); struct large { int a, b; }; The default on x86 – cdecl 16 main – locals call fn
What does the compiler actually do with my code? lea eax, "z" result address 10 return address fn – locals add esp, 8 push eax push 10 Filip Strömbäck } large z = fn(10); int main() { large fn(int a); struct large { int a, b; }; The default on x86 – cdecl 17 main – locals call fn
What does the compiler actually do with my code? lea eax, "z" result address 10 return address fn – locals add esp, 8 push eax push 10 Filip Strömbäck } large z = fn(10); int main() { struct large { int a, b; }; The default on x86 – cdecl 17 main – locals large *fn(large *result, int a); call fn
What does the compiler actually do with my code? Filip Strömbäck 18 More advanced – AMD64 This is where the fun begins!
What does the compiler actually do with my code? r9 High Low Stack Registers r15 r14 r13 r12 r11 r10 r8 Filip Strömbäck rdi rsi rbp rsp rbx rdx rcx rax More advanced – AMD64 18 Stack frame
What does the compiler actually do with my code? r12 High Low Stack Registers 6 5 4 3 2 1 r15 r14 r13 r11 Filip Strömbäck r10 r9 r8 rdi rsi rbp rsp rbx rdx rcx rax More advanced – AMD64 18 Stack frame
What does the compiler actually do with my code? Filip Strömbäck 19 Rules (simplifjed) 1. If a parameter has a copy constructor or a destructor: • Pass by hidden reference 2. If a parameter is larger than 4*8 bytes • Pass in memory 3. If a parameter uses more than 2 integer registers • Pass in memory 4. Otherwise • Pass in appropriate registers (integer/fmoating-point)
What does the compiler actually do with my code? rdi 3 2 1 rax r9 r8 rcx rdx rsi mov "r", rax Filip Strömbäck call fn mov edx, 3 mov esi, 2 mov edi, 1 } int r = fn(1, 2, 3); int main() { int fn(int a, int b, int c); AMD64 20 r
What does the compiler actually do with my code? mov "r", rax 3 z rax r9 r8 rcx rdx rsi rdi call fn Filip Strömbäck mov rsi, 3 mov rdi, "z" } int r = fn(z, 3); large z{ 1, 2 }; int main() { int fn(large a, int b); struct large { int a, b; }; AMD64 21 r
What does the compiler actually do with my code? mov "r", rax 3 z.b z.a rax r9 r8 rcx rdx rsi rdi call fn Filip Strömbäck mov rsi, 3 mov rdi, "z" } int r = fn(z, 3); large z{ 1, 2 }; int main() { int fn(large a, long b); struct large { long a, b; }; AMD64 22 r
What does the compiler actually do with my code? call fn 3 z stack rax r9 r8 rcx rdx rsi rdi mov "r", rax mov rdi, 3 Filip Strömbäck push "z.a" push "z.b" push "z.c" } int r = fn(z, 4); large z{ 1, 2, 3 }; int main() { int fn(large a, long b); struct large { long a, b, c; }; AMD64 23 r
What does the compiler actually do with my code? mov "r", rax 3 &z' rax r9 r8 rcx rdx rsi rdi large is not trivially copiable, has a destructor or a vtable call fn Filip Strömbäck mov rsi, 3 lea rdi, "z'" ;; Copy z into z' } int r = fn(z, 3); large z{ 1, 2 }; int main() { int fn(large a, long b); struct large { /*...*/ }; AMD64 24 r
What does the compiler actually do with my code? mov "r", rax 3 &z rax r9 r8 rcx rdx rsi rdi call fn Filip Strömbäck mov rsi, 3 lea rdi, "z" } int r = fn(z, 3); large z{ 1, 2 }; int main() { int fn(large &a, int b); struct large { int a, b; }; AMD64 25 r
What does the compiler actually do with my code? mov "z", rax 10 rax r9 r8 rcx rdx rsi rdi call fn Filip Strömbäck mov rdi, 10 } large z = fn(10); int main() { large fn(int a); struct large { int a, b; }; AMD64 26 z
What does the compiler actually do with my code? mov "z"+8, rdx z.a 10 rax r9 r8 rcx rdx rsi rdi mov "z", rax Filip Strömbäck call fn mov rdi, 10 } large z = fn(10); int main() { large fn(int a); struct large { long a, b; }; AMD64 27 z.b
What does the compiler actually do with my code? mov "z"+8, rdx 10 &z rax r9 r8 rcx rdx rsi rdi mov "z", rax Filip Strömbäck call fn mov rdi, 10 } large z = fn(10); int main() { large fn(int a); struct large { long a, b, c; }; AMD64 28 &z
Recommend
More recommend