C C’s ancestors are the typeless programming languages BCPL (the Basic Combined Programming Language), developed by Martin Richards; and B, a descendant of BCPL, developed by Ken Thompson. A new feature of C was its variety of data types: characters, numeric types, arrays, structures, and so on. Brian Kernighan and Dennis Ritchie published an official description of the C programming language in 1978. Few hardware-dependent elements. For example, the C language proper has no file access or dynamic memory management statements. No input/output. Instead, the extensive standard C library provides the functions for all of these purposes. System, Social and Mobile Security
VIRTUES OF C Fast (it's a compiled language and so is close to the machine hardware) Portable (you can compile you program to run on just about any hardware platform out there) The language is small (unlike C++ for example) Mature (a long history and lots of resources and experience available) There are many tools for making programming easier (e.g. IDEs like Xcode) You have direct access to memory You have access to low-level system features if needed System, Social and Mobile Security
CHALLENGES OF USING C The language is small (but there are many APIs) It's easy to get into trouble, e.g. with direct memory access & pointers You must manage memory yourself Sometimes code is more verbose than in high-level scripting languages like Python, R, etc System, Social and Mobile Security
STANDARDS K & R C (Brian Kernighan and Dennis Ritchie) ü 1972 First created by Dennis Ritchie ü 1978 The C Programming Language described ANSI C ü 1989 ANSI X.159-1989 aka C89 - First standardized version ISO C 1990 ISO/IEC 9899:1990 aka C90 - Equivalent to C89 1995 Amendment 1 aka C95 1999 ISO/IEC 9899:1999 aka C99 2011 ISO/IEC 9899:2011 aka C11 2018 ISO/IEC 9899:2018 aka C18 gcc file.c –std=c11 System, Social and Mobile Security
DENNIS System, Social and Mobile Security
HISTORY OF C++ In the early 1970s, Dennis Ritchie introduced “C” at Bell Labs. ü http://cm.bell-labs.co/who/dmr/chist.html As a Bell Labs employee, Bjarne Stroustrup was exposed to and appreciated the strengths of C, but also appreciated the power and convenience of higher-level languages like Simula, which had language support for object-oriented programming (OOP). ü Originally called C With Classes , in 1983 it becomes C++ In 1985, the first edition of The C++ Programming Language was released Standard in 1998 (ISO/IEC 14882:1998) System, Social and Mobile Security
HISTORY Adding support for OOP turned out to be the right feature at the right time for the ʽ90s. At a time when GUI programming was all the rage, OOP was the right paradigm, and C++ was the right implementation. At over 700 pages, the C++ standard demonstrated something about C++ that some critics had said about it for a while: C++ is a complicated beast. The first decade of the 21st century saw desktop PCs that were powerful enough that it didn’t seem worthwhile to deal with all this complexity when there were alternatives that offered OOP with less complexity. ü Java System, Social and Mobile Security
STROUSTRUP System, Social and Mobile Security
CHARACTERISTICS The most important feature of C++ is that it is both low- and high- level. Programming in C++ requires a discipline and attention to detail that may not be required of kinder, gentler languages that are not as focused on performance ü No garbage collector! System, Social and Mobile Security
STANDARDS 1998 ISO/IEC 14882:1998 C++98 2003 ISO/IEC 14882:2003 C++03 2011 ISO/IEC 14882:2011 C++11 2014 ISO/IEC 14882:2014 C++14 2017 ISO/IEC 14882:2017 C++17 2020 ???? C++20 System, Social and Mobile Security
LET’S START! System, Social and Mobile Security
CERT C CODING STANDARD https://wiki.sei.cmu.edu/confluence/display/c/SEI+CERT +C+Coding+Standard The SEI CERT C Coding Standard is a software coding standard for the C programming language, developed by the CERT Coordination Center to improve the safety, reliability, and security of software system ü CERT division at Carnagie Mellon ü A non-profit United States federally funded research and development center System, Social and Mobile Security
ARRAYS AND STRINGS System, Social and Mobile Security
ARRAYS ARE COMPLICATED When passed as a parameter, an array name is a pointer sizeof(int *) == 8 always The CERT C Secure Coding Standard includes “ARR01-C. Do not apply the sizeof operator to a pointer when taking the size of an array,” which warns against this problem. System, Social and Mobile Security
STRINGS Strings are a fundamental concept in software engineering, but they are not a built-in type in C or C++. The standard C library supports strings of type char and wide strings of type wchar_t. System, Social and Mobile Security
IMPROPERLY BOUNDED COPIES Reading from a source to a fixed-length array. This program has undefined behaviour if more than eight characters (including the null terminator) are entered at the prompt. The main problem with the gets() function is that it provides no way to specify a limit on the number of characters to read. The gets() function has been deprecated in C99 and eliminated from C11. The CERT C Secure Coding Standard, “ MSC34-C. Do not use deprecated or obsolescent functions ”. System, Social and Mobile Security
READING FROM STDIN Reading data from unbounded sources (such as stdin()) creates an interesting problem for a programmer. Because it is not possible to know beforehand how many characters a user will supply, it is not possible to preallocate an array of sufficient length. A common solution is to statically allocate an array that is thought to be much larger than needed. In this example, the programmer expects the user to enter only one character and consequently assumes that the eight-character array length will not be exceeded. ü With friendly users, this approach works well. But with malicious users, a fixed-length character array can be easily exceeded, resulting in undefined behaviour. This approach is prohibited by The CERT C Secure Coding Standard, “ STR35-C. Do not copy data from an unbounded source to a fixed-length array .” System, Social and Mobile Security
FROM PROGRAM PARAMETERS Vulnerabilities can occur when inadequate space is allocated to copy a program input such as a command-line argument. Although argv[0] contains the program name by convention, an attacker can control the contents of arg[0] to cause a vulnerability in the following program by providing a string with more than 128 bytes. Problems with C++ as well System, Social and Mobile Security
HOW TO FIX IT System, Social and Mobile Security
NULL TERMINATING STRINGS The result is that the strcpy() to c may write well beyond the bounds of the array because the string stored in a[] is not correctly null-terminated. The CERT C Secure Coding Standard includes “ STR32-C. Null-terminate byte strings as required .” System, Social and Mobile Security
ERRORS Most of the functions defined in the standard string-handling library <string.h>. Visual Studio has deprecated most of them. However, errors are still possible without them, since strings are arrays of chars… System, Social and Mobile Security
STRING VULNERABILITIES AND EXPLOITS System, Social and Mobile Security
TAINTED VALUES Previous sections described common errors in manipulating strings in C or C++. ü These errors become dangerous when code operates on untrusted data from external sources such as command-line arguments, environment variables, console input, text files, and network connections. It is safer to view all external data as untrusted. In software security analysis, a value is said to be tainted if it comes from an untrusted source (outside of the program’s control) and has not been sanitized to ensure that it conforms to any constraints on its value that consumers of the value require, for example, that all strings are null-terminated. System, Social and Mobile Security
PASSWORD EXAMPLE System, Social and Mobile Security
SECURITY FLAWS The security flaw in the IsPasswordOK program that allows an attacker to gain unauthorized access is caused by the call to gets(). The condition that allows an out-of-bounds write to occur is referred to in software security as a buffer overflow . System, Social and Mobile Security
ONE MORE FLAW The IsPasswordOK program has another problem: it does not check the return status of gets(). This is a violation of “ FIO04- C. Detect and handle input and output errors. ” When gets() fails, the contents of the Password buffer are indeterminate, and the subsequent strcmp() call has undefined behaviour. In a real program, the buffer might even contain the good password previously entered by another user. System, Social and Mobile Security
BUFFER OVERFLOWS Buffer overflows occur when data is written outside of the boundaries of the memory allocated to a particular data structure. C and C++ are susceptible to buffer overflows because these languages: ü Define strings as null-terminated arrays of characters. ü Do not perform implicit bounds checking. ü Provide standard library calls for strings that do not enforce bounds checking. Depending on the location of the memory and the size of the overflow, a buffer overflow may go undetected but can corrupt data, cause erratic behaviour, or terminate the program abnormally. System, Social and Mobile Security
BUFFER OVERFLOWS Not all buffer overflows lead to software vulnerabilities. However, a buffer overflow can lead to a vulnerability if an attacker can manipulate user-controlled inputs to exploit the security flaw. There are, for example, well-known techniques for overwriting frames in the stack to execute arbitrary code. Buffer overflows can also be exploited in heap or static memory areas by overwriting data structures in adjacent memory. To help, static (at program time) and dynamic analysis tools (if right data is passed) System, Social and Mobile Security
MEMORY System, Social and Mobile Security
PROCESS MEMORY ORGANIZATION A program instance that is loaded into memory and managed by the operating system. System, Social and Mobile Security
MEMORY The code or text segment includes instructions and read-only data. It can be marked read-only so that modifying memory in the code section results in faults. The data segment contains initialized data, uninitialized data, static variables, and global variables. The heap is used for dynamically allocating process memory. The stack is a last-in, first-out (LIFO) data structure used to support process execution. The exact organization of process memory depends on the operating system, compiler, linker, and loader—in other words, on the implementation of the programming language System, Social and Mobile Security
STACK int fun(int p1, int p2, int p3) { res int res= 0; res= p1 + p2 + p3; return return res; address } p3 p2 int main() { int a= 4, b= 5, c= 7; p1 a= fun(a,b,c); } c b a System, Social and Mobile Security
. EXAMPLE . . include<stdio.h> frame f2 void f1(); void f2() { frame f1 int c; f1(); puts(“bye f2”); frame f2 } void f1() { frame f1 int b= 0; f2(); puts(“bye f1”); main } int main() { int a= 0; MacBook-Francesco:ProgrammI francescosantini$ ./test f1(); Segmentation fault: 11 puts(“bye main”); } System, Social and Mobile Security
STACK To return control to the proper location, the sequence of return addresses must be stored. A stack is well suited for maintaining this information because it is a dynamic data structure that can support any level of nesting within memory constraints. The address of the current frame is stored in the frame or base pointer register. On x86-32, the extended base pointer ( ebp ) register is used for this purpose. The frame pointer is used as a fixed point of reference within the stack. When a subroutine is called, the frame pointer for the calling routine is also pushed onto the stack so that it can be restored when the subroutine exits. System, Social and Mobile Security
DISASSEMBLY IN INTEL System, Social and Mobile Security
INSTRUCTION POINTER The instruction pointer ( eip ) points to the next instruction to be executed. When executing sequential instructions, it is automatically incremented by the size of each instruction, so that the CPU will then execute the next instruction in the sequence. Normally, the eip cannot be modified directly; instead, it must be modified indirectly by instructions such as jump, call, and return. Extended stack pointer ( esp ) is the current pointer to the stack. The stack pointer points to the top of the stack. ü For many popular architectures, including x86, SPARC, and MIPS processors, the stack grows toward lower memory. System, Social and Mobile Security
DISASSEMBLY OF FOO (PROLOGUE) System, Social and Mobile Security
STACK FRAME FOR FOO System, Social and Mobile Security
DISASSEMBLING FOO (EPILOGUE) System, Social and Mobile Security
RETURN VALUES If there is a return value, it is stored in eax by the called function before returning. The caller function knows it can be found in eax and can use it. int MyFunction2(int a, int b) { x = MyFunction2(2, 3); return a + b; } :_MyFunction2 push 3 push ebp push 2 mov ebp, esp call _MyFunction2 mov eax, [ebp + 8] %Use x in eax mov edx, [ebp + 12] add eax, edx mov esp, ebp pop ebp ret System, Social and Mobile Security
STACK SMASHING System, Social and Mobile Security
WHAT IS IT? Stack smashing is when an attacker purposely overflows a buffer on stack to get access to forbidden regions of computer memory. Stack smashing occurs when a buffer overflow overwrites data in the memory allocated to the execution stack. It can have serious consequences for the reliability and security of a program. Buffer overflows in the stack segment may allow an attacker to modify the values of automatic variables or execute arbitrary code. System, Social and Mobile Security
WHAT CAN HAPPEN? Overwriting automatic variables can result in a loss of data integrity or, in some cases, a security breach (for example, if a variable containing a user ID or password is overwritten). More often, a buffer overflow in the stack segment can lead to an attacker executing arbitrary code by overwriting a pointer to an address to which control is (eventually) transferred. A common example is overwriting the return address, which is located on the stack. System, Social and Mobile Security
EXAMPLE The IsPasswordOk program is vulnerable to a stack- smashing attack. The IsPasswordOK program has a security flaw because the Password array is improperly bounded and can hold only an 11-character password plus a trailing null byte. This flaw can easily be demonstrated by entering a 20- character password of “1234567890123456W ▸ *!” that causes the program to jump in an unexpected way System, Social and Mobile Security
BACK TO THE EXAMPLE System, Social and Mobile Security
EXAMPLE It crashes! System, Social and Mobile Security
WHAT HAPPENS Each of these characters has a corresponding hexadecimal value: W = 0x57, ▸ = 0x10, * = 0x2A, and ! = 0x21. In memory, this sequence of four characters corresponds to a 4-byte address that overwrites the return address on the stack, so instead of returning to the instruction immediately following the call in main(): ü The IsPasswordOK() function returns control to the “Access granted” branch, bypassing the password validation logic and allowing unauthorized access to the system System, Social and Mobile Security
GUESS THE RIGHT ADDRESS void * __builtin_return_address (unsigned int level) A value of 0 yields the return address of the current function, a value of 1 yields the return address of the caller of the current function, and so forth System, Social and Mobile Security
ARC INJECTION The arc injection technique (sometimes called return- into-libc) involves transferring control to code that already exists in process memory. These exploits are called arc injection because they insert a new arc (control-flow transfer) into the program’s control-flow graph as opposed to injecting new code . More sophisticated attacks are possible using this technique, including installing the address of an existing function (such as system() or exec(), which can be used to execute commands and other programs already on the local system) on the stack along with the appropriate arguments. System, Social and Mobile Security
ARC INJECTION An attacker may prefer arc injection over code injection for several reasons. Because arc injection uses code already in memory on the target system, the attacker merely needs to provide the addresses of the functions and arguments for a successful attack. ü The footprint for this type of attack can be significantly smaller and may be used to exploit vulnerabilities that cannot be exploited by the code injection technique. Because the exploit consists entirely of existing code, it cannot be prevented by memory-based protection schemes such as making memory segments (such as the stack) non-executable. System, Social and Mobile Security
ONE MORE EXAMPLE System, Social and Mobile Security
CODE INJECTION (SHELL CODE) System, Social and Mobile Security
INJECTION AND SHELLCODE When the return address is overwritten because of a software flaw, it seldom points to valid instructions. Consequently, transferring control to this address typically causes a trap and results in a corrupted stack. But it is possible for an attacker to create a specially crafted string that contains a pointer to some malicious code, which the attacker also provides. ü When the function invocation whose return address has been overwritten returns, control is transferred to this code. The malicious code runs with the permissions that the vulnerable program has when the subroutine returns. ü This is why programs running with root or other elevated privileges are normally targeted. The malicious code can perform any function that can otherwise be programmed but often simply opens a remote shell on the compromised machine. For this reason, the injected malicious code is referred to as shellcode . System, Social and Mobile Security
HOW IT HAS TO BE The pièce de résistance of any good exploit is the malicious argument. A malicious argument must have several characteristics: ü It must be accepted by the vulnerable program as legitimate input. ü The argument, along with other controllable inputs, must result in execution of the vulnerable code path. ü The argument must not cause the program to terminate abnormally before control is passed to the shellcode. System, Social and Mobile Security
BACK TO THE EXAMPLE System, Social and Mobile Security
INJECTION % ./BufferOverflow < exploit.bin (exploit.bin is the “payload”) System, Social and Mobile Security
HOW IT WORKS The lea instruction used in this example stands for “load effective address.” The lea instruction computes the effective address of the second operand (the source operand) and stores it in the first operand (destination operand). The source operand is a memory address (offset part) specified with one of the processor’s addressing modes; the destination operand is a general purpose register. System, Social and Mobile Security
HOW IT WORKS The exploit code works as follows: ü 1. The first mov instruction is used to assign 0xB to the %eax register. 0xB is the number of the execve() system call in Linux. • int execve(const char * filename , char *const argv [], char *const envp []); ü 2. The three arguments for the execve() function call are set up in the subsequent three instructions (the two lea instructions and the mov instruction). The data for these arguments is located on the stack, just before the exploit code. ü 3. The int $0x50 instruction is used to invoke execve(), which results in the execution of the Linux calendar program. System, Social and Mobile Security
RESULT System, Social and Mobile Security
RETURN-ORIENTED PROGRAMMING System, Social and Mobile Security
OVERCOMING DEFENCES Code that already exists in the process image. ü The standard C library, libc , is loaded in nearly every Unix program, it contains routines useful for an attacker. ü But in principle any available code, either from the program’s text segment or from a library it links to, could be used. By contrast, the building blocks for our attack are short code sequences, each just two or three instructions long. Some are present in libc as a result of the code- generation choices of the compiler. ü These code sequences would be very difficult to eliminate without extensive modifications to the compiler and assembler. System, Social and Mobile Security
HOW IT WORKS The return-oriented programming exploit technique is similar to arc injection, but instead of returning to functions, the exploit code returns to sequences of instructions followed by a return ( ret ) instruction. Any such useful sequence of instructions is called a gadget. Each gadget specifies certain values to be placed on the stack that make use of one or more sequences of instructions in the code segment. ü Gadgets perform well-defined operations, such as a load, an add, or a jump. It allows an attacker to execute code in the presence of security defences such as executable space protection and code signing. System, Social and Mobile Security
HOW IT WORKS Return-oriented programming is an advanced version of a stack smashing attack. In a standard buffer overrun attack, the attacker would simply write attack code (the "payload") onto the stack and then overwrite the return address with the location of these newly written instructions. ü Since the late 90s, OS/compilers have protections: data zones cannot be executed. • DEP Data Execution prevention (there is a hardware bit) • NX (no execute), on Intel XD (execute disable) https://cseweb.ucsd.edu/~hovav/dist/geometry.pdf System, Social and Mobile Security
EXAMPLE pop %ebx; ret; The left side shows the x86-32 assembly language instruction necessary to copy the constant value $0xdeadbeef into the ebx register, and the right side shows the equivalent gadget. With the stack pointer referring to the gadget, the return instruction is executed by the CPU. The resulting gadget pops the constant from the stack and returns execution to the next gadget on the stack. System, Social and Mobile Security
EXAMPLE 2 pop %esp; ret; An unconditional branch can be used to branch to an earlier gadget on the stack, resulting in an infinite loop. System, Social and Mobile Security
EXAMPLE OF ATTACK The goal of the attack is to invoke system call sys_write and output “xxxHACKxxx” to screen ssize_t sys_write(unsigned int fd, const char * buf, size_t count) System, Social and Mobile Security
EXAMPLE OF ATTACK http://www.cs.virginia.edu/~ww6r/CS4630/lectures/return_oriented_programming.pdf int main(int argc, char *argv[]){ char buf[4]; gets(buf) return 0; } System, Social and Mobile Security
GADGETS 5: System, Social and Mobile Security
HOW TO DO IT System, Social and Mobile Security
SOME THEORETICAL ISSUES Can you always find the gadgets you need? ü Some small executable files may not have all the gadgets for you ü If the executable file is larger than 3MB there is a good chance that you can find a set of gadgets for any exploits Do you need “ret”? ü No, other jumps also work ROP can work also without lib, only with provided code (in case mitigation on libc have been considered) System, Social and Mobile Security
SOME THEORETICAL ISSUES Return-oriented programming provides a fully functional "language" (Turing complete) that an attacker can use to make a compromised machine perform any operation desired. System, Social and Mobile Security
SUMMARY Return-oriented Programming (ROP) addresses the limitations of code-injection and return-to-libc ü Code-injection: need executable stack ü Return-to-libc: • Highly depends on libc's implementation • Can be defended with mapped memory randomization Gadgets: a small sequence of code ending with “ret” within a program's code section ü No need to inject code, so no need of executable stack ü Do not use libc's full function implementation, may even only use just application's code ROP attacks chain several gadgets together to execute arbitrary code Enough ROP gadgets can be found in most executable files using scanning tools System, Social and Mobile Security
COMPLICATED While return-oriented programming might seem very complex, this complexity can be abstracted behind a programming language, compiler, and support tools, making it a viable technique for writing exploits. System, Social and Mobile Security
HOW TO EXPLOIT/PREVENT IT An automated tool has been developed to help automate the process of locating gadgets and constructing an attack against a binary. This tool, known as ROPgadget , searches through a binary looking for potentially useful gadgets, and attempts to assemble them into an attack payload that spawns a shell to accept arbitrary commands from the attacker. https://github.com/JonathanSalwan/ROPgadget System, Social and Mobile Security
HOW TO BUILD THEM pwntools is a CTF (Capture the Flag) framework and exploit development library. Written in Python, it is designed for rapid prototyping and development, and intended to make exploit writing as simple as possible. ü http://docs.pwntools.com/en/stable/rop/rop.html System, Social and Mobile Security
MITIGATION System, Social and Mobile Security
STACK-SMASHING PROTECTOR (PROPOLICE) In version 4.1, GCC introduced the Stack-Smashing Protector (SSP) feature, which implements canaries derived from StackGuard. Also known as ProPolice, SSP is a GCC extension for protecting applications written in C from the most common forms of stack buffer overflow exploits and is implemented as an intermediate language translator of GCC. Specifically, SSP reorders local variables to place buffers after pointers and copies pointers in function arguments to an area preceding local variable buffers to avoid the corruption of pointers that could be used to further corrupt arbitrary memory locations. System, Social and Mobile Security
CANARIES Canaries consist of a value that is difficult to insert or spoof and are written to an address before the section of the stack being protected. A sequential write would consequently need to overwrite this value on the way to the protected region. The canary is initialized immediately after the return address is saved and checked immediately before the return address is accessed. A hard-to-spoof or random canary is a 32-bit secret random number that changes each time the program is executed. System, Social and Mobile Security
Recommend
More recommend