1
play

1 Malicious Usage Computer Arithmetic /* Kernel memory region - PDF document

Overview Course themes Four realities Course Overview and Introduction How the course fits into the CS curriculum Logistics CSci 2021: Machine Architecture and Organization Lecture #1, January 22nd, 2020 Your instructor: Stephen


  1. Overview  Course themes  Four realities Course Overview and Introduction  How the course fits into the CS curriculum  Logistics CSci 2021: Machine Architecture and Organization Lecture #1, January 22nd, 2020 Your instructor: Stephen McCamant Based on slides originally by: Randy Bryant, Dave O’Hallaron 1 2 Course Theme: Great Reality #1: Abstraction Is Good But Don’t Forget Reality Ints are not Integers, Floats are not Reals  Example 1: Is x 2 ≥ 0?  Most CS courses emphasize abstraction  Abstract data types  Floats: Yes!  Asymptotic analysis  These abstractions have limits  Especially in the presence of bugs  Need to understand details of underlying implementations  Ints:  Useful outcomes  40000 * 40000 → 1600000000  Become more effective programmers  50000 * 50000 → ??  Able to find and eliminate bugs efficiently  Example 2: Is (x + y) + z = x + (y + z)?  Able to understand and tune for program performance  Unsigned & Signed Ints: Yes!  Prepare for later “systems” classes in CS & EE  Floats:  Compilers, Operating Systems, Networks, Computer Architecture,  (1e20 + -1e20) + 3.14 --> 3.14 Embedded Systems  1e20 + (-1e20 + 3.14) --> ?? 3 Cartoon source: xkcd.com/571 4 Code Security Example Typical Usage /* Kernel memory region holding user-accessible data */ /* Kernel memory region holding user-accessible data */ #define KSIZE 1024 #define KSIZE 1024 char kbuf[KSIZE]; char kbuf[KSIZE]; /* Copy at most maxlen bytes from kernel region to user buffer */ /* Copy at most maxlen bytes from kernel region to user buffer */ int copy_from_kernel(void *user_dest, int maxlen) { int copy_from_kernel(void *user_dest, int maxlen) { /* Byte count len is minimum of buffer size and maxlen */ /* Byte count len is minimum of buffer size and maxlen */ int len = KSIZE < maxlen ? KSIZE : maxlen; int len = KSIZE < maxlen ? KSIZE : maxlen; memcpy(user_dest, kbuf, len); memcpy(user_dest, kbuf, len); return len; return len; } } #define MSIZE 528  Similar to code found in FreeBSD’s implementation of getpeername void getstuff() { char mybuf[MSIZE];  There are legions of smart people trying to find vulnerabilities copy_from_kernel(mybuf, MSIZE); in programs printf("%s\n", mybuf); } 5 6 1

  2. Malicious Usage Computer Arithmetic /* Kernel memory region holding user-accessible data */  Does not generate random values #define KSIZE 1024  Arithmetic operations have important mathematical properties char kbuf[KSIZE];  Cannot assume all “usual” mathematical properties /* Copy at most maxlen bytes from kernel region to user buffer */ int copy_from_kernel(void *user_dest, int maxlen) {  Due to finiteness of representations /* Byte count len is minimum of buffer size and maxlen */  Integer operations satisfy “ring” properties int len = KSIZE < maxlen ? KSIZE : maxlen; memcpy(user_dest, kbuf, len);  Commutativity, associativity, distributivity return len;  Floating point operations satisfy “ordering” properties }  Monotonicity, values of signs #define MSIZE 528  Observation  Need to understand which abstractions apply in which contexts void getstuff() { char mybuf[MSIZE];  Important issues for compiler writers and serious application programmers copy_from_kernel(mybuf, -MSIZE); . . . } 7 8 Great Reality #2: Assembly Code Example You’ve Got to Know Assembly  Chances are, you’ll never write full programs in assembly  Time Stamp Counter  Compilers are much better & more patient than you are  Special 64-bit register in Intel-compatible machines  Incremented every clock cycle  But, assembly is key to the machine-level execution model  Read with rdtsc instruction  Behavior of programs in the presence of bugs  Application  High-level language models break down  Tuning program performance  Measure time (in clock cycles) required by procedure  Understand optimizations done or not done by the compiler  Understanding sources of program inefficiency double t; start_counter();  Implementing system software P();  Compiler has machine code as target t = get_counter(); printf("P required %f clock cycles\n", t);  Operating systems must manage process state  Creating / fighting malware  x86 assembly is the lingua franca 9 10 Great Reality #3: Memory Matters Code to Read Counter Random Access Memory Is an Unphysical Abstraction  Write small amount of assembly code using GCC’s asm facility  Memory is not unbounded  Inserts assembly code into machine code generated by  It must be allocated and managed compiler  Many applications are memory dominated /* Return the cycle count as a 64-bit integer */  Memory referencing bugs are especially pernicious unsigned long access_counter(void)  Effects are distant in both time and space { unsigned long high, low;  Memory performance is not uniform asm("rdtsc"  Cache and virtual memory effects can greatly affect program performance : "=d" (high), "=a" (low)); return (high << 32) | low;  Adapting program to characteristics of memory system can lead to major } speed improvements 11 12 2

  3. Memory Referencing Bug Example Memory Referencing Bug Example typedef struct { typedef struct { fun(0) → 3.14 int a[2]; int a[2]; fun(1) → 3.14 double d; double d; fun(2) → 3.1399998664856 } struct_t; } struct_t; fun(3) → 2.00000061035156 fun(4) → 3.14 double fun(int i) { fun(6) → Segmentation fault volatile struct_t s; s.d = 3.14; Explanation: s.a[i] = 1073741824; /* Possibly out of bounds */ return s.d; } Critical State 6 ? 5 fun(0) → 3.14 fun(1) → 3.14 ? 4 fun(2) → 3.1399998664856 Location accessed by d7 ... d4 3 fun(3) → 2.00000061035156 fun(i) d3 ... d0 2 fun(4) → 3.14 struct_t fun(6) → Segmentation fault a[1] 1 a[0]  0 Result is system specific 13 14 Memory Referencing Errors Memory System Performance Example  C and C++ do not provide any memory protection void copyij(int src[2048][2048], void copyji(int src[2048][2048],  Out of bounds array references int dst[2048][2048]) int dst[2048][2048]) { {  Invalid pointer values int i,j; int i,j;  Abuses of malloc/free for (i = 0; i < 2048; i++) for (j = 0; j < 2048; j++) for (j = 0; j < 2048; j++) for (i = 0; i < 2048; i++)  Can lead to nasty bugs dst[i][j] = src[i][j]; dst[i][j] = src[i][j];  Whether or not bug has any effect depends on system and compiler } }  Action at a distance 21 times slower  Corrupted object logically unrelated to one being accessed  Effect of bug may be first observed long after it is generated (Pentium 4)  Hierarchical memory organization  How can I deal with this?  Performance depends on access patterns  Program in Java, Python, Ruby, ML, etc.  Including how step through multi-dimensional array  Understand what possible interactions may occur  Use or develop tools to detect referencing errors (e.g. Valgrind) 15 16 Great Reality #4: There’s more to Why The Performance Differs performance than asymptotic complexity copyij  Constant factors matter too! 16000  And even exact op count does not predict performance 14000  Easily see 10:1 performance range depending on how code written Read throughput (MB/s)  Must optimize at multiple levels: algorithm, data representations, 12000 procedures, and loops 10000  Must understand system to optimize performance 8000  How programs compiled and executed 6000  How to measure program performance and identify bottlenecks 4000  How to improve performance without destroying code modularity and 2000 copyji generality 0 32k s1 128k s3 512k s5 2m s7 8m Stride (x8 bytes) s9 Size (bytes) 32m s11 128m 17 18 3

Recommend


More recommend