Overview Course themes Four realities Course Overview and Introduction How the course fits into the CS curriculum Logistics CSci 2021: Machine Architecture and Organization Lecture #1, January 22nd, 2020 Your instructor: Stephen McCamant Based on slides originally by: Randy Bryant, Dave O’Hallaron 1 2 Course Theme: Great Reality #1: Abstraction Is Good But Don’t Forget Reality Ints are not Integers, Floats are not Reals Example 1: Is x 2 ≥ 0? Most CS courses emphasize abstraction Abstract data types Floats: Yes! Asymptotic analysis These abstractions have limits Especially in the presence of bugs Need to understand details of underlying implementations Ints: Useful outcomes 40000 * 40000 → 1600000000 Become more effective programmers 50000 * 50000 → ?? Able to find and eliminate bugs efficiently Example 2: Is (x + y) + z = x + (y + z)? Able to understand and tune for program performance Unsigned & Signed Ints: Yes! Prepare for later “systems” classes in CS & EE Floats: Compilers, Operating Systems, Networks, Computer Architecture, (1e20 + -1e20) + 3.14 --> 3.14 Embedded Systems 1e20 + (-1e20 + 3.14) --> ?? 3 Cartoon source: xkcd.com/571 4 Code Security Example Typical Usage /* Kernel memory region holding user-accessible data */ /* Kernel memory region holding user-accessible data */ #define KSIZE 1024 #define KSIZE 1024 char kbuf[KSIZE]; char kbuf[KSIZE]; /* Copy at most maxlen bytes from kernel region to user buffer */ /* Copy at most maxlen bytes from kernel region to user buffer */ int copy_from_kernel(void *user_dest, int maxlen) { int copy_from_kernel(void *user_dest, int maxlen) { /* Byte count len is minimum of buffer size and maxlen */ /* Byte count len is minimum of buffer size and maxlen */ int len = KSIZE < maxlen ? KSIZE : maxlen; int len = KSIZE < maxlen ? KSIZE : maxlen; memcpy(user_dest, kbuf, len); memcpy(user_dest, kbuf, len); return len; return len; } } #define MSIZE 528 Similar to code found in FreeBSD’s implementation of getpeername void getstuff() { char mybuf[MSIZE]; There are legions of smart people trying to find vulnerabilities copy_from_kernel(mybuf, MSIZE); in programs printf("%s\n", mybuf); } 5 6 1
Malicious Usage Computer Arithmetic /* Kernel memory region holding user-accessible data */ Does not generate random values #define KSIZE 1024 Arithmetic operations have important mathematical properties char kbuf[KSIZE]; Cannot assume all “usual” mathematical properties /* Copy at most maxlen bytes from kernel region to user buffer */ int copy_from_kernel(void *user_dest, int maxlen) { Due to finiteness of representations /* Byte count len is minimum of buffer size and maxlen */ Integer operations satisfy “ring” properties int len = KSIZE < maxlen ? KSIZE : maxlen; memcpy(user_dest, kbuf, len); Commutativity, associativity, distributivity return len; Floating point operations satisfy “ordering” properties } Monotonicity, values of signs #define MSIZE 528 Observation Need to understand which abstractions apply in which contexts void getstuff() { char mybuf[MSIZE]; Important issues for compiler writers and serious application programmers copy_from_kernel(mybuf, -MSIZE); . . . } 7 8 Great Reality #2: Assembly Code Example You’ve Got to Know Assembly Chances are, you’ll never write full programs in assembly Time Stamp Counter Compilers are much better & more patient than you are Special 64-bit register in Intel-compatible machines Incremented every clock cycle But, assembly is key to the machine-level execution model Read with rdtsc instruction Behavior of programs in the presence of bugs Application High-level language models break down Tuning program performance Measure time (in clock cycles) required by procedure Understand optimizations done or not done by the compiler Understanding sources of program inefficiency double t; start_counter(); Implementing system software P(); Compiler has machine code as target t = get_counter(); printf("P required %f clock cycles\n", t); Operating systems must manage process state Creating / fighting malware x86 assembly is the lingua franca 9 10 Great Reality #3: Memory Matters Code to Read Counter Random Access Memory Is an Unphysical Abstraction Write small amount of assembly code using GCC’s asm facility Memory is not unbounded Inserts assembly code into machine code generated by It must be allocated and managed compiler Many applications are memory dominated /* Return the cycle count as a 64-bit integer */ Memory referencing bugs are especially pernicious unsigned long access_counter(void) Effects are distant in both time and space { unsigned long high, low; Memory performance is not uniform asm("rdtsc" Cache and virtual memory effects can greatly affect program performance : "=d" (high), "=a" (low)); return (high << 32) | low; Adapting program to characteristics of memory system can lead to major } speed improvements 11 12 2
Memory Referencing Bug Example Memory Referencing Bug Example typedef struct { typedef struct { fun(0) → 3.14 int a[2]; int a[2]; fun(1) → 3.14 double d; double d; fun(2) → 3.1399998664856 } struct_t; } struct_t; fun(3) → 2.00000061035156 fun(4) → 3.14 double fun(int i) { fun(6) → Segmentation fault volatile struct_t s; s.d = 3.14; Explanation: s.a[i] = 1073741824; /* Possibly out of bounds */ return s.d; } Critical State 6 ? 5 fun(0) → 3.14 fun(1) → 3.14 ? 4 fun(2) → 3.1399998664856 Location accessed by d7 ... d4 3 fun(3) → 2.00000061035156 fun(i) d3 ... d0 2 fun(4) → 3.14 struct_t fun(6) → Segmentation fault a[1] 1 a[0] 0 Result is system specific 13 14 Memory Referencing Errors Memory System Performance Example C and C++ do not provide any memory protection void copyij(int src[2048][2048], void copyji(int src[2048][2048], Out of bounds array references int dst[2048][2048]) int dst[2048][2048]) { { Invalid pointer values int i,j; int i,j; Abuses of malloc/free for (i = 0; i < 2048; i++) for (j = 0; j < 2048; j++) for (j = 0; j < 2048; j++) for (i = 0; i < 2048; i++) Can lead to nasty bugs dst[i][j] = src[i][j]; dst[i][j] = src[i][j]; Whether or not bug has any effect depends on system and compiler } } Action at a distance 21 times slower Corrupted object logically unrelated to one being accessed Effect of bug may be first observed long after it is generated (Pentium 4) Hierarchical memory organization How can I deal with this? Performance depends on access patterns Program in Java, Python, Ruby, ML, etc. Including how step through multi-dimensional array Understand what possible interactions may occur Use or develop tools to detect referencing errors (e.g. Valgrind) 15 16 Great Reality #4: There’s more to Why The Performance Differs performance than asymptotic complexity copyij Constant factors matter too! 16000 And even exact op count does not predict performance 14000 Easily see 10:1 performance range depending on how code written Read throughput (MB/s) Must optimize at multiple levels: algorithm, data representations, 12000 procedures, and loops 10000 Must understand system to optimize performance 8000 How programs compiled and executed 6000 How to measure program performance and identify bottlenecks 4000 How to improve performance without destroying code modularity and 2000 copyji generality 0 32k s1 128k s3 512k s5 2m s7 8m Stride (x8 bytes) s9 Size (bytes) 32m s11 128m 17 18 3
Recommend
More recommend