Lawrence Livermore National Laboratory Lawrence Livermore National Laboratory ROSE-CIRM Detecting C-Style Errors in UPC Code Peter Pirkelbauer 1 Chunhua Liao 1 Ch h Li 1 Thomas Panas 2 1 Lawrence Livermore National Laboratory 2 Microsoft Parallel Data Warehouse Daniel Quinlan 1 This work was funded by the Department of Defense and used elements at the Extreme Scale Systems Center, located at Oak Ridge. This work performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344 UCRL- LLNL-PRES-504931 Lawrence Livermore National Laboratory, P. O. Box 808, Livermore, CA 94551
Motivation � Cost of Software Bugs is significant • in 2002 0.6% of the GDP [NIST02] i 2002 0 6% f th GDP � Error Detection Support � Error Detection Support • RTED Benchmark for Compilers and Runtime- Systems [Lue09a] [Lue09b] [RTED] � Bug Detection Tools • Static and Dynamic Analysis • Source Code and Binary Code Lawrence Livermore National Laboratory 2 Option:UCRL# Option:Additional Information
Outline � Unified Parallel C and C-Style Errors � Implementation • Code Instrumentation and Dynamic Analysis • Code Instrumentation and Dynamic Analysis � Evaluation � Conclusion Lawrence Livermore National Laboratory 3 Option:UCRL# Option:Additional Information
Unified Parallel C (UPC) � Extends C99 with: • Partitioned Global Address Space P titi d Gl b l Add S • Language constructs for Parallelism − e.g., shared pointers, parallel for loop, memory consistency models Lawrence Livermore National Laboratory 4 Option:UCRL# Option:Additional Information
Error Categories � C-Style Errors • out of bounds accesses, uninitialized variables, t f b d i iti li d i bl dangling pointers � C-Style Errors in UPC’s shared memory space � UPC Library Functions • upc_memput with wrong length � Parallelism Related Errors • deadlock, livelock, race conditions deadlock livelock race conditions Lawrence Livermore National Laboratory 5 Option:UCRL# Option:Additional Information
UPC – Bug Example 1 UPC Code int upc main() { int upc_main() { shared [] int *ptr; if (MYTHREAD if (MYTHREAD == 0) { 0) { ptr = upc_alloc(…); } upc_barrier; if (MYTHREAD == 1) { if (MYTHREAD 1) { upc_free(ptr); } } } Lawrence Livermore National Laboratory 6 Option:UCRL# Option:Additional Information
UPC – Bug Example 1 (cont’d) UPC Code int upc main() { int upc_main() { shared [] int *ptr; Thread 0 allocates Bug local shared memory. y uninitialized pointer uninitialized pointer if (MYTHREAD == 0) { if (MYTHREAD 0) { ptr in Thread 1 access ptr = upc_alloc(…); remains } uninitialized. upc_barrier; Thread 1 accesses Thread 1 accesses if (MYTHREAD if (MYTHREAD == 1) { 1) { upc_free(ptr); uninitialized ptr. } } } Lawrence Livermore National Laboratory 7 Option:UCRL# Option:Additional Information
UPC – Bug Example 2 UPC Code int upc main() int upc_main() { shared [] int *ptr; ptr = upc_all_alloc(…); upc_barrier; ptr[MYTHREAD] = …; if (MYTHREAD == 0) { upc_free(ptr); } } } Lawrence Livermore National Laboratory 8 Option:UCRL# Option:Additional Information
UPC – Bug Example 2 (cont’d) UPC Code int upc main() int upc_main() { Bug shared [] int *ptr; ptr = upc_all_alloc(…); potential early memory potential early memory Collective memory release allocation upc_barrier; ptr[MYTHREAD] = …; Missing barrier: if (MYTHREAD == 0) { { Thread 0 might upc_free(ptr); free the memory } early. } } Lawrence Livermore National Laboratory 9 Option:UCRL# Option:Additional Information
Dynamic Analysis Original Code Instrumented Code Thread 0 int upc_main() { int upc_main() { allocates allocates shared [] int *ptr; shared [] int *ptr; local shared if (MYTHREAD == 0) { memory. if (MYTHREAD == 0) { ptr = upc_alloc(…); cirm_CreateHeapPtr(ptr, …); cirm CreateHeapPtr(ptr ); ptr = upc_alloc(…); ptr = upc alloc( ); Leaves ptr in Leaves ptr in } cirm_InitVariable(&ptr, …); Thread 1 } uninitialized. cirm_ExitWorkzone(); upc_barrier; upc_barrier; b i cirm_EnterWorkzone(); Thread 1 if (MYTHREAD == 1) { accesses if (MYTHREAD if (MYTHREAD == 1) { 1) { cirm FreeMem(&ptr); cirm_FreeMem(&ptr); uninitialized i iti li d upc_free(ptr); upc_free(ptr); ptr. } } } } Lawrence Livermore National Laboratory 10 Option:UCRL# Option:Additional Information
Dynamic Analysis (Scheme) Original Code Instrumented Code Updates shadow memory and int upc_main() { int upc_main() { notifies other shared [] int *ptr; shared [] int *ptr; UPC threads if (MYTHREAD == 0) { about the heap if (MYTHREAD == 0) { ptr = upc_alloc(…); allocation. allocation. cirm CreateHeapPtr(ptr cirm_CreateHeapPtr(ptr, …); ); ptr = upc alloc( ptr = upc_alloc(…); ); } cirm_InitVariable(&ptr, …); Marks the } location of the ptr cirm_ExitWorkzone(); upc_barrier; as initialized as initialized. upc_barrier; b i Note: ptr in cirm_EnterWorkzone(); Thread 0 != ptr in if (MYTHREAD == 1) { Thread 1. if (MYTHREAD if (MYTHREAD == 1) { 1) { cirm_FreeMem(&ptr); cirm FreeMem(&ptr); upc_free(ptr); upc_free(ptr); Thread 1 } } accesses } } uninitialized ptr. uninitialized ptr. Lawrence Livermore National Laboratory 11 Option:UCRL# Option:Additional Information
The ROSE Compiler Infrastructure Lawrence Livermore National Laboratory 12 Option:UCRL# Option:Additional Information
ROSE-CIRM Toolchain � ROSE - Code Instrumentation and Runtime Monitor Lawrence Livermore National Laboratory 13 Option:UCRL# Option:Additional Information
Runtime Architecture (1) Lawrence Livermore National Laboratory 14 Option:UCRL# Option:Additional Information
Runtime Architecture (2) Instrumented Code shared[] int *values = upc_all_alloc(…); cirm CreateHeap(values cirm_CreateHeap(values, …); ); cirm_InitVariable(&values); if (MYTHREAD == 1) { values[1] = 7; cirmInitVar(&values[1], …); } Lawrence Livermore National Laboratory 15 Option:UCRL# Option:Additional Information
Runtime Monitor Coordination (1) � Concurrent Access Instrumented Code // shared int val; Sends update on initialization to other initialization to other if (MYTHREAD==0) { runtime managers. val = comp(…); cirm_InitVariable(&val, …); } } Messages are cirm_EnterBarrier(); processed after barrier. upc_barrier; cirm_ExitBarrier(); Test succeeds cirm_AccessVar(&val, …); printf(“%d\n”, val); printf( %d\n , val); Lawrence Livermore National Laboratory 16 Option:UCRL# Option:Additional Information
Runtime Monitor Coordination (2) � Concurrent Access If the input program contains race conditions, ROSE-CIRM race conditions, ROSE CIRM may spuriously report an error. Instrumented Code // shared int val; if (MYTHREAD==0) { val = comp(…); Sends update on cirm_InitVariable(&val, …); initialization to other } } runtime managers. Missing barrier. // upc_barrier; Test fails if messages are cirm_AccessVar(&val, …); not processed in p printf(“%d\n”, val); printf( %d\n , val); time. Lawrence Livermore National Laboratory 17 Option:UCRL# Option:Additional Information
Coordination – Early Release Problem (1) Instrumented Code shared[] int *values = upc_all_alloc(…); Heap-memory access p y cirm_ArrayAccess(&values[0], &values[idx]); values[idx] = useful_computation(idx); cirm_InitVariable(&values[…], …); Missing barrier // upc_barrier; if (MYTHREAD == 0) { Thread 0 might free cirm_ExitWorkzone(); the memory early. y y cirm_FreeMem(&ptr); upc_free(ptr); cirm_EnterWorkzone(); } } Lawrence Livermore National Laboratory 18 Option:UCRL# Option:Additional Information
Coordination – Early Release Problem (2) � Isolate Destructive Updates Instrumented Code shared[] int *values = upc_all_alloc(…); cirm_ArrayAccess(&values[0], &values[idx]); values[idx] = useful_computation(idx); cirm_InitVariable(&values[…], …); // upc_barrier; if (MYTHREAD == 0) { cirm_ExitWorkzone(); cirm_FreeMem(&ptr); upc_free(ptr); cirm_EnterWorkzone(); } } Lawrence Livermore National Laboratory 19 Option:UCRL# Option:Additional Information
Coordination – Early Release Problem (3) � Isolate Destructive Updates Instrumented Code shared[] int *values = upc_all_alloc(…); cirm_ArrayAccess(&values[0], &values[idx]); values[idx] = useful_computation(idx); cirm_InitVariable(&values[…], …); // upc_barrier; if (MYTHREAD == 0) { cirm_ExitWorkzone(); cirm_FreeMem(&ptr); upc_free(ptr); cirm_EnterWorkzone(); } } Lawrence Livermore National Laboratory 20 Option:UCRL# Option:Additional Information
Coordination – Early Release Problem (4) � Isolate Destructive Updates Instrumented Code shared[] int *values = upc_all_alloc(…); cirm_ArrayAccess(&values[0], &values[idx]); values[idx] = useful_computation(idx); cirm_InitVariable(&values[…], …); // upc_barrier; if (MYTHREAD == 0) { cirm_ExitWorkzone(); cirm_FreeMem(&ptr); upc_free(ptr); cirm_EnterWorkzone(); } } Lawrence Livermore National Laboratory 21 Option:UCRL# Option:Additional Information
Recommend
More recommend