ICTAC 2018 15 th International Colloquium on Theoretical Aspects of Computing Stellenbosch, South Africa, Oct 19, 2018 Finding Rare Concurrent Programming Bugs An Automatic , Symbolic , Randomized , and Parallelizable Approach Gennaro Parlato gennaro@ecs.soton.ac.uk
Concurrent programs Concurrency is everywhere in computing – Embedded systems – multi-core architectures – worldwide networks Large concurrent computing resources are available – clusters – cloud computing There is a big demand for concurrent software – enterprise customer services (e.g, telecom companies) – government services (e.g., tax payment services) – social networks, cloud services, …
Developing concurrent programs is difficult communication mechanism … T 2 T 2 T N Threads/processes Programmers have to guarantee – correctness of sequential execution of each individual thread – under nondeterministic interferences from other threads ( interleavings )
Developing concurrent programs is difficult What happens here...??? in int n=0 n=0; ; //a //ato tomic mic s shar hared ed vari ariab able le int P(v in P(void oid) { ) { int int tmp tmp, , i=1; 1; whi while ( e (i<=1 <=10) 0) { tmp tmp = = n; n; n = n = tmp tmp + + 1; i++; ++; } } } int ma int main n (v (voi oid) id1 id1 = = thr threa ead_c d_cre reate ate(P); (P); id2 id2 = = thr threa ead_c d_cre reate ate(P); (P); joi join( i ( id1 d1 ); joi join( i ( id2 d2 ); Can the assert fail? as assert( sert(n n == 20 == 20); ); }
Developing concurrent programs is difficult What happens here...??? int n=0 in n=0; ; //a //ato tomic mic s shar hared ed vari ariab able le int P(v in P(void oid) { ) { int int tmp tmp, , i=1; 1; whi while ( e (i<=1 <=10) 0) { tmp tmp = = n; n; n = n = tmp tmp + + 1; i++; ++; } } } int ma int main n (v (voi oid) id1 id1 = = thr threa ead_c d_cre reate ate(P); (P); id2 id2 = = thr threa ead_c d_cre reate ate(P); (P); joi join( i ( id1 d1 ); joi join( i ( id2 d2 ); as assert( sert(n n > 2 > 2); ); }
Scale of the challenge: #interleavings 2 threads with N LOC Scenario 1: ( ) – N=40 2N #interleavings: N – If 1 billion interleavings are simulated per second T 1 T 2 3.4 million years Scenario 2: – N=150 # interleavings > estimated # atoms in the known universe! >= 10 80
Bug-finding: finding needles in a haystack Set of interleavings Haystack Testing is easy when many interleavings are buggy
Bug-finding: finding A needle in a haystack Set of interleavings Haystack … but is hard when buggy interleavings are rare ⇒ … needs to be complemented by automated analyses that handle interleavings symbolically
Bounded Model Checking (BMC) of concurrent programs
Testing vs Bounded Model Checking • Testing: – checks some executions – may miss errors – fast • Bounded Model Checking (BMC) – Exhaustively explores all executions bounding loop iterations bounding context-switchs, etc. – Can be extremely resource-hungry
BMC for sequential C programs BOUNDED SAT/SMT SEQUENTIAL SOLVER PROGRAM PROGRAM FORMULA inlining unrolling SSA form tools – BLITZ [ Cho, D'Silva, Song – ASE’13 ] – CBMC [ Clarke, Kroening, Lerda – TACAS’04 ] – LLBMC [ Falke, Merz, Sinz – ASE’13 ] – ESBMC [ Cordeiro, Fischer, Marques-Silva – ASE’09 ]
BMC for concurrent C programs CONC SAT/SMT BOUNDED SOLVER PROGRAM PROGRAM FORMULA concurrency handling SAT/SMT approach • encode each thread as in the sequential case • add a conjunct for shared memory operations • all possible interleavings in the bounded program φ threads ∧ φ concurrency papers • [ Sinha, Wang – POPL’11 ] • [ Alglave, Kroening, Tautschnig – CAV’13 ] CBMC
Sequentialization targeting BMC
Sequentialization: motivations Building verification tools for full-fledged concurrent languages is difficult and expensive... … but scalable verification techniques exist for sequential languages – Abstraction – SAT/SMT techniques (i.e., bounded model checking) – … ⇒ Can we leverage these?
Sequentialization as a code-to-code translation Code-to-code translation from multithreaded recursive programs to sequential programs that preserves reachability shared variables “equivalent” Sequential Conc. program program … T 1 T 2 T N with non determinism Use existing automatic verification techniques designed for sequential programs to analyze concurrent programs
[ Inverso – Tomasco – Fischer – La Torre –Parlato, CAV’14 ] Lazy-CSeq: Schema Overview (a sequentialization for BMC)
Lazy-CSeq approach SEQUENTIALIZATION (code-to-code translation) CONC BOUNDED PROGRAM PROGRAM BMC SEQ SEQUENTIAL PROGRAM TOOL We have designed new sequentializations targeting BMC scalable analyses + surprisingly simple Lazy-CSeq
Bounded Concurrent Programs main() … T 1 T N-1 T N T 0 • no loops • no function calls • control flow only forward • one procedure for each thread
Round Robin Schedule round 1 round 2 round 3 main() … T 1 T N-1 T N T 0 round k Lazy-Cseq sequentialization: • captures all bounded Round-Robin computations for a given bound • error manifest themselves within very few rounds [ Musuvathi, Qadeer – PLDI’07 ]
Schema Overview … main() bounded concurrent program T N T 0 T 1 translates translates translates sequentialization … (code-to-code translation) “equivalent” … sequential program F 0 F N main() F 1 with non determinism Sequentialized functions Main Driver
Naïve Lazy Sequentialization • Add a global pc for each thread main driver • thread locals thread global pc 0 =0; ... pc N =0; local 0 ; ... local k ; main() { for (r=0; r<K; r++) for (i=0; i<N; i++) // simulate T i if ( active i ) F i (); }
Naïve Lazy Sequentialization for each round main driver for each thread T i pc 0 =0; ... pc N =0; simulate T i local 0 ; ... local k ; main() { for (r=0; r<K; r++) for (i=0; i<N; i++) // simulate T i if ( active i ) F i (); }
Naïve Lazy Sequentialization F i () switch(pc k ) { case 0: goto 0; main driver case 1: goto 1; case 2: goto 2; pc 0 =0; ... pc N =0; ... local 0 ; ... local k ; case M: goto M; } main() { for (r=0; r<K; r++) 0: CS(0); stmt0; for (i=0; i<N; i++) 1: CS(1); stmt1; // simulate T i 2: CS(2); stmt2; if ( active i ) . . F i (); . E XE . } . . M: CS(M); stmt M;
Naïve Lazy Sequentialization F i () resume mechanism switch(pc i ) { case 0: goto 0; main driver case 1: goto 1; case 2: goto 2; pc 0 =0; ... pc N =0; ... ... local 0 ; ... local k ; case M: goto M; } main() { for (r=0; r<K; r++) 0: CS(0); stmt0; for (i=0; i<N; i++) 1: CS(1); stmt1; // simulate T i 2: CS(2); stmt2; if ( active i ) . . ... F i (); . E XE . } . . M: CS(M); stmt M;
Naïve Lazy Sequentialization F i () switch(pc i ) { case 0: goto 0; main driver case 1: goto 1; case 2: goto 2; pc 0 =0; ... pc N =0; ... ... local 0 ; ... local k ; case M: goto M; } main() { for (r=0; r<K; r++) 0: CS(0); stmt0; for (i=0; i<N; i++) 1: CS(1); stmt1; // simulate T i 2: CS(2); stmt2; if ( active i ) . . ... ... F i (); . E XE . } . . M: CS(M); stmt M; Context-switch mechanism: #define CS(j) if (*) { pc i =j; return; }
Naïve Lazy Sequentialization switch(pc i ) { case 0: goto 0; main driver case 1: goto 1; case 2: goto 2; ... ... pc 0 =0; pc 1 =0; ... pc N =0; case M: goto M; local 0 ; local 1 ; ... local k ; Formula encoding: } main() { 0: CS(0); stmt0; goto statement to formula for (r=0; r<R; r++) 1: CS(1); stmt1; for (k=0; k<N; k++) 2: CS(2); stmt2; // simulate T k add a guard for each crossing . . ... F k (); ... . E XE . control-flow edge } . . M: CS(M); stmt M; = O(M 2 ) guards Context-switch mechanism: #define CS(j) if (*) { pc i =j; return; }
Lazy-CSeq sequentialization Guess next context-switch point main driver pc 0 =0; ... pc N =0; local 0 ; ... local k ; nextCS; main() for (r=0; r<K; r++) for (i=0; i<N; i++) // simulate T i if ( active i ) nextCS = nondet; assume(nextCS>=pc i ) F i (); pc i = nextCS;
Lazy-CSeq sequentialization F i () 0: J(0); stmt0; main driver skip 1: J(1); stmt1; ... 2: J(2); stmt2; pc 0 =0; ... pc N =0; . . local 0 ; ... local k ; . E XE . nextCS; . . main() . . for (r=0; r<K; r++) . E XE . for (i=0; i<N; i++) . . // simulate T i skip ... if ( active i ) nextCS = nondet; M: J(M); stmt M; assume(nextCS>=pc i ) F i (); pc i = nextCS; #define J(j) if (j<pc i || j>=nextCS) goto j+1;
Lazy-CSeq sequentialization resuming + context-switch F i () 0: J(0); stmt0; main driver skip 1: J(1); stmt1; ... 2: J(2); stmt2; pc 0 =0; ... pc N =0; . . local 0 ; ... local k ; pc i nextCS; EXECUTE main() for (r=0; r<K; r++) nextCS for (i=0; i<N; i++) . . // simulate T i skip ... . . if ( active i ) . . nextCS = nondet; M: J(M); stmt M; assume(nextCS>=pc i ) F i (); pc i = nextCS; #define J(j) if (j<pc i || j>=nextCS) goto j+1;
Recommend
More recommend