Cloud9 Parallel Symbolic Execution for Automated Real-World Software Testing Stefan Bucur, Vlad Ureche, Cristian Zamfir, George Candea School of Computer and Communication Sciences
Automated Software Testing Automated Industrial Techniques SW Testing λ Scalability Manual Testing Applicability Symbolic Execution Static Analysis Usability Fuzzing Model Checking 2
Cloud9 - The Big Picture • Parallel symbolic execution • Linear scalability on commodity clusters • Full symbolic POSIX support • Applicable on real-world systems • Platform for writing test cases • Easy-to-use platform API 3
Automated Systems Testing • Promising for systems testing: λ KLEE [*] • High-coverage test cases • Found new bugs Symbolic Execution • ... But applied only on small programs [*] C. Cadar, D. Dunbar, D. Engler, “KLEE: Unassisted and automatic generation of high-coverage tests for complex systems programs” , OSDI 2008 4
Memcached Apache GNU Coreutils 5
Symbolic Execution in a Nutshell void proc_pkt(packet_t* pkt) { [C9 A0 ... ] if (pkt->magic != 0xC9) { err(pkt); return; } if (pkt->cmd == GET) { ... } else if ... ... } 6
Symbolic Execution in a Nutshell void proc_pkt(packet_t* pkt) { [C9 A0 ... ] if (pkt->magic != 0xC9) { err(pkt); return; } if (pkt->cmd == GET) { ... } else if ... ... } 6
Symbolic Execution in a Nutshell void proc_pkt(packet_t* pkt) { [C9 A0 ... ] if (pkt->magic != 0xC9) { err(pkt); pkt->mag ic != 0xC9 return; } if (pkt->cmd == GET) { ... } else if ... ... } 6
Symbolic Execution in a Nutshell void proc_pkt(packet_t* pkt) { [C9 A0 ... ] if (pkt->magic != 0xC9) { err(pkt); pkt->mag ic != 0xC9 return; } if (pkt->cmd == GET) { pkt->cmd == GET ... } else if ... ... } 6
Symbolic Execution in a Nutshell void proc_pkt(packet_t* pkt) { [C9 A0 ... ] if (pkt->magic != 0xC9) { err(pkt); pkt->mag ic != 0xC9 return; } if (pkt->cmd == GET) { pkt->cmd == GET ... } else if ... ... } 6
Symbolic Execution in a Nutshell void proc_pkt(packet_t* pkt) { λ if (pkt->magic != 0xC9) { err(pkt); return; } if (pkt->cmd == GET) { ... } else if ... ... } 7
Symbolic Execution in a Nutshell void proc_pkt(packet_t* pkt) { λ if (pkt->magic != 0xC9) { err(pkt); λ.magic == 0xC9 λ.mag ic != 0xC9 return; } if (pkt->cmd == GET) { ... } else if ... ... } 7
Symbolic Execution in a Nutshell void proc_pkt(packet_t* pkt) { λ if (pkt->magic != 0xC9) { err(pkt); λ .magic == 0xC9 λ .magic != 0xC9 return; } if (pkt->cmd == GET) { ... λ.cmd == GET λ .cmd != GET } else if ... ... } 7
Symbolic Execution in a Nutshell void proc_pkt(packet_t* pkt) { λ if (pkt->magic != 0xC9) { err(pkt); λ .magic == 0xC9 λ .magic != 0xC9 return; } if (pkt->cmd == GET) { ... λ.cmd == GET λ .cmd != GET } else if ... ... } program ∼ 2 paths size 7
CPU Bottleneck Memory Exhaustion 8
Parallel Tree Exploration W1 W3 W2 8
Parallel Tree Exploration W1 W3 W2 Key research problem: Scalable parallel exploration 8
Linear Solution to Exponential Problem Time to Test Program Size 9
Linear Solution to Exponential Problem 1 worker Time to Test Program Size Testing target 9
Linear Solution to Exponential Problem 1 worker Time to Test 2 workers 4 workers 8 workers Program Size Testing target Bring testing time down to practical values 9
Throw Hardware at the Problem 10
Scalability Challenges Tree structure not known a priori ? ? ? ? ? ? ? ? ? ? 11
Scalability Challenges Static Allocation 12
Scalability Challenges 12
Scalability Challenges Anticipate Allocation 13
Scalability Challenges 13
Outline • Scalable Parallel Symbolic Execution • POSIX Environment Model • Evaluation 14
Cloud9 Architecture Global Symbolic Tree 15
Cloud9 Architecture W1’s Local Tree W2’s Local Tree W3’s Local Tree Each worker runs a local sequential symbolic execution engine (KLEE) 15
Cloud9 Architecture • Candidate nodes are selected for exploration • Fence nodes bound the local tree Fence nodes Candidate nodes 16
Load Balancing W1 W2 W3 LB Hybrid distributed system: centralized reports, P2P work transfer 17
Load Balancing W1 W2 W3 LB Hybrid distributed system: centralized reports, P2P work transfer 17
Load Balancing W1 W2 W3 LB Hybrid distributed system: centralized reports, P2P work transfer 17
Work Transfer W1 Candidate Fence 18
Work Transfer W1 W2 Candidate Fence 18
Work Transfer Virtual W1 W2 Candidate Fence 18
Work Transfer Virtual W1 W2 Candidate Fence 18
Work Transfer Materialized W1 W2 Candidate Fence 18
Work Transfer W1 W2 Exploration disjointness + completeness Candidate Fence 18
Path-based Encoding 0 1 • Nodes are encoded as paths in tree 0 1 • Compact binary representation 0 0 1 • Two paths can share common prefix 0 1 • Small encoding size 0 0 • For a tree of 2 100 leaves, a path fits in 0 1 < 128 bits ( 16 bytes) 19
Load Balancing in Practice 100 90 Continuous load balancing [% of total instructions] 80 LB stops after 4 min 70 Work done LB stops after 1 min 60 50 40 30 20 10 0 0 2 4 6 8 10 Time [minutes] Load balancing necessary to ensure scalability 20
Outline • Scalable Parallel Symbolic Execution • POSIX Environment Model • Evaluation 21
Calls into the Environment if (fork() == 0) { ... if ((res = recv(sock, buff, size, 0)) > 0) { pthread_mutex_lock(&mutex); memcpy(gBuff, buff, res); pthread_mutex_unlock(&mutex); } ... } else { ... pid_t pid = wait(&stat); ... } 22
Environment Model Program Under Test fork() Environment (C Library / OS) Cannot directly execute symbolically 23
Environment Model Program Under Test fork() Environment (C Library / OS) Model Code Equivalent functionality Executable symbolically Symbolic Execution Engine 23
Starting Point d d e e s d d e a a d e e o r r n h h t t d - - s e e e e l l t g g i a t n n l i o l i i i S S t s u i POSIX Network Files Stubs Symbolic Execution Engine 24
POSIX Environment Model , s t n s s e m t v g n d e e e n d e t s i i s e d l s u y c d s a o s a d a e n d p e n r o s h e r a e m h r t t h g - u s t s a e r - c a b e e i r l n t s g g i i v l s t r y u n o i t r e C s l s e M i i r M A P S t i p S D u I POSIX Network Threads Files Pipes Processes Signals pthread_* TCP/UDP/UNIX Symbolic Execution Engine 25
Key Changes in Symbolic Execution Multithreading and Scheduling • Deterministic or symbolic scheduling • Non-preemptive execution model Address Space Isolation • Copy on Write (CoW) between processes • CoW domains for memory sharing 26
Symbolic Engine System Calls Symbolic Engine • Symbolic engine support System Calls thread_create needed for threads/processes thread_terminate 1 process_fork 1. Thread/process lifecycle process_terminate get_context 2. Synchronization thread_preempt thread_sleep 2 3. Shared memory thread_notify get_wait_list 3 make_shared 27
Outline • Scalable Parallel Symbolic Execution • POSIX Environment Model • Evaluation 28
Testing Real-World Software Memcached Apache GNU Coreutils 29
Time to Reach Target Coverage 60 Time to achieve target coverage [minutes] 60% coverage printf 50 70% coverage 80% coverage 40 90% coverage 30 20 10 0 1 4 8 24 48 Number of workers Faster time-to-cover, higher coverage values 30
Increase in Code Coverage 50 Additional code covered Coreutils suite (12 workers, 10 min.) [ % of program LOC ] 40 30 20 10 0 0 10 20 30 40 50 60 70 80 90 Index of tested Coreutil (sorted by additional coverage) Consistent code coverage increase 31
Exhaustive Exploration 6 exhaustive test [hours] memcached (7.4×10 4 paths) 5 Time to complete 4 3 2 1 0 2 4 6 12 24 48 Number of workers Scalability of exhaustive path exploration 32
Instruction Throughput 1.8e+10 memcached 4 minutes 1.6e+10 [ # of instructions ] Useful work done 6 minutes 1.4e+10 8 minutes 1.2e+10 10 minutes 1.0e+10 8.0e+09 6.0e+09 4.0e+09 2.0e+09 0.0e+00 1 4 6 12 24 48 Number of workers Linear scalability with number of workers 33
Experimental Setup Symbolic State memcached/ Client Symbolic cmd. Apache/ Process TCP Stream lighttpd Srv. response Execute the “whole world” symbolically 34
Symbolic Test Cases • Easy-to-use API for developers to write symbolic test cases • Basic symbolic memory support • POSIX extensions for environment control • Network conditions, fault injection, symbolic scheduler 35
Recommend
More recommend