faster stronger c analysis with the clang static analyzer
play

Faster, Stronger C++ Analysis with the Clang Static Analyzer - PowerPoint PPT Presentation

Faster, Stronger C++ Analysis with the Clang Static Analyzer George Karpenkov, Apple Artem Dergachev, Apple Agenda Introduction to Clang Static Analyzer Using coverage-based iteration order Improved C++ constructor and destructor


  1. • Faster, Stronger C++ Analysis with the Clang Static Analyzer George Karpenkov, Apple Artem Dergachev, Apple

  2. Agenda • Introduction to Clang Static Analyzer • Using coverage-based iteration order • Improved C++ constructor and destructor support

  3. Agenda • Introduction to Clang Static Analyzer • Using coverage-based iteration order • Improved C++ constructor and destructor support

  4. Clang Static Analyzer Finds Bugs at Compile Time • Use-after-free bugs • Null pointer dereferences • Uses of uninitialized values • Memory leaks, etc…

  5. Analyzer Visualizes Paths • Inside IDE: Xcode, QtCreator, CodeCompass • From command line: generate HTML • $ scan-build make • http://clang-analyzer.llvm.org

  6. Analyzer Simulates Program Execution • Explores paths through the program • Uses symbols instead of concrete values • Generates reports on errors

  7. A Faster than Light Intro to the Analyzer x = 0 x = 0 int foo( int a) { a int x = 0; a ≠ 0 a = 0 x = 0 x = 0 TRUE FALSE if ( a != 0) x = 1; a ≠ 0 x = 1 return 1/0 x = 1 return 1/ x ; } 💦 CRASH! return 1 return 1/ x Code Control Flow Graph Exploded Graph

  8. Agenda • Introduction to Clang Static Analyzer • Using coverage-based iteration order • Improved C++ constructor and destructor support

  9. Problem: Path is Too Long • XNU (Darwin Kernel): many paths over 400 steps • Bug can be found on the first iteration • Aim: provide shorter , more concise diagnostics

  10. Analyzer Uses Worklist to Generate Exploded Graph worklist = { start } • Start: entry point while worklist : • Successors: node = worklist .pop() successors = execute( node ) • Simulated execution of a statement for successor in successors : • Allows different exploration strategies worklist .push( successor ) • Previously: DFS by default

  11. DFS Exploration Order Leads to Wasted Effort for int main() { cond() i = 0 for ( int i = 0; i < 2; ++ i ) { TRUE FALSE if (cond()) for i = 0 continue ; return 1/0; // 💦 crash cond() i = 1 } TRUE FALSE } for return 1/0 i = 1 EXIT

  12. DFS Exploration Order Leads to Wasted Effort for int main() { cond () i = 0 for ( int i = 0; i < 2; ++ i ) { TRUE FALSE if (cond()) for return 1/0 i = 0 continue ; return 1/0; // 💦 crash cond() i = 1 } TRUE FALSE } for return 1/0 i = 1 EXIT

  13. Problem Often Mitigated by Analyzer Heuristics • Deduplication • If same report is found multiple times, return shortest path • Budget per source location • Paths that visit a location more than 3 times get dropped • Budget per number of inlinings • … • In many unfortunate cases, shortest path not found at all

  14. Solution: Coverage-Based Iteration order • Record the number of times the analyzer visits each location • Use a priority queue: • Prefers source locations analyzer has visited fewer times so far • Finds bugs on first iteration when possible

  15. Coverage-Based Iteration Order int main() { for for ( int i = 0; i < 2; ++ i ) { cond() if (cond()) i = 0 continue; TRUE FALSE return 1/0 ; // 💦 crash return 1/0; } }

  16. Coverage-Based Iteration Order int main() { for for ( int i = 0; i < 2; ++ i ) { cond() if (cond()) i = 0 continue ; TRUE FALSE return 1/0; // 💦 crash return 1/0; } }

  17. Results: 95th Percentile of Path Length 300 95th Percentile of Path Length Before 95th Percentile of Path Length After 225 150 75 0 XNU openSSL postgres Adium sqlite3

  18. Results: Total Bug Reports 16% Increase in Number of Reports Found 1200 # Reports Before # Reports After 900 600 300 0 XNU openSSL postgres Adium sqlite3

  19. Agenda • Introduction to Clang Static Analyzer • Using coverage-based iteration order • Improved C++ constructor and destructor support

  20. Incomplete C++ Support Caused False Positives • Analyzer lost information on object construction • Analyzer lost track of objects before they were destroyed • Temporaries are hard!

  21. Constructor Call = Initialization Bookkeeping + Method Call

  22. 
 Initialization Bookkeeping In C Is Easy typedef struct {...} Point; 
 1. CallExpr 
 Point makePoint(); 
 Call 'makePoint()' to evaluate 
 contents of the structure Point P = makePoint(); 2. DeclStmt 
 DeclStmt 
 `- VarDecl ' P ' 'Point' 
 Put these contents 
 `- CallExpr 'makePoint' 'Point' into ' P '

  23. 
 Initialization Bookkeeping In C++ Is More Complicated struct Point { 
 1. CXXConstructExpr 
 ... 
 Call constructor like a method 
 Point(); 
 on the object P }; 
 Point P ; 2. DeclStmt 
 DeclStmt 
 `- VarDecl ' P ' 'Point' 
 Learn about the existence 
 `- CXXConstructExpr 'Point()' of variable P

  24. 
 Initialization Bookkeeping In C++ Is More Complicated struct Point { 
 2. DeclStmt 
 ... 
 Learn about the existence 
 Point(); 
 of variable P }; 
 Point P ; 1. CXXConstructExpr 
 DeclStmt 
 `- VarDecl ' P ' 'Point' 
 Call constructor like a method 
 `- CXXConstructExpr 'Point()' on the object P

  25. 
 Initialization Bookkeeping In C++ Is More Complicated struct Point { 
 1. DeclStmt 
 ... 
 Learn about the existence 
 Point(); 
 of variable P }; 
 Point P ; 2. CXXConstructExpr 
 DeclStmt 
 `- VarDecl ' P ' 'Point' 
 Call constructor like a method 
 `- CXXConstructExpr 'Point()' on the object P

  26. Initialization Bookkeeping In C++ Is More Complicated • The constructor needs to know what object is being constructed • CXXConstructExpr doesn't tell us everything in advance

  27. 
 
 
 
 
 Initialization Bookkeeping In C++ Takes Many Forms Variables: Heap allocation: Argument values: Point P (1, 2, 3); Point * P = new Point(1, 2, 3); draw(Point(1, 2, 3)); Point P = Point(1, 2, 3); Point * P = new Point[ N + 1]; 
 Point(1, 2, 3) - Point(4, 5, 6); Point P = Point(1); // cast from 1 void draw(Point P = Point(1, 2, 3)); 
 Point P = 1; // implicit cast from 1 
 draw(); // construct P 
 Temporaries: Point(1, 2, 3); Captured values: Constructor initializers: const Point & P = Point(1, 2, 3); const int & x = Point(1, 2, 3). x ; // copy to capture 
 struct Vector { 
 // determine in run-time 
 Point P ; [ P ]{ return P ; }(); Point P ; 
 const Point & P = 
 Vector() : P (1, 2, 3) {} 
 better lunarPhase() ? Point(1, 2, 3) 
 }; IT IS ONLY GETTING WORSE : Point(3, 2, 1); 
 struct Vector { 
 Point P = Point(1, 2, 3); 
 }; 
 Return values: Point getPoint() { 
 Aggregates and brace initializers: return Point(1, 2, 3); // RVO 
 } Point P {1, 2, 3}; Point getPoint() { 
 PointPair PP {Point(1, 2), 
 Point P (1, 2, 3); // NRVO 
 Point(3, 4)}; PointPairPair PPP {{{1, 2}, {3, 4}}, 
 return P ; 
 {{5, 6}, {7, 8}}}; } 
 std::vector<Point> V {{1, 2, 3}}; 


  28. There is a common theme

  29. Need to track the constructed object’s address until the analyzer processes the statement 
 that represents the object’s storage

  30. Solution: Construction Context • Augments CFG constructor call elements • Describes the construction site: • What object is constructed? • Who is responsible for destroying it? • Is it a temporary that requires materialization? • Is the constructor elidable?

  31. Solution: Construction Context • A construction syntax catalog • There are currently 15 classes 
 • Easy to identify and to support

  32. 
 
 
 
 
 Progress made… Variables: Heap allocation: Argument values: Point P (1, 2, 3); Point * P = new Point(1, 2, 3); draw(Point(1, 2, 3)); BEFORE NOW NOW Point P = Point(1, 2, 3); Point * P = new Point[ N + 1]; 
 Point(1, 2, 3) - Point(4, 5, 6); Point P = Point(1); // cast from 1 void draw(Point P = Point(1, 2, 3)); 
 NOW Point P = 1; // implicit cast from 1 
 draw(); // construct P 
 Temporaries: Point(1, 2, 3); Captured values: Constructor initializers: const Point & P = Point(1, 2, 3); const int & x = Point(1, 2, 3). x ; // copy to capture 
 struct Vector { 
 // determine in run-time 
 Point P ; [ P ]{ return P ; }(); Point P ; 
 NOW BEFORE const Point & P = 
 Vector() : P (1, 2, 3) {} 
 lunarPhase() ? Point(1, 2, 3) 
 }; : Point(3, 2, 1); 
 struct Vector { 
 Point P = Point(1, 2, 3); 
 }; 
 Return values: Point getPoint() { 
 Aggregates and brace initializers: NOW return Point(1, 2, 3); // RVO 
 } Point P {1, 2, 3}; BEFORE Point getPoint() { 
 PointPair PP {Point(1, 2), 
 Point P (1, 2, 3); // NRVO 
 Point(3, 4)}; PointPairPair PPP {{{1, 2}, {3, 4}}, 
 return P ; 
 {{5, 6}, {7, 8}}}; } 
 std::vector<Point> V {{1, 2, 3}}; 


Recommend


More recommend