static software analysis
play

Static (Software) Analysis Dagstuhl 16172: Machine Learning for - PowerPoint PPT Presentation

Static (Software) Analysis Dagstuhl 16172: Machine Learning for Dynamic Software Analysis Reiner Hhnle Software Engineering Group Department of Computer Science Technische Universitt Darmstadt http://www.se.tu-darmstadt.de/


  1. Static (Software) Analysis Dagstuhl 16172: Machine Learning for Dynamic Software Analysis Reiner Hähnle Software Engineering Group Department of Computer Science Technische Universität Darmstadt http://www.se.tu-darmstadt.de/ haehnle@cs.tu-darmstadt.de 160425 | TUD CS SE | R. Hähnle | Static Analysis | Dagstuhl 16172 ML & SA | 0

  2. What Is Static Analysis (SA) of Software? Establish a property of a program at compile time, without executing it 160425 | TUD CS SE | R. Hähnle | Static Analysis | Dagstuhl 16172 ML & SA | 1

  3. What Is Static Analysis (SA) of Software? Establish a property of a program at compile time, without executing it Some Facts ◮ Checking done by a tool, not a human ◮ Performed usually on source or assembler code ◮ Original motivation: compiler optimization ◮ Data flow analysis, e.g., used variables ◮ Control flow analysis, e.g., reachable code ◮ Current focus: software quality ◮ Security, e.g., confidentiality, vulnerability ◮ Compliance, e.g., MISRA-C, web service protocols ◮ Defects (bug finding), e.g., memory leaks, buffer overflows ◮ Code quality, e.g., metrics, “code smells” 160425 | TUD CS SE | R. Hähnle | Static Analysis | Dagstuhl 16172 ML & SA | 1

  4. Static Analysis in the Narrow/Wide Sense Static Analysis in the Narrow Sense ◮ Check a fixed property ◮ Low polynomial decision complexity ◮ Value-insensitive abstraction, e.g., control flow graph 160425 | TUD CS SE | R. Hähnle | Static Analysis | Dagstuhl 16172 ML & SA | 2

  5. Static Analysis in the Narrow/Wide Sense Static Analysis in the Narrow Sense ◮ Check a fixed property ◮ Low polynomial decision complexity ◮ Value-insensitive abstraction, e.g., control flow graph Static Analysis in the Wide Sense (which is my sense) ◮ Complex properties, expressed in specification language ◮ security policy, interface protocol, functional property ◮ NP-hard or even undecidable ◮ heuristics optimize the “common case”, no guarantees ◮ interaction by human expert user ◮ Fully precise control flow and data model ◮ Often based on formal methods 160425 | TUD CS SE | R. Hähnle | Static Analysis | Dagstuhl 16172 ML & SA | 2

  6. Static Analysis Techniques ◮ Graph-based program abstractions Algorithms ◮ Control flow graph ◮ Program dependence graph ◮ Constraint Solving ◮ Recently popular: SAT/SMT solvers as backend Search ◮ Automata-based representations ◮ Model checking Inference ◮ Abstract Interpretation ◮ Symbolic Execution ◮ Deductive Verification 160425 | TUD CS SE | R. Hähnle | Static Analysis | Dagstuhl 16172 ML & SA | 3

  7. Static Analysis Techniques ◮ Graph-based program abstractions Algorithms ◮ Control flow graph ◮ Program dependence graph ◮ Constraint Solving ◮ Recently popular: SAT/SMT solvers as backend Search ◮ Automata-based representations ◮ Model checking Inference ◮ Abstract Interpretation ◮ Symbolic Execution ◮ Deductive Verification ML has most potential in complex SA techniques: Search ⇒ Lookup 160425 | TUD CS SE | R. Hähnle | Static Analysis | Dagstuhl 16172 ML & SA | 3

  8. Interlude A State-of-art Tool for Complex Static Analysis By Source, Fair use, https://en.wikipedia.org/w/index.php?curid=20208543 160425 | TUD CS SE | R. Hähnle | Static Analysis | Dagstuhl 16172 ML & SA | 4

  9. Challenges for SA Scaling ◮ Intra- vs. inter-procedural: compositionality difficult for complex SA ◮ Coverage/rapid evolution of industrial programming languages ◮ Hard to analyze language features: dynamic typing, reflection, HO closures 160425 | TUD CS SE | R. Hähnle | Static Analysis | Dagstuhl 16172 ML & SA | 5

  10. Challenges for SA Scaling ◮ Intra- vs. inter-procedural: compositionality difficult for complex SA ◮ Coverage/rapid evolution of industrial programming languages ◮ Hard to analyze language features: dynamic typing, reflection, HO closures Precision ◮ Incompleteness, false positives ◮ “Soundiness”, see B. Livshits et al., CACM 58(2) 44–46, Feb. 2015 160425 | TUD CS SE | R. Hähnle | Static Analysis | Dagstuhl 16172 ML & SA | 5

  11. Challenges for SA Scaling ◮ Intra- vs. inter-procedural: compositionality difficult for complex SA ◮ Coverage/rapid evolution of industrial programming languages ◮ Hard to analyze language features: dynamic typing, reflection, HO closures Precision ◮ Incompleteness, false positives ◮ “Soundiness”, see B. Livshits et al., CACM 58(2) 44–46, Feb. 2015 Modern computer architecture The deployment gap ◮ Multi-level caches, stale data ◮ Parallel computing: GPUs, weak memory models ◮ Cloud: provisioning bugs, resource-aware computing 160425 | TUD CS SE | R. Hähnle | Static Analysis | Dagstuhl 16172 ML & SA | 5

  12. Current Trends More complex properties, often combine behavior and data ◮ Integration tasks (web interfaces, frameworks, APIs, . . . ) ◮ Security-related: information flow ◮ Evolution: regression, certification, . . . 160425 | TUD CS SE | R. Hähnle | Static Analysis | Dagstuhl 16172 ML & SA | 6

  13. Current Trends More complex properties, often combine behavior and data ◮ Integration tasks (web interfaces, frameworks, APIs, . . . ) ◮ Security-related: information flow ◮ Evolution: regression, certification, . . . Combine Static and Dynamic Analysis ◮ Concolic or dynamic symbolic execution ◮ Incomplete static analysis to speed up runtime monitoring 160425 | TUD CS SE | R. Hähnle | Static Analysis | Dagstuhl 16172 ML & SA | 6

  14. Current Trends More complex properties, often combine behavior and data ◮ Integration tasks (web interfaces, frameworks, APIs, . . . ) ◮ Security-related: information flow ◮ Evolution: regression, certification, . . . Combine Static and Dynamic Analysis ◮ Concolic or dynamic symbolic execution ◮ Incomplete static analysis to speed up runtime monitoring Immersion in IDEs ◮ Use machine idle time while user deliberates ◮ Improved usability 160425 | TUD CS SE | R. Hähnle | Static Analysis | Dagstuhl 16172 ML & SA | 6

  15. Current Trends More complex properties, often combine behavior and data ◮ Integration tasks (web interfaces, frameworks, APIs, . . . ) ◮ Security-related: information flow ◮ Evolution: regression, certification, . . . Combine Static and Dynamic Analysis ◮ Concolic or dynamic symbolic execution ◮ Incomplete static analysis to speed up runtime monitoring Immersion in IDEs ◮ Use machine idle time while user deliberates ◮ Improved usability Resource Analysis Resource management and deployment become separate development phases 160425 | TUD CS SE | R. Hähnle | Static Analysis | Dagstuhl 16172 ML & SA | 6

  16. Related Fields Static Analysis of Programs is a not a Separate Discipline Cross cutting with many other fields, distinction is blurry ◮ Type Theory ◮ Abstract Interpretation ◮ Model Checking ◮ Software Verification 160425 | TUD CS SE | R. Hähnle | Static Analysis | Dagstuhl 16172 ML & SA | 7

  17. Program Analysis: The Two Worlds The Two Worlds Meet ◮ Glassbox ◮ Blackbox ◮ Symbolic ◮ Statistical ◮ Heuristic ◮ Complete ◮ Analysis ◮ Synthesis ◮ Static Analyses ◮ Learning Techniques ◮ Model checking ◮ Extract Behavior from Traces ◮ Abstract interpretation ◮ Learning-based Software Testing ◮ Symbolic execution ◮ Learning-based synthesis ◮ Deductive verification ◮ 160425 | TUD CS SE | R. Hähnle | Static Analysis | Dagstuhl 16172 ML & SA | 8

  18. Program Analysis: The Two Worlds Advantages ◮ Precise, rich modelling ◮ Source code not needed ◮ Executable/compilable target ◮ Applicable to any system level ◮ Can be scalable ◮ Fully automatic ◮ Certificates possible ◮ Robust 160425 | TUD CS SE | R. Hähnle | Static Analysis | Dagstuhl 16172 ML & SA | 8

  19. Program Analysis: The Two Worlds Disadvantages ◮ Must have/generate source code ◮ Learned models very abstract ◮ Where do the specs come from? ◮ How to map abstract to code level? ◮ Some expert interaction necessary ◮ No use of symbolic techniques ◮ Evolution of target expensive ◮ Slow convergence, small coverage ◮ Incompleteness, soundiness ◮ Doesn’t scale/not compositional 160425 | TUD CS SE | R. Hähnle | Static Analysis | Dagstuhl 16172 ML & SA | 8

  20. Software Model ⇔ Executable Code Pivotal Issue: The Link between Models and Code Too tight: Need to hand-craft modelling abstractions Too loose: Unsatisfactory precision/coverage 160425 | TUD CS SE | R. Hähnle | Static Analysis | Dagstuhl 16172 ML & SA | 9

  21. Software Model ⇔ Executable Code Pivotal Issue: The Link between Models and Code Too tight: Need to hand-craft modelling abstractions Too loose: Unsatisfactory precision/coverage Increase elasticity of model-code link without sacrificing precision 160425 | TUD CS SE | R. Hähnle | Static Analysis | Dagstuhl 16172 ML & SA | 9

  22. Software Model ⇔ Executable Code Pivotal Issue: The Link between Models and Code Too tight: Need to hand-craft modelling abstractions Too loose: Unsatisfactory precision/coverage Increase elasticity of model-code link without sacrificing precision Potential Benefits ◮ Decrase dependency of glassbox from availability of specs, source code ◮ Integrate blackbox into precise/sound(y) reasoning framework ◮ Dramatically improve performance of both glass-/blackbox 160425 | TUD CS SE | R. Hähnle | Static Analysis | Dagstuhl 16172 ML & SA | 9

Recommend


More recommend