smart programming languages smart program analysis
play

Smart programming languages, smart program analysis Varmo Vene - PowerPoint PPT Presentation

Smart programming languages, smart program analysis Varmo Vene Institute of Cybernetics at TUT & University of Tartu Introduction A quote from classics Everyone knows that debugging is twice as hard as writing a program in the first


  1. Smart programming languages, smart program analysis Varmo Vene Institute of Cybernetics at TUT & University of Tartu

  2. Introduction A quote from classics Everyone knows that debugging is twice as hard as writing a program in the first place. So if you’re as clever as you can be when you write it, how will you ever debug it? Brian Kernighan, P.J. Plauger ”The Elements of Programming Style”, 2ed., 1978. 30 years later, . . . we still spend often more time for debugging and testing than for actual programming; despite that, the software we are using and/or developing has bugs (sometimes quite serious ones).

  3. Introduction Possible reasons Human imperfection To err is human, to forgive divine. (Alexander Pope, 1688–1744) Laws of nature Program testing can be used to show the presence of bugs, but never to show their absence! (Edsger Dijkstra, 1970) Imperfection of tools The most effective debugging tool is still careful thought, coupled with judiciously placed print state- ments. (Brian Kernighan, 1979)

  4. Introduction A goal of Semantics (among others) To develop programming tools that give strong guarantees about properties of programs. – Eg. guarantee the absence of certain kind of errors. Proactive tools – Eg. program extraction. Preventive tools – Eg. programming languages with powerful type systems. Retroactive tools – Eg. static program analyzers.

  5. Outline Total Functional Programming – Inductive types – Comonadic recursion – Recursive coalgebra – Mendler-style recursion Goblint – Path-sensitivity – Concurrent analysis Working Group and Plans

  6. Total Functional Programming Total Functional Programming In total functional programming paradigm all programs are terminating. In particular, there is no general recursion. Instead, only some restricted forms of recursion are allowed, which are guaranteed to terminate. Usually, these are simple iteration or primitive recursion over inductive types. Sometimes also corecursive definitions of coinductive types are allowed. While not Turing complete, most of the interesting programs are in principle expressible in such paradigm.

  7. Total Functional Programming Inductive Types and Iteration Categorically, inductive types (such as natural numbers, lists, trees, etc) are initial algebras of endofunctors. The most basic form of recursion (known as iteration or fold) corresponds to the unique homomorphism property of initial algebras. ✐♥ F ✖ F ✖ F F ❢ ✾ ✦ ❢ ❂ ❢♦❧❞✭ ✬ ✮ ✽ ✬ ❆ F ❆ By duality, coinductive types (streams and various other in- finite and potentially infinite structures) are terminal coal- gebras, and the basic form of corecursion (known as coiter- ation) rises from the unique homomorphism property.

  8. Total Functional Programming Comonadic Recursion In series of papers (Uustalu & Vene, 1996–98) we introduced several new (co)recursion schemes capturing primitive core- cursion, course-of-value (co-)recursion, etc. All of them shared strong similarities, but differed on con- crete details. In (Uustalu & Vene & Pardo, 2000) we proved a generic many-in-one recursion scheme parametrized by a recursive call pattern represented by a comonad with a distributive law. The new scheme covered most of the previously known re- cursion schemes as instances of the comonadic one.

  9. Total Functional Programming Recursive Coalgebras The algebra structure ✐♥ F ✿ F ✖ F ✦ ✖ F is an isomorphism. In fact, the essential properties of a recursion scheme depend more on its inverse, a coalgebra! In (Capretta & Uustalu & Vene, 2004) we defined the notion of recursive coalgebras. ☛ F ❆ ❆ F ❢ ✾ ✦ ❢ ✽ ✬ F ❇ ❇ The notion generalizes well-founded recursion and has it’s origin in (Osius, 1970). We identified a number of ways for constructing recursive coalgebras and generalized the comonadic recursion to this setting.

  10. Total Functional Programming Mendler-style recursion Programming with recursors defined by properties such as initiality, comonadic recursion, etc. is cumbersome. Eg. functions defined by the iteration must have the follow- ing form: ✐♥ F ✖ F ✖ F F ❢ ✾ ✦ ❢ ✽ ✬ F ❆ ❆

  11. Total Functional Programming Mendler-style recursion Programming with recursors defined by properties such as initiality, comonadic recursion, etc. is cumbersome. In (Uustalu & Vene, 1996, 2000, 2002) we considered an alternative form: ✐♥ F ✖ F ✖ F ✽ ✟✭ ❢ ✮ ✾ ✦ ❢ ❆ where ✟ ✿ ✽ ❳✿ ✭ ❳ ✦ ❆ ✮ ✦ ✭ F ❳ ✦ ❆ ✮ . Idea originates from (Mendler, 1987). And extends to other recursion schemes.

  12. Total Functional Programming Mendler-style recursion The scheme looks quite similar to the general recursion, hence is (hopefully) more intuitive. But the termination is still guaranteed. Ie. we have termination checking by type-checking.

  13. Total Functional Programming Mendler-style recursion The scheme looks quite similar to the general recursion, hence is (hopefully) more intuitive. But the termination is still guaranteed. Ie. we have termination checking by type-checking. Ongoing and further works Corecursive algebras (with V. Capretta) Mendler-style vs. Circular proofs (with R. Cockett) . . . To make Total FP fly!

  14. Where we are? Total Functional Programming – Inductive types – Comonadic recursion – Recursive coalgebra – Mendler-style recursion Goblint – Path-sensitivity – Concurrent analysis Working Group and Plans

  15. Goblint What is Goblint? Goblint is a static analyzer for Posix-threaded C Focused on detecting multiple access data races Integrates with Eclipse C development environment Aims to be sound (ie. must detect all errors, but may give false alarms) Aims to be efficient enough to be able to analyze medium-to-large scale programs ( ✕ 100 kLOC) Aims to be precise enough to be able to analyze medium-to-large scale programs ( ✕ 100 kLOC) (Vojdani & Vene, 2007)

  16. Goblint Main conflicts Soundness vs. C Efficiency vs. Precision

  17. Goblint Main conflicts Soundness vs. C Efficiency vs. Precision Soundness vs. C Restrict to the ”safe” subset of C: no setjmp and getjmp ; no dynamic data structures; no recursion; . . .

  18. Goblint Main conflicts Soundness vs. C Efficiency vs. Precision Soundness vs. C Restrict to the ”safe” subset of C: Not as bad as it looks: no setjmp and getjmp ; we can still handle these constructs, no dynamic data structures; but do not guarantee no recursion; the soundness. . . .

  19. Goblint Main conflicts Soundness vs. C Efficiency vs. Precision Efficiency vs. Precision We adopt normal data flow analysis techniques, but use functional approach to distinguish calling contexts, use dynamically adjustable path-sensitive analysis; use global invariant based concurrent analysis.

  20. Goblint: Path-sensitivity man gcc on “-Wuninitialized” These warnings are made optional because GCC is not smart enough to see all the reasons why the code might be correct despite appearing to have an error . . . Here is another common case: int save_y ; i f ( change_y ) save_y = y , y = new_y ; . . . ( change_y ) y = save_y ; i f This has no bug because "save_y" is used only if it is set.

  21. Goblint: Path-sensitivity Example int save_y ; ( change_y ) save_y = y , y = new_y ; i f . . . i f ( change_y ) y = save_y ; What is the problem? There are 4 potential execution paths. Only 2 are logically possible. We need to distinguish execution paths. In general, there are an infinite number of paths!

  22. Goblint: Path-sensitivity Example int save_y ; ( change_y ) save_y = y , y = new_y ; i f . . . i f ( change_y ) y = save_y ; Our solution We only track the paths that are relevant to the analysis result. In this example, paths are relevant when the set of uninitialized variables are different. In general, relevance depends on the user-analysis. . .

  23. Goblint: Concurrent Analysis State explosion Precise concurrent analysis leads to state explosion. Eg. if there are two threads with 10 instructions each, then there are 184756 possible interleavings! Global invariant based concurrent analysis Separate shared (ie. global) and local variables. Compute a single invariant for global state. Essentially, join all possible values in all program points. Now all threads can be analyzed sequentially. Very imprecise for base domain, but works well with user domains like lock-sets. (Seidl & Vene & Müller Olm, 2003).

  24. Goblint Ongoing and further works Equality analysis of addresses (with H. Seidl); Scalability improvements; Adding new analyses (eg. variable initialization, open-use-close analysis, etc.); Better handling of external functions; . . . Additional information Goblint has an Open Source license You can download it from web: http://goblin.at.mt.ut.ee/goblint/tracker/

  25. Working Group and Plans Programming Languages and Systems at EXCS Senior staff Keiko Nakata (IoC) Jaan Penjam (IOC) Härmel Nestra (UT) Tarmo Uustalu (IOC) Hellis Tamm (IOC) Varmo Vene (IOC/UT) PhD students Ando Saabas (IOC) Vesal Vojdani (UT) Jevgeni Kabanov (UT) Andres Toom (IOC) Aivar Annamaa (UT) Martin Pettai (UT) Best friend Peeter Laud (CybAS)

Recommend


More recommend