programming language elements for correctness proofs
play

Programming Language Elements for Correctness Proofs Gergely Dvai - PowerPoint PPT Presentation

Programming Language Elements for Correctness Proofs Gergely Dvai ELTE University, Budapest Department of Programming Languages and Compilers Supervisor: Dr. Zoltn Csrnyei Motivation A considerable part of software products' life


  1. Programming Language Elements for Correctness Proofs Gergely Dévai ELTE University, Budapest Department of Programming Languages and Compilers Supervisor: Dr. Zoltán Csörnyei

  2. Motivation ● A considerable part of software products' life cycle is testing and bug-fixing. ● Expectations concerning safe and secure operation of programs are increasing. ● Formal methods could help, but they are not yet efficient enough: their usage in industry is limited. ● Key problems: integration of formal methods and low efficiency of theorem provers.

  3. Possible solution ● A programming language where instead of instructions one writes formal specification and proof. ● The task of the compiler is to check the proof and – using its information – to generate code in a “traditional” target language. ● The generated program fulfils the requirements of the specification.

  4. External theorem provers Program code in a Representation “traditional” in an external Specification language theorem prover Dischargement of proof obligations ● Programming errors are discovered in the last phase of development. ● Changing the program code may invalidate parts of the proof.

  5. Annotating the source code public class QSort { /*@ requires A != null; ensures A.length == \old(A.length) && (\forall int k; k< A.length && k > 0; A[k] >= A[k-1] ); @*/ public void quickSort(/*@ non_null @*/ int[] A) { quicksort(A, 0, A.length - 1); } ... ● Few annotations only: checking can not be (fully) automated, external theorem prover is needed. ● More annotations in order to enable automated checking: redundant code.

  6. Functional and logic programming ● Program code in these languages may be considered as “executable specification”. sum [] = 0 sum [ x : r ] = x + sum r ● For some problems (e.g. sorting) either it does not reflect the “natural” specification, or it is extremely inefficient. insert_sort(List,Sorted) :- i_sort(List,[],Sorted). naive_sort(List,Sorted) :- i_sort([],Acc,Acc). perm(List,Sorted), sorted(Sorted). i_sort([H|T],Acc,Sorted) :- insert(H,Acc,NAcc), i_sort(T,NAcc,Sorted). sorted([]). sorted([_]). insert(X,[Y|T],[Y|NT]) :- X>Y, insert(X,T,NT). sorted([X,Y|T]) :- insert(X,[Y|T],[X,Y|T]) :- X=<Y. X=<Y,is_sorted([Y|T]). insert(X,[],[X]).

  7. Correctness by construction ● The formal specification is MACHINE First VARIABLES x refined towards an INVARIANT x>0 implementation. OPERATIONS ... ● If the refinement steps are END correct, the resulting program fulfils the requirements of the specification. MACHINE Second REFINES First ● Implementations (e.g. the B- OPERATIONS method, SpecWare) also use ... END external theorem provers.

  8. In the proposed solution... ● stepwise refinement is used to ensure early discovey of errors and to help in design decisions ● specification is abstract, implementation can be any (efficient) algorithm that solves the problem ● target-language code is generated automatically, the programmer writes the proof only ● construction of proofs (both using temporal and classical logic) is integrated, no external theorem provers are needed ● programming language elements (e.g. templates) are used to ease proof construction

  9. Current state of the project ● The compiler is implemented in C++ (>6000 lines of source code). ● There are already hundreds of test files. ● Simple but useful algorithms (sort, conditional maximum search) are implemented. ● A small “utility library” is constructed to ease reasoning about loops etc. ● Supported target languages: C++ (currently), NASM assembly (in a previous version)

  10. States of a program ● States of a program are described using first-order logic formulae using program variables and parameter variables. ● “The program starts and the outValue parameter denotes the value of the standard output stream.” ip = Start & out = outValue ● “The program terminates and the original value of the standard output stream is extended by the string « Hello! ». ” ip = Stop & out = outValue + "Hello!"

  11. “Hello World!” example Precondition ip = Start & out = outValue >> out = outValue + "Hello!" & ip = Stop; It is a “progress” Postcondition property ● There is no need to refine this specification. ● Tactics can “solve” it automatically.

  12. Tactics & templates ● Tactics are not built in the compiler, they can be implemented in the language using templates. ● Templates contain proof fragments (refinements) that can be reused and parametrised. ● Compile time conditions examine the actual parameters of a template call and makes the templates more reusable. ● There are several types of templates: – to contain axioms of functions or instructions – to enable induction – to describe proof tactics

  13. Tactic – template example The template has 2 This template formal parameters of implements a tactic. type Boolean. sequenceTactic( Boolean #pre, Boolean #post) tactic { equals( #post, #a & #b ) : Compile time condition: it is true iff the second block argument can be { matched with (#a & #b). #pre >> #a; #a >> #post; Parameters are changed here according to the } actual parameters and the } result of the match.

  14. Call of a template ● The compiler calls the previous template with the pre- and postcondition of the specification as arguments. ip = Start & out = outValue >> out = outValue + "Hello!" & ip = Stop { sequenceTactic( ip = Start & out = outValue, out = outValue + "Hello!" & ip = Stop ); } This template was automatically called by the The specification is compiler as a tactic, but it is now refined by the also possible to call a template call. template explicitly.

  15. Refinement ● The template call is replaced by its definition (after evaluation of the compile-time conditions and change of the parameters). ip = Start & out = outValue >> out = outValue + "Hello!" & ip = Stop { This is a ip = Start & out = outValue “sequential” refinement >> out = outValue + "Hello!"; consisting of two steps. out = outValue + "Hello!" >> ip = Stop; } The two steps are refined further automatically by tactics.

  16. Axioms ● The refinement steps form a proof-tree. Its root is the specification and the leaves are axioms. Axioms are placed in special templates. This Safety property: The argument is the label where template an expression is the instructions is to be placed at contains invariant of the by the code generator. temporal instruction if it is axioms. independent of the exit( Label #at ) atom ip variable. { independent( $expr, ip ) : [ $expr ]; ip = #at >> ip = Stop; } Progress property: this instruction terminates the program.

  17. Code generation ● The refinement of our example specification is completed (automatically) by the calls of the following two “atoms”: write( "Hello!", Start, L0 ); exit( L0 ); ● The code generator uses this “intermediate code” to output the following C++ program: int main( int argc, char* argv[] ) { Start: cout << "Hello!"; L0: exit( 0 ); }

  18. Further language elements ● declarations (variables, parameters, types, operators, functions) ● “selectional” refinement (for case distinction, reasoning about “if”) ● templates that contain non-temporal logic axioms (like x=y & y=z => x=z) ● templates to generate code (for example code for expression evaluation) ● templates to implement induction with (for loops and recursive procedures) ● templates that generates templates

  19. Future directions ● inclusion of more C++ instructions (e.g. methods of C++ STL) ● support for other target languages (Java, Ada, assembly...) ● further improvement of automatic proof generation ● parallel and concurrent programming ● specification statements concerning resources (memory, time...) ● “fuzzy” temporal statements – reasoning about randomised algorithms

Recommend


More recommend