I NTRODUCTION S IMULATION AND R EASONING F RAMEWORK C ODE P ROOFS C ONCLUSION AND F UTURE W ORK S IMULATION AND F ORMAL V ERIFICATION OF X 86 M ACHINE -C ODE P ROGRAMS THAT MAKE S YSTEM C ALLS Shilpi Goel Warren A. Hunt, Jr. Matt Kaufmann Soumava Ghosh The University of Texas at Austin 22 nd October, 2014 1| 31
I NTRODUCTION S IMULATION AND R EASONING F RAMEWORK C ODE P ROOFS C ONCLUSION AND F UTURE W ORK O UTLINE I NTRODUCTION 1 S IMULATION AND R EASONING F RAMEWORK 2 X 86 ISA M ODEL S YSTEM C ALLS M ODEL C ODE P ROOFS 3 C ONCLUSION AND F UTURE W ORK 4 2| 31
I NTRODUCTION S IMULATION AND R EASONING F RAMEWORK C ODE P ROOFS C ONCLUSION AND F UTURE W ORK O UTLINE I NTRODUCTION 1 S IMULATION AND R EASONING F RAMEWORK 2 X 86 ISA M ODEL S YSTEM C ALLS M ODEL C ODE P ROOFS 3 C ONCLUSION AND F UTURE W ORK 4 3| 31
I NTRODUCTION S IMULATION AND R EASONING F RAMEWORK C ODE P ROOFS C ONCLUSION AND F UTURE W ORK M OTIVATION Bug-hunting tools, like static analyzers, have matured remarkably. ◮ Regularly used in the software development industry ◮ Strengths: easy to use; largely automatic ◮ Weaknesses: cannot prove complex invariants; cannot prove the absence of bugs 4| 31
I NTRODUCTION S IMULATION AND R EASONING F RAMEWORK C ODE P ROOFS C ONCLUSION AND F UTURE W ORK M OTIVATION Bug-hunting tools, like static analyzers, have matured remarkably. ◮ Regularly used in the software development industry ◮ Strengths: easy to use; largely automatic ◮ Weaknesses: cannot prove complex invariants; cannot prove the absence of bugs We want to formally verify properties of (x86 machine-code) programs that cannot be established in the foreseeable future by automatic tools. 4| 31
I NTRODUCTION S IMULATION AND R EASONING F RAMEWORK C ODE P ROOFS C ONCLUSION AND F UTURE W ORK O UR A PPROACH Focus: Mechanical verification of user-level x86 machine-code programs that request services from an operating system via system calls 5| 31
I NTRODUCTION S IMULATION AND R EASONING F RAMEWORK C ODE P ROOFS C ONCLUSION AND F UTURE W ORK O UR A PPROACH Focus: Mechanical verification of user-level x86 machine-code programs that request services from an operating system via system calls ◮ Specify the x86 ISA and Linux/FreeBSD system calls in ACL2 program- ming/proof environment 5| 31
I NTRODUCTION S IMULATION AND R EASONING F RAMEWORK C ODE P ROOFS C ONCLUSION AND F UTURE W ORK O UR A PPROACH Focus: Mechanical verification of user-level x86 machine-code programs that request services from an operating system via system calls ◮ Specify the x86 ISA and Linux/FreeBSD system calls in ACL2 program- ming/proof environment ◮ Validate the above specification against real hardware and software 5| 31
I NTRODUCTION S IMULATION AND R EASONING F RAMEWORK C ODE P ROOFS C ONCLUSION AND F UTURE W ORK O UR A PPROACH Focus: Mechanical verification of user-level x86 machine-code programs that request services from an operating system via system calls ◮ Specify the x86 ISA and Linux/FreeBSD system calls in ACL2 program- ming/proof environment ◮ Validate the above specification against real hardware and software ◮ Reason about x86 machine-code programs using this specification 5| 31
I NTRODUCTION S IMULATION AND R EASONING F RAMEWORK C ODE P ROOFS C ONCLUSION AND F UTURE W ORK W HAT ’ S SPECIAL ABOUT S YSTEM C ALLS ? ◮ From the point of view of a programmer, system calls are non-deterministic ; different runs can yield different results on the same machine. 6| 31
I NTRODUCTION S IMULATION AND R EASONING F RAMEWORK C ODE P ROOFS C ONCLUSION AND F UTURE W ORK W HAT ’ S SPECIAL ABOUT S YSTEM C ALLS ? ◮ From the point of view of a programmer, system calls are non-deterministic ; different runs can yield different results on the same machine. ◮ This makes it non-trivial to reason about user-level programs that make system calls. 6| 31
I NTRODUCTION S IMULATION AND R EASONING F RAMEWORK C ODE P ROOFS C ONCLUSION AND F UTURE W ORK W HAT ’ S SPECIAL ABOUT S YSTEM C ALLS ? ◮ From the point of view of a programmer, system calls are non-deterministic ; different runs can yield different results on the same machine. ◮ This makes it non-trivial to reason about user-level programs that make system calls. Proved functional correctness of a word count program 6| 31
I NTRODUCTION S IMULATION AND R EASONING F RAMEWORK C ODE P ROOFS C ONCLUSION AND F UTURE W ORK C ORRECTNESS OF THE W ORD C OUNT P ROGRAM Assembly Program Snippet Pseudo-code: Specification Function ncSpec (offset, str, count): ... if (EOF-TERMINATED(str) && push %rbx lea -0x9(%rbp),%rax offset < len(str)) then mov %rax,-0x20(%rbp) c := str[offset] mov $0x0,%rax xor %rdi,%rdi if (c == EOF) then mov -0x20(%rbp),%rsi return count mov $0x1,%rdx syscall else mov %eax,%ebx count := (count + 1) mod 2^32 mov %ebx,-0x10(%rbp) movzbl -0x9(%rbp),%eax ncSpec (1 + offset, str, count) movzbl %al,%eax ... endif endif 7| 31
I NTRODUCTION S IMULATION AND R EASONING F RAMEWORK C ODE P ROOFS C ONCLUSION AND F UTURE W ORK C ORRECTNESS OF THE W ORD C OUNT P ROGRAM Assembly Program Snippet Pseudo-code: Specification Function ncSpec (offset, str, count): ... if (EOF-TERMINATED(str) && push %rbx lea -0x9(%rbp),%rax offset < len(str)) then mov %rax,-0x20(%rbp) c := str[offset] mov $0x0,%rax xor %rdi,%rdi if (c == EOF) then mov -0x20(%rbp),%rsi return count mov $0x1,%rdx syscall else mov %eax,%ebx count := (count + 1) mod 2^32 mov %ebx,-0x10(%rbp) movzbl -0x9(%rbp),%eax ncSpec (1 + offset, str, count) movzbl %al,%eax ... endif endif Theorem preconditions(rip i , x86 i ) ∧ x86 f = x86-run(clk(x86 i ), x86 i ) ⇒ = getNc(x86 f ) = ncSpec (Offset(x86 i ), Str(x86 i ), 0) 7| 31
I NTRODUCTION S IMULATION AND R EASONING F RAMEWORK C ODE P ROOFS C ONCLUSION AND F UTURE W ORK O UTLINE I NTRODUCTION 1 S IMULATION AND R EASONING F RAMEWORK 2 X 86 ISA M ODEL S YSTEM C ALLS M ODEL C ODE P ROOFS 3 C ONCLUSION AND F UTURE W ORK 4 8| 31
I NTRODUCTION S IMULATION AND R EASONING F RAMEWORK C ODE P ROOFS C ONCLUSION AND F UTURE W ORK X 86 ISA + S YSTEM C ALLS S PECIFICATION ◮ Formalization of the x86 ISA, with syscall extended by a specification of Linux and FreeBSD system calls ◮ Formal and executable specification ◮ Memory model: 64-bit linear address space 9| 31
I NTRODUCTION S IMULATION AND R EASONING F RAMEWORK C ODE P ROOFS C ONCLUSION AND F UTURE W ORK O UTLINE I NTRODUCTION 1 S IMULATION AND R EASONING F RAMEWORK 2 X 86 ISA M ODEL S YSTEM C ALLS M ODEL C ODE P ROOFS 3 C ONCLUSION AND F UTURE W ORK 4 10| 31
I NTRODUCTION S IMULATION AND R EASONING F RAMEWORK C ODE P ROOFS C ONCLUSION AND F UTURE W ORK X 86 ISA M ODEL IN ACL2 ◮ Interpreter-style operational semantics ◮ Semantics of a program is given by the effect it has on the state of the machine. ◮ State-transition function is characterized by a recursively defined interpreter . We call this state transition function x86-run . 11| 31
I NTRODUCTION S IMULATION AND R EASONING F RAMEWORK C ODE P ROOFS C ONCLUSION AND F UTURE W ORK F ORMALIZATION : X 86 S TATE Component Description registers general-purpose, segment, debug, control, floating point, MMX, model-specific rip instruction pointer flg flags register env environment field mem memory 12| 31
I NTRODUCTION S IMULATION AND R EASONING F RAMEWORK C ODE P ROOFS C ONCLUSION AND F UTURE W ORK F ORMALIZATION : S TATE T RANSITION F UNCTION ◮ State transition function: fetch , decode & execute ◮ Each instruction has its own semantic function 13| 31
I NTRODUCTION S IMULATION AND R EASONING F RAMEWORK C ODE P ROOFS C ONCLUSION AND F UTURE W ORK F ACTSHEET : X 86 ISA M ODEL ◮ 64-bit mode of Intel’s IA-32e mode ◮ 221 general and 96 SSE/SSE2 opcodes ◮ Implementation of all addressing modes ◮ Lines of Code: ∼ 40,000 ◮ Execution speed: up to 3.3 million instructions/second Machine used: 3.50GHz Intel Xeon E31280 CPU 14| 31
I NTRODUCTION S IMULATION AND R EASONING F RAMEWORK C ODE P ROOFS C ONCLUSION AND F UTURE W ORK A SSESSING THE A CCURACY OF THE ISA M ODEL 15| 31
I NTRODUCTION S IMULATION AND R EASONING F RAMEWORK C ODE P ROOFS C ONCLUSION AND F UTURE W ORK O UTLINE I NTRODUCTION 1 S IMULATION AND R EASONING F RAMEWORK 2 X 86 ISA M ODEL S YSTEM C ALLS M ODEL C ODE P ROOFS 3 C ONCLUSION AND F UTURE W ORK 4 16| 31
I NTRODUCTION S IMULATION AND R EASONING F RAMEWORK C ODE P ROOFS C ONCLUSION AND F UTURE W ORK S YSTEM C ALLS M ODEL : E XTENDING SYSCALL System calls in the real world 17| 31
I NTRODUCTION S IMULATION AND R EASONING F RAMEWORK C ODE P ROOFS C ONCLUSION AND F UTURE W ORK S YSTEM C ALLS M ODEL : E XTENDING SYSCALL System calls in the real world System calls in our x86 model 17| 31
I NTRODUCTION S IMULATION AND R EASONING F RAMEWORK C ODE P ROOFS C ONCLUSION AND F UTURE W ORK B ENEFITS OF THE S YSTEM C ALL M ODEL ◮ Useful for verifying application programs while assuming that services like I/O operations are provided reliably by the OS We check such assumptions during co-simulations. 18| 31
Recommend
More recommend