Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken PLDI, Santa Barbara, June 16, 2016
Symbolic Execution Automatically Program π β‘ Reason about Verification Programs Program β‘ Equivalence β¦ 2
Automatically reasoning about programs requires 3
testq %rdi , %rdi je .L1 xorq %rax , %rax .L0: movq %rdi , %rdx andq $0x1, %rdx addq %rdx , %rax shrq $0x1, %rdi jne .L0 cltq retq .L1: xorq %rax , %rax retq 4
64-bit bit-vector addition rax β rax + 64 1 64 addq $0x1, %rax 64-bit constant previous value of rax 5
rax β rax + 64 1 64 addq $0x1, %rax al β al + 8 1 8 addb $0x1, %al 6
rax β rax + 64 1 64 addq $0x1, %rax al β al + 8 1 8 addb $0x1, %al rax 64 bits eax 32 bits ax 16 bits ah al 8 bits 8 bits 7
rax β rax + 64 1 64 addq $0x1, %rax al β al + 8 1 8 addb $0x1, %al rax β rax 63: 8 β rax 7: 0 + 8 1 8 rax 64 bits eax 32 bits ax 16 bits ah al 8 bits 8 bits 8
rax β rax + 64 1 64 addq $0x1, %rax rax β rax 63: 8 β rax 7: 0 + 8 1 8 addb $0x1, %al rax β rax 63: 16 β rax 15: 0 + 16 1 16 addw $0x1, %ax rax β rax[63: 32] β (rax[31: 0] + 32 1 32 ) addl $0x1, %eax 9
rax β rax + 64 1 64 addq $0x1, %rax rax β rax 63: 8 β rax 7: 0 + 8 1 8 addb $0x1, %al rax β rax 63: 16 β rax 15: 0 + 16 1 16 addw $0x1, %ax rax β 0 32 β (rax[31: 0] + 32 1 32 ) addl $0x1, %eax 10
rax β rax + 64 1 64 addq $0x1, %rax rax β rax 63: 8 β rax 7: 0 + 8 1 8 addb $0x1, %al rax β rax 63: 16 β rax 15: 0 + 16 1 16 addw $0x1, %ax rax β 0 32 β (rax[31: 0] + 32 1 32 ) addl $0x1, %eax zf β 0 32 = (eax + 32 1 32 ) cf β 0 1 β eax + 33 1 33 [32,32] sf β eax + 32 1 32 [31,31] of β Β¬eax 31,31 β§ (eax + 32 1 32 )[31,31] pf β (eax + 32 1 32 )[0,0] β (eax + 32 1 32 )[1,1] β (eax + 32 1 32 )[2,2] β (eax + 32 1 32 )[3,3] β (eax + 32 1 32 )[4,4] β (eax + 32 1 32 )[5,5] β (eax + 32 1 32 )[6,6] β (eax + 32 1 32 )[7,7] 11
β’ Manual partial specifications β CompCert [CACMβ09] , BAP [CAVβ11] , BitBlaze [ICISSβ08] , Codesurfer/x86 [ETAPSβ05] , McVeto [CAVβ10] , STOKE [ASPLOSβ13] , Jakstab [CAVβ08] , many others β’ Taly/Godefroid [PLDIβ12] β Automatically synthesize specification from templates β Only 534 instructions 13
Bit-vector formulas of input-output behavior 14
All instructions Remaining Instructions Base set Learn specification automatically Specify manually 15
combine base Program π Instruction π formulas synthesize Formula π How do we Formal synthesize guarantee? π β‘ π programs? 16
combine base Program π Instruction π formulas synthesize Formula π Randomized search How do we Guided by cost function synthesize Based on test-cases programs? Using STOKE [ASPLOSβ13] 17
combine base Program π Instruction π formulas synthesize Formula π Formal π β‘ π guarantee? π β‘ π 18
combine base Program π Instruction π formulas synthesize Formula π Formal π β‘ π β‘ π guarantee? π β‘ π 19
combine base Program π Instruction π formulas synthesize Candidate formula π Formal π β‘ π β‘ π guarantee? π β‘ π 20
combine base Program π Instruction π formulas synthesize Candidate formula π Candidate Program πβ² formula πβ² Candidate β¦ formula πβ²β² yes β increase confidence ? πβ² π Φ Add counter example, remove wrong program(s) no 21
Increase confidence Remove incorrect program(s) ? πβ² π Φ No information about equivalence 22
Increase confidence Remove incorrect program(s) ? πβ² π Φ No information about equivalence 23
Increase confidence Remove incorrect program(s) ? πβ² π Φ No information about equivalence Equivalence class 1 Equivalence class 2 24
Equivalence class 1 Equivalence class 2 Equivalence class 3 β’ Prefer programs whose formulas are β Precise (fewest uninterpreted functions) β Fast (fewest non-linear arithmetic operations) β Simple (fewest nodes) 25
Equivalence class 1 Equivalence class 2 Equivalence class 3 β’ Prefer programs whose formulas are β Precise (fewest uninterpreted functions) β Fast (fewest non-linear arithmetic operations) β Simple (fewest nodes) 26
synthesize 27
28
Learn dx β dx + 16 ax addw %ax , %dx Rename addw %cx , %bx bx β bx + 16 cx β dx β dx + 16 M rsp β addw ( %rsp ), %dx dx β dx + 16 5 16 addw $0x5, %dx β 29
1. Learn formula for register-only instructions 2. Generalize formulas β To other types of operands 3. Check on test inputs 30
shufps $0xb3, %xmm0 , %xmm1 Problem: No corresponding register-only variant Solution: Brute force a formula for every constant 31
β’ Base set (51 instructions) β Integer, bitwise and float operations β Data movement (including conditional move) β Conversion operations β’ Pseudo instructions (11 templates) β Split and combine registers β Changing status flags 32
β’ Total instructions 3,684 β’ Out-of-scope β System instructions invpcid, jle 302 β Crypto instructions aeskeygenassist 35 β Deprecated instructions fadd 332 β String instructions scasq 97 β’ Goal instructions 2,918 33
β’ Base set 51 β’ Pseudo instructions 11 β’ Register-only instructions learned 692 β’ Generalized 984 β’ 8-bit constant instructions learned 119.42 β’ Total formulas learned 1,795.42 34
Compare with handwritten formulas (from STOKE) Available for comparison 1,431.91 Automatically proven equivalent 1,377.91 4 Equivalent with additional lemma 35
Compare with handwritten formulas (from STOKE) Available for comparison 1,431.91 fadd π, π = fadd π, π Automatically proven equivalent 1,377.91 4 Equivalent with additional lemma 36
Compare with handwritten formulas (from STOKE) Available for comparison 1,431.91 Automatically proven equivalent 1,377.91 4 Equivalent with additional lemma Semantically different 50 Handwritten formula correct 0 Learned formula correct 50 37
Stratum 1 Stratum 3 Stratum 0 Stratum 2 base set 0 if π β baseset stratum π = ΰ΅ π β² βπ(π) stratum i β² 1 + max otherwise 38
0 if π β baseset stratum π = ΰ΅ π β² βπ(π) stratum i β² 1 + max otherwise 39
800 Number of formulas learned 700 600 500 400 300 200 100 0 0 50 100 150 200 250 Wall-clock time elapsed [hours] Stratification Without stratification 40
Fully inlined: 3526 instructions number of nodes in learned formula number of nodes in handwritten formula 41
1. Automatically learned 1,795 formulas 2. Stratification key to scale program synthesis 3. Compare to hand-written specification β More correct, equally precise, same size Source code, formulas, experimental results https://github.com/StanfordPL/strata/ 42
43
1. Missing base instructions Some integer and floating point operations are missing 2. Program synthesis limits Shortest known program is long and outside of reach e.g., byte-vectorized operation 3. Cost function limitation For one bit of output, the cost function does not give enough signal 4. Crazy instructions 44
β’ Total decisions 7,075 β’ Equivalent 6,669 (94.26%) β’ New equivalence class 356 (5.03%) β’ Counter-examples 50 (0.71%) β’ β’ Timeouts (45 seconds): 3 45
β’ Intel Xeon E5-2697 (28 cores) at 2.6 GHz β 268.86 hours (register-only) β 159.12 hours (8-bit constants) β’ Total of 11,983.37 core hours 46
β’ Random inputs (random machine state) β’ βInterestingβ bit -patterns 0 , 1 , β1 , 2 π , NaN , Infinity β’ Test cases learned from counter-examples 47
β’ Formulas are simplified β Constant propagation β‘ 8 64 2 64 β 64 4 64 β Move bit-selection over concatenation 0 64 β rax 63,0 β‘ rax 48
β’ Formula precision (number of uninterpreted functions) β Learned formulas equally precise in all but 4 cases β’ Formula quality (number of non-linear operations) β Learned formulas contain same number of non- linear operations, except for 11 cases 49
Recommend
More recommend