probabilistic assertions
play

Probabilistic Assertions Adrian Sampson University of Washington - PowerPoint PPT Presentation

Expressing and Verifying Probabilistic Assertions Adrian Sampson University of Washington Pavel Panchekha Todd Mytkowicz Microsoft Research Kathryn S. McKinley Dan Grossman University of Washington


  1. Expressing and Verifying Probabilistic Assertions Adrian Sampson University of Washington Pavel Panchekha Todd Mytkowicz Microsoft Research Kathryn S. McKinley Dan Grossman University of Washington � � � � � � � � Luis Ceze � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � PLDI 2014 � �

  2. Probabilistic assertions express correctness properties in modern software. Our verifier checks them e ffi ciently and accurately .

  3. assert file != NULL t e s t check y f i r e v

  4. e assert file != NULL e must hold on every execution

  5. ≈ Approximate Computing this approximate k-means clustering is image is close to likely to converge even its precise version on unreliable hardware assert e e Obfuscation for Mobile and Sensing Data Privacy sensor error does not obfuscated data is still render the app’s useful in aggregate conclusions useless

  6. ≈ Approximate Computing this approximate k-means clustering is image is close to likely to converge even its precise version on unreliable hardware assert e Traditional assertions are insu ffi cient e for programs with probabilistic behavior. Obfuscation for Mobile and Sensing Data Privacy sensor error does not obfuscated data is still render the app’s useful in aggregate conclusions useless

  7. Assertions are insu ffi cient for private-data obfuscation true_avg = average(salaries) � private_avg = � average(obfuscate(salaries)) � assert true_avg - private_avg � <= 10,000

  8. Assertions are insu ffi cient for private-data obfuscation true_avg = average(salaries) � private_avg = � average( (salaries)) � obfuscate assert true_avg - private_avg � <= 10,000 probability distribution

  9. Assertion assert e

  10. Probabilistic assertion assert e , p, c p

  11. Probabilistic assertion assert e , p, c p e must hold with probability p at confidence c

  12. Probabilistic assertion assert e , p, c p t e s t ? check? ? y f i r e v

  13. How to verify a probabilistic assertion probabilistic float obfuscated( float n) { � return n + gaussian(0.0, 1000.0); � } � float average_salary( float* salaries) { � program total = 0.0; � for ( int i = 0; i < COUNT; ++i) � total += obfuscated(salaries[i]); � avg = total / len (salaries); � ? p_avg = ...; passert e, p, c }

  14. How to verify a probabilistic assertion naively probabilistic float obfuscated( float n) { � return n + gaussian(0.0, 1000.0); � } � float average_salary( float* salaries) { � program total = 0.0; � for ( int i = 0; i < COUNT; ++i) � total += obfuscated(salaries[i]); � avg = total / len (salaries); � ? p_avg = ...; passert e, p, c }

  15. How to verify a probabilistic assertion with statistical reasoning queries & inference passert for statistical models for probabilistic software Church Infer.NET ? [Sankaranarayanan+ PLDI 2013] [Hur+ PLDI 2014] ⋮

  16. How to verify a probabilistic assertion e ffi ciently and accurately distribution extraction verification via symbolic execution statistical optimizations float obfuscated( float n) { � ✓ return n + gaussian(0.0, 1000.0); � } � float average_salary( float* salaries) { � total = 0.0; � for ( int i = 0; i < COUNT; ++i) � total += obfuscated(salaries[i]); � avg = total / len (salaries); � p_avg = ...; passert e, p, c } Bayesian network IR

  17. How to verify a probabilistic assertion e ffi ciently and accurately distribution extraction verification via symbolic execution statistical optimizations float obfuscated( float n) { � ✓ return n + gaussian(0.0, 1000.0); � } � float average_salary( float* salaries) { � total = 0.0; � for ( int i = 0; i < COUNT; ++i) � total += obfuscated(salaries[i]); � avg = total / len (salaries); � p_avg = ...; passert e, p, c } Bayesian network IR implementation for LLVM & Clang

  18. How to verify a probabilistic assertion e ffi ciently and accurately distribution extraction verification via symbolic execution statistical optimizations float obfuscated( float n) { � ✓ return n + gaussian(0.0, 1000.0); � } � float average_salary( float* salaries) { � total = 0.0; � for ( int i = 0; i < COUNT; ++i) � total += obfuscated(salaries[i]); � avg = total / len (salaries); � p_avg = ...; passert e, p, c } Bayesian network IR implementation for LLVM & Clang

  19. Distribution extraction: random draws are symbolic symbolic heap 4.2 a b = a + gaussian(0.0, 1.0) 4.2 a 4.2 + G 0,1 b

  20. Concrete vs. symbolic semantics program + outputs input nondeterministic concrete execution

  21. Concrete vs. symbolic semantics program + outputs input nondeterministic concrete execution program + outputs input deterministic nondeterministic symbolic execution sampling

  22. a 4.2 b G 0,1 input: a = 4.2 � b = gaussian(0.0, 1.0)

  23. a 4.2 b G 0,1 c input: a = 4.2 � + b = gaussian(0.0, 1.0) � c = a + b

  24. a 4.2 b G 0,1 c input: a = 4.2 � + b = gaussian(0.0, 1.0) � c = a + b � d + d = c + b

  25. a input: a = 4.2 � c 4.2 b = gaussian(0.0, 1.0) � + c = a + b � d + b d = c + b G 0,1

  26. a input: a = 4.2 � c 4.2 b = gaussian(0.0, 1.0) � + c = a + b � d + b d = c + b � if b > 0.5 � G 0,1 e = 2.0 � > else � 0.5 e = 4.0 if then e 2.0 ? else 4.0

  27. a input: a = 4.2 � c 4.2 b = gaussian(0.0, 1.0) � + c = a + b � d + b d = c + b � if b > 0.5 � G 0,1 e = 2.0 � 3.0 > else � 0.5 ≤ e = 4.0 � if passert e <= 3.0, � then e 2.0 ? 0.9, 0.9 else 4.0

  28. input: a = 4.2 � 4.2 b = gaussian(0.0, 1.0) � + c = a + b � + d = c + b � if b > 0.5 � G 0,1 e = 2.0 � 3.0 > else � 0.5 ≤ e = 4.0 � if passert e <= 3.0, � then 2.0 ? 0.9, 0.9 else 4.0

  29. input: a = unif(2.0, 9.0) input: a = 4.2 � 4.2 b = gaussian(0.0, 1.0) � + c = a + b � + d = c + b � if b > 0.5 � G 0,1 e = 2.0 � 3.0 > else � 0.5 ≤ e = 4.0 � if passert e <= 3.0, � then 2.0 ? 0.9, 0.9 else 4.0

  30. concrete input input distribution salary = $24,000 salary = uniform(…) ≈ testing ≈ static analysis

  31. More in the paper Arrays & pointers Loops External code Probabilistic path pruning

  32. Distribution extraction produces an expression dag Bayesian network 4.2 + + > G 0,1 0.5

  33. Distribution extraction produces an expression dag Bayesian network 4.2 G 0,1 + + 0.5 >

  34. Distribution extraction produces an expression dag Bayesian network nodes: random variables edges: dependence 4.2 G 0,1 + random draws directed & acyclic only at leaves + 0.5 > sample in a single pass

  35. distribution extraction verification via symbolic execution statistical optimizations float obfuscated( float n) { � ✓ return n + gaussian(0.0, 1000.0); � } � float average_salary( float* salaries) { � total = 0.0; � for ( int i = 0; i < COUNT; ++i) � total += obfuscated(salaries[i]); � avg = total / len (salaries); � p_avg = ...; passert e, p, c } Bayesian network IR implementation for LLVM & Clang

  36. statistical passert verifier property optimization

  37. Bayesian-network IR enables new optimizations G ʹ G G ʹʹ + X ∼ G ( µ X , σ 2 X ) Y ∼ G ( µ Y , σ 2 Y ) Z = X + Y ⇒ Z ∼ G ( µ X + µ Y , σ 2 X + σ 2 Y )

  38. Bayesian-network IR enables new optimizations U c U ʹ × X ∼ U ( a, b ) Y = cX ⇒ Y ∼ U ( ca, cb )

  39. Bayesian-network IR enables new optimizations c U B ≤ X ∼ U ( a, b ) Y ∼ X ≤ c a ≤ c ≤ b ✓ c − a ◆ ⇒ Y ∼ B b − a

  40. Central Limit Theorem collapses large sums D D D D D D D G + X 1 , X 2 , . . . , X n ∼ D X Y = X i i ⇒ Y ∼ G ( nµ D , n σ 2 D )

  41. distribution extraction verification via symbolic execution statistical optimizations float obfuscated( float n) { � ✓ return n + gaussian(0.0, 1000.0); � } � float average_salary( float* salaries) { � total = 0.0; � for ( int i = 0; i < COUNT; ++i) � total += obfuscated(salaries[i]); � avg = total / len (salaries); � p_avg = ...; passert e, p, c } Bayesian network IR implementation for LLVM & Clang

  42. Verification via direct evaluation ✓ D D D D D D D B + c ≤

  43. Verification via hypothesis testing p D G 0,1 3 , p, c μ + c ÷ D 2 >

  44. distribution extraction verification via symbolic execution statistical optimizations float obfuscated( float n) { � ✓ return n + gaussian(0.0, 1000.0); � } � float average_salary( float* salaries) { � total = 0.0; � for ( int i = 0; i < COUNT; ++i) � total += obfuscated(salaries[i]); � avg = total / len (salaries); � p_avg = ...; passert e, p, c } Bayesian network IR implementation for LLVM & Clang

  45. Probabilistic assertions for C and C++ LLVM LLVM Native .c IR IR Code strawman stress-tester

Recommend


More recommend