Tail Probabilities for Randomized Program Runtimes via Martingales for Higher Moments Satoshi Kura 1,2 Natsuki Urabe 1 Ichiro Hasuo 1,2 1 National Institute of Informatics, Tokyo, Japan 2 The Graduate University for Advanced Studies (SOKENDAI), Kanagawa, Japan April 10, 2019 1 / 36
Our question “What is an upper bound of the tail probability?” p p . . . 0 1 2 3 1 − p 1 − p 1 − p How likely is it to terminate within 100 steps? (e.g. at least 90%) How unlikely is it to not terminate within 100 steps? (e.g. at most 10%) prob. Pr( T ≥ 100) ≤ ?? � �� � tail probability step 100 2 / 36
Related work Supermartingale-based approach • Proving almost-sure termination [Chakarov & Sankaranarayanan, CAV’13] • Overapproximating tail probabilities: Pr( T ≥ d ) ≤ ?? [Chatterjee & Fu, arxiv preprint], [Chatterjee et al., TOPLAS’18] • Azuma’s, Hoeffding’s and Bernstein’s inequalities • Markov’s inequality (wider applicability) Pr( T ≥ d ) ≤ E[ T ] d 3 / 36
Our approach • Aim: overapproximating tail probabilities: Pr( T ≥ d ) ≤ ?? • Corollary of Markov’s inequality Pr( T ≥ d ) ≤ E[ T k ] d k • Extends ranking supermartingale for higher moments E[ T k ] ( k = 1 , 2 , . . . ) 4 / 36
Our workflow randomized program our supermartingales upper bounds of higher moments � � E[ T ] , . . . , E[ T K ] ≤ ( u 1 , . . . , u K ) concentration inequality deadline d upper bound of tail probability Pr( T ≥ d ) ≤ ? 5 / 36
Our workflow randomized program our supermartingales upper bounds of higher moments � � E[ T ] , . . . , E[ T K ] ≤ ( u 1 , . . . , u K ) concentration inequality deadline d upper bound of tail probability Pr( T ≥ d ) ≤ ? 6 / 36
Randomized program ✓ sampling ✓ (demonic/termination avoiding) nondeterminism Given as a pCFG (probabilistic control flow graph). 1 x := 5; l 1 l 4 2 while x > 0 do x := x + 1 3 x := 5 0 . 4 if prob (0.4) then 4 x > 0 x := x + 1 l 2 l 3 5 else 6 x := x - 1 ¬ ( x > 0) 0 . 6 x := x − 1 7 fi l 6 l 5 8 od 7 / 36
Semantics x ) ∈ L × R V • Configuration: ( l, � • L : finite set of locations • V : finite set of program variables • Run: sequence of configurations ( l 1 , [ x �→ 0]) l 1 l 4 x := x + 1 x := 5 0 . 4 x > 0 l 2 l 3 ¬ ( x > 0) 0 . 6 x := x − 1 l 6 l 5 8 / 36
Semantics x ) ∈ L × R V • Configuration: ( l, � • L : finite set of locations • V : finite set of program variables • Run: sequence of configurations ( l 1 , [ x �→ 0]) l 1 l 4 1 x := x + 1 ( l 2 , [ x �→ 5]) x := 5 0 . 4 x > 0 l 2 l 3 ¬ ( x > 0) 0 . 6 x := x − 1 l 6 l 5 8 / 36
Semantics x ) ∈ L × R V • Configuration: ( l, � • L : finite set of locations • V : finite set of program variables • Run: sequence of configurations ( l 1 , [ x �→ 0]) l 1 l 4 1 x := x + 1 ( l 2 , [ x �→ 5]) x := 5 0 . 4 1 x > 0 l 2 l 3 ( l 3 , [ x �→ 5]) ¬ ( x > 0) 0 . 6 x := x − 1 l 6 l 5 8 / 36
Semantics x ) ∈ L × R V • Configuration: ( l, � • L : finite set of locations • V : finite set of program variables • Run: sequence of configurations ( l 1 , [ x �→ 0]) l 1 l 4 1 x := x + 1 ( l 2 , [ x �→ 5]) x := 5 0 . 4 1 x > 0 l 2 l 3 ( l 3 , [ x �→ 5]) ¬ ( x > 0) 0 . 4 0 . 6 x := x − 1 ( l 4 , [ x �→ 5]) l 6 l 5 8 / 36
Semantics x ) ∈ L × R V • Configuration: ( l, � • L : finite set of locations • V : finite set of program variables • Run: sequence of configurations ( l 1 , [ x �→ 0]) l 1 l 4 1 x := x + 1 ( l 2 , [ x �→ 5]) x := 5 0 . 4 1 x > 0 l 2 l 3 ( l 3 , [ x �→ 5]) ¬ ( x > 0) 0 . 4 0 . 6 x := x − 1 ( l 4 , [ x �→ 5]) l 6 l 5 . . . 8 / 36
Semantics x ) ∈ L × R V • Configuration: ( l, � • L : finite set of locations • V : finite set of program variables • Run: sequence of configurations ( l 1 , [ x �→ 0]) l 1 l 4 1 x := x + 1 ( l 2 , [ x �→ 5]) x := 5 0 . 4 1 x > 0 l 2 l 3 ( l 3 , [ x �→ 5]) ¬ ( x > 0) 0 . 4 0 . 6 x := x − 1 ( l 4 , [ x �→ 5]) l 6 l 5 . . . 8 / 36
Semantics x ) ∈ L × R V • Configuration: ( l, � • L : finite set of locations • V : finite set of program variables • Run: sequence of configurations ( l 1 , [ x �→ 0]) l 1 l 4 1 x := x + 1 ( l 2 , [ x �→ 5]) x := 5 0 . 4 1 x > 0 l 2 l 3 ( l 3 , [ x �→ 5]) ¬ ( x > 0) 0 . 4 0 . 6 0 . 6 x := x − 1 ( l 4 , [ x �→ 5]) ( l 5 , [ x �→ 5]) l 6 l 5 . . . 8 / 36
Semantics x ) ∈ L × R V • Configuration: ( l, � • L : finite set of locations • V : finite set of program variables • Run: sequence of configurations ( l 1 , [ x �→ 0]) l 1 l 4 1 x := x + 1 ( l 2 , [ x �→ 5]) x := 5 0 . 4 1 x > 0 l 2 l 3 ( l 3 , [ x �→ 5]) ¬ ( x > 0) 0 . 4 0 . 6 0 . 6 x := x − 1 ( l 4 , [ x �→ 5]) ( l 5 , [ x �→ 5]) l 6 l 5 . . . . . . 8 / 36
Our workflow randomized program our supermartingales upper bounds of higher moments � � E[ T ] , . . . , E[ T K ] ≤ ( u 1 , . . . , u K ) concentration inequality deadline d upper bound of tail probability Pr( T ≥ d ) ≤ ? 9 / 36
Ranking function [Floyd, ’67] r : L × R V → N ∪ {∞} For each transition, r decreases by (at least) 1: x ) �→ ( l ′ , � ⇒ r ( l ′ , � x ′ ) = x ′ ) ≤ r ( l, � ( l, � x ) − 1 Theorem If r ( l, � x ) < ∞ , then the program is terminating from ( l, � x ) within r ( l, � x ) steps. 1 x := 5; 2 x + 1 0 2 x x > 0 2 while x > 0 do x ≤ 0 3 x := x - 1 l 3 l 1 l 2 4 od x := x − 1 10 / 36
Ranking supermartingale [Chakarov & Sankaranarayanan, CAV’13] η : L × R V → [0 , ∞ ] For each transition, η decreases by (at least) 1 “on average”: ( X η )( l, � x ) ≤ η ( l, � x ) − 1 for each ( l, � x ) where X is next-time operator (the expected value after one transition): x ) := E [ η ( l ′ , � x ′ ) | ( l, � x ) �→ ( l ′ , � x ′ )] . ( X η )( l, � 11 / 36
Ranking supermartingale Theorem If η ( l, � x ) < ∞ , then the program is (positively) almost surely terminating from ( l, � x ) with the expected runtime ≤ η ( l, � x ) steps. This can be explained lattice-theoretically. • Expected runtime is a lfp • Ranking supermartingale is a prefixed point 12 / 36
Runtime before and after transition Let T ( l, � x ) be a random variable representing the runtime from ( l, � x ) . T ( l 1 , � x 1 ) T ( l 0 , � x 0 ) . . . l 1 p l 0 1 − p . . . l 2 T ( l 2 , � x 2 ) Runtime from ( l 0 , � x 0 ) : • T ( l 1 , � x 1 ) + 1 with probability p • T ( l 2 , � x 2 ) + 1 with probability 1 − p 13 / 36
Expected runtime is a fixed point T ( l 1 , � x 1 ) T ( l 0 , � x 0 ) . . . l 1 p l 0 1 − p . . . l 2 T ( l 2 , � x 2 ) E[ T ]( l 0 , � x 0 ) = p E[ T ( l 1 , � x 1 ) + 1] + (1 − p )E[ T ( l 2 , � x 2 ) + 1] = p (E[ T ( l 1 , � x 1 )] + 1) + (1 − p )(E[ T ( l 2 , � x 2 )] + 1) � � E[ T ( l ′ , � x 0 ) �→ ( l ′ , � = E x ′ )] + 1 | ( l 0 , � x ′ ) = ( X (E[ T ] + 1))( l 0 , � x 0 ) where E[ T ] := λ ( l, � x ) . E[ T ( l, � x )] . 14 / 36
Expected runtime is lfp E[ T ] = X (E[ T ] + 1) In fact, E[ T ] is the “least” fixed point of F 1 ( η ) := X ( η + 1) . • F 1 is a monotone function on the complete lattice [0 , ∞ ] L × R V • F 1 adds 1 unit of time, and then calculate the expected value after one transition 15 / 36
Ranking supermartingale is prefixed point η is a ranking supermartingale ⇐ ⇒ η is a prefixed point of F 1 F 1 η = X ( η + 1) ≤ η Theorem (Knaster–Tarski) Let L be a complete lattice and F : L → L be a monotone function. The least fixed point µF is the least prefixed point. Therefore we have F η ≤ η = ⇒ µF ≤ η. It follows that η is a ranking supermartingale = ⇒ E[ T ] ≤ η. 16 / 36
Our supermartingale [Chakarov & Sankaranarayanan, CAV’13] L × R V → [0 , ∞ ] lattice monotone F 1 function F lfp µF E[ T ] ranking prefixed point supermartingale F η ≤ η η Knaster–Tarski E[ T ] ≤ η µF ≤ η † for a pCFG without nondeterminism 17 / 36
Our supermartingale [Chakarov & Our supermartingale Sankaranarayanan, CAV’13] L × R V → [0 , ∞ ] L × R V → [0 , ∞ ] K lattice monotone F 1 F K function F (E[ T ] , . . . , E[ T K ]) † lfp µF E[ T ] ranking ranking prefixed point supermartingale supermartingale F η ≤ η for higher moments η � η Knaster–Tarski (E[ T ] , . . . , E[ T K ]) ≤ � E[ T ] ≤ η η µF ≤ η † for a pCFG without nondeterminism 17 / 36
Runtime before and after transition Let T ( l, � x ) be a random variable representing the runtime from ( l, � x ) . T ( l 1 , � x 1 ) T ( l 0 , � x 0 ) . . . l 1 p l 0 1 − p . . . l 2 T ( l 2 , � x 2 ) Runtime from ( l 0 , � x 0 ) : • T ( l 1 , � x 1 ) + 1 with probability p • T ( l 2 , � x 2 ) + 1 with probability 1 − p 18 / 36
Characterizing E[ T 2 ] as lfp? T ( l 1 , � x 1 ) T ( l 0 , � x 0 ) . . . l 1 p l 0 1 − p . . . l 2 T ( l 2 , � x 2 ) E[ T 2 ]( l 0 , � x 0 ) � � 2 ] = p E[ T ( l 1 , � x 1 ) + 1 x 2 ) + 1) 2 ] + (1 − p )E[( T ( l 2 , � � � X (E[ T 2 ] + 2E[ T ] + 1) = ( l 0 , � x 0 ) 19 / 36
Recommend
More recommend