randomized computation eugene santos looked at
play

Randomized Computation Eugene Santos looked at computability for - PowerPoint PPT Presentation

Randomized Computation Eugene Santos looked at computability for Probabilistic TM. John Gill studied complexity classes defined by Probabilistic TM. 1. Eugene Santos. Probabilistic Turing Machines and Computability. Proc. American Mathematical


  1. Biased Random Source Fact . (von Neumann, 1951) A coin with Pr [ Heads ] = 1 / 2 can be simulated by a PTM with access to a ρ -biased coin in expected time O (1). The machine tosses pairs of coin until it gets ‘Head-Tail’ or ‘Tail-Head’. In the former case it outputs ‘Head’, and in the latter case it outputs ‘Tail’. The probability of ‘Head-Tail’/‘Tail-Head’ is ρ (1 − ρ ). The expected running time is 1 / 2 ρ (1 − ρ ). Computational Complexity, by Fu Yuxi Randomized Computation 24 / 109

  2. Finding the k -th Element FindKthElement ( k , { a 1 , . . . , a n } ) 1. Pick a random i ∈ [ n ] and let x = a i . 2. Count the number m of a j ’s such that a j ≤ x . 3. Split a 1 , . . . , a n to two lists L ≤ x < H by the pivotal element x . 4. If m = k then output x . 5. If m > k then FindKthElement ( k , L ). 6. If m < k then FindKthElement ( k − m , H ). Computational Complexity, by Fu Yuxi Randomized Computation 25 / 109

  3. Finding the k -th Element Let T ( n ) be the expected worst case running time of the algorithm. Suppose the running time of the nonrecursive part is cn . We prove by induction that T ( n ) ≤ 10 cn . cn + 1 � � T ( n ) ≤ n ( T ( j ) + T ( n − j )) j > k j < k cn + 10 c � � ≤ n ( j + ( n − j )) j > k j < k ≤ 10 cn . This is a ZPP algorithm. Computational Complexity, by Fu Yuxi Randomized Computation 26 / 109

  4. Polynomial Identity Testing An algebraic circuit has gates implementing + , − , × operators. ZERO is the set of algebraic circuits calculating the zero polynomial. ◮ Given polynomials p ( x ) and q ( x ), is p ( x ) = q ( x )? Computational Complexity, by Fu Yuxi Randomized Computation 27 / 109

  5. Polynomial Identity Testing Let C be an algebraic circuit. The polynomial computed by C has degree at most 2 | C | . Our algorithm does the following: 1. Randomly choose x 1 , . . . , x n from [10 · 2 | C | ]; 2. Accept if C ( x 1 , . . . , x n ) = 0 and reject otherwise. By Schwartz-Zippel Lemma, the error probability is at most 1 / 10. However the intermediate values could be as large as (10 · 2 | C | ) 2 | C | . Schwartz-Zippel Lemma . If a polynomial p ( x 1 , x 2 , . . . , x n ) over GF ( q ) is nonzero and has total degree at most d , then Pr a 1 ,..., a n ∈ R GF ( q ) [ p ( a 1 , . . . , a n ) � = 0] ≥ 1 − d / q . Computational Complexity, by Fu Yuxi Randomized Computation 28 / 109

  6. Polynomial Identity Testing A solution is to use the so-called fingerprinting technique. Let m = | C | . ◮ Evaluation is carried out modulo a number k ∈ R [2 2 m ]. ◮ With probability at least 1 / 4 m , k does not divide y if y � = 0. ◮ There are at least 2 2 m 2 m prime numbers in [2 2 m ]. ◮ y can have at most log y = O ( m 2 m ) prime factors. ◮ When m is large enough, the number of primes in [2 2 m ] not dividing y is at least 2 2 m 4 m . ◮ Repeat the above 4 m times. Accept if all results are zero. This is a coRP algorithm. Computational Complexity, by Fu Yuxi Randomized Computation 29 / 109

  7. Testing for Perfect Matching in Bipartite Graph Lov´ acz (1979) reduced the matching problem to the problem of zero testing of the determinant of the following matrix. ◮ A bipartite graph of size 2 n is represented as an n × n matrix whose entry at ( i , j ) is a variable x i , j if there is an edge from i to j and is 0 otherwise. Pick a random assignment from [2 n ] and calculate the determinant. Computational Complexity, by Fu Yuxi Randomized Computation 30 / 109

  8. PP Computational Complexity, by Fu Yuxi Randomized Computation 31 / 109

  9. If P-time probabilistic decidable problems are defined using worst case complexity measure without any bound on error probability, we get a complexity class that appears much bigger than P . Computational Complexity, by Fu Yuxi Randomized Computation 32 / 109

  10. Problem Decided by PTM Suppose T : N → N and L ⊆ { 0 , 1 } ∗ . A PTM P decides L in time T ( n ) if, for every x ∈ { 0 , 1 } ∗ , Pr [ P ( x ) = L ( x )] > 1 / 2 and P halts in T ( | x | ) steps regardless of its random choices. Computational Complexity, by Fu Yuxi Randomized Computation 33 / 109

  11. Probabilistic Polynomial Time Complexity Class We write PP for the class of problems decided by P-time PTM’s. Alternatively L is in PP if there exist a polynomial p : N → N and a P-time TM M such that for every x ∈ { 0 , 1 } ∗ , Pr r ∈ R { 0 , 1 } p ( | x | ) [ M ( x , r ) = L ( x )] > 1 / 2 . Computational Complexity, by Fu Yuxi Randomized Computation 34 / 109

  12. Another Characterization of PP L is in PP if there exist a polynomial p : N → N and a P-time TM M such that for every x ∈ { 0 , 1 } ∗ , Pr r ∈ R { 0 , 1 } p ( | x | ) [ M ( x , r ) = 1] ≥ 1 / 2 , if x ∈ L , Pr r ∈ R { 0 , 1 } p ( | x | ) [ M ( x , r ) = 0] > 1 / 2 , if x / ∈ L . 1. If a computation that uses some δ 1 transition ends up with a ‘yes’/‘no’ answer, toss the coin three more times and produce seven ‘yes’s/‘no’s and one ‘no’/‘yes’. 2. If the computation using only δ 0 transitions ends up with a ‘no’ answer, toss the coin and announces the result. 3. If the computation using only δ 0 transitions ends up with a ‘yes’ answer, answers ‘yes’. Computational Complexity, by Fu Yuxi Randomized Computation 35 / 109

  13. Lemma (Gill, 1977). NP , coNP ⊆ PP ⊆ PSPACE . Suppose L is accepted by some NDTM N running in P-time. Design P that upon receiving x executes the following: 1. Simulate N ( x ) probabilistically. 2. If a computation terminates with a ‘yes’ answer, then accept; otherwise toss a coin and decide accordingly. 3. If the computation using only δ 0 transitions ends up with a ‘no’ answer, then toss the coin two more times and produce three ‘no’s and one ‘yes’. Clearly P decides L . Computational Complexity, by Fu Yuxi Randomized Computation 36 / 109

  14. PP -Completeness Probabilistic version of SAT : 1. � ϕ, i � ∈ ♮ SAT if more than i assignments make ϕ true. 2. ϕ ∈ MajSAT if more than half assignments make ϕ true. 1. J. Simons. On Some Central Problems in Computational Complexity. Cornell University, 1975. 2. J. Gill. Computational Complexity of Probabilistic Turing Machines. SIAM Journal Computing 6(4): 675-695, 1977. Computational Complexity, by Fu Yuxi Randomized Computation 37 / 109

  15. PP -Completeness Theorem (Simon, 1975). ♮ SAT is PP -complete. Theorem (Gill, 1977). MajSAT ≤ K ♮ SAT ≤ K MajSAT . 1. Probabilistically produce an assignment. Then evaluate the formula under the assignment. This shows that MajSAT ∈ PP . Completeness by Cook-Levin reduction. 2. The reduction MajSAT ≤ K ♮ SAT is clear. Conversely given � ϕ, i � , where ϕ contains n variables, construct a formula ψ with 2 n − 2 i j − . . . − 2 i 1 true assignments, where i = � j h =1 2 i h . ◮ For example ( x k +1 ∨ . . . ∨ x n ) has 2 n − 2 k true assignments. Let x be a fresh variable. Then � ϕ, i � ∈ ♮ SAT if and only if x ∧ ϕ ∨ x ∧ ψ ∈ MajSAT . Computational Complexity, by Fu Yuxi Randomized Computation 38 / 109

  16. Closure Property of PP Theorem . PP is closed under union and intersection. 1. R. Beigel, N. Reingold and D. Spielman. PP is Closed under Intersection, STOC, 1-9, 1991. Computational Complexity, by Fu Yuxi Randomized Computation 39 / 109

  17. BPP Computational Complexity, by Fu Yuxi Randomized Computation 40 / 109

  18. If P-time probabilistic decidable problems are defined using worst case complexity measure with bound on error probability, we get a complexity class that is believed to be very close to P . Computational Complexity, by Fu Yuxi Randomized Computation 41 / 109

  19. Problem Decided by PTM with Bounded-Error Suppose T : N → N and L ⊆ { 0 , 1 } ∗ . A PTM P with bounded error decides L in time T ( n ) if for every x ∈ { 0 , 1 } ∗ , P halts in T ( | x | ) steps, and Pr [ P ( x ) = L ( x )] ≥ 2 / 3. L ∈ BPTIME ( T ( n )) if there is some c such that L is decided by a PTM in cT ( n ) time. Computational Complexity, by Fu Yuxi Randomized Computation 42 / 109

  20. Bounded-Error Probabilistic Polynomial Class c BPTIME ( n c ). We write BPP for � Alternatively L ∈ BPP if there exist a polynomial p : N → N and a P-time TM M such that for every x ∈ { 0 , 1 } ∗ , Pr r ∈ R { 0 , 1 } p ( | x | ) [ M ( x , r ) = L ( x )] ≥ 2 / 3 . Computational Complexity, by Fu Yuxi Randomized Computation 43 / 109

  21. 1. P ⊆ BPP ⊆ PP . 2. BPP = coBPP . Computational Complexity, by Fu Yuxi Randomized Computation 44 / 109

  22. How robust is our definition of BPP ? Computational Complexity, by Fu Yuxi Randomized Computation 45 / 109

  23. Average Case Fact . In the definition of BPP , we could use the expected running time instead of the worst case running time. Let L be decided by a bounded error PTM P in average T ( n ) time. Design a PTM that simulates P for 9 T ( n ) steps. It outputs ‘yes’ if P does not stop in 9 T ( n ) steps. By Markov’s inequality the probability that P does not stop in 9 T ( n ) steps is at most 1 / 9. Computational Complexity, by Fu Yuxi Randomized Computation 46 / 109

  24. Error Reduction Theorem Let BPP ( ρ ) denote the BPP defined with error probability ρ . Theorem . BPP (1 / 2 − 1 / n c ) = BPP (2 − n d ) for all c , d > 1. Computational Complexity, by Fu Yuxi Randomized Computation 47 / 109

  25. Error Reduction Theorem Let L be decided by a bounded error PTM P in BPP (1 / 2 − 1 / n c ). Design a PTM P ′ as follows: 1. P ′ simulates P on x for k = 12 | x | 2 c + d + 1 times, obtaining k results y 1 , . . . , y k ∈ { 0 , 1 } . 2. If the majority of y 1 , . . . , y k are 1, P ′ accepts x ; otherwise P ′ rejects x . For each i ∈ [ k ] let X i be the random variable that equals to 1 if y i = 1 and is 0 if y i = 0. Let X = � k i =1 X i . Let δ = | x | − c . Let p = 1 / 2 + δ and p = 1 / 2 − δ . ◮ By linearity E [ X ] ≥ kp if x ∈ L , and E [ X ] ≤ kp if x / ∈ L . < Pr [ X < (1 − δ ) kp ] ≤ Pr [ X < (1 − δ ) E [ X ]] < e − δ 2 2 kp < 1 ◮ If x ∈ L then Pr � X < k � 2 | x | d . 2 < Pr [ X > (1+ δ ) kp ] ≤ Pr [ X > (1+ δ ) E [ X ]] < e − δ 2 3 kp < ◮ If x / � X > k � 1 ∈ L then Pr 2 | x | d . 2 The inequality < is due to Chernoff Bound. Conclude that the error probability of P ′ is ≤ 1 2 nd . Computational Complexity, by Fu Yuxi Randomized Computation 48 / 109

  26. Conclusion: In the definition of BPP , ◮ we can replace 2 / 3 by a constant arbitrarily close to 1 / 2; ◮ we can even replace 2 / 3 by 1 2 + 1 n c for any constant c . Error Reduction Theorem offers a powerful tool to study BPP . Computational Complexity, by Fu Yuxi Randomized Computation 49 / 109

  27. “Nonuniformity is more powerful than randomness.” Adleman Theorem . BPP ⊆ P / poly . 1. Leonard Adleman. Two Theorems on Random Polynomial Time. FOCS, 1978. Computational Complexity, by Fu Yuxi Randomized Computation 50 / 109

  28. Proof of Adleman Theorem Suppose L ∈ BPP . There exist a polynomial p ( x ) and a P-time TM M such that Pr r ∈ R { 0 , 1 } p ( n ) [ M ( x , r ) � = L ( x )] ≤ 1 / 2 n +1 for every x ∈ { 0 , 1 } n . Say r ∈ { 0 , 1 } p ( n ) is bad for x ∈ { 0 , 1 } n if M ( x , r ) � = L ( x ); otherwise r is good for x . ◮ For each x of size n , the number of r ’s bad for x is at most 2 p ( n ) 2 n +1 . ◮ The number of r ’s bad for some x of size n is at most 2 n 2 p ( n ) 2 n +1 = 2 p ( n ) / 2. ◮ There must be some r n that is good for every x of size n . We may construct a P-time TM M with advice { r n } n ∈ N . Computational Complexity, by Fu Yuxi Randomized Computation 51 / 109

  29. Theorem . BPP ⊆ � p 2 ∩ � p 2 . Sipser proved BPP ⊆ � p 4 ∩ � p acs pointed out that BPP ⊆ � p 2 ∩ � p 4 . G´ 2 . This is reported in Sipser’s paper. Lautemann provided a simplified proof using probabilistic method. 1. M. Sipser. A Complexity Theoretic Approach to Randomness. STOC, 1983. 2. C. Lautemann. BPP and the Polynomial Hierarchy. IPL, 1983. Computational Complexity, by Fu Yuxi Randomized Computation 52 / 109

  30. Lautemann’s Proof Suppose L ∈ BPP . There is a polynomial p and a P-time TM M such that for all x ∈ { 0 , 1 } n , 1 − 2 − n , Pr r ∈ R { 0 , 1 } p ( n ) [ M ( x , r ) = 1] ≥ if x ∈ L , 2 − n , Pr r ∈ R { 0 , 1 } p ( n ) [ M ( x , r ) = 1] ≤ if x / ∈ L . Let S x be the set of r ’s such that M ( x , r ) = 1. Then (1 − 2 − n )2 p ( n ) , | S x | ≥ if x ∈ L , 2 − n 2 p ( n ) , | S x | ≤ if x / ∈ L . For a set S ⊆ { 0 , 1 } p ( n ) and string u ∈ { 0 , 1 } p ( n ) , let S + u be { r + u | r ∈ S } , where + is the bitwise exclusive ∨ . Computational Complexity, by Fu Yuxi Randomized Computation 53 / 109

  31. Lautemann’s Proof Let k = ⌈ p ( n ) n ⌉ + 1. Claim 1 . For every set S ⊆ { 0 , 1 } p ( n ) such that | S | ≤ 2 − n 2 p ( n ) and every k vectors u 1 , . . . , u k , one has � k i =1 ( S + u i ) � = { 0 , 1 } p ( n ) . Claim 2 . For every set S ⊆ { 0 , 1 } p ( n ) such that | S | ≥ (1 − 2 − n )2 p ( n ) there exist u 1 , . . . , u k rendering � k i =1 ( S + u i ) = { 0 , 1 } p ( n ) . Proof. Fix r ∈ { 0 , 1 } p ( n ) . Now Pr u i ∈ R { 0 , 1 } p ( n ) [ u i ∈ S + r ] ≥ 1 − 2 − n . �� k � ≤ 2 − kn < 2 − p ( n ) . So Pr u 1 ,..., u k ∈ R { 0 , 1 } p ( n ) i =1 u i / ∈ S + r Notice that u i / ∈ S + r if and only if r / ∈ S + u i , we get by union bound that � � ∈ � k ∃ r ∈ { 0 , 1 } p ( n ) . r / Pr u 1 ,..., u k ∈ R { 0 , 1 } p ( n ) i =1 ( S + u i ) < 1. Computational Complexity, by Fu Yuxi Randomized Computation 54 / 109

  32. Lautemann’s Proof Now Claim 1 and Claim 2 imply that x ∈ L if and only if k � ∃ u 1 , . . . , u k ∈ { 0 , 1 } p ( n ) . ∀ r ∈ { 0 , 1 } p ( n ) . r ∈ ( S x + u i ) , i =1 or equivalently k � ∃ u 1 , . . . , u k ∈ { 0 , 1 } p ( n ) . ∀ r ∈ { 0 , 1 } p ( n ) . M ( x , r + u i ) = 1 . i =1 Since k is polynomial in n , we may conclude that L ∈ � p 2 . Computational Complexity, by Fu Yuxi Randomized Computation 55 / 109

  33. BPP is Low for Itslef Lemma . BPP BPP = BPP . Computational Complexity, by Fu Yuxi Randomized Computation 56 / 109

  34. Complete Problem for BPP ? PP is a syntactical class in the sense that every P-time PTM decides a language in PP . BPP is a semantic class. It is undecidable to check if a PTM both accepts and rejects with probability 2 / 3. 1. We are unable to prove that PTMSAT is BPP -complete. 2. We are unable to construct universal machines. Consequently we are unable to prove any hierarchy theorem. But if BPP = P , there should be complete problems for BPP . Computational Complexity, by Fu Yuxi Randomized Computation 57 / 109

  35. ZPP Computational Complexity, by Fu Yuxi Randomized Computation 58 / 109

  36. If P-time probabilistic decidable problems are defined using average complexity measure with bound on error probability, we get a complexity class that is even closer to P . Computational Complexity, by Fu Yuxi Randomized Computation 59 / 109

  37. PTM with Zero Sided Error Suppose T : N → N and L ⊆ { 0 , 1 } ∗ . A PTM P with zero-sided error decides L in time T ( n ) if for every x ∈ { 0 , 1 } ∗ , the expected running time of P ( x ) is at most T ( | x | ), and it outputs L ( x ) if P ( x ) halts. L ∈ ZTIME ( T ( n )) if there is some c such that L is decided by some zero-sided error PTM in cT ( n ) average time. Computational Complexity, by Fu Yuxi Randomized Computation 60 / 109

  38. � ZTIME ( n c ) . ZPP = c ∈ N Computational Complexity, by Fu Yuxi Randomized Computation 61 / 109

  39. Lemma . L ∈ ZPP if and only if there exists a P-time PTM P with outputs in { 0 , 1 , ? } such that, for every x ∈ { 0 , 1 } ∗ and for all choices, P ( x ) outputs either L ( x ) or ?, and Pr [ P ( x ) =?] ≤ 1 / 3. If a PTM P answers in O ( n c ) time ‘dont-know’ with probability at most 1 / 3, then we can design a zero sided error PTM that simply runs P repetitively until it gets a proper answer. The expected running time of the new PTM is also O ( n c ). Given a zero sided error PTM P with expected running time T ( n ), we can design a PTM that answers ‘?’ if a sequence of 3 T ( n ) choices have not led to a proper answer. By Markov’s inequality, this machines answers ‘?’ with a probability no more than 1 / 3. Computational Complexity, by Fu Yuxi Randomized Computation 62 / 109

  40. PTM with One Sided Error Suppose T : N → N and L ⊆ { 0 , 1 } ∗ . A PTM P with one-sided error decides L in time T ( n ) if for every x ∈ { 0 , 1 } ∗ , P halts in T ( | x | ) steps, and Pr [ P ( x ) = 1] ≥ 2 / 3 , if x ∈ L , Pr [ P ( x ) = 1] = 0 , if x / ∈ L . L ∈ RTIME ( T ( n )) if there is some c such that L is decided in cT ( n ) time by some PTM with one-sided error. Computational Complexity, by Fu Yuxi Randomized Computation 63 / 109

  41. � RTIME ( n c ) . RP = c ∈ N Computational Complexity, by Fu Yuxi Randomized Computation 64 / 109

  42. Theorem . ZPP = RP ∩ coRP . A ‘?’ answer can be replaced by a yes/no answer consistently. Computational Complexity, by Fu Yuxi Randomized Computation 65 / 109

  43. Error Reduction for ZPP Theorem . ZPP (1 − 1 / n c ) = ZPP (2 − n d ) for all c , d > 1. Suppose L ∈ ZPP (1 − 1 / n c ) is decided by a PTM P with a “don’t know” probability 1 − 1 / n c in expected running time T ( n ). Let P ′ be the PTM that on input x of size n , repeat P a total of ln(2) n c + d times. The “don’t know” probability of P ′ is (1 − 1 / n c ) ln(2) n c + d < e − ln(2) n d = 2 − n d . The running time of P ′ on x is bounded by ln(2) n c + d T ( n ). Computational Complexity, by Fu Yuxi Randomized Computation 66 / 109

  44. Error Reduction for RP Theorem . RP (1 − 1 / n c ) = RP (2 − n d ) for all c , d > 1. Computational Complexity, by Fu Yuxi Randomized Computation 67 / 109

  45. Random Walk and RL Computational Complexity, by Fu Yuxi Randomized Computation 68 / 109

  46. Randomized Logspace Complexity L ∈ BPL if there is a logspace PTM P such that Pr [ P ( x ) = L ( x )] ≥ 2 3 . Fact . BPL ⊆ P . Proof. Upon receiving an input the algorithm produces the adjacent matrix A of the configuration graph, in which a ij ∈ { 0 , 1 2 , 1 } indicates the probability C i reaches C j in ≤ one step. It then computes A n − 1 . Computational Complexity, by Fu Yuxi Randomized Computation 69 / 109

  47. Randomized Logspace Complexity L ∈ RL if x ∈ L implies Pr [ P ( x )=1] ≥ 2 3 and x / ∈ L implies Pr [ P ( x )=1] = 0 for some logspace PTM P . Fact . RL ⊆ NL . Computational Complexity, by Fu Yuxi Randomized Computation 70 / 109

  48. Undirected Path Problem Let UPATH be the reachability problem of undirected graph. Is UPATH in L ? Computational Complexity, by Fu Yuxi Randomized Computation 71 / 109

  49. Theorem . UPATH ∈ RL . To prove the theorem we need preliminary properties about Markov chains. 1. R. Aleliunas, R. Karp, R. Lipton, L. Lov´ asz and C. Rackoff. Random Walks, Universal Traversal Sequences, and the Complexity of Maze Problems. FOCS, 1979. Computational Complexity, by Fu Yuxi Randomized Computation 72 / 109

  50. Markov chains were introduced by Andre˘ i Andreevich Markov (1856-1922). Computational Complexity, by Fu Yuxi Randomized Computation 73 / 109

  51. Stochastic Process A stochastic process X = { X t | t ∈ T } is a set of random variables taking values in a single state space Ω. ◮ If T is countably infinite, X is a discrete time process. ◮ If Ω is countably infinite, X is a discrete space process. ◮ If Ω is finite, X is a finite process. A discrete space is often identified to { 0 , 1 , 2 , . . . } and a finite space to { 0 , 1 , 2 , . . . , n } . Computational Complexity, by Fu Yuxi Randomized Computation 74 / 109

  52. In the discrete time case a stochastic process starts with a state distribution X 0 . It becomes another distribution X 1 on the states in the next step, and so on. In the t -th step X t may depend on all the histories X 0 , . . . , X t − 1 . Computational Complexity, by Fu Yuxi Randomized Computation 75 / 109

  53. Markov Chain A discrete time, discrete space stochastic process X 0 , X 1 , X 2 , . . . , is a Markov chain if Pr [ X t = a t | X t − 1 = a t − 1 ] = Pr [ X t = a t | X t − 1 = a t − 1 , . . . , X 0 = a 0 ] . The dependency on the past is captured by the value of X t − 1 . This is the Markov property. A Markov chain is time homogeneous if for all t ≥ 1, Pr [ X t +1 = j | X t = i ] = Pr [ X t = j | X t − 1 = i ] . These are the Markov chains we are interested in. We write M j , i for Pr [ X t +1 = j | X t = i ]. Computational Complexity, by Fu Yuxi Randomized Computation 76 / 109

  54. Transition Matrix The transition matrix M is ( M j , i ) j , i such that � j M j , i = 1 for all i . For example   0 1 / 2 1 / 2 0 . . . 1 / 4 0 1 / 3 1 / 2 . . .     0 1 / 3 1 / 9 1 / 4 . . . M =     1 / 2 1 / 6 0 1 / 8 . . .   . . . . .   . . . . . . . . . . Computational Complexity, by Fu Yuxi Randomized Computation 77 / 109

  55. Transition Graph 5 / 6 1 / 2 1 / 6 1 1 / 6 1 / 3 1 / 2 1 / 2 1 2 / 3 1 1 / 3 1 / 3 1 / 3 1 / 3 Computational Complexity, by Fu Yuxi Randomized Computation 78 / 109

  56. Finite Step Transition Let m t denote a probability distribution on the state space at time t . Then m t +1 = M · m t . The t step transition matrix is clearly given by M t . Computational Complexity, by Fu Yuxi Randomized Computation 79 / 109

  57. Irreducibility A state j is accessible from state i if ( M n ) j , i > 0 for some n ≥ 0. If i and j are accessible from each other, they communicate. A Markov chain is irreducible if all states belong to one communication class. Computational Complexity, by Fu Yuxi Randomized Computation 80 / 109

  58. Aperiodicity A period of a state i is the greatest common divisor of T i = { t ≥ 1 | ( M t ) i , i > 0 } . A state i is aperiodic if gcd T i = 1. Lemma . If M is irreducible, then gcd T i = gcd T j for all states i , j . Proof. By irreducibility ( M s ) j , i > 0 and ( M t ) i , j > 0 for some s , t > 0. Clearly T i + ( s + t ) ⊆ T j . It follows that gcd T i ≥ gcd T j . Symmetrically gcd T j ≥ gcd T i . The period of an irreducible Markov chain is the period of the states. Computational Complexity, by Fu Yuxi Randomized Computation 81 / 109

  59. Classification of State Let r t j , i denote the probability that, starting at i , the first transition to j occurs at time t ; that is r t j , i = Pr [ X t = j ∧ ∀ s ∈ [ t − 1] . X s � = j | X 0 = i ] . A state i is recurrent if � r t i , i = 1 . t ≥ 1 A state i is transient if � r t i , i < 1 . t ≥ 1 A recurrent state i is absorbing if M i , i = 1 . Computational Complexity, by Fu Yuxi Randomized Computation 82 / 109

  60. 5 / 6 1 / 2 1 / 6 1 1 / 6 1 / 2 1 / 3 1 / 2 1 2 / 3 1 1 / 3 1 / 3 1 / 3 1 / 3 If one state in an irreducible Markov chain is recurrent, respectively transient, all states in the chain are recurrent, respectively transient. Computational Complexity, by Fu Yuxi Randomized Computation 83 / 109

  61. Ergodic State The expected hitting time to j from i is � t · r t h j , i = j , i . t ≥ 1 A recurrent state i is positive recurrent if the expected first return time h i , i < ∞ . A recurrent state i is null recurrent if h i , i = ∞ . An aperiodic, positive recurrent state is ergodic. Computational Complexity, by Fu Yuxi Randomized Computation 84 / 109

  62. 1 / 6 1 / 5 1 / 4 1 / 3 1 / 2 . . . 1 / 2 2 / 3 3 / 4 4 / 5 5 / 6 For the presence of null recursive state, the number of states must be infinite. Computational Complexity, by Fu Yuxi Randomized Computation 85 / 109

  63. A Markov chain M is recurrent if every state in M is recurrent. A Markov chain M is aperiodic if the period of M is 1. A Markov chain M is ergodic if all states in M are ergodic. A Markov chain M is regular if ∃ r > 0 . ∀ i , j . M r j , i > 0. A Markov chain M is absorbing if there is at least one absorbing state and from every state it is possible to go to an absorbing state. Computational Complexity, by Fu Yuxi Randomized Computation 86 / 109

  64. The Gambler’s Ruin A fair gambling game between Player I and Player II. ◮ In each round a player wins/loses with probability 1 / 2. ◮ The state at time t is the number of dollars won by Player I. Initially the state is 0. ◮ Player I can afford to lose ℓ 1 dollars, Player II ℓ 2 dollars. ◮ The states − ℓ 1 and ℓ 2 are absorbing. The state i is transient if − ℓ 1 < i < ℓ 2 . ◮ Let M t i be the probability that the chain is in state i after t steps. ◮ Clearly lim t →∞ M t i = 0 if − ℓ 1 < i < ℓ 2 . ◮ Let q be the probability the game ends in state ℓ 2 . By definition lim t →∞ M t ℓ 2 = q . ◮ Let W t be the gain of Player I at step t . Then E [ W t ] = 0 since the game is fair. Now E [ W t ] = � ℓ 2 i = − ℓ 1 iM t i = 0 and lim t →∞ E [ W t ] = ℓ 2 q − ℓ 1 (1 − q ) = 0. ℓ 1 Conclude that q = ℓ 1 + ℓ 2 . Computational Complexity, by Fu Yuxi Randomized Computation 87 / 109

  65. In the rest of the lecture we confine our attention to finite Markov chains. Computational Complexity, by Fu Yuxi Randomized Computation 88 / 109

  66. Lemma . In a finite Markov chain, at least one state is recurrent; and all recurrent states are positive recurrent. In a finite Markov chain M there must be a communication class without any outgoing edges. Starting from any state k in the class the probability that the chain will return to k in d steps is at least p for some p > 0, where d is the diameter of the class. The probability that the chain never returns to k is lim t →∞ (1 − p ) dt = 0. Hence � t ≥ 1 M t k , k = 1. Starting from a recurrent state i , the probability that the chain returns to i in dt steps is at t ≥ 1 dtq dt < ∞ . t ≥ 1 tr t most q for some q ∈ (0 , 1). Thus � i , i is bounded by � Corollary . In a finite irreducible Markov chain, all states are positive recurrent. Computational Complexity, by Fu Yuxi Randomized Computation 89 / 109

  67. Proposition . Suppose M is a finite irreducible Markov chain. The following are equivalent: (i) M is aperiodic. (ii) M is ergodic. (iii) M is regular. (i ⇔ ii) This is a consequence of the previous corollary. (i ⇒ iii) Assume ∀ i . gcd T i = 1. Since T i is closed under addition, Fact implies that some t i exists such that t ∈ T i whenever t ≥ t i . By irreducibility for every j , ( M t j , i ) j , i > 0 for some t j , i . Set t = � i t i · � i � = j t j , i . Then ( M t ) i , j > 0 for all i , j . (iii ⇒ i) If M has period t > 1, for any k > 1 some entries in the diagonal of M kt − 1 are 0. Fact . If a set of natural number is closed under addition and has greatest common divisor 1, then it contains all but finitely many natural numbers. Computational Complexity, by Fu Yuxi Randomized Computation 90 / 109

  68. The graph of a finite Markov chain contains two types of maximal strongly connected components (MSCC). ◮ Recurrent MSCC’s that have no outgoing edges. There is at least one such MSCC. ◮ Transient MSCC’s that have at least one outgoing edge. If we think of an MSCC as a big node, the graph is a dag. How fast does the chain leave the transient states? What is the limit behaviour of the chain on the recurrent states? Computational Complexity, by Fu Yuxi Randomized Computation 91 / 109

  69. Canonical Form of Finite Markov Chain Let Q be the matrix for the transient states, E for the recurrent states, assuming that the graph has only one recurrent MSCC. We shall assume that E is ergodic. � Q � 0 L E It is clear that � n � Q � Q n � 0 0 = . L ′ E n L E Limit Theorem for Transient Chain . lim n →∞ Q n = 0 . Computational Complexity, by Fu Yuxi Randomized Computation 92 / 109

  70. Fundamental Matrix of Transient States n ≥ 0 Q n is the inverse of I − Q . The entry N j , i is the expected Theorem . N = � number of visits to j starting from i . I − Q is nonsingular because x ( I − Q ) = 0 implies x = 0 . Then N ( I − Q n +1 ) = � n i =0 Q i follows from N ( I − Q ) = I . Thus N = � ∞ i =0 Q n . Let X k be the Poisson trial with Pr [ X k = 1] = ( Q k ) j , i , the probability that starting from i the chain visits j at the k -th step. Let X = � ∞ k =1 X k . Clearly E [ X ] = N j , i . Notice that N i , i counts the visit at the 0-th step. Computational Complexity, by Fu Yuxi Randomized Computation 93 / 109

  71. Fundamental Matrix of Transient States Theorem . � j N j , i is the expected number of steps to stay in transient states after starting from i . � j N j , i is the expected number of visits to any transient states after starting from i . This is precisely the expected number of steps. Computational Complexity, by Fu Yuxi Randomized Computation 94 / 109

  72. Stationary Distribution A stationary distribution of a Markov chain M is a distribution π such that π = M π.   π 0 π 1    satisfies � n j =0 M i , j π j = π i = � n If the Markov chain is finite, then π = j =0 M j , i π i .  .  .   .  π n [probability entering i = probability leaving i ] Computational Complexity, by Fu Yuxi Randomized Computation 95 / 109

  73. Limit Theorem for Ergodic Chains Theorem . The power E n approaches to a limit as n → ∞ . Suppose W = lim n →∞ E n . Then W = ( π, π, . . . , π ) for some positive π . Moreover π is a stationary distribution of E . We may assume that E > 0. Let r be a row of E , and let ∆( r ) = max r − min r . ◮ It is easily seen that ∆( rE ) < (1 − 2 p )∆( r ), where p is the minimal entry in E . ◮ It follows that lim n →∞ E n = W = ( π, π, . . . , π ) for some distribution π . ◮ π is positive since rE is already positive. Moreover W = lim n →∞ E n = E lim n →∞ E n = EW . That is π = E π . Computational Complexity, by Fu Yuxi Randomized Computation 96 / 109

  74. Limit Theorem for Ergodic Chains Lemma . E has a unique stationary distribution. [ π can be calculated by solving linear equations.] Suppose π, π ′ are stationary distributions. Let π i /π ′ 0 ≤ k ≤ n { π k /π ′ i = min k } . It follows from the regularity property that π i /π ′ i = π j /π ′ j for all j ∈ { 0 , . . . , n } . Computational Complexity, by Fu Yuxi Randomized Computation 97 / 109

  75. Limit Theorem for Ergodic Chains Theorem . π = lim n →∞ E n v for every distribution v . Suppose E = ( m 0 , . . . , m k ). Then E n +1 = ( E n m 0 , . . . , E n m k ). It follows from n →∞ E n +1 = ( π, . . . , π ) � � n →∞ E n m 0 , . . . , lim n →∞ E n m k lim = lim that lim n →∞ E n m 0 = . . . = lim n →∞ E n m k = π . Now n →∞ E n v = lim n →∞ E n ( v 0 m 0 + . . . + v k m k ) = v 0 π + . . . + v k π = π. lim Computational Complexity, by Fu Yuxi Randomized Computation 98 / 109

  76. Limit Theorem for Ergodic Chains H is the hitting time matrix whose entries at ( j , i ) is h j , i . D is the diagonal matrix whose entry at ( i , i ) is h i , i . J is the matrix whose entries are all 1. Lemma . H = J + ( H − D ) E . Proof. For i � = j , the hitting time is h j , i = E j , i + � k � = j E k , i ( h j , k + 1) = 1 + � k � = j E k , i h j , k , and the first recurrence time is h i , i = E i , i + � k � = i E k , i ( h i , k + 1) = 1 + � k � = i E k , i h i , k . Theorem . h i , i = 1 /π i for all i . [This equality can be used to calculate h i , i .] Proof. 1 = J π = H π − ( H − D ) E π = H π − ( H − D ) π = D π . Computational Complexity, by Fu Yuxi Randomized Computation 99 / 109

Recommend


More recommend