18 175 lecture 31 more markov chains
play

18.175: Lecture 31 More Markov chains Scott Sheffield MIT 1 18.175 - PowerPoint PPT Presentation

18.175: Lecture 31 More Markov chains Scott Sheffield MIT 1 18.175 Lecture 31 Outline Recollections General setup and basic properties Recurrence and transience 2 18.175 Lecture 31 Outline Recollections General setup and basic properties


  1. 18.175: Lecture 31 More Markov chains Scott Sheffield MIT 1 18.175 Lecture 31

  2. Outline Recollections General setup and basic properties Recurrence and transience 2 18.175 Lecture 31

  3. Outline Recollections General setup and basic properties Recurrence and transience 3 18.175 Lecture 31

  4. Markov chains � Consider a sequence of random variables X 0 , X 1 , X 2 , . . . each taking values in the same state space, which for now we take to be a finite set that we label by { 0 , 1 , . . . , M } . � Interpret X n as state of the system at time n . � Sequence is called a Markov chain if we have a fixed collection of numbers P ij (one for each pair i , j ∈ { 0 , 1 , . . . , M } ) such that whenever the system is in state i , there is probability P ij that system will next be in state j . � Precisely, P { X n +1 = j | X n = i , X n − 1 = i n − 1 , . . . , X 1 = i 1 , X 0 = i 0 } = P ij . � Kind of an “almost memoryless” property. Probability distribution for next state depends only on the current state (and not on the rest of the state history). 4 18.175 Lecture 31

  5. Matrix representation To describe a Markov chain, we need to define P ij for any � � i , j ∈ { 0 , 1 , . . . , M } . It is convenient to represent the collection of transition � � probabilities P ij as a matrix: ⎛ P 00 P 01 . . . P 0 M ⎞ P 10 P 11 . . . P 1 M ⎜ ⎟ ⎜ ⎟ · ⎜ ⎟ A = ⎜ ⎟ · ⎜ ⎟ ⎜ ⎟ · ⎝ ⎠ P M 0 P M 1 . . . P MM For this to make sense, we require P ij ≥ 0 for all i , j and � � M P ij = 1 for each i . That is, the rows sum to one. j =0 5 18.175 Lecture 31

  6. Powers of transition matrix ( n ) We write P for the probability to go from state i to state j � � ij over n steps. From the matrix point of view � � ( n ) ( n ) ( n ) ⎛ ⎞ n P P . . . P ⎛ ⎞ P 00 P 01 . . . P 0 M 00 01 0 M ( n ) ( n ) ( n ) P 10 P 11 . . . P 1 M P P . . . P ⎜ ⎟ ⎜ ⎟ 10 11 1 M ⎜ ⎟ ⎜ ⎟ · ⎜ · ⎟ ⎜ ⎟ = ⎜ ⎟ ⎜ ⎟ · ⎜ · ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ · · ⎝ ⎠ ⎝ ⎠ ( n ) ( n ) ( n ) P M 0 P M 1 . . . P MM P P . . . P M 0 M 1 MM If A is the one-step transition matrix, then A n is the n -step � � transition matrix. 6 18.175 Lecture 31

  7. Ergodic Markov chains Say Markov chain is ergodic if some power of the transition � � matrix has all non-zero entries. Turns out that if chain has this property, then � � π j := lim n →∞ P ( n ) exists and the π j are the unique ij M non-negative solutions of π j = π k P kj that sum to one. k =0 This means that the row vector � � π = π 0 π 1 . . . π M is a left eigenvector of A with eigenvalue 1, i.e., π A = π . We call π the stationary distribution of the Markov chain. � � One can solve the system of linear equations � � M π j = π k P kj to compute the values π j . Equivalent to k =0 considering A fixed and solving π A = π . Or solving ( A − I ) π = 0. This determines π up to a multiplicative constant, and fact that π j = 1 determines the constant. 7 18.175 Lecture 31

  8. Examples Random walks on R d . � � i Branching processes: p ( i , j ) = P ξ m = j where ξ i are � � m =1 i.i.d. non-negative integer-valued random variables. Renewal chain (deterministic unit decreases, random jump � � when zero hit). Card shuffling. � � Ehrenfest chain ( n balls in two chambers, randomly pick ball � � to swap). Birth and death chains (changes by ± 1). Stationarity � � distribution? M/G/1 queues. � � Random walk on a graph. Stationary distribution? � � Random walk on directed graph (e.g., single directed chain). � � Snakes and ladders. � � 8 18.175 Lecture 31

  9. Outline Recollections General setup and basic properties Recurrence and transience 9 18.175 Lecture 31

  10. Outline Recollections General setup and basic properties Recurrence and transience 10 18.175 Lecture 31

  11. Markov chains: general definition Consider a measurable space ( S , S ). � � A function p : S × S → R is a transition probability if � � � For each x ∈ S , A → p ( x , A ) is a probability measure on S , S ). � For each A ∈ S , the map x → p ( x , A ) is a measurable function. Say that X n is a Markov chain w.r.t. F n with transition � � probability p if P ( X n +1 ∈ B |F n ) = p ( X n , B ). How do we construct an infinite Markov chain? Choose p and � � initial distribution µ on ( S , S ). For each n < ∞ write P ( X j ∈ B j , 0 ≤ j ≤ n ) = µ ( dx 0 ) p ( x 0 , dx 1 ) · · · B 0 B 1 p ( x n − 1 , dx n ) . B n Extend to n = ∞ by Kolmogorov’s extension theorem. 11 18.175 Lecture 31

  12. Markov chains Definition, again: Say X n is a Markov chain w.r.t. F n with � � transition probability p if P ( X n +1 ∈ B |F n ) = p ( X n , B ). Construction, again: Fix initial distribution µ on ( S , S ). For � � each n < ∞ write P ( X j ∈ B j , 0 ≤ j ≤ n ) = µ ( dx 0 ) p ( x 0 , dx 1 ) · · · B 0 B 1 p ( x n − 1 , dx n ) . B n Extend to n = ∞ by Kolmogorov’s extension theorem. Notation: Extension produces probability measure P µ on � � sequence space ( S 0 , 1 ,... , S 0 , 1 ,... ). Theorem: ( X 0 , X 1 , . . . ) chosen from P µ is Markov chain. � � Theorem: If X n is any Markov chain with initial distribution � � µ and transition p , then finite dim. probabilities are as above. 12 18.175 Lecture 31

  13. Markov properties S { 0 , 1 ,... } , S { 0 , 1 ,... } , and Markov property: Take (Ω 0 , F ) = � � let P µ be Markov chain measure and θ n the shift operator on Ω 0 (shifts sequence n units to left, discarding elements shifted off the edge). If Y : Ω 0 → R is bounded and measurable then E µ ( Y ◦ θ n |F n ) = E X n Y . Strong Markov property: Can replace n with a.s. finite � � stopping time N and function Y can vary with time. Suppose that for each n , Y n : Ω n → R is measurable and | Y n | ≤ M for all n . Then E µ ( Y N ◦ θ N |F N ) = E X N Y N , where RHS means E x Y n evaluated at x = X n , n = N . 13 18.175 Lecture 31 18.175

  14. Properties Property of infinite opportunities: Suppose X n is Markov � � chain and P ( ∪ ∞ m = n +1 { X m ∈ B m }| X n ) ≥ δ > 0 on { X n ∈ A n } . Then P ( { X n ∈ A n i . o . } − { X n ∈ B n i . o . } ) = 0. Reflection principle: Symmetric random walks on R . Have � � P (sup m ≥ n S m > a ) ≤ 2 P ( S n > a ). Proof idea: Reflection picture. � � 14 18.175 Lecture 31

  15. Outline Recollections General setup and basic properties Recurrence and transience 15 18.175 Lecture 31

  16. Outline Recollections General setup and basic properties Recurrence and transience 16 18.175 Lecture 31

  17. Query Interesting question: If A is an infinite probability transition � � matrix on a countable state space, what does the (infinite) matrix I + A + A 2 + A 3 + . . . = ( I − A ) − 1 represent (if the sum converges)? Question: Does it describe the expected number of y hits � � when starting at x ? Is there a similar interpretation for other power series? A λ A ? How about e or e � � Related to distribution after a Poisson random number of � � steps? 17 18.175 Lecture 31

  18. Recurrence Consider probability walk from y ever returns to y . � � If it’s 1, return to y infinitely often, else don’t. Call y a � � recurrent state if we return to y infinitely often. 18 18.175 Lecture 31

  19. MIT OpenCourseWare http://ocw.mit.edu 18.175 Theory of Probability Spring 2014 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

Recommend


More recommend