Employment State 1 (student) is transient � � X ( j ) � = 1 for all j > i | � � X ( i ) = 1 P
Employment State 1 (student) is transient � � � � X ( j ) � = 1 for all j > i | � � X ( i + 1 ) = 3 | � � X ( i ) = 1 ≥ P X ( i ) = 1 P = 0 . 1 > 0
Employment State 1 (student) is transient � � � � X ( j ) � = 1 for all j > i | � � X ( i + 1 ) = 3 | � � X ( i ) = 1 ≥ P X ( i ) = 1 P = 0 . 1 > 0 State 3 (employed) is recurrent � � X ( j ) � = 3 for all j > i | � � X ( i ) = 3 P
Employment State 1 (student) is transient � � � � X ( j ) � = 1 for all j > i | � � X ( i + 1 ) = 3 | � � X ( i ) = 1 ≥ P X ( i ) = 1 P = 0 . 1 > 0 State 3 (employed) is recurrent � � X ( j ) � = 3 for all j > i | � � X ( i ) = 3 P � � X ( j ) = 4 for all j > i | � � = P X ( i ) = 3
Employment State 1 (student) is transient � � � � X ( j ) � = 1 for all j > i | � � X ( i + 1 ) = 3 | � � X ( i ) = 1 ≥ P X ( i ) = 1 P = 0 . 1 > 0 State 3 (employed) is recurrent � � X ( j ) � = 3 for all j > i | � � X ( i ) = 3 P � � X ( j ) = 4 for all j > i | � � = P X ( i ) = 3 � � � � k � X ( i + 1 ) = 4 | � � X ( i + j + 1 ) = 4 | � � = lim k →∞ P X ( i ) = 3 P X ( i + j ) = 4 j = 1
Employment State 1 (student) is transient � � � � X ( j ) � = 1 for all j > i | � � X ( i + 1 ) = 3 | � � X ( i ) = 1 ≥ P X ( i ) = 1 P = 0 . 1 > 0 State 3 (employed) is recurrent � � X ( j ) � = 3 for all j > i | � � X ( i ) = 3 P � � X ( j ) = 4 for all j > i | � � = P X ( i ) = 3 � � � � k � X ( i + 1 ) = 4 | � � X ( i + j + 1 ) = 4 | � � = lim k →∞ P X ( i ) = 3 P X ( i + j ) = 4 j = 1 k →∞ 0 . 1 · 0 . 6 k = 0 = lim
Irreducible Markov chain A Markov chain is irreducible if for any state x and y � = x there exists m ≥ 0 such that � � X ( i + m ) = y | � � > 0 P X ( i ) = x All states in an irreducible Markov chain are recurrent
Definition Recurrence Periodicity Convergence Markov-chain Monte Carlo
Period of a state The period m of a state x of a Markov chain � X is the largest integer such that the chain always takes km steps (for a positive integer k ) to return to x
Period of a state 1 0.9 A B C 0.1 1
Aperiodic chain A Markov chain � X is aperiodic if all states have period equal to one
Definition Recurrence Periodicity Convergence Markov-chain Monte Carlo
Convergence in distribution A Markov chain converges in distribution if the state vector converges to a constant vector � p ∞ := lim i →∞ � p � X ( i ) i →∞ T i = lim X � p � � X ( 0 )
Mobile phones ◮ Company releases new mobile-phone model ◮ At the moment 90% of the phones are in stock, 10% have been sold locally and none have been exported ◮ Each day a phone is sold with probability 0.2 and exported with probability 0.1 ◮ Initial state vector and transition matrix: 0 . 9 0 . 7 0 0 a := � , T � X = 0 . 1 0 . 2 1 0 0 0 . 1 0 1
Mobile phones 1 1 Exported Sold 0.7 0.2 0.1 In stock
Mobile phones Exported Sold In stock 0 5 10 15 20 Day
Mobile phones Exported Sold In stock 0 5 10 15 20 Day
Mobile phones Exported Sold In stock 0 5 10 15 20 Day
Mobile phones The company wants to know how many phones are eventually sold locally and how many exported i →∞ T i i →∞ � lim X ( i ) = lim X � p � p � � X ( 0 ) i →∞ T i = lim X � a �
Mobile phones The transition matrix T � X has three eigenvectors 0 0 0 . 80 , , � q 1 := 0 � q 2 := 1 � q 3 := − 0 . 53 1 0 0 . 27 The corresponding eigenvalues are λ 1 := 1, λ 2 := 1 and λ 3 := 0 . 7 Eigendecomposition of T � X : X := Q Λ Q − 1 T � λ 1 0 0 � � � Q := q 1 q 2 � q 3 � Λ := 0 λ 2 0 0 0 λ 3
Mobile phones We express the initial state vector � a in terms of the eigenvectors 0 . 3 Q − 1 � 0 . 7 p � X ( 0 ) = 1 . 122 so that � a = 0 . 3 � q 1 + 0 . 7 � q 2 + 1 . 122 � q 3
Mobile phones i →∞ T i lim X � a �
Mobile phones i →∞ T i i →∞ T i lim X � a = lim X ( 0 . 3 � q 1 + 0 . 7 � q 2 + 1 . 122 � q 3 ) � �
Mobile phones i →∞ T i i →∞ T i lim X � a = lim X ( 0 . 3 � q 1 + 0 . 7 � q 2 + 1 . 122 � q 3 ) � � i →∞ 0 . 3 T i q 1 + 0 . 7 T i q 2 + 1 . 122 T i = lim X � X � X � q 3 � � �
Mobile phones i →∞ T i i →∞ T i lim X � a = lim X ( 0 . 3 � q 1 + 0 . 7 � q 2 + 1 . 122 � q 3 ) � � i →∞ 0 . 3 T i q 1 + 0 . 7 T i q 2 + 1 . 122 T i = lim X � X � X � q 3 � � � i →∞ 0 . 3 λ i q 1 + 0 . 7 λ i q 2 + 1 . 122 λ i = lim 1 � 2 � 3 � q 3
Mobile phones i →∞ T i i →∞ T i lim X � a = lim X ( 0 . 3 � q 1 + 0 . 7 � q 2 + 1 . 122 � q 3 ) � � i →∞ 0 . 3 T i q 1 + 0 . 7 T i q 2 + 1 . 122 T i = lim X � X � X � q 3 � � � i →∞ 0 . 3 λ i q 1 + 0 . 7 λ i q 2 + 1 . 122 λ i = lim 1 � 2 � 3 � q 3 q 2 + 1 . 122 0 . 5 i � = lim i →∞ 0 . 3 � q 1 + 0 . 7 � q 3
Mobile phones i →∞ T i i →∞ T i lim X � a = lim X ( 0 . 3 � q 1 + 0 . 7 � q 2 + 1 . 122 � q 3 ) � � i →∞ 0 . 3 T i q 1 + 0 . 7 T i q 2 + 1 . 122 T i = lim X � X � X � q 3 � � � i →∞ 0 . 3 λ i q 1 + 0 . 7 λ i q 2 + 1 . 122 λ i = lim 1 � 2 � 3 � q 3 q 2 + 1 . 122 0 . 5 i � = lim i →∞ 0 . 3 � q 1 + 0 . 7 � q 3 = 0 . 3 � q 1 + 0 . 7 � q 2
Mobile phones i →∞ T i i →∞ T i lim X � a = lim X ( 0 . 3 � q 1 + 0 . 7 � q 2 + 1 . 122 � q 3 ) � � i →∞ 0 . 3 T i q 1 + 0 . 7 T i q 2 + 1 . 122 T i = lim X � X � X � q 3 � � � i →∞ 0 . 3 λ i q 1 + 0 . 7 λ i q 2 + 1 . 122 λ i = lim 1 � 2 � 3 � q 3 q 2 + 1 . 122 0 . 5 i � = lim i →∞ 0 . 3 � q 1 + 0 . 7 � q 3 = 0 . 3 � q 1 + 0 . 7 � q 2 0 = 0 . 7 0 . 3
Mobile phones 1.0 In stock Sold 0.8 Exported 0.6 0.4 0.2 0.0 0 5 10 15 20 Day
Mobile phones 0 � � Q − 1 � p � i →∞ T i lim X � p � X ( 0 ) = X ( 0 ) � � � 2 Q − 1 � p � X ( 0 ) 1 0 . 6 0 . 6 � , Q − 1 � b := 0 b = 0 . 4 (1) 0 . 4 0 . 75 0 . 4 0 . 23 , Q − 1 � c := � 0 . 5 c = 0 . 77 (2) 0 . 1 0 . 50
Initial state vector � b 1.0 In stock Sold 0.8 Exported 0.6 0.4 0.2 0.0 0 5 10 15 20 Day
Initial state vector � c 1.0 0.8 0.6 0.4 0.2 0.0 0 5 10 15 20 Day
Stationary distribution p stat is a stationary distribution of � � X if X � p stat = � T � p stat � p stat is an eigenvector with eigenvalue equal to one If � p stat is the initial state i →∞ � lim p � X ( i ) = � p stat
Reversibility Let � p ∈ R s X ( i ) be distributed according to a state vector � ( s = number of states) � X is reversible with respect to � p if � � � � X ( i ) = x j , � � X ( i ) = x k , � � X ( i + 1 ) = x k = P X ( i + 1 ) = x j P for all 1 ≤ j , k ≤ s This is equivalent to the detailed-balance condition � � � � T � kj � p j = T � jk � p k , for all 1 ≤ j , k ≤ s X X
Reversibility implies stationarity The detailed-balance condition provides a sufficient condition for stationarity If � p is a stationary distribution of � X is reversible with respect to � p , then � X � � X � T � p j
Reversibility implies stationarity The detailed-balance condition provides a sufficient condition for stationarity If � p is a stationary distribution of � X is reversible with respect to � p , then � X s � � � � � X � j = jk � T � p T � p k X k = 1
Reversibility implies stationarity The detailed-balance condition provides a sufficient condition for stationarity If � p is a stationary distribution of � X is reversible with respect to � p , then � X s � � � � � X � j = jk � T � p T � p k X k = 1 s � � � = T � kj � p j X k = 1
Reversibility implies stationarity The detailed-balance condition provides a sufficient condition for stationarity If � p is a stationary distribution of � X is reversible with respect to � p , then � X s � � � � � X � j = jk � T � p T � p k X k = 1 s � � � = T � kj � p j X k = 1 s � � � = � p j T � X kj k = 1
Reversibility implies stationarity The detailed-balance condition provides a sufficient condition for stationarity If � p is a stationary distribution of � X is reversible with respect to � p , then � X s � � � � � X � j = jk � T � p T � p k X k = 1 s � � � = T � kj � p j X k = 1 s � � � = � p j T � X kj k = 1 = � p j
Irreducible chains Irreducible Markov chains have a single stationary distribution Follows from the Perron-Frobenius theorem: ◮ The transition matrix of an irreducible Markov chain has a single eigenvector with eigenvalue equal to one ◮ The eigenvector has nonnegative entries
Irreducible chains If � X is irreducible and aperiodic, its state vector converges to its stationary distribution � p stat for any initial state vector � p � X ( 0 ) � X converges in distribution to a random variable with pmf given by � p stat
Car rental Aim: Model location of cars 3 states: Los Angeles, San Francisco, San Jose New cars are uniformly distributed between the 3 states After that the transition probabilities are San Francisco Los Angeles San Jose � � 0.6 0.1 0.3 San Francisco 0.2 0.8 0.3 Los Angeles 0.2 0.1 0.4 San Jose
Car rental What is the proportion of cars in each city eventually? Does this depend on the initial allocation?
Car rental Markov chain with 1 / 3 0 . 6 0 . 1 0 . 3 p � � X ( 0 ) := T := 1 / 3 0 . 2 0 . 8 0 . 3 1 / 3 0 . 2 0 . 1 0 . 4
Car rental 0.8 LA 0.2 0.1 0.1 0.3 SF SJ 0.3 0.2 0.6 0.4
Car rental The transition matrix has the following eigenvectors 0 . 273 − 0 . 577 − 0 . 577 , , q 1 := � 0 . 545 q 2 := � 0 . 789 q 3 := � − 0 . 211 0 . 182 − 0 . 211 0 . 789 The eigenvalues are λ 1 := 1, λ 2 := 0 . 573 and λ 3 := 0 . 227 No matter how the cars are allocated, 27 . 3 % end up in San Francisco, 54 . 5 % in LA and 18 . 2 % in San Jose
Car rental 1.0 SF LA 0.8 SJ 0.6 0.4 0.2 0.0 0 5 10 15 20 Customer
Car rental 1.0 SF LA 0.8 SJ 0.6 0.4 0.2 0.0 0 5 10 15 20 Customer
Car rental 1.0 SF LA 0.8 SJ 0.6 0.4 0.2 0.0 0 5 10 15 20 Customer
Definition Recurrence Periodicity Convergence Markov-chain Monte Carlo
Markov-chain Monte Carlo Irreducible aperiodic Markov chains converge to a unique stationary distribution Basic idea: Simulate a Markov chain that converges to the target distribution Very useful in Bayesian statistics Main challenge: Designing the Markov chain so the stationary distribution is the one we want
Metropolis-Hastings algorithm Aim: Construct a Markov chain such that its stationary distribution is p ∈ R s � � p j := p X ( x j ) , 1 ≤ j ≤ s Idea: Sample from an irreducible Markov chain with transition matrix T on the same state space { x 1 , . . . , x s } , forcing it to converge to � p
Metropolis-Hastings algorithm Initialize � X ( 0 ) to an arbitrary value, then for i = 1 , 2 , 3 , . . . 1. Generate C from � X ( i − 1 ) by using T , i.e. � � C = k | � X ( i − 1 ) = j = T kj , 1 ≤ j , k ≤ s P 2. Set � � � � C with probability p acc X ( i − 1 ) , C � X ( i ) := � X ( i − 1 ) otherwise where the acceptance probability is defined as � T jk � � p k p acc ( j , k ) := min , 1 1 ≤ j , k ≤ s T kj � p j
Reversibility implies stationarity Let � p ∈ R s X ( i ) be distributed according to a state vector � � X is reversible with respect to � p if for all 1 ≤ j , k ≤ s � � � � X ( i ) = x j , � � X ( i ) = x k , � � P X ( i + 1 ) = x k = P X ( i + 1 ) = x j Equivalent to the detailed-balance condition � � � � kj � p j = jk � p k , for all 1 ≤ j , k ≤ s T � T � X X If � p is a stationary distribution of � X is reversible with respect to � p , then � X
Reversibility of the Metropolis-Hastings chain Holds if j = k . Assume j � = k � � � � X ( i ) = k | � � T � kj := P X ( i − 1 ) = j X
Reversibility of the Metropolis-Hastings chain Holds if j = k . Assume j � = k � � � � X ( i ) = k | � � T � kj := P X ( i − 1 ) = j X � � X ( i ) = C , C = k | � � = P X ( i − 1 ) = j
Reversibility of the Metropolis-Hastings chain Holds if j = k . Assume j � = k � � � � X ( i ) = k | � � T � kj := P X ( i − 1 ) = j X � � X ( i ) = C , C = k | � � = P X ( i − 1 ) = j � � � � X ( i ) = C | C = k , � � C = k | � = P X ( i − 1 ) = j X ( i − 1 ) = j P
Reversibility of the Metropolis-Hastings chain Holds if j = k . Assume j � = k � � � � X ( i ) = k | � � T � kj := P X ( i − 1 ) = j X � � X ( i ) = C , C = k | � � = P X ( i − 1 ) = j � � � � X ( i ) = C | C = k , � � C = k | � = P X ( i − 1 ) = j X ( i − 1 ) = j P = p acc ( j , k ) T kj
Reversibility of the Metropolis-Hastings chain Holds if j = k . Assume j � = k � � � � X ( i ) = k | � � T � kj := P X ( i − 1 ) = j X � � X ( i ) = C , C = k | � � = P X ( i − 1 ) = j � � � � X ( i ) = C | C = k , � � C = k | � = P X ( i − 1 ) = j X ( i − 1 ) = j P = p acc ( j , k ) T kj � � Similarly, T � jk = p acc ( k , j ) T jk X
Reversibility of the Metropolis-Hastings chain � � T � kj � p j = p acc ( j , k ) T kj � p j X
Reversibility of the Metropolis-Hastings chain � � T � kj � p j = p acc ( j , k ) T kj � p j X � T jk � � p k = T kj � p j min , 1 T kj � p j
Reversibility of the Metropolis-Hastings chain � � T � kj � p j = p acc ( j , k ) T kj � p j X � T jk � � p k = T kj � p j min , 1 T kj � p j = min { T jk � p k , T kj � p j }
Reversibility of the Metropolis-Hastings chain � � T � kj � p j = p acc ( j , k ) T kj � p j X � T jk � � p k = T kj � p j min , 1 T kj � p j = min { T jk � p k , T kj � p j } � � 1 , T kj � p j = T jk � p k min T jk � p k
Reversibility of the Metropolis-Hastings chain � � T � kj � p j = p acc ( j , k ) T kj � p j X � T jk � � p k = T kj � p j min , 1 T kj � p j = min { T jk � p k , T kj � p j } � � 1 , T kj � p j = T jk � p k min T jk � p k = p acc ( k , j ) T jk � p k
Reversibility of the Metropolis-Hastings chain � � T � kj � p j = p acc ( j , k ) T kj � p j X � T jk � � p k = T kj � p j min , 1 T kj � p j = min { T jk � p k , T kj � p j } � � 1 , T kj � p j = T jk � p k min T jk � p k = p acc ( k , j ) T jk � p k � � = T � jk � p k X
Generating a Poisson random variable Aim: Generate a Poisson random variable X We don’t need to know the normalizing constant, just that p X ( x ) ∝ λ x x !
Recommend
More recommend