Dirichlet Process DPM MCMC Application DPM Application Reserach Potential DPM in Applications Yong Song University of Melbourne Department of Economics BAM Yong Song University of Melbourne Department of Economics BAM DPM in Applications
Dirichlet Process DPM MCMC Application DPM Application Reserach Potential Today Extract from the forthcoming 3-day BAM short course in Nov 2016. Yong Song University of Melbourne Department of Economics BAM DPM in Applications
Dirichlet Process DPM MCMC Application DPM Application Reserach Potential Definition of DP (Ferguson 1973) Definition (Dirichlet Process) The Dirichlet process over a set Ω is a stochastic process whose sample path is a probability distribution over Ω. For a random distribution F distributed according to a Dirichlet process DP ( α, G 0 ), given any finite measurable partition A 1 , A 2 , · · · , A K of the sample space Ω, the random vector ( F ( A 1 ) , · · · , F ( A K )) is distributed as a Dirichlet distribution with parameters ( α G 0 ( A 1 ) , · · · , α G 0 ( A K )) Yong Song University of Melbourne Department of Economics BAM DPM in Applications
Dirichlet Process DPM MCMC Application DPM Application Reserach Potential Partition Example Yong Song University of Melbourne Department of Economics BAM DPM in Applications
Dirichlet Process DPM MCMC Application DPM Application Reserach Potential Interpretation A random draw from DP ( α, G 0 ) is a distribution F over Ω (the real line in the above figure) For the above partion, the probability measure is random as � � ( F ( A 1 ) , F ( A 2 ) , ..., F ( A 7 )) ∼ Dir α G 0 ( A 1 ) , α G 0 ( A 2 ) , ..., α G 0 ( A 7 ) Yong Song University of Melbourne Department of Economics BAM DPM in Applications
Dirichlet Process DPM MCMC Application DPM Application Reserach Potential Conjugacy Suppose that we observe a random draw θ from distribution F , where F ∼ DP ( α, G 0 ). The posterior distribution of F ? For any partition A 1 , ... A K , what is the posterior of [ F ( A 1 ) , ..., F ( A K )] conditional on θ ? Yong Song University of Melbourne Department of Economics BAM DPM in Applications
Dirichlet Process DPM MCMC Application DPM Application Reserach Potential Conjugacy Prior: [ F ( A 1 ) , ..., F ( A K )] ∼ Dir ( α G 0 ( A 1 ) , α G 0 ( A 2 ) , ..., α G 0 ( A 7 )) Likelihood p ( θ | [ F ( A 1 ) , ..., F ( A K )]) = F ( A 1 ) δ θ ( A 1 ) F ( A 2 ) δ θ ( A 2 ) ... F ( A K ) δ θ ( A K ) , where δ θ ( A ) = 1 if θ ∈ A and 0 otherwise. Yong Song University of Melbourne Department of Economics BAM DPM in Applications
Dirichlet Process DPM MCMC Application DPM Application Reserach Potential Conjugacy Posterior kernel p ([ F ( A 1 ) , ..., F ( A K )] | θ ) ∝ p ( θ | [ F ( A 1 ) , ..., F ( A K )]) p ([ F ( A 1 ) , ..., F ( A K )]) ∝ F ( A 1 ) δ θ ( A 1 ) F ( A 2 ) δ θ ( A 2 ) ... F ( A K ) δ θ ( A K ) × F ( A 1 ) α G 0 ( A 1 ) − 1 F ( A 2 ) α G 0 ( A 2 ) − 1 ... F ( A K ) α G 0 ( A K ) − 1 ∝ F ( A 1 ) α G 0 ( A 1 )+ δ θ ( A 1 ) − 1 F ( A 2 ) α G 0 ( A 2 )+ δ θ ( A 2 ) − 1 ... F ( A K ) α G 0 ( A K )+ δ θ ( A K ) − 1 Yong Song University of Melbourne Department of Economics BAM DPM in Applications
Dirichlet Process DPM MCMC Application DPM Application Reserach Potential Conjugacy [ F ( A 1 ) , ..., F ( A K )] | θ ∼ Dir ( α G 0 ( A 1 ) + δ θ ( A 1 ) , α G 0 ( A 2 ) + δ θ ( A 2 ) , ..., α G 0 ( A K ) + δ θ ( A K )) This formula applies to ANY finite partition. We can write a new concentration parameter α = α + 1 and new shape parameter α 1 G 0 = α +1 G 0 + α +1 δ θ . Yong Song University of Melbourne Department of Economics BAM DPM in Applications
Dirichlet Process DPM MCMC Application DPM Application Reserach Potential Conjugacy F | θ ∼ DP ( α, G 0 ) α 1 with α = α + 1 and G 0 = α +1 G 0 + α +1 δ θ . Yong Song University of Melbourne Department of Economics BAM DPM in Applications
Dirichlet Process DPM MCMC Application DPM Application Reserach Potential Conjugacy Suppose that you observe n observations, denoted by Θ = ( θ 1 , ..., θ n ), drawn from F . For any finite partition A 1 , ..., A K , what is the posterior [ F ( A 1 ) , ..., F ( A K )]? Prior (same): [ F ( A 1 ) , ..., F ( A K )] ∼ Dir ( α, [ G 0 ( A 1 ) , ..., G 0 ( A K )]) Likelihood p (Θ | [ F ( A 1 ) , ..., F ( A K )]) n n n � δ θ i ( A 1 ) � δ θ i ( A 2 ) � δ θ i ( A K ) = F ( A 1 ) F ( A 2 ) ... F ( A K ) , i =1 i =1 i =1 n � where δ θ i ( A j ) counts the number of θ i ’s that fall in the set i =1 A j . Yong Song University of Melbourne Department of Economics BAM DPM in Applications
Dirichlet Process DPM MCMC Application DPM Application Reserach Potential Conjugacy Posterior kernel p ([ F ( A 1 ) , ..., F ( A K )] | Θ) ∝ p (Θ | [ F ( A 1 ) , ..., F ( A K )]) p ([ F ( A 1 ) , ..., F ( A K )]) n n n � δ θ i ( A 1 ) � δ θ i ( A 2 ) � δ θ i ( A K ) ∝ F ( A 1 ) F ( A 2 ) ... F ( A K ) i =1 i =1 i =1 × F ( A 1 ) α G 0 ( A 1 ) − 1 F ( A 2 ) α G 0 ( A 2 ) − 1 ... F ( A K ) α G 0 ( A K ) − 1 n n α G 0 ( A 1 )+ � δ θ i ( A 1 ) − 1 α G 0 ( A 2 )+ � δ θ i ( A 2 ) − 1 ∝ F ( A 1 ) F ( A 2 ) i =1 i =1 n α G 0 ( A K )+ � δ θ i ( A K ) − 1 ... F ( A K ) i =1 Yong Song University of Melbourne Department of Economics BAM DPM in Applications
Dirichlet Process DPM MCMC Application DPM Application Reserach Potential Conjugacy [ F ( A 1 ) , ..., F ( A K )] | Θ � n n � � � ∼ Dir α G 0 ( A 1 ) + δ θ i ( A 1 ) , ..., α G 0 ( A K ) + δ θ i ( A K ) i =1 i =1 This formula applies to ANY finite partition. We can write a new concentration parameter α = α + n and new shape parameter n 1 α G 0 = α + n G 0 + � δ θ i . α + n i =1 Yong Song University of Melbourne Department of Economics BAM DPM in Applications
Dirichlet Process DPM MCMC Application DPM Application Reserach Potential Conjugacy F | Θ ∼ DP ( α, G 0 ) n 1 α with α = α + n and G 0 = α + n G 0 + � δ θ i . α + n i =1 Yong Song University of Melbourne Department of Economics BAM DPM in Applications
Dirichlet Process DPM MCMC Application DPM Application Reserach Potential Conjugacy Shape parameter n α 1 � G 0 = α + nG 0 + δ θ i α + n i =1 n α + n × 1 α n � = α + nG 0 + δ θ i n i =1 n Notice that 1 � δ θ i is the empirical distribution of observations. n i =1 What if n → ∞ ? Yong Song University of Melbourne Department of Economics BAM DPM in Applications
Dirichlet Process DPM MCMC Application DPM Application Reserach Potential Conjugacy F | Θ ∼ DP ( α, G 0 ) n 1 α n with α = α + n and G 0 = α + n G 0 + � δ θ i . α + n n i =1 Yong Song University of Melbourne Department of Economics BAM DPM in Applications
Dirichlet Process DPM MCMC Application DPM Application Reserach Potential Conjugacy 1 DP is conjugate to discrete distributions (finite or infinite). 2 Each draw F from DP is a distribution. 3 Each draw F from DP is a discrete distribution. These properties make DP suitable as a prior for the parameters in an infinite mixture model. Yong Song University of Melbourne Department of Economics BAM DPM in Applications
Dirichlet Process DPM MCMC Application DPM Application Reserach Potential Dinstinct values We can use θ ∗ j to represent distinct values of θ i for i = 1 , ..., n . For instance, if there are M distinct values of θ i , we have Θ ∗ = ( θ ∗ 1 , ..., θ ∗ M ). To link Θ to Θ ∗ , we can use West et al.[1994] and Escobar and West[1995] by a link funcion c i ( c means choice). If c i = j , then θ i = θ ∗ j . More concisely, θ i = θ ∗ c i Yong Song University of Melbourne Department of Economics BAM DPM in Applications
Dirichlet Process DPM MCMC Application DPM Application Reserach Potential Dinstinct values We can write the conditional posterior of F | Θ in terms of distinct values as F | Θ ∼ DP ( α, G 0 ) M n j α n � with α = α + n and G 0 = α + n G 0 + n δ θ ∗ j . n j is the α + n j =1 number of θ i ’s that take value of θ ∗ j . Yong Song University of Melbourne Department of Economics BAM DPM in Applications
Dirichlet Process DPM MCMC Application DPM Application Reserach Potential Stick Breaking Representation (Sethuraman 1994) F ≡ ( w k , θ k ) ∞ k =1 ∼ DP ( α, G 0 ) can be generated by θ k ∼ G 0 for k = 1 , 2 , · · · V k ∼ Beta (1 , α ) for k = 1 , 2 , · · · k − 1 � w k = V k (1 − V k ) for k = 1 , 2 , · · · j =1 The generation of { w k } is called stick breaking process, or simply w ∼ SBP ( α ). Yong Song University of Melbourne Department of Economics BAM DPM in Applications
Dirichlet Process DPM MCMC Application DPM Application Reserach Potential Stick Breaking Process π 1 π 2 π 3 π 4 ... Yong Song University of Melbourne Department of Economics BAM DPM in Applications
Dirichlet Process DPM MCMC Application DPM Application Reserach Potential Dirichelt Process Mixture (DPM) West et al.[1994] and Escobar and West[1995] proposed the Dirichlet process mixture model. DPM w ∼ SBP ( α ) θ k ∼ G 0 for k = 1 , 2 , · · · ∞ � y ∼ w k f ( y | θ k ) k =1 Yong Song University of Melbourne Department of Economics BAM DPM in Applications
Recommend
More recommend