csci 8980 advanced topics in graphical models expectation
play

CSci 8980: Advanced Topics in Graphical Models Expectation - PowerPoint PPT Presentation

Posterior Estimation Assumed Density Filtering Expectation Propagation Experiments CSci 8980: Advanced Topics in Graphical Models Expectation Propagation Instructor: Arindam Banerjee October 26, 2007 Posterior Estimation Assumed Density


  1. Posterior Estimation Assumed Density Filtering Expectation Propagation Experiments CSci 8980: Advanced Topics in Graphical Models Expectation Propagation Instructor: Arindam Banerjee October 26, 2007

  2. Posterior Estimation Assumed Density Filtering Expectation Propagation Experiments Posterior Estimation Consider a Bayesian model

  3. Posterior Estimation Assumed Density Filtering Expectation Propagation Experiments Posterior Estimation Consider a Bayesian model Latent variable u with prior P (0) ( u )

  4. Posterior Estimation Assumed Density Filtering Expectation Propagation Experiments Posterior Estimation Consider a Bayesian model Latent variable u with prior P (0) ( u ) Observable D , such as { x 1 , . . . , x m }

  5. Posterior Estimation Assumed Density Filtering Expectation Propagation Experiments Posterior Estimation Consider a Bayesian model Latent variable u with prior P (0) ( u ) Observable D , such as { x 1 , . . . , x m } Quantities of interest

  6. Posterior Estimation Assumed Density Filtering Expectation Propagation Experiments Posterior Estimation Consider a Bayesian model Latent variable u with prior P (0) ( u ) Observable D , such as { x 1 , . . . , x m } Quantities of interest Posterior over latent variable P (0) ( u | D )

  7. Posterior Estimation Assumed Density Filtering Expectation Propagation Experiments Posterior Estimation Consider a Bayesian model Latent variable u with prior P (0) ( u ) Observable D , such as { x 1 , . . . , x m } Quantities of interest Posterior over latent variable P (0) ( u | D ) Likelihood of observation P ( D )

  8. Posterior Estimation Assumed Density Filtering Expectation Propagation Experiments Posterior Estimation Consider a Bayesian model Latent variable u with prior P (0) ( u ) Observable D , such as { x 1 , . . . , x m } Quantities of interest Posterior over latent variable P (0) ( u | D ) Likelihood of observation P ( D ) For conjugate priors, posterior is in the same family

  9. Posterior Estimation Assumed Density Filtering Expectation Propagation Experiments Posterior Estimation Consider a Bayesian model Latent variable u with prior P (0) ( u ) Observable D , such as { x 1 , . . . , x m } Quantities of interest Posterior over latent variable P (0) ( u | D ) Likelihood of observation P ( D ) For conjugate priors, posterior is in the same family In general, it can be intractable

  10. Posterior Estimation Assumed Density Filtering Expectation Propagation Experiments Posterior Estimation Consider a Bayesian model Latent variable u with prior P (0) ( u ) Observable D , such as { x 1 , . . . , x m } Quantities of interest Posterior over latent variable P (0) ( u | D ) Likelihood of observation P ( D ) For conjugate priors, posterior is in the same family In general, it can be intractable What is the best approximation in the (prior) family?

  11. Posterior Estimation Assumed Density Filtering Expectation Propagation Experiments Posterior Estimation (Contd.) The likelihood function often factorizes n � P ( D | u ) = t i ( u ) i =1

  12. Posterior Estimation Assumed Density Filtering Expectation Propagation Experiments Posterior Estimation (Contd.) The likelihood function often factorizes n � P ( D | u ) = t i ( u ) i =1 The true posterior may be intractable n P ( u | D ) ∝ P (0) ( u ) � t i ( u ) i =1

  13. Posterior Estimation Assumed Density Filtering Expectation Propagation Experiments Posterior Estimation (Contd.) The likelihood function often factorizes n � P ( D | u ) = t i ( u ) i =1 The true posterior may be intractable n P ( u | D ) ∝ P (0) ( u ) � t i ( u ) i =1 The normalizer Z is the same as the data likelihood, i.e., n � � P (0) ( u ) � P ( D | u ) P (0) ( u ) d u = P ( D ) Z = t i ( u ) d u = u u i =1

  14. Posterior Estimation Assumed Density Filtering Expectation Propagation Experiments Posterior Estimation (Contd.) The likelihood function often factorizes n � P ( D | u ) = t i ( u ) i =1 The true posterior may be intractable n P ( u | D ) ∝ P (0) ( u ) � t i ( u ) i =1 The normalizer Z is the same as the data likelihood, i.e., n � � P (0) ( u ) � P ( D | u ) P (0) ( u ) d u = P ( D ) Z = t i ( u ) d u = u u i =1 The two problems are closely related

  15. Posterior Estimation Assumed Density Filtering Expectation Propagation Experiments Approximating the Posterior Assume prior P (0) ( u ) belongs to exponential family F P (0) ( u ) = exp( � θ 0 , s ( u ) � − ψ ( θ ))

  16. Posterior Estimation Assumed Density Filtering Expectation Propagation Experiments Approximating the Posterior Assume prior P (0) ( u ) belongs to exponential family F P (0) ( u ) = exp( � θ 0 , s ( u ) � − ψ ( θ )) Let Q ( u ) ∈ F be the best approximation to P ( u | D )

  17. Posterior Estimation Assumed Density Filtering Expectation Propagation Experiments Approximating the Posterior Assume prior P (0) ( u ) belongs to exponential family F P (0) ( u ) = exp( � θ 0 , s ( u ) � − ψ ( θ )) Let Q ( u ) ∈ F be the best approximation to P ( u | D ) Tractably compute Q ( u ) when P ( u | D ) is hard to compute

  18. Posterior Estimation Assumed Density Filtering Expectation Propagation Experiments Approximating the Posterior Assume prior P (0) ( u ) belongs to exponential family F P (0) ( u ) = exp( � θ 0 , s ( u ) � − ψ ( θ )) Let Q ( u ) ∈ F be the best approximation to P ( u | D ) Tractably compute Q ( u ) when P ( u | D ) is hard to compute Approach 1: Assumed density filtering, online Bayesian learning

  19. Posterior Estimation Assumed Density Filtering Expectation Propagation Experiments Approximating the Posterior Assume prior P (0) ( u ) belongs to exponential family F P (0) ( u ) = exp( � θ 0 , s ( u ) � − ψ ( θ )) Let Q ( u ) ∈ F be the best approximation to P ( u | D ) Tractably compute Q ( u ) when P ( u | D ) is hard to compute Approach 1: Assumed density filtering, online Bayesian learning Approach 2: Expectation propagation

  20. Posterior Estimation Assumed Density Filtering Expectation Propagation Experiments Assumed Density Filtering Start with an initial guess Q ( u ) = P (0) ( u )

  21. Posterior Estimation Assumed Density Filtering Expectation Propagation Experiments Assumed Density Filtering Start with an initial guess Q ( u ) = P (0) ( u ) Recall that n P ( u | D ) ∝ P (0) ( u ) � t i ( u ) i =1

  22. Posterior Estimation Assumed Density Filtering Expectation Propagation Experiments Assumed Density Filtering Start with an initial guess Q ( u ) = P (0) ( u ) Recall that n P ( u | D ) ∝ P (0) ( u ) � t i ( u ) i =1 At each step, update Q to incorporate one t i ( u )

  23. Posterior Estimation Assumed Density Filtering Expectation Propagation Experiments Assumed Density Filtering Start with an initial guess Q ( u ) = P (0) ( u ) Recall that n P ( u | D ) ∝ P (0) ( u ) � t i ( u ) i =1 At each step, update Q to incorporate one t i ( u ) Compute the true Bayesian update t i ( u ) Q ( u ) ˆ P ( u ) = � z t i ( z ) Q ( z ) d z

  24. Posterior Estimation Assumed Density Filtering Expectation Propagation Experiments Assumed Density Filtering Start with an initial guess Q ( u ) = P (0) ( u ) Recall that n P ( u | D ) ∝ P (0) ( u ) � t i ( u ) i =1 At each step, update Q to incorporate one t i ( u ) Compute the true Bayesian update t i ( u ) Q ( u ) ˆ P ( u ) = � z t i ( z ) Q ( z ) d z Find Q new ∈ F such that KL (ˆ P ( u ) � ˜ Q new ( u ) = argmin Q ( u )) ˜ Q ∈F

  25. Posterior Estimation Assumed Density Filtering Expectation Propagation Experiments Assumed Density Filtering Start with an initial guess Q ( u ) = P (0) ( u ) Recall that n P ( u | D ) ∝ P (0) ( u ) � t i ( u ) i =1 At each step, update Q to incorporate one t i ( u ) Compute the true Bayesian update t i ( u ) Q ( u ) ˆ P ( u ) = � z t i ( z ) Q ( z ) d z Find Q new ∈ F such that KL (ˆ P ( u ) � ˜ Q new ( u ) = argmin Q ( u )) ˜ Q ∈F Maximum likelihood estimate with ˆ P as the true distribution

  26. Posterior Estimation Assumed Density Filtering Expectation Propagation Experiments Assumed Density Filtering (Contd.) To obtain Q new it is sufficient to do moment matching µ new = E ˆ P [ s ( u )]

  27. Posterior Estimation Assumed Density Filtering Expectation Propagation Experiments Assumed Density Filtering (Contd.) To obtain Q new it is sufficient to do moment matching µ new = E ˆ P [ s ( u )] For each factor t i ( u )

  28. Posterior Estimation Assumed Density Filtering Expectation Propagation Experiments Assumed Density Filtering (Contd.) To obtain Q new it is sufficient to do moment matching µ new = E ˆ P [ s ( u )] For each factor t i ( u ) Compute the means (moments) of ˆ P ( u ) ∝ t i ( u ) Q ( u )

  29. Posterior Estimation Assumed Density Filtering Expectation Propagation Experiments Assumed Density Filtering (Contd.) To obtain Q new it is sufficient to do moment matching µ new = E ˆ P [ s ( u )] For each factor t i ( u ) Compute the means (moments) of ˆ P ( u ) ∝ t i ( u ) Q ( u ) Pick Q new ∈ F with these mean parameters

  30. Posterior Estimation Assumed Density Filtering Expectation Propagation Experiments ADF: An Alternative Viewpoint For a single factor t i ( u )

  31. Posterior Estimation Assumed Density Filtering Expectation Propagation Experiments ADF: An Alternative Viewpoint For a single factor t i ( u ) The true posterior ˆ P ( u ) ∝ t i ( u ) Q ( u )

Recommend


More recommend