posteriors conjugacy and exponential families
play

Posteriors, conjugacy, and exponential families for completely - PowerPoint PPT Presentation

Posteriors, conjugacy, and exponential families for completely random measures Tamara Broderick, Ashia C. Wilson, Michael I. Jordan MIT Berkeley Berkeley Models Beta process, Bernoulli process (IBP) Gamma process, Poisson likelihood


  1. Posteriors, conjugacy, and exponential families for completely random measures Tamara Broderick, Ashia C. Wilson, Michael I. Jordan MIT Berkeley Berkeley

  2. Models • Beta process, Bernoulli process (IBP) • Gamma process, Poisson likelihood process (DP, CRP) • Beta process, negative binomial process Background • Parametric exponential family conjugacy [Diaconis & Ylvisaker 1979] p ( x | θ ) = θ x (1 + θ ) − 1 x ∈ { 0 , 1 } θ > 0 � p ( θ ) ∝ θ α (1 + θ ) − α − β = BetaPrime( θ | α , β ) α > 0 , β > 0 p ( θ | x ) ∝ θ α + x (1 + θ ) − ( α + x ) − ( β − x +1) 1

  3. Models • Beta process, Bernoulli process (IBP) • Gamma process, Poisson likelihood process (DP, CRP) • Beta process, negative binomial process Background • Parametric exponential family conjugacy [Diaconis & Ylvisaker 1979] p ( x | θ ) = θ x (1 + θ ) − 1 x ∈ { 0 , 1 } θ > 0 � p ( θ ) ∝ θ α (1 + θ ) − α − β = BetaPrime( θ | α , β ) α > 0 , β > 0 p ( θ | x ) ∝ θ α + x (1 + θ ) − ( α + x ) − ( β − x +1) 1

  4. Models • Beta process, Bernoulli process (IBP) • Gamma process, Poisson likelihood process (DP, CRP) • Beta process, negative binomial process Background • Parametric exponential family conjugacy [Diaconis & Ylvisaker 1979] p ( x | θ ) = θ x (1 + θ ) − 1 x ∈ { 0 , 1 } θ > 0 � p ( θ ) ∝ θ α (1 + θ ) − α − β = BetaPrime( θ | α , β ) α > 0 , β > 0 p ( θ | x ) ∝ θ α + x (1 + θ ) − ( α + x ) − ( β − x +1) 1

  5. Models • Beta process, Bernoulli process (IBP) • Gamma process, Poisson likelihood process (DP, CRP) • Beta process, negative binomial process Background • Parametric exponential family conjugacy [Diaconis & Ylvisaker 1979] p ( x | θ ) = θ x (1 + θ ) − 1 x ∈ { 0 , 1 } θ > 0 � p ( θ ) ∝ θ α (1 + θ ) − α − β = BetaPrime( θ | α , β ) α > 0 , β > 0 p ( θ | x ) ∝ θ α + x (1 + θ ) − ( α + x ) − ( β − x +1) 1

  6. Models • Beta process, Bernoulli process (IBP) • Gamma process, Poisson likelihood process (DP, CRP) • Beta process, negative binomial process Background • Parametric exponential family conjugacy [Diaconis & Ylvisaker 1979] p ( x | θ ) = θ x (1 + θ ) − 1 x ∈ { 0 , 1 } θ > 0 � p ( θ ) ∝ θ α (1 + θ ) − α − β = BetaPrime( θ | α , β ) α > 0 , β > 0 p ( θ | x ) ∝ θ α + x (1 + θ ) − ( α + x ) − ( β − x +1) 1

  7. Models • Beta process, Bernoulli process (IBP) • Gamma process, Poisson likelihood process (DP, CRP) • Beta process, negative binomial process Background • Parametric exponential family conjugacy [Diaconis & Ylvisaker 1979] p ( x | θ ) = θ x (1 + θ ) − 1 x ∈ { 0 , 1 } θ > 0 � p ( θ ) ∝ θ α (1 + θ ) − α − β = BetaPrime( θ | α , β ) α > 0 , β > 0 p ( θ | x ) ∝ θ α + x (1 + θ ) − ( α + x ) − ( β − x +1) 1

  8. Models • Beta process, Bernoulli process (IBP) • Gamma process, Poisson likelihood process (DP, CRP) • Beta process, negative binomial process Background • Parametric exponential family conjugacy [Diaconis & Ylvisaker 1979] p ( x | θ ) = θ x (1 + θ ) − 1 x ∈ { 0 , 1 } θ > 0 � p ( θ ) ∝ θ α (1 + θ ) − α − β = BetaPrime( θ | α , β ) α > 0 , β > 0 p ( θ | x ) ∝ θ α + x (1 + θ ) − ( α + x ) − ( β − x +1) 1

  9. Models • Beta process, Bernoulli process (IBP) • Gamma process, Poisson likelihood process (DP, CRP) • Beta process, negative binomial process Background • Parametric exponential family conjugacy [Diaconis & Ylvisaker 1979] p ( x | θ ) = θ x (1 + θ ) − 1 x ∈ { 0 , 1 } θ > 0 � p ( θ ) ∝ θ α (1 + θ ) − α − β = BetaPrime( θ | α , β ) α > 0 , β > 0 p ( θ | x ) ∝ θ α + x (1 + θ ) − ( α + x ) − ( β − x +1) 1

  10. Models • Beta process, Bernoulli process (IBP) • Gamma process, Poisson likelihood process (DP, CRP) • Beta process, negative binomial process Background • Parametric exponential family conjugacy [Diaconis & Ylvisaker 1979] p ( x | θ ) = θ x (1 + θ ) − 1 x ∈ { 0 , 1 } θ > 0 � p ( θ ) ∝ θ α (1 + θ ) − α − β = BetaPrime( θ | α , β ) α > 0 , β > 0 p ( θ | x ) ∝ θ α + x (1 + θ ) − ( α + x ) − ( β − x +1) 1

  11. Models • Beta process, Bernoulli process (IBP) • Gamma process, Poisson likelihood process (DP, CRP) • Beta process, negative binomial process Background • Parametric exponential family conjugacy [Diaconis & Ylvisaker 1979] p ( x | θ ) = θ x (1 + θ ) − 1 x ∈ { 0 , 1 } θ > 0 � p ( θ ) ∝ θ α (1 + θ ) − α − β = BetaPrime( θ | α , β ) α > 0 , β > 0 p ( θ | x ) ∝ θ α + x (1 + θ ) − ( α + x ) − ( β − x +1) 1

  12. Models • Beta process, Bernoulli process (IBP) • Gamma process, Poisson likelihood process (DP, CRP) • Beta process, negative binomial process Background • Parametric exponential family conjugacy [Diaconis & Ylvisaker 1979] • Likelihood ➞ conjugate prior, straightforward inference • Integration ➞ addition 2

  13. Models • Beta process, Bernoulli process (IBP) • Gamma process, Poisson likelihood process (DP, CRP) • Beta process, negative binomial process Background • Parametric exponential family conjugacy [Diaconis & Ylvisaker 1979] • Likelihood ➞ conjugate prior, straightforward inference • Integration ➞ addition 2

  14. Models • Beta process, Bernoulli process (IBP) • Gamma process, Poisson likelihood process (DP, CRP) • Beta process, negative binomial process Want: One framework • For Bayesian nonparametric models: • Likelihood ➞ conjugate prior, straightforward inference 3

  15. Models • Beta process, Bernoulli process (IBP) • Gamma process, Poisson likelihood process (DP, CRP) • Beta process, negative binomial process Want: One framework • For Bayesian nonparametric models: • Likelihood ➞ conjugate prior, straightforward inference 3

  16. Models • Beta process, Bernoulli process (IBP) • Gamma process, Poisson likelihood process (DP, CRP) • Beta process, negative binomial process Want: One framework • For Bayesian nonparametric models: • Likelihood ➞ conjugate prior, straightforward inference 3

  17. Models • Beta process, Bernoulli process (IBP) • Gamma process, Poisson likelihood process (DP, CRP) • Beta process, negative binomial process Want: One framework • For Bayesian nonparametric models: • Likelihood ➞ conjugate prior, straightforward inference 3

  18. Clustering Technology Sports Health Econ Arts Document 1 Document 2 Document 3 Document 4 Document 5 Document 6 Document 7 4

  19. Feature allocation Technology Sports Health Econ Arts Document 1 Document 2 Document 3 Document 4 Document 5 Document 6 Document 7 5

  20. Indian buffet process (IBP) ... k = 1 2 For n = 1, 2, ..., N n = 1 1. Data point n has an existing 2 feature k that has occurred S n − 1 ,k S n − 1 ,k times with probability ... β + n − 1 � 2. Number of new features for data N point n: ✓ ◆ β K + γ n = Poisson β + n − 1 6 [Griffiths & Ghahramani 2006]

  21. Indian buffet process (IBP) ... k = 1 2 For n = 1, 2, ..., N n = 1 1. Data point n has an existing 2 feature k that has occurred S n − 1 ,k S n − 1 ,k times with probability ... β + n − 1 � 2. Number of new features for data N point n: ✓ ◆ β K + γ n = Poisson β + n − 1 6 [Griffiths & Ghahramani 2006]

  22. Indian buffet process (IBP) ... k = 1 2 For n = 1, 2, ..., N n = 1 1. Data point n has an existing 2 feature k that has occurred S n − 1 ,k S n − 1 ,k times with probability ... β + n − 1 � 2. Number of new features for data N point n: ✓ ◆ β K + γ n = Poisson β + n − 1 6 [Griffiths & Ghahramani 2006]

  23. Indian buffet process (IBP) ... k = 1 2 For n = 1, 2, ..., N n = 1 1. Data point n has an existing 2 feature k that has occurred S n − 1 ,k S n − 1 ,k times with probability ... β + n − 1 � 2. Number of new features for data N point n: ✓ ◆ β K + γ n = Poisson β + n − 1 6 [Griffiths & Ghahramani 2006]

  24. Indian buffet process (IBP) ... k = 1 2 For n = 1, 2, ..., N n = 1 1. Data point n has an existing 2 feature k that has occurred S n − 1 ,k S n − 1 ,k times with probability ... β + n − 1 � 2. Number of new features for data N point n: ✓ ◆ β K + γ n = Poisson β + n − 1 6 [Griffiths & Ghahramani 2006]

  25. Indian buffet process (IBP) ... k = 1 2 For n = 1, 2, ..., N n = 1 1. Data point n has an existing 2 feature k that has occurred S n − 1 ,k S n − 1 ,k times with probability ... β + n − 1 � 2. Number of new features for data N point n: ✓ ◆ β K + γ n = Poisson β + n − 1 6 [Griffiths & Ghahramani 2006]

  26. Indian buffet process (IBP) ... k = 1 2 For n = 1, 2, ..., N n = 1 1. Data point n has an existing 2 feature k that has occurred S n − 1 ,k S n − 1 ,k times with probability ... β + n − 1 � 2. Number of new features for data N point n: ✓ ◆ β K + γ n = Poisson β + n − 1 6 [Griffiths & Ghahramani 2006]

Recommend


More recommend