imprecise probability models for inference in exponential
play

Imprecise probability models for inference in exponential families - PowerPoint PPT Presentation

Imprecise probability models for inference in exponential families SYSTeMS-dialogue of 14 July 2005 Erik Quaeghebeur SYSTeMS research group General idea - Specific idea - A result - Updating - History - Classification p.1/14 Overview 1.


  1. Imprecise probability models for inference in exponential families SYSTeMS-dialogue of 14 July 2005 Erik Quaeghebeur SYSTeMS research group General idea - Specific idea - A result - Updating - History - Classification – p.1/14

  2. Overview 1. The general idea 2. Specifying the details 3. A useful result 4. Updating 5. History: how this research got started 6. An application: classification 7. Conclusions General idea - Specific idea - A result - Updating - History - Classification – p.2/14

  3. The general idea General idea - Specific idea - A result - Updating - History - Classification – p.3/14

  4. The general idea Sampling model f ( x | ψ ) : likelihood function L x ( ψ ) , sufficient statistic. General idea - Specific idea - A result - Updating - History - Classification – p.3/14

  5. The general idea Sampling model f ( x | ψ ) : likelihood function L x ( ψ ) , sufficient statistic. Choose some prior C ( ψ ) : obtain a posterior after observing samples. General idea - Specific idea - A result - Updating - History - Classification – p.3/14

  6. The general idea Sampling model f ( x | ψ ) : likelihood function L x ( ψ ) , sufficient statistic. Choose some prior C ( ψ ) : obtain a posterior after observing samples. Obtain the corresponding predictive distribution through � P ( x ) = Ψ CL x . General idea - Specific idea - A result - Updating - History - Classification – p.3/14

  7. The general idea Sampling model f ( x | ψ ) : likelihood function L x ( ψ ) , sufficient statistic. Choose some prior C ( ψ ) : obtain a posterior after observing samples. Obtain the corresponding predictive distribution through � P ( x ) = Ψ CL x . Obtain the corresponding linear previsions P C and P P . General idea - Specific idea - A result - Updating - History - Classification – p.3/14

  8. The general idea Sampling model f ( x | ψ ) : likelihood function L x ( ψ ) , sufficient statistic. Choose some prior C ( ψ ) : obtain a posterior after observing samples. Obtain the corresponding predictive distribution through � P ( x ) = Ψ CL x . Obtain the corresponding linear previsions P C and P P . Imprecision: take a set of priors, use the lower envelope theorem to obtain coherent lower previsions P C and P P . General idea - Specific idea - A result - Updating - History - Classification – p.3/14

  9. Specifying the details: sampling model Sampling model f ( x | ψ ) : likelihood function L x ( ψ ) , sufficient statistic. General idea - Specific idea - A result - Updating - History - Classification – p.4/14

  10. Specifying the details: sampling model Exponential family sampling model Ef ( x | ψ ) : likelihood L x ( ψ ) , sufficient statistic τ ( x ) of fixed dimension. Ef ( x | ψ ) = a ( x ) exp( � ψ, τ ( x ) � − b ( ψ )) . General idea - Specific idea - A result - Updating - History - Classification – p.4/14

  11. Specifying the details: sampling model Exponential family sampling model Ef ( x | ψ ) : likelihood L x ( ψ ) , sufficient statistic τ ( x ) of fixed dimension. Ef ( x | ψ ) = a ( x ) exp( � ψ, τ ( x ) � − b ( ψ )) . Multinomial sampling Likelihood function is a multivariate Bernoulli Br ( x | θ ) : x ∈ { 0 , 1 } d : � x � ≤ 1; τ ( x ) = x ; � d � ln( θ i θ ∈ (0 , 1) d : � θ � < 1 , θ 0 = 1 − � θ i ; ψ ( θ ) = ) ; θ 0 i =1 i a = 1; b ( ψ ( θ )) = ln( θ 0 ) . General idea - Specific idea - A result - Updating - History - Classification – p.4/14

  12. Specifying the details: sampling model Exponential family sampling model Ef ( x | ψ ) : likelihood L x ( ψ ) , sufficient statistic τ ( x ) of fixed dimension. Ef ( x | ψ ) = a ( x ) exp( � ψ, τ ( x ) � − b ( ψ )) . Normal sampling Likelihood is a Normal N ( x | µ, λ ) : τ ( x ) = ( x, x 2 ); x ∈ R ; µ ∈ R , λ ∈ R + , σ 2 = 1 ψ ( λ, µ ) = ( λµ, − 1 λ ; 2 λ ); b ( ψ ( µ, λ )) = λµ 2 − ln( λ ) 1 √ a = 2 π ; . 2 General idea - Specific idea - A result - Updating - History - Classification – p.4/14

  13. Specifying the details: conjugate Choose some prior C ( ψ ) : obtain a posterior after observing samples. General idea - Specific idea - A result - Updating - History - Classification – p.5/14

  14. Specifying the details: conjugate Choose a conjugate prior CEf ( ψ | n 0 , y 0 ) : easily obtain a posterior CEf ( ψ | n k , y k ) after observing k samples. CEf ( ψ | n, y ) = c ( n, y ) exp( n [ � ψ, y � − b ( ψ )]) General idea - Specific idea - A result - Updating - History - Classification – p.5/14

  15. Specifying the details: conjugate Choose a conjugate prior CEf ( ψ | n 0 , y 0 ) : easily obtain a posterior CEf ( ψ | n k , y k ) after observing k samples. CEf ( ψ | n, y ) = c ( n, y ) exp( n [ � ψ, y � − b ( ψ )]) Multinomial sampling The conjugate distribution is a Dirichlet distribution Di ( θ | ny, ny 0 ) : y ∈ (0 , 1) d : � y � < 1 , y 0 = 1 − � y i ; i Γ( n ) c ( n, y ) = i Γ( ny i ) . Γ( ny 0 ) � General idea - Specific idea - A result - Updating - History - Classification – p.5/14

  16. Specifying the details: conjugate Choose a conjugate prior CEf ( ψ | n 0 , y 0 ) : easily obtain a posterior CEf ( ψ | n k , y k ) after observing k samples. CEf ( ψ | n, y ) = c ( n, y ) exp( n [ � ψ, y � − b ( ψ )]) Normal sampling The conjugate distribution is a Normal-gamma distribution n [ y 2 − y 1 2 ] N ( µ | y 1 , nλ ) Ga ( λ | n +3 2 , ) : 2 y ∈ R × R + : y 2 − y 12 > 0; � n +3 � n [ y 2 − y 1 2 ] 2 c ( n, y ) = 2 √ n 2 √ . Γ( n +3 2 ) 2 π General idea - Specific idea - A result - Updating - History - Classification – p.5/14

  17. Specifying the details: predictive Obtain the corresponding predictive distribution through � P ( x ) = Ψ CL x . General idea - Specific idea - A result - Updating - History - Classification – p.6/14

  18. Specifying the details: predictive Obtain the corresponding predictive distribution through c ( n, y ) a ( x ) � PEf ( x | n, y ) = CEf ( · | n, y ) L x = . c ( n + 1 , ny + τ ( x ) ) Ψ n +1 General idea - Specific idea - A result - Updating - History - Classification – p.6/14

  19. Specifying the details: predictive Obtain the corresponding predictive distribution through c ( n, y ) a ( x ) � PEf ( x | n, y ) = CEf ( · | n, y ) L x = . c ( n + 1 , ny + τ ( x ) ) Ψ n +1 Multinomial sampling The predictive distribution is a Dirichlet-multinomial distribution DiMn ( x | ny, ny 0 ) . Normal sampling The predictive distribution is a Student distribution St ( x | y 1 , n +3 1 1 , n + 3) . n +1 y 2 − y 2 General idea - Specific idea - A result - Updating - History - Classification – p.6/14

  20. Specifying the details: linear previsions Obtain the corresponding linear previsions P C and P P . General idea - Specific idea - A result - Updating - History - Classification – p.7/14

  21. Specifying the details: linear previsions Obtain the corresponding linear previsions � P C ( f | n k , y ) = CEf ( · | n k , y ) f, f ∈ L (Ψ) ≈ [Ψ → R ] . Ψ and � P P ( f | n k , y ) = PEf ( · | n k , y ) f, f ∈ L ( X ) ≈ [ X → R ] . X General idea - Specific idea - A result - Updating - History - Classification – p.7/14

  22. Specifying the details: lower previsions Imprecision: take a set of priors, use the lower envelope theorem to obtain coherent lower previsions P C and P P . General idea - Specific idea - A result - Updating - History - Classification – p.8/14

  23. Specifying the details: lower previsions Imprecision: take a set of priors, one for every y ∈ Y 0 , use the lower envelope theorem to obtain coherent lower previsions P C ( · | n k , Y k ) = inf y ∈Y k P C ( · | n k , y ) . and P P ( · | n k , Y k ) = inf y ∈Y k P P ( · | n k , y ) . General idea - Specific idea - A result - Updating - History - Classification – p.8/14

  24. A useful result � P ( τ | ψ ) = Ef ( · | ψ ) τ X General idea - Specific idea - A result - Updating - History - Classification – p.9/14

  25. A useful result � P ( τ | ψ ) = Ef ( · | ψ ) τ X Multinomial sampling P ( τ | ψ ) = θ ( ψ ) . Normal sampling P ( τ | ψ ) = ( µ ( ψ ) , m 2 ( ψ )) . General idea - Specific idea - A result - Updating - History - Classification – p.9/14

  26. A useful result P ( τ | Ψ) General idea - Specific idea - A result - Updating - History - Classification – p.9/14

  27. A useful result P C ( P ( τ | Ψ) | n k , y k ) = y k General idea - Specific idea - A result - Updating - History - Classification – p.9/14

  28. A useful result P C ( P ( τ | Ψ) | n k , Y k ) = inf Y k General idea - Specific idea - A result - Updating - History - Classification – p.9/14

  29. A useful result P C ( P ( τ | Ψ) | n k , Y k ) = inf Y k P C ( P ( τ | Ψ) | n k , Y k ) = sup Y k General idea - Specific idea - A result - Updating - History - Classification – p.9/14

  30. Updating Initial choice n 0 ∈ R + and Y 0 ⊂ Y (bounded). General idea - Specific idea - A result - Updating - History - Classification – p.10/14

  31. Updating Initial choice n 0 ∈ R + and Y 0 ⊂ Y (bounded). Take k samples, keep sufficient statistic τ k . General idea - Specific idea - A result - Updating - History - Classification – p.10/14

Recommend


More recommend