biostatistics 602 statistical inference
play

Biostatistics 602 - Statistical Inference April 16th, 2013 - PowerPoint PPT Presentation

. . . . .. . . .. . .. . . . .. . . .. . .. . . . . . .. Biostatistics 602 - Statistical Inference April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang April 16th, 2013 Hyun Min Kang E-M Algorithm & Practice


  1. • One-sided (with upper-bound) interval . . .. . . .. . . .. . . . . . . .. . . .. . . .. . Recap . . April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang U X Three types of intervals that . Interval Estimator E-M . Interval Estimation Summary . P3 P2 P1 . .. .. .. . .. . . .. . . . .. . .. . . .. . . . . . .. . .. . . .. . . .. . . 3 / 33 . . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . ˆ θ ( X ) is usually represented as a point estimator Let [ L ( X ) , U ( X )] , where L ( X ) and U ( X ) are functions of sample X and L ( X ) ≤ U ( X ) . Based on the observed sample x , we can make an inference θ ∈ [ L ( X ) , U ( X )] Then we call [ L ( X ) , U ( X )] an interval estimator of θ . • Two-sided interval [ L ( X ) , U ( X )] • One-sided (with lower-bound) interval [ L ( X ) , ∞ )

  2. . . . . .. . . .. . .. . . . .. . . .. . .. . . . . . .. Interval Estimator April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang Three types of intervals that . . . Recap Interval Estimation Summary . P3 P2 P1 E-M . .. . .. .. . . . .. . . . . . .. . . .. . . . 3 / 33 .. .. .. . . .. . . .. . . . . . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . ˆ θ ( X ) is usually represented as a point estimator Let [ L ( X ) , U ( X )] , where L ( X ) and U ( X ) are functions of sample X and L ( X ) ≤ U ( X ) . Based on the observed sample x , we can make an inference θ ∈ [ L ( X ) , U ( X )] Then we call [ L ( X ) , U ( X )] an interval estimator of θ . • Two-sided interval [ L ( X ) , U ( X )] • One-sided (with lower-bound) interval [ L ( X ) , ∞ ) • One-sided (with upper-bound) interval ( −∞ , U ( X )]

  3. . P2 defined as . . Definition : Coverage Probability . Definitions Summary . P3 P1 L X E-M Recap . . . . . . .. . . .. . . Pr U X . . April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang U X L X inf Pr Confidence coefficient is defined as . . . In other words, the probability of a random variable in interval . . . . Definition: Confidence Coefficient . . covers the parameter U X L X .. .. . . .. . . .. . . .. . . .. . .. . . . .. . . .. . . .. . . . 4 / 33 .. .. . .. . . .. . . .. . .. . . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Given an interval estimator [ L ( X ) , U ( X )] of θ , its coverage probability is

  4. . Recap Definition : Coverage Probability . Definitions Summary . P3 P2 P1 E-M . . . . . . . .. . . .. . . .. .. . defined as .. . April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang U X L X inf Pr Confidence coefficient is defined as . . . In other words, the probability of a random variable in interval . . . . Definition: Confidence Coefficient . . covers the parameter U X L X . . . .. . .. . . .. . . .. . . . . . .. . . .. . . .. . . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. 4 / 33 . . . . . . . . . . . . . . . . . . . . . . . . . . Given an interval estimator [ L ( X ) , U ( X )] of θ , its coverage probability is Pr ( θ ∈ [ L ( X ) , U ( X )])

  5. . . Summary . P3 P2 P1 E-M Recap . . . . . .. . . . .. . . .. . .. .. Definitions Definition : Coverage Probability . . April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang U X L X inf Pr Confidence coefficient is defined as . . . . . . . . Definition: Confidence Coefficient . In other words, the probability of a random variable in interval defined as . . . .. .. .. . . .. . . .. . . . . . .. . . .. . . .. . . . . .. . . .. . . .. . . .. . . .. . . .. . . .. . 4 / 33 . . . . . . . . . . . . . . . . . . . . . . . . . . Given an interval estimator [ L ( X ) , U ( X )] of θ , its coverage probability is Pr ( θ ∈ [ L ( X ) , U ( X )]) [ L ( X ) , U ( X )] covers the parameter θ .

  6. . .. P1 E-M Recap . . . . . . .. . . . P3 . .. . . .. . .. .. P2 . . . April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang U X L X inf Pr Confidence coefficient is defined as . Definition: Confidence Coefficient Summary . In other words, the probability of a random variable in interval defined as . . Definition : Coverage Probability . Definitions . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . . .. . .. . .. . . .. . . .. . . . . .. . . .. . 4 / 33 . . . . . . . . . . . . . . . . . . . . . . . . . . Given an interval estimator [ L ( X ) , U ( X )] of θ , its coverage probability is Pr ( θ ∈ [ L ( X ) , U ( X )]) [ L ( X ) , U ( X )] covers the parameter θ .

  7. . . Recap . . . . . . .. . . .. . P1 .. . . .. . . .. . E-M P2 .. . April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang inf Confidence coefficient is defined as . . Definition: Confidence Coefficient In other words, the probability of a random variable in interval P3 defined as . . Definition : Coverage Probability . Definitions Summary . . .. . . . .. . . .. . . .. . . .. . . .. . . .. . . . 4 / 33 .. . . . .. . . . .. .. .. . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . Given an interval estimator [ L ( X ) , U ( X )] of θ , its coverage probability is Pr ( θ ∈ [ L ( X ) , U ( X )]) [ L ( X ) , U ( X )] covers the parameter θ . θ ∈ Ω Pr ( θ ∈ [ L ( X ) , U ( X )])

  8. where X are random samples from f X x . Recap Definition : Confidence Interval . Definitions Summary . P3 P2 P1 E-M . . . . . . . .. . . .. . . .. .. . . .. of April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang average length of the interval estimator. . In other words, it is the L X E U X defined as , its expected length is U X Definition: Expected Length Given an interval estimator L X . . . . . . . . . . . .. . .. . . .. . . .. . . . . . .. . . .. . . .. . . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. 5 / 33 . . . . . . . . . . . . . . . . . . . . . . . . . . Given an interval estimator [ L ( X ) , U ( X )] of θ , if its confidence coefficient is 1 − α , we call it a (1 − α ) confidence interval

  9. where X are random samples from f X x . .. E-M Recap . . . . . . .. . . . . P2 .. . . .. . .. .. P1 . P3 . April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang average length of the interval estimator. . In other words, it is the L X E U X defined as . . Definition: Expected Length . . . Definition : Confidence Interval . Definitions Summary . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . . .. . .. . .. . . .. . . .. . . . . .. . . .. . 5 / 33 . . . . . . . . . . . . . . . . . . . . . . . . . . Given an interval estimator [ L ( X ) , U ( X )] of θ , if its confidence coefficient is 1 − α , we call it a (1 − α ) confidence interval Given an interval estimator [ L ( X ) , U ( X )] of θ , its expected length is

  10. where X are random samples from f X x . . Recap . . . . . . .. . . .. .. . P1 . . .. . . .. . E-M P2 .. . April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang average length of the interval estimator. . In other words, it is the defined as . Definition: Expected Length P3 . . . Definition : Confidence Interval . Definitions Summary . . .. . . . .. . . .. . . .. . . .. . . .. . . .. . . . 5 / 33 . .. .. .. . . . .. .. . . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Given an interval estimator [ L ( X ) , U ( X )] of θ , if its confidence coefficient is 1 − α , we call it a (1 − α ) confidence interval Given an interval estimator [ L ( X ) , U ( X )] of θ , its expected length is E [ U ( X ) − L ( X )]

  11. . . . . . . . . .. . . .. . .. E-M . . .. . . .. . . Recap P1 . Definition: Expected Length April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang average length of the interval estimator. defined as . . . P2 . . Definition : Confidence Interval . Definitions Summary . P3 .. .. . . . . .. . . .. . .. . . . .. . . .. . . .. .. . . . . .. . . .. . . .. 5 / 33 . . . .. . .. . .. . . . . . . . . . . . . . . . . . . . . . . . . . . Given an interval estimator [ L ( X ) , U ( X )] of θ , if its confidence coefficient is 1 − α , we call it a (1 − α ) confidence interval Given an interval estimator [ L ( X ) , U ( X )] of θ , its expected length is E [ U ( X ) − L ( X )] where X are random samples from f X ( x | θ ) . In other words, it is the

  12. . P1 . . 9.2.2 is an interval, but quite often There is no guarantee that the confidence set obtained from Theorem Confidence set and confidence interval Summary . P3 P2 E-M two-sided CI L X Recap . . . . . . .. . . .. . . .. 1 To obtain U X , we invert the . . April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . , where vs. H acceptance region of a test for H U X , then we invert the 3 To obtain a upper-bounded CI . acceptance region of a level . , where vs. H acceptance region of a test for H , then we invert the 2 To obtain a lower-bounded CI L X . . vs. H test for H .. . .. . . . .. . . .. . . .. . .. . . . .. . . .. . . .. . . .. . . .. . .. . . .. . . .. . . .. . . .. . . .. . . 6 / 33 . . . . . . . . . . . . . . . . . . . . . . . . . .

  13. . . Summary . P3 P2 P1 E-M Recap . . . . . .. There is no guarantee that the confidence set obtained from Theorem . . .. . . .. . . .. Confidence set and confidence interval 9.2.2 is an interval, but quite often . . April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . , where vs. H acceptance region of a test for H U X , then we invert the 3 To obtain a upper-bounded CI . . . , where vs. H acceptance region of a test for H , then we invert the 2 To obtain a lower-bounded CI L X . . . . .. .. .. .. . . .. . . .. . . . . . .. . . .. . . .. . . . . .. . . .. . . .. . . .. . . .. . . .. . . .. . 6 / 33 . . . . . . . . . . . . . . . . . . . . . . . . . . 1 To obtain (1 − α ) two-sided CI [ L ( X ) , U ( X )] , we invert the acceptance region of a level α test for H 0 : θ = θ 0 vs. H 1 : θ ̸ = θ 0

  14. . .. E-M Recap . . . . . . .. . . . P2 . .. . . .. . .. .. P1 P3 . 3 To obtain a upper-bounded CI April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . , where vs. H acceptance region of a test for H U X , then we invert the . . . . . . 9.2.2 is an interval, but quite often There is no guarantee that the confidence set obtained from Theorem Confidence set and confidence interval Summary . . .. . . .. . . .. . . .. . .. .. . . .. . . .. . . . . 6 / 33 . . .. . . .. . . .. . . .. . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 To obtain (1 − α ) two-sided CI [ L ( X ) , U ( X )] , we invert the acceptance region of a level α test for H 0 : θ = θ 0 vs. H 1 : θ ̸ = θ 0 2 To obtain a lower-bounded CI [ L ( X ) , ∞ ) , then we invert the acceptance region of a test for H 0 : θ = θ 0 vs. H 1 : θ > θ 0 , where Ω = { θ : θ ≥ θ 0 } .

  15. . . .. . . .. . . .. . . . . . . .. . . .. . . .. . Recap . . April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . . . . . E-M 9.2.2 is an interval, but quite often There is no guarantee that the confidence set obtained from Theorem Confidence set and confidence interval Summary . P3 P2 P1 . .. .. .. . .. . . .. . . . .. . .. . . .. . . . . . . . .. . . .. . . .. . 6 / 33 .. . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 To obtain (1 − α ) two-sided CI [ L ( X ) , U ( X )] , we invert the acceptance region of a level α test for H 0 : θ = θ 0 vs. H 1 : θ ̸ = θ 0 2 To obtain a lower-bounded CI [ L ( X ) , ∞ ) , then we invert the acceptance region of a test for H 0 : θ = θ 0 vs. H 1 : θ > θ 0 , where Ω = { θ : θ ≥ θ 0 } . 3 To obtain a upper-bounded CI ( −∞ , U ( X )] , then we invert the acceptance region of a test for H 0 : θ = θ 0 vs. H 1 : θ < θ 0 , where Ω = { θ : θ ≤ θ 0 } .

  16. • For one-dimensional parameter, negative second order derivative . E-M Recap . . . . . . .. . . .. . P2 . .. . . .. . . .. P1 . P3 3 Check second-order derivative to check local maximum. April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang maximum. 4 Check boundary points to see whether boundary gives global . . implies local maximum. . . . 2 Find candidates that makes first order derivative to be zero . . . . Typical strategies for finding MLEs Summary . .. .. . . . .. . . .. . . .. . . .. . . .. . . .. . . .. 7 / 33 . . . .. . . .. . . .. . .. . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Write the joint (log-)likelihood function, L ( θ | x ) = f X ( x | θ ) .

  17. • For one-dimensional parameter, negative second order derivative . E-M Recap . . . . . . .. . . .. . P2 . .. . . .. . . .. P1 . P3 3 Check second-order derivative to check local maximum. April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang maximum. 4 Check boundary points to see whether boundary gives global . . implies local maximum. . . . 2 Find candidates that makes first order derivative to be zero . . . . Typical strategies for finding MLEs Summary . .. .. . . . .. . . .. . . .. . . .. . . .. . . .. . . .. 7 / 33 . . . .. . . .. . . .. . .. . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Write the joint (log-)likelihood function, L ( θ | x ) = f X ( x | θ ) .

  18. . .. E-M Recap . . . . . . .. . . . P2 . .. . . .. . . .. P1 P3 . 3 Check second-order derivative to check local maximum. April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang maximum. 4 Check boundary points to see whether boundary gives global . . implies local maximum. . . . 2 Find candidates that makes first order derivative to be zero . . . . Typical strategies for finding MLEs Summary . .. .. . . .. . . .. . . .. . . .. . . .. . . .. . . . .. 7 / 33 . . .. . . .. . . .. . . .. . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Write the joint (log-)likelihood function, L ( θ | x ) = f X ( x | θ ) . • For one-dimensional parameter, negative second order derivative

  19. . .. E-M Recap . . . . . . .. . . . P2 . .. . . .. . . .. P1 P3 . 3 Check second-order derivative to check local maximum. April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang maximum. 4 Check boundary points to see whether boundary gives global . . implies local maximum. . . . 2 Find candidates that makes first order derivative to be zero . . . . Typical strategies for finding MLEs Summary . .. .. . . .. . . .. . . .. . . .. . . .. . . .. . . . .. 7 / 33 . . .. . . .. . . .. . . .. . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Write the joint (log-)likelihood function, L ( θ | x ) = f X ( x | θ ) . • For one-dimensional parameter, negative second order derivative

  20. . .. .. . . .. . . . . . .. . . .. . . .. .. P3 April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang Example: A mixture distribution Summary . P2 . P1 E-M Recap . . . . . . .. . .. . . . . . .. . . .. . . .. . . .. . . . .. . .. .. . . .. . . .. . 8 / 33 . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

  21. . . .. . . .. . . .. . . . . . . .. . . .. . .. .. . Recap . mixture proportion of each component April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang k number of mixture components parameters shared among components parameters specific to each component f the probability density function x observed data E-M k A general mixture distribution Summary . P3 P2 P1 . . .. . . . .. . . .. . .. .. . . .. . . .. . . . 9 / 33 . . . .. . . .. . . .. . . .. . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . ∑ f ( x | π, φ, η ) = π i f ( x | φ i , η ) i =1

  22. . . .. . . .. . . .. . . . . . . .. . . .. . .. .. . Recap . mixture proportion of each component April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang k number of mixture components parameters shared among components parameters specific to each component f the probability density function x observed data E-M k A general mixture distribution Summary . P3 P2 P1 . . .. . . . .. . . .. . .. .. . . .. . . .. . . . 9 / 33 . . . .. . . .. . . .. . . .. . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . ∑ f ( x | π, φ, η ) = π i f ( x | φ i , η ) i =1

  23. . . . . .. . . .. . .. . . . .. . .. .. . .. . . . . . .. x observed data April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang k number of mixture components parameters shared among components parameters specific to each component f the probability density function k Recap A general mixture distribution Summary . P3 P2 P1 E-M . . . .. . .. . . .. . . . . . .. . . .. . . . 9 / 33 .. .. .. . . .. . . .. . . . . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . ∑ f ( x | π, φ, η ) = π i f ( x | φ i , η ) i =1 π mixture proportion of each component

  24. . . . . .. . . .. . .. . . . .. . .. .. . .. . . . . . .. x observed data April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang k number of mixture components parameters shared among components parameters specific to each component f the probability density function k Recap A general mixture distribution Summary . P3 P2 P1 E-M . . . .. . .. . . .. . . . . . .. . . .. . . . 9 / 33 .. .. .. . . .. . . .. . . . . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . ∑ f ( x | π, φ, η ) = π i f ( x | φ i , η ) i =1 π mixture proportion of each component

  25. . . . .. . . .. . .. .. . . .. . .. .. . . . .. k April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang k number of mixture components parameters shared among components f the probability density function x observed data A general mixture distribution . . . . . Summary . P3 P2 P1 E-M Recap . . . .. . . . . .. . . . .. . .. . . .. . . . .. . . .. . . .. . . .. . . 9 / 33 .. . .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ∑ f ( x | π, φ, η ) = π i f ( x | φ i , η ) i =1 π mixture proportion of each component φ parameters specific to each component

  26. . .. .. . . .. . . . . . .. . .. .. . . . .. . A general mixture distribution April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang k number of mixture components f the probability density function x observed data k Summary . . P3 P2 P1 E-M Recap . . . . . .. . . .. .. .. . .. . . . . . .. . . .. . . . . .. .. . . .. . . .. . . . 9 / 33 . . .. . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . ∑ f ( x | π, φ, η ) = π i f ( x | φ i , η ) i =1 π mixture proportion of each component φ parameters specific to each component η parameters shared among components

  27. . .. .. . . .. . . . . . .. . .. .. . . . .. . A general mixture distribution April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang k number of mixture components f the probability density function x observed data k Summary . . P3 P2 P1 E-M Recap . . . . . .. . . .. .. .. . .. . . . . . .. . . .. . . . . .. .. . . .. . . .. . . . 9 / 33 . . .. . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . ∑ f ( x | π, φ, η ) = π i f ( x | φ i , η ) i =1 π mixture proportion of each component φ parameters specific to each component η parameters shared among components

  28. f i x . .. P3 P2 P1 E-M Recap . . . . . . . Summary . .. . . .. . .. .. . . . MLE Problem for mixture of normals i April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . Find MLEs for i i n i .. x exp i i i k . . Problem . . . .. . . . .. . . .. . . . . . .. . . .. . . .. . . . .. .. .. .. . . .. . . .. . . .. . . . . . .. . 10 / 33 . . . . . . . . . . . . . . . . . . . . . . . . . . ∑ f ( x | θ = ( π, µ, σ 2 )) π i f i ( x | µ i , σ 2 = i ) i =1

  29. . .. E-M Recap . . . . . . .. . . . P2 .. .. . . .. . . .. P1 P3 . i April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . Find MLEs for i i n exp . i k . . Problem . MLE Problem for mixture of normals Summary . . .. . . .. . . . . . .. . .. .. . . .. . . .. . . . .. 10 / 33 . . . .. . . .. . . .. .. . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ∑ f ( x | θ = ( π, µ, σ 2 )) π i f i ( x | µ i , σ 2 = i ) i =1 − ( x − µ i ) 2 1 [ ] f i ( x | µ i , σ 2 i ) = 2 σ 2 √ 2 πσ 2

  30. . .. Recap . . . . . . .. . . .. . P1 .. . . .. . . .. . E-M P2 .. exp April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . Find MLEs for n i i P3 k . . Problem . MLE Problem for mixture of normals Summary . . . . . .. . . .. . . . . .. . . . .. . . .. . . . .. 10 / 33 . . .. . . .. .. . . .. .. . . .. . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . ∑ f ( x | θ = ( π, µ, σ 2 )) π i f i ( x | µ i , σ 2 = i ) i =1 − ( x − µ i ) 2 1 [ ] f i ( x | µ i , σ 2 i ) = 2 σ 2 √ 2 πσ 2 ∑ π i = 1 i =1

  31. . .. . .. . . .. .. . . Recap . .. . . .. . . . . . . . E-M . k April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang n i exp i . P1 . Problem . MLE Problem for mixture of normals Summary . P3 P2 .. . . . . . .. . . .. .. .. . . . .. . . .. . . .. . 10 / 33 .. .. . . . .. . . . . . .. . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . ∑ f ( x | θ = ( π, µ, σ 2 )) π i f i ( x | µ i , σ 2 = i ) i =1 − ( x − µ i ) 2 1 [ ] f i ( x | µ i , σ 2 i ) = 2 σ 2 √ 2 πσ 2 ∑ π i = 1 i =1 Find MLEs for θ = ( π, µ, σ 2 ) .

  32. . .. . .. . . .. . . . Recap . .. . . .. . . . . . . . E-M . n April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang n x x i i • P1 x • • k Summary . P3 P2 .. .. . . .. . . .. . . .. . .. . . . .. . . .. . . .. 11 / 33 . . .. . . .. .. . . .. . . . .. . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . Solution when k = 1 ∑ π i f i ( x | µ i , σ 2 f ( x | θ ) = i ) i =1

  33. . .. .. . . .. . . . . .. .. . . .. . . .. .. P3 April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang k Summary . P2 . P1 E-M Recap . . . . . . .. . . . . .. . . .. . . . . . . .. . . .. . . .. 11 / 33 . . . . .. . . . .. .. .. . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . Solution when k = 1 ∑ π i f i ( x | µ i , σ 2 f ( x | θ ) = i ) i =1 • π = π 1 = 1 • µ = µ 1 = x • σ 2 = σ 2 i =1 ( x i − x ) 2 / n 1 = ∑ n

  34. . .. .. . . .. . . . . . .. . . .. . . . .. . n April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang sums of exponential functions. The MLE solution is not analytically tractable, because it involves multiple k Summary . . P3 P2 P1 E-M Recap . . . . . .. .. . .. .. . . .. . . . . . .. . . .. . . . .. .. . . . .. . . .. . . .. . 12 / 33 . .. . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . Incomplete data problem when k > 1   ∏ ∑ π i f i ( x i | µ j , σ 2 f ( x | θ ) = j )   i =1 j =1

  35. . .. .. . . .. . . . . . .. . . .. . . . .. . n April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang sums of exponential functions. The MLE solution is not analytically tractable, because it involves multiple k Summary . . P3 P2 P1 E-M Recap . . . . . .. .. . .. .. . . .. . . . . . .. . . .. . . . .. .. . . . .. . . .. . . .. . 12 / 33 . .. . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . Incomplete data problem when k > 1   ∏ ∑ π i f i ( x i | µ j , σ 2 f ( x | θ ) = j )   i =1 j =1

  36. j f i x i f i x i . i z i z i i n j j I z i j k n n f x z sampled from. Converting to a complete data problem Summary . P3 P2 P1 E-M Recap . . . . . i i .. i April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang The MLE solution is analytically tractable, if z is known. i I z i i n i i x i I z i n I z i i i I z i i n i x i I z i i n i n i .. . . .. .. . . .. . . .. . . .. . . . . . .. . . .. . . .. . . .. . . . .. . . .. . . .. . . .. . . .. . 13 / 33 . .. . . .. . . . .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . Let z i ∈ { 1 , · · · , k } denote the source distribution where each x i was

  37. f i x i . Converting to a complete data problem n i z i z i i n k n sampled from. Summary I z i . P3 P2 P1 E-M .. Recap . . . . . . .. . i i .. I z i April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang The MLE solution is analytically tractable, if z is known. i I z i i n i i x i i n n i i I z i i n i x i I z i i n i . 13 / 33 . . . .. . . .. . . .. . . .. . . .. . .. . . . .. . . .. . .. .. . . .. . . . . . .. .. . . .. . . .. . . .. . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . Let z i ∈ { 1 , · · · , k } denote the source distribution where each x i was   ∏ ∑ I ( z i = j ) f i ( x i | µ j , σ 2 f ( x | z , θ ) = j )   i =1 j =1

  38. . P3 i n i n k n sampled from. Converting to a complete data problem Summary . P2 i P1 E-M .. Recap . . . . . . .. . . .. I z i n . I z i April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang The MLE solution is analytically tractable, if z is known. i I z i i n i i x i i i n i i I z i i n i x i I z i i n . 13 / 33 .. . .. . . . .. . . .. . .. . . .. . . . .. . . .. . . . . . . . .. . . .. . . .. . . .. .. . . .. . .. . . . .. . .. . . . . . . . . . . . . . . . . . . . . . . . . . . Let z i ∈ { 1 , · · · , k } denote the source distribution where each x i was   ∏ ∑ ∏ I ( z i = j ) f i ( x i | µ j , σ 2  = f i ( x i | µ z i , σ 2 f ( x | z , θ ) = j ) z i )  i =1 j =1 i =1

  39. . E-M k n sampled from. Converting to a complete data problem Summary . P3 P2 P1 .. n Recap . . . . . . .. . . .. . . .. n i . i x i April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang The MLE solution is analytically tractable, if z is known. i I z i i n i I z i n i n i i I z i i n i x i I z i i . 13 / 33 .. . .. . . .. . . . .. . . . .. . . . .. . . .. . .. . .. . . .. . . .. . . .. . .. . . . . . .. .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . Let z i ∈ { 1 , · · · , k } denote the source distribution where each x i was   ∏ ∑ ∏ I ( z i = j ) f i ( x i | µ j , σ 2  = f i ( x i | µ z i , σ 2 f ( x | z , θ ) = j ) z i )  i =1 j =1 i =1 ∑ n i =1 I ( z i = i ) ˆ = π i

  40. . . P2 P1 E-M .. . . . . . . .. . .. . . . .. . . .. . . P3 Summary . i April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang The MLE solution is analytically tractable, if z is known. i I z i i n i x i Converting to a complete data problem I z i i n i n n k n sampled from. .. Recap . .. . . .. . .. . .. . . . . . .. . . .. . . .. . . .. 13 / 33 . .. . . .. . . .. . . .. . . . .. . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . Let z i ∈ { 1 , · · · , k } denote the source distribution where each x i was   ∏ ∑ ∏ I ( z i = j ) f i ( x i | µ j , σ 2  = f i ( x i | µ z i , σ 2 f ( x | z , θ ) = j ) z i )  i =1 j =1 i =1 ∑ n i =1 I ( z i = i ) ˆ = π i ∑ n i =1 I ( z i = i ) x i µ i ˆ = ∑ n i =1 I ( z i = i )

  41. . . .. . . .. . . .. . .. .. . . .. . . .. . Recap . k April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang The MLE solution is analytically tractable, if z is known. i n n n E-M sampled from. Converting to a complete data problem Summary . P3 P2 P1 . . . . . . .. .. . .. . . .. . . . .. . . .. . . .. . . . 13 / 33 . . . . .. . .. . . .. . . .. . .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Let z i ∈ { 1 , · · · , k } denote the source distribution where each x i was   ∏ ∑ ∏ I ( z i = j ) f i ( x i | µ j , σ 2  = f i ( x i | µ z i , σ 2 f ( x | z , θ ) = j ) z i )  i =1 j =1 i =1 ∑ n i =1 I ( z i = i ) ˆ = π i ∑ n i =1 I ( z i = i ) x i µ i ˆ = ∑ n i =1 I ( z i = i ) µ i ) 2 ∑ n i =1 I ( z i = i )( x i − ˆ σ 2 ˆ = ∑ n i =1 I ( z i = i )

  42. . . .. . . .. . . .. . .. .. . . .. . . .. . Recap . k April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang The MLE solution is analytically tractable, if z is known. i n n n E-M sampled from. Converting to a complete data problem Summary . P3 P2 P1 . . . . . . .. .. . .. . . .. . . . .. . . .. . . .. . . . 13 / 33 . . . . .. . .. . . .. . . .. . .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Let z i ∈ { 1 , · · · , k } denote the source distribution where each x i was   ∏ ∑ ∏ I ( z i = j ) f i ( x i | µ j , σ 2  = f i ( x i | µ z i , σ 2 f ( x | z , θ ) = j ) z i )  i =1 j =1 i =1 ∑ n i =1 I ( z i = i ) ˆ = π i ∑ n i =1 I ( z i = i ) x i µ i ˆ = ∑ n i =1 I ( z i = i ) µ i ) 2 ∑ n i =1 I ( z i = i )( x i − ˆ σ 2 ˆ = ∑ n i =1 I ( z i = i )

  43. • A procedure for typically solving for the MLE. • Guaranteed to converge the MLE (!) • Particularly suited to the ”missing data” problems where analytic .. . . .. . . .. . . . . . . . . .. . . .. . . . E-M Recap The algorithm was derived and used in various special cases by a number April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang Statistical Society Series B (1977). seminal paper by Dempster, Laird, and Rubin in Journal of Royal of authors, but it was not identified as a general algorithm until the solution of MLE is not tractable . E-M (Expectation-Maximization) algorithm is E-M Algorithm Summary . P3 P2 P1 .. .. . . . . .. . . .. . .. .. . . .. . . .. . . .. . 14 / 33 . . . .. . . .. . . .. . . .. . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . .

  44. • Guaranteed to converge the MLE (!) • Particularly suited to the ”missing data” problems where analytic . . .. . . .. . . . . .. . . .. . . .. .. Recap . . . . . solution of MLE is not tractable April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang Statistical Society Series B (1977). seminal paper by Dempster, Laird, and Rubin in Journal of Royal of authors, but it was not identified as a general algorithm until the The algorithm was derived and used in various special cases by a number E-M (Expectation-Maximization) algorithm is . E-M Algorithm Summary . P3 P2 P1 E-M . .. .. .. . .. . . .. . . . . . .. . . .. . . . .. . . . .. . . .. . . .. . .. . . .. . . .. . 14 / 33 . . . . . . . . . . . . . . . . . . . . . . . . . . • A procedure for typically solving for the MLE.

  45. • Particularly suited to the ”missing data” problems where analytic . . . .. . . .. . . . .. . . .. . . .. .. Recap . . . . . solution of MLE is not tractable April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang Statistical Society Series B (1977). seminal paper by Dempster, Laird, and Rubin in Journal of Royal of authors, but it was not identified as a general algorithm until the The algorithm was derived and used in various special cases by a number E-M (Expectation-Maximization) algorithm is . E-M Algorithm Summary . P3 P2 P1 E-M . .. .. .. . .. . . .. . . . .. . .. . . .. . . . . . . . .. . . .. . . .. . . .. . . .. . . .. 14 / 33 . . . . . . . . . . . . . . . . . . . . . . . . . . • A procedure for typically solving for the MLE. • Guaranteed to converge the MLE (!)

  46. . . . . .. . . .. . .. . . . .. . . .. . .. . . . . . .. solution of MLE is not tractable April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang Statistical Society Series B (1977). seminal paper by Dempster, Laird, and Rubin in Journal of Royal of authors, but it was not identified as a general algorithm until the The algorithm was derived and used in various special cases by a number E-M (Expectation-Maximization) algorithm is Recap E-M Algorithm Summary . P3 P2 P1 E-M . .. . .. . .. . . .. . . . .. . .. . . .. . . . . . .. .. . . .. . . .. . . . . . .. . . .. 14 / 33 . . . . . . . . . . . . . . . . . . . . . . . . . . • A procedure for typically solving for the MLE. • Guaranteed to converge the MLE (!) • Particularly suited to the ”missing data” problems where analytic

  47. . . . . .. . . .. . .. . . . .. . . .. . .. . . . . . .. solution of MLE is not tractable April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang Statistical Society Series B (1977). seminal paper by Dempster, Laird, and Rubin in Journal of Royal of authors, but it was not identified as a general algorithm until the The algorithm was derived and used in various special cases by a number E-M (Expectation-Maximization) algorithm is Recap E-M Algorithm Summary . P3 P2 P1 E-M . .. . .. . .. . . .. . . . .. . .. . . .. . . . . . .. .. . . .. . . .. . . . . . .. . . .. 14 / 33 . . . . . . . . . . . . . . . . . . . . . . . . . . • A procedure for typically solving for the MLE. • Guaranteed to converge the MLE (!) • Particularly suited to the ”missing data” problems where analytic

  48. • Complete data likelihood : f x • Incomplete data likelihood : g y . . Overview of E-M Algorithm Summary . P3 P2 P1 E-M Recap . . . . . . . .. . . .. . . .. .. Basic Structure . . f y z April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . g y y We are interested in MLE for L d z f y z .. . . . . . . . . Complete and incomplete data likelihood . . . .. . .. . . .. . . .. . . . . . .. . . .. . . .. . . . .. . . .. . . .. . . .. . . .. . .. . . . .. . . .. 15 / 33 . . . . . . . . . . . . . . . . . . . . . . . . . . • y is observed (or incomplete) data • z is missing (or augmented) data • x = ( y , z ) is complete data

  49. • Incomplete data likelihood : g y . P1 E-M Recap . . . . . . .. . . .. P3 . . .. . . .. .. . P2 Summary . f y z April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . g y y We are interested in MLE for L d z . . . Complete and incomplete data likelihood . . . Basic Structure . Overview of E-M Algorithm .. . . . . . .. . . .. . . .. . . .. . . .. . . .. . . .. .. 15 / 33 . . . .. . . .. . . .. . . .. . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . • y is observed (or incomplete) data • z is missing (or augmented) data • x = ( y , z ) is complete data • Complete data likelihood : f ( x | θ ) = f ( y , z | θ )

  50. . . Recap . . . . . . .. . . .. . P1 .. . . .. .. . .. . E-M P2 .. . April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . g y y We are interested in MLE for L . Complete and incomplete data likelihood P3 . . . Basic Structure . Overview of E-M Algorithm Summary . . . . . .. . . .. . . .. . .. . . . .. . . .. . . . . .. .. .. . . .. . . .. . . . 15 / 33 .. . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . • y is observed (or incomplete) data • z is missing (or augmented) data • x = ( y , z ) is complete data • Complete data likelihood : f ( x | θ ) = f ( y , z | θ ) ∫ • Incomplete data likelihood : g ( y | θ ) = f ( y , z | θ ) d z

  51. . . .. . . .. . . .. . . . . . . .. . . .. . . .. . Recap . . April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . . Complete and incomplete data likelihood . . E-M Basic Structure . Overview of E-M Algorithm Summary . P3 P2 P1 . .. .. .. . .. . . .. . . . .. . .. . . .. . . . . . . . .. . . .. . . .. . 15 / 33 .. . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . • y is observed (or incomplete) data • z is missing (or augmented) data • x = ( y , z ) is complete data • Complete data likelihood : f ( x | θ ) = f ( y , z | θ ) ∫ • Incomplete data likelihood : g ( y | θ ) = f ( y , z | θ ) d z We are interested in MLE for L ( θ | y ) = g ( y | θ ) .

  52. . P1 k z g y y L Maximizing incomplete data likelihood Summary . P3 P2 E-M f y z Recap . . . . . . .. . . .. . .. .. y g y . y Z April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang algorithm. Iteratively maximizing the first term in the right-hand side results in E-M y y E log k Z y E log L log L y log L y , creating the new identity under k z Because z is missing data, we replace the right side with its expectation y log k z y z log L y . . .. . .. . . .. . . .. . . .. . .. . . . .. . . .. . . .. . . . .. . . . .. . . .. . . .. . . .. 16 / 33 . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . L ( θ | y , z ) f ( y , z | θ ) =

  53. . Recap y k z Maximizing incomplete data likelihood Summary . P3 P2 P1 E-M . . . . . g y . .. . . .. . .. .. . f y z log L .. y Z April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang algorithm. Iteratively maximizing the first term in the right-hand side results in E-M y y E log k Z y E log L y y log L y , creating the new identity under k z Because z is missing data, we replace the right side with its expectation y log k z y z log L . . . . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . . 16 / 33 .. . .. . . .. . . .. . . .. . . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . L ( θ | y , z ) f ( y , z | θ ) = L ( θ | y ) = g ( y | θ )

  54. . . Summary . P3 P2 P1 E-M Recap . . . . . .. log L . . .. . .. .. . . .. Maximizing incomplete data likelihood y . y April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang algorithm. Iteratively maximizing the first term in the right-hand side results in E-M y y E log k Z y Z log L E log L y log L y , creating the new identity under k z Because z is missing data, we replace the right side with its expectation y log k z y z . . .. .. . .. . . .. . . .. . . . . . .. . . .. . . .. . . . 16 / 33 .. .. . .. . . .. . . .. . . .. . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . L ( θ | y , z ) f ( y , z | θ ) = L ( θ | y ) = g ( y | θ ) f ( y , z | θ ) k ( z | θ, y ) = g ( y | θ )

  55. . .. E-M Recap . . . . . . .. . . . P2 . .. .. . .. . . .. P1 P3 . y April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang algorithm. Iteratively maximizing the first term in the right-hand side results in E-M y y E log k Z y Z . E log L y log L y , creating the new identity under k z Because z is missing data, we replace the right side with its expectation Maximizing incomplete data likelihood Summary . . .. . . .. . . . .. . . .. . .. .. . . .. . . .. . . . 16 / 33 . . .. . . .. . . .. . .. . . . .. . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . L ( θ | y , z ) f ( y , z | θ ) = L ( θ | y ) = g ( y | θ ) f ( y , z | θ ) k ( z | θ, y ) = g ( y | θ ) log L ( θ | y ) = log L ( θ | y , z ) − log k ( z | θ, y )

  56. . . Recap . . . . . . .. . . .. . P1 .. .. . .. . . .. . E-M P2 .. E log k Z April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang algorithm. Iteratively maximizing the first term in the right-hand side results in E-M y y y P3 y Z E log L y log L Because z is missing data, we replace the right side with its expectation Maximizing incomplete data likelihood Summary . . . . . .. . . . . . .. . .. . . . .. . . .. . . . .. 16 / 33 . .. . . .. . . .. . .. . . . .. . . . .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . L ( θ | y , z ) f ( y , z | θ ) = L ( θ | y ) = g ( y | θ ) f ( y , z | θ ) k ( z | θ, y ) = g ( y | θ ) log L ( θ | y ) = log L ( θ | y , z ) − log k ( z | θ, y ) under k ( z | θ ′ , y ) , creating the new identity

  57. . .. .. . . .. . . . . . .. . . .. . . . .. . Maximizing incomplete data likelihood April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang algorithm. Iteratively maximizing the first term in the right-hand side results in E-M E Because z is missing data, we replace the right side with its expectation Summary . . P3 P2 P1 E-M Recap . . . . . .. .. . .. .. . . .. . . . . . .. . . .. . . . .. .. .. . . .. . . .. . . . 16 / 33 . . . . .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . L ( θ | y , z ) f ( y , z | θ ) = L ( θ | y ) = g ( y | θ ) f ( y , z | θ ) k ( z | θ, y ) = g ( y | θ ) log L ( θ | y ) = log L ( θ | y , z ) − log k ( z | θ, y ) under k ( z | θ ′ , y ) , creating the new identity [ ] [ ] log L ( θ | y ) = log L ( θ | y , Z ) | θ ′ , y − E log k ( Z | θ, y ) | θ ′ , y

  58. . .. .. . . .. . . . . . .. . . .. . . . .. . Maximizing incomplete data likelihood April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang algorithm. Iteratively maximizing the first term in the right-hand side results in E-M E Because z is missing data, we replace the right side with its expectation Summary . . P3 P2 P1 E-M Recap . . . . . .. .. . .. .. . . .. . . . . . .. . . .. . . . .. .. .. . . .. . . .. . . . 16 / 33 . . . . .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . L ( θ | y , z ) f ( y , z | θ ) = L ( θ | y ) = g ( y | θ ) f ( y , z | θ ) k ( z | θ, y ) = g ( y | θ ) log L ( θ | y ) = log L ( θ | y , z ) − log k ( z | θ, y ) under k ( z | θ ′ , y ) , creating the new identity [ ] [ ] log L ( θ | y ) = log L ( θ | y , Z ) | θ ′ , y − E log k ( Z | θ, y ) | θ ′ , y

  59. • Let f y z r is the estimation of • Q r . . . . . . . Summary . P3 P2 P1 E-M Recap . . .. . . .. . . .. . .. Overview of E-M Algorithm (cont’d) . Objective r April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang on the observed data and is the expected log-likelihood of complete data , conditioning r in r -th iteration. where y . E log f y Z r Q function y directly, we work with the surrogate rather than working with l denotes the pdf of complete data. In E-M algorithm, . .. . . .. . .. . . .. . . .. . . . .. . .. . . .. . . .. . . . .. . . . . .. . . .. . . .. . .. . . . .. . . .. 17 / 33 . . . . . . . . . . . . . . . . . . . . . . . . . . • Maximize L ( θ | y ) or l ( θ | y ) .

  60. r is the estimation of • Q r . . P2 P1 E-M Recap . . . . . . .. . . .. . . . .. . . .. . P3 . Summary r April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang on the observed data and is the expected log-likelihood of complete data , conditioning r in r -th iteration. where y Overview of E-M Algorithm (cont’d) E log f y Z r Q function . . Objective .. . .. . .. .. . . .. . . .. . . . . . .. . . .. . . .. . . . . .. . .. . . .. . . .. . . .. . .. . . . .. . 17 / 33 . . . . . . . . . . . . . . . . . . . . . . . . . . • Maximize L ( θ | y ) or l ( θ | y ) . • Let f ( y , z | θ ) denotes the pdf of complete data. In E-M algorithm, rather than working with l ( θ | y ) directly, we work with the surrogate

  61. r is the estimation of • Q r . . Recap . . . . . . .. . . .. . . P1 .. . . .. . . .. E-M . P2 where April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang on the observed data and is the expected log-likelihood of complete data , conditioning r in r -th iteration. E P3 function . . Objective . Overview of E-M Algorithm (cont’d) Summary . . .. .. . . .. . . .. . . .. . .. .. . . .. . . .. . . . . . . . .. . . .. . . .. . . .. 17 / 33 .. . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . • Maximize L ( θ | y ) or l ( θ | y ) . • Let f ( y , z | θ ) denotes the pdf of complete data. In E-M algorithm, rather than working with l ( θ | y ) directly, we work with the surrogate [ log f ( y , Z | θ ) | y , θ ( r ) ] Q ( θ | θ ( r ) ) =

  62. • Q r . . . . . . . . . .. . . .. . . .. E-M . .. .. . .. . Recap P2 P1 function April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang on the observed data and is the expected log-likelihood of complete data , conditioning r E . .. . Objective . Overview of E-M Algorithm (cont’d) Summary . P3 . . . . . . . .. . . .. . .. . . . .. . . .. . . . .. 17 / 33 .. .. . . .. . . .. . . .. . . . . . .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . • Maximize L ( θ | y ) or l ( θ | y ) . • Let f ( y , z | θ ) denotes the pdf of complete data. In E-M algorithm, rather than working with l ( θ | y ) directly, we work with the surrogate [ log f ( y , Z | θ ) | y , θ ( r ) ] Q ( θ | θ ( r ) ) = where θ ( r ) is the estimation of θ in r -th iteration.

  63. . . . . .. . . .. . .. . . . .. . . .. . .. . . . . . .. Objective April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang E function . . . Recap Overview of E-M Algorithm (cont’d) Summary . P3 P2 P1 E-M . .. . .. .. . . . .. . . . . . .. . . .. . . . 17 / 33 .. .. .. . . .. . . .. . . . . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . • Maximize L ( θ | y ) or l ( θ | y ) . • Let f ( y , z | θ ) denotes the pdf of complete data. In E-M algorithm, rather than working with l ( θ | y ) directly, we work with the surrogate [ log f ( y , Z | θ ) | y , θ ( r ) ] Q ( θ | θ ( r ) ) = where θ ( r ) is the estimation of θ in r -th iteration. • Q ( θ | θ ( r ) ) is the expected log-likelihood of complete data , conditioning on the observed data and θ ( r ) .

  64. • Maximize Q • The arg max Q • Repeat E-step until convergence . E-M Expectation Step . Key Steps of E-M algorithm Summary . P3 P2 P1 . . . . . Recap . . .. . . .. . .. .. . Maximization Step . . April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang E-step. to be fed into the -th will be the r r with respect to . r . . . . . . . . . . .. . . . .. . . .. . . .. . .. . . . .. . . .. . . .. . . .. 18 / 33 . .. .. . . .. . . . .. . . . . .. . . .. . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . • Compute Q ( θ | θ ( r ) ) . • This typically involves in estimating the conditional distribution Z | Y , assuming θ = θ ( r ) . • After computing Q ( θ | θ ( r ) ) , move to the M-step

  65. . .. . .. . . .. . . . Recap . .. . . .. . . . . . . . E-M . . April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang E-step. . . Maximization Step . P1 . Expectation Step . Key Steps of E-M algorithm Summary . P3 P2 .. .. . . .. . .. . . .. . .. . . . .. . . .. . . .. . 18 / 33 .. .. . . . .. . . . . . .. . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . • Compute Q ( θ | θ ( r ) ) . • This typically involves in estimating the conditional distribution Z | Y , assuming θ = θ ( r ) . • After computing Q ( θ | θ ( r ) ) , move to the M-step • Maximize Q ( θ | θ ( r ) ) with respect to θ . • The arg max θ Q ( θ | θ ( r ) ) will be the ( r + 1) -th θ to be fed into the • Repeat E-step until convergence

  66. r y log f y z r y i log f y i z i f y i z i log f y i z i f y i z i i f y i z i . z E . . E-step . E-M algorithm for mixture of normals Summary P2 . P3 P1 E-M Recap . . . . . . .. . .. .. k z k n z i April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang j j k g y i z i r i g y i r z i k i n k z i z i . . . .. . . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . . . .. . . .. . . .. . . .. . .. .. . . . . .. 19 / 33 . . . . . . . . . . . . . . . . . . . . . . . . . . [ log f ( y , Z | θ ) | y , θ ( r ) ] Q ( θ | θ ( r ) ) =

  67. r y i log f y i z i f y i z i log f y i z i f y i z i i f y i z i P2 . . E-step . E-M algorithm for mixture of normals Summary . P3 . P1 z E-M Recap . . . . . . .. . .. .. . E k n z i April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang j j k g y i z i r i g y i r z i k i n k z i z i .. . . . . . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. 19 / 33 . .. .. . . .. . .. . . . . . .. . .. . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . [ log f ( y , Z | θ ) | y , θ ( r ) ] Q ( θ | θ ( r ) ) = ∑ k ( z | θ ( r ) , y ) log f ( y , z | θ ) =

  68. f y i z i log f y i z i f y i z i i f y i z i . E-M algorithm for mixture of normals Summary . P3 P2 P1 E-M Recap . . . . . . . . .. .. . .. . . .. E-step z . z i April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang j j k g y i z i r E g y i r z i k i n k n . . . .. . . . .. . . .. . . .. . .. .. . . .. . . .. . . .. . . . 19 / 33 . . . .. . . .. . . .. . . .. . . .. . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . [ log f ( y , Z | θ ) | y , θ ( r ) ] Q ( θ | θ ( r ) ) = ∑ k ( z | θ ( r ) , y ) log f ( y , z | θ ) = ∑ ∑ k ( z i | θ ( r ) , y i ) log f ( y i , z i | θ ) = i =1 z i =1

  69. f y i z i i f y i z i . P3 P2 P1 E-M Recap . . . . . . .. .. . Summary .. . . .. . . .. . . . E-M algorithm for mixture of normals z i April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang j j k g y i z i k .. n k n z E . . E-step . . . .. . . .. . . . .. . . . . . .. . . .. . . .. . . .. 19 / 33 . .. .. . .. . . . .. . . . .. . .. . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . [ log f ( y , Z | θ ) | y , θ ( r ) ] Q ( θ | θ ( r ) ) = ∑ k ( z | θ ( r ) , y ) log f ( y , z | θ ) = ∑ ∑ k ( z i | θ ( r ) , y i ) log f ( y i , z i | θ ) = i =1 z i =1 f ( y i , z i | θ ( r ) ) ∑ ∑ = g ( y i | θ ( r ) ) log f ( y i , z i | θ ) i =1 z i =1

  70. i f y i z i . . P1 E-M Recap . . . . . . .. . . .. P3 . .. . . .. . . P2 Summary . n April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang j j k g y i k k . n z E . . E-step . E-M algorithm for mixture of normals .. .. . .. . . .. . . .. . .. . . . . .. . . .. . . .. . . .. 19 / 33 . .. . .. . .. . .. . . . . . .. . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . [ log f ( y , Z | θ ) | y , θ ( r ) ] Q ( θ | θ ( r ) ) = ∑ k ( z | θ ( r ) , y ) log f ( y , z | θ ) = ∑ ∑ k ( z i | θ ( r ) , y i ) log f ( y i , z i | θ ) = i =1 z i =1 f ( y i , z i | θ ( r ) ) ∑ ∑ = g ( y i | θ ( r ) ) log f ( y i , z i | θ ) i =1 z i =1 N ( µ z i , σ 2 f ( y i , z i | θ ) ∼ z i )

  71. . . Recap . . . . . . .. . . .. . P1 .. . . .. . . .. . E-M P2 .. n April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang k k n k z P3 E . . E-step . E-M algorithm for mixture of normals Summary . . .. . . .. . . .. . . .. . .. . . . .. . . .. . . . . 19 / 33 .. .. . . .. . . .. . . .. . . . . . .. . .. . . . . . . . . . . . . . . . . . . . . . . . . . . [ log f ( y , Z | θ ) | y , θ ( r ) ] Q ( θ | θ ( r ) ) = ∑ k ( z | θ ( r ) , y ) log f ( y , z | θ ) = ∑ ∑ k ( z i | θ ( r ) , y i ) log f ( y i , z i | θ ) = i =1 z i =1 f ( y i , z i | θ ( r ) ) ∑ ∑ = g ( y i | θ ( r ) ) log f ( y i , z i | θ ) i =1 z i =1 N ( µ z i , σ 2 f ( y i , z i | θ ) ∼ z i ) ∑ g ( y i | θ ) = π i f ( y i , z i = j | θ ) j =1

  72. f y i z i x i k z i g y i r r j n i j y i j r n i k z i j y i r r . i n . .. n k r j n i i k z i j y i r n n n x i k z i . j y i n i x i r j k z i r j y i n r j Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 r k z i j y i n r n r j r j i i x i r j k z i j y i r n M-step . E-M algorithm for mixture of normals (cont’d) . .. . . .. . . .. . . .. . . .. . . .. . .. . . . . .. . . .. . .. . . . .. . . .. . . .. Summary .. . P3 P2 P1 E-M Recap . . .. . . . . . . . . .. . .. . . . 20 / 33 . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . f ( y i , z i | θ ( r ) ) Q ( θ | θ ( r ) ) ∑ ∑ = g ( y i | θ ( r ) ) log f ( y i , z i | θ ) i =1 z i =1

  73. . j x i k z i i n r j y i k z i i n r j y i x i k z i i n r r n n n n j k n .. . M-step . E-M algorithm for mixture of normals (cont’d) Summary . j y i n P2 r April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang j r n r j y i k z i j r x i i n j y i r k z i i n r j y i k z i j r x i i n j r j P3 . 20 / 33 .. . .. . . .. . . . .. . .. . . .. . . .. . . . .. . . .. . . .. . . . . . .. . . .. . . .. . .. .. . . .. . . . . . . . . Recap .. . . E-M P1 . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . f ( y i , z i | θ ( r ) ) Q ( θ | θ ( r ) ) ∑ ∑ = g ( y i | θ ( r ) ) log f ( y i , z i | θ ) i =1 z i =1 f ( y i , z i = j | θ ( r ) ) 1 k ( z i = j | y i , θ ( r ) ) = 1 π ( r +1) ∑ ∑ = g ( y i | θ ( r ) ) i =1 i =1

  74. . . r j j n n n n j k n . M-step n . E-M algorithm for mixture of normals (cont’d) .. . P3 P2 P1 E-M Recap . . . . . . j i . x i April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang j r n r j y i k z i j r i x i n r j y i k z i i n r j y i k z i j r .. Summary . . . . .. . . .. . . .. . . .. . .. .. . .. . . . .. . . .. . .. .. . . .. . . . 20 / 33 . . . . .. . . .. . . .. . . .. . . .. . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . f ( y i , z i | θ ( r ) ) Q ( θ | θ ( r ) ) ∑ ∑ = g ( y i | θ ( r ) ) log f ( y i , z i | θ ) i =1 z i =1 f ( y i , z i = j | θ ( r ) ) 1 k ( z i = j | y i , θ ( r ) ) = 1 π ( r +1) ∑ ∑ = g ( y i | θ ( r ) ) i =1 i =1 i =1 x i k ( z i = j | y i , θ ( r ) ) i =1 x i k ( z i = j | y i , θ ( r ) ) ∑ n ∑ n µ ( r +1) = i =1 k ( z i = j | y i , θ ( r ) ) = n π ( r +1) ∑ n

  75. . P2 . . M-step . E-M algorithm for mixture of normals (cont’d) Summary . P3 .. P1 k E-M Recap . . . . . . .. . . .. . . n j . j April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang j r n r j y i k z i r n x i i n j j j j n n n .. 20 / 33 . . . .. . .. . .. . . .. . . . . .. .. .. . . . . .. . . . .. .. . . .. . . .. . . .. . . .. . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . f ( y i , z i | θ ( r ) ) Q ( θ | θ ( r ) ) ∑ ∑ = g ( y i | θ ( r ) ) log f ( y i , z i | θ ) i =1 z i =1 f ( y i , z i = j | θ ( r ) ) 1 k ( z i = j | y i , θ ( r ) ) = 1 π ( r +1) ∑ ∑ = g ( y i | θ ( r ) ) i =1 i =1 i =1 x i k ( z i = j | y i , θ ( r ) ) i =1 x i k ( z i = j | y i , θ ( r ) ) ∑ n ∑ n µ ( r +1) = i =1 k ( z i = j | y i , θ ( r ) ) = n π ( r +1) ∑ n i =1 ( x i − µ ( r +1) ) 2 k ( z i = j | y i , θ ( r ) ) ∑ n σ 2 , ( r +1) = i =1 k ( z i = j | y i , θ ( r ) ) ∑ n

  76. . .. P3 P2 .. P1 E-M Recap . . . . . . . Summary . .. . . .. . . .. . . E-M algorithm for mixture of normals (cont’d) .. j April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang j j j j j n . n n n j k n . . M-step . 20 / 33 . . .. . . .. . . .. . . . .. .. . . .. . . . . . .. .. .. .. . . .. . . .. . . . . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . f ( y i , z i | θ ( r ) ) Q ( θ | θ ( r ) ) ∑ ∑ = g ( y i | θ ( r ) ) log f ( y i , z i | θ ) i =1 z i =1 f ( y i , z i = j | θ ( r ) ) 1 k ( z i = j | y i , θ ( r ) ) = 1 π ( r +1) ∑ ∑ = g ( y i | θ ( r ) ) i =1 i =1 i =1 x i k ( z i = j | y i , θ ( r ) ) i =1 x i k ( z i = j | y i , θ ( r ) ) ∑ n ∑ n µ ( r +1) = i =1 k ( z i = j | y i , θ ( r ) ) = n π ( r +1) ∑ n i =1 ( x i − µ ( r +1) ) 2 k ( z i = j | y i , θ ( r ) ) ∑ n σ 2 , ( r +1) = i =1 k ( z i = j | y i , θ ( r ) ) ∑ n i =1 ( x i − µ ( r +1) ) 2 k ( z i = j | y i , θ ( r ) ) ∑ n = n π ( r +1)

  77. r y r y r y Z r y r y converges monotonically . . . . . . Summary . P3 P2 P1 E-M Recap .. . . . . .. . . .. .. Does E-M iteration converge to MLE? L Theorem 7.2.20 - Monotonic EM sequence y Z April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . y for some stationary point to L Theorem 7.5.2 further guarantees that L E log L r . E log L value of the maximized expected complete-data log likelihood, that is with equality holding if and only if successive iterations yield the same L y r .. . . . . .. . .. . . .. . . .. . . . . . .. . . .. . . .. . . . .. . . .. . . .. . . .. . . .. . 21 / 33 . .. . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . The sequence { ˆ θ ( r ) } defined by the E-M procedure satisfies

  78. r y r y Z r y r y converges monotonically . P3 P2 P1 E-M Recap . . . . . . .. . . Summary .. .. . .. . . .. . Theorem 7.2.20 - Monotonic EM sequence Does E-M iteration converge to MLE? y Z April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . y for some stationary point to L Theorem 7.5.2 further guarantees that L E log L r . E log L value of the maximized expected complete-data log likelihood, that is with equality holding if and only if successive iterations yield the same L L . . . . . .. .. .. . . . .. . . .. . . . . . .. . . .. . . .. . . . 21 / 33 .. . . .. . . .. . . .. . . .. . . .. . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . The sequence { ˆ θ ( r ) } defined by the E-M procedure satisfies ( ) ( ) ˆ ˆ θ ( r +1) | y θ ( r ) | y ≥

  79. r y r y Z r y r y converges monotonically . P3 P2 P1 E-M Recap . . . . . . .. . . Summary .. .. . .. . . .. . Theorem 7.2.20 - Monotonic EM sequence Does E-M iteration converge to MLE? y Z April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . y for some stationary point to L Theorem 7.5.2 further guarantees that L E log L r . E log L value of the maximized expected complete-data log likelihood, that is with equality holding if and only if successive iterations yield the same L L . . . . . .. .. .. . . . .. . . .. . . . . . .. . . .. . . .. . . . 21 / 33 .. . . .. . . .. . . .. . . .. . . .. . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . The sequence { ˆ θ ( r ) } defined by the E-M procedure satisfies ( ) ( ) ˆ ˆ θ ( r +1) | y θ ( r ) | y ≥

  80. r y converges monotonically . . P3 P2 P1 E-M .. . . . . . . .. . Summary .. . . .. . . .. . . Does E-M iteration converge to MLE? .. E April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . y for some stationary point to L Theorem 7.5.2 further guarantees that L log L log L . E value of the maximized expected complete-data log likelihood, that is with equality holding if and only if successive iterations yield the same L L . . Theorem 7.2.20 - Monotonic EM sequence . Recap . .. . . .. . . . .. . . . . . .. . . .. . . .. . . .. 21 / 33 . . . . . . .. . .. . . .. . .. .. . .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . The sequence { ˆ θ ( r ) } defined by the E-M procedure satisfies ( ) ( ) ˆ ˆ θ ( r +1) | y θ ( r ) | y ≥ [ ( ) ] [ ( ) ] ˆ | ˆ ˆ | ˆ θ ( r +1) | y , Z θ ( r ) , y θ ( r ) | y , Z θ ( r ) , y =

  81. . . .. . . . . . . .. . . .. . P1 .. . . .. . . .. . E-M P2 .. with equality holding if and only if successive iterations yield the same April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang log L E log L E value of the maximized expected complete-data log likelihood, that is L P3 L . . Theorem 7.2.20 - Monotonic EM sequence . Does E-M iteration converge to MLE? Summary . . Recap . . .. . . .. . . .. . . . .. . . .. . . .. . . . 21 / 33 . . .. .. .. . . .. . . .. . . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . The sequence { ˆ θ ( r ) } defined by the E-M procedure satisfies ( ) ( ) ˆ ˆ θ ( r +1) | y θ ( r ) | y ≥ [ ( ) ] [ ( ) ] ˆ | ˆ ˆ | ˆ θ ( r +1) | y , Z θ ( r ) , y θ ( r ) | y , Z θ ( r ) , y = Theorem 7.5.2 further guarantees that L (ˆ θ ( r ) | y ) converges monotonically to L (ˆ θ | y ) for some stationary point ˆ θ .

  82. user@host~/> ./mixEM ./mix.dat Maximum log-likelihood = 3043.46, at pi = (0.667842,0.332158) between N(-0.0299457,1.00791) and N(5.0128,0.913825) . . P2 P1 E-M Recap . . . . . . .. . . .. . . .. . . .. . P3 A working example (from BIOSTAT615/815 Fall 2012) Summary . April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . . . . . .. . . Running example of implemented software . . . Example Data (n=1,500) . .. . . .. .. . . .. . . .. . . . . . .. . . .. . . .. . . . .. . .. .. . . .. . . .. . . . . .. . . .. . . 22 / 33 . . . . . . . . . . . . . . . . . . . . . . . . . .

  83. . . .. . . .. . . .. . . . . . . .. . . .. . .. .. . Recap . . April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . . Running example of implemented software . . E-M Example Data (n=1,500) . A working example (from BIOSTAT615/815 Fall 2012) Summary . P3 P2 P1 . . .. . . . .. . . .. . .. .. . . .. . . .. . . . 22 / 33 . . .. . . .. . . .. . . .. . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . user@host~/> ./mixEM ./mix.dat Maximum log-likelihood = 3043.46, at pi = (0.667842,0.332158) between N(-0.0299457,1.00791) and N(5.0128,0.913825)

  84. • Can we use the Cramer-Rao bound? No, because the • Then, can we use complete sufficient statistics? . Summary . Strategy to solve the problem . . . Problem . Practice Problem 1 . . P3 P2 P1 E-M Recap . . . . . . .. .. . .. . . . T April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . T T such that E 3 or Make a function . . E W T , and compute . 2 For a trivial unbiased estimator W for . . 1 Find a complete sufficient statistic T . . . interchangeability condition does not hold . . . . . .. .. .. . . .. . . .. . . .. . . . . . .. . . .. . . .. . . .. . . . .. . .. . .. . . .. . . .. . . . 23 / 33 . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . Let X 1 , · · · , X n be a random sample from a population with pdf f ( x | θ ) = 1 − θ < x < θ, θ > 0 2 θ Find, if one exists, a best unbiased estimator of θ .

  85. • Then, can we use complete sufficient statistics? . P1 . Problem . Practice Problem 1 Summary . P3 P2 E-M . Recap . . . . . . .. . . .. .. . .. . . Strategy to solve the problem E W T April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . T T such that E 3 or Make a function . . T . , and compute 2 For a trivial unbiased estimator W for . . 1 Find a complete sufficient statistic T . . . interchangeability condition does not hold No, because the . . . .. . .. . . .. . . .. . . .. . .. . . . .. . . .. . . .. . . . 23 / 33 .. . . . .. . . . . .. . . .. . . . .. . .. . .. . . . . . . . . . . . . . . . . . . . . . . . . . . Let X 1 , · · · , X n be a random sample from a population with pdf f ( x | θ ) = 1 − θ < x < θ, θ > 0 2 θ Find, if one exists, a best unbiased estimator of θ . • Can we use the Cramer-Rao bound?

  86. • Then, can we use complete sufficient statistics? . P1 . Problem . Practice Problem 1 Summary . P3 P2 E-M . Recap . . . . . . .. . . .. .. . .. . Strategy to solve the problem . E W T April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . T T such that E 3 or Make a function . . T . , and compute 2 For a trivial unbiased estimator W for . . 1 Find a complete sufficient statistic T . . . interchangeability condition does not hold . . . .. . . . .. . . .. . . .. . .. . . . .. . . .. . . .. . . .. 23 / 33 . . . .. . . .. . . .. . . .. . .. . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . Let X 1 , · · · , X n be a random sample from a population with pdf f ( x | θ ) = 1 − θ < x < θ, θ > 0 2 θ Find, if one exists, a best unbiased estimator of θ . • Can we use the Cramer-Rao bound? No, because the

  87. . Recap Problem . Practice Problem 1 Summary . P3 P2 P1 E-M . . . . . . . .. . . .. .. . .. . . . .. E W T April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . T T such that E 3 or Make a function . . T Strategy to solve the problem , and compute 2 For a trivial unbiased estimator W for . . 1 Find a complete sufficient statistic T . . . interchangeability condition does not hold . . . . . . . . .. . . .. . . .. . .. .. . . .. . . .. . . .. . . . 23 / 33 . . .. . . .. . . .. . . .. . .. . . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . Let X 1 , · · · , X n be a random sample from a population with pdf f ( x | θ ) = 1 − θ < x < θ, θ > 0 2 θ Find, if one exists, a best unbiased estimator of θ . • Can we use the Cramer-Rao bound? No, because the • Then, can we use complete sufficient statistics?

  88. . Recap Problem . Practice Problem 1 Summary . P3 P2 P1 E-M . . . . . . . .. . . .. .. . .. . . . .. E W T April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . T T such that E 3 or Make a function . . T Strategy to solve the problem , and compute 2 For a trivial unbiased estimator W for . . 1 Find a complete sufficient statistic T . . . interchangeability condition does not hold . . . . . . . . .. . . .. . . .. . .. .. . . .. . . .. . . .. . . . 23 / 33 . . .. . . .. . . .. . . .. . .. . . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . Let X 1 , · · · , X n be a random sample from a population with pdf f ( x | θ ) = 1 − θ < x < θ, θ > 0 2 θ Find, if one exists, a best unbiased estimator of θ . • Can we use the Cramer-Rao bound? No, because the • Then, can we use complete sufficient statistics?

Recommend


More recommend