likelihood inference in complex settings
play

Likelihood inference in complex settings Nancy Reid with Uyen - PowerPoint PPT Presentation

Likelihood inference for simple problems Higher order approximation Harder problems Approximations to likelihoods Likelihood inference in complex settings Nancy Reid with Uyen Hoang, Wei Lin, Ximing Xu 1 / 30 Likelihood inference for simple


  1. Likelihood inference for simple problems Higher order approximation Harder problems Approximations to likelihoods Likelihood inference in complex settings Nancy Reid with Uyen Hoang, Wei Lin, Ximing Xu 1 / 30

  2. Likelihood inference for simple problems Higher order approximation Harder problems Approximations to likelihoods Likelihood inference for simple problems Higher order approximation Harder problems Approximations to likelihoods 2 / 30

  3. Likelihood inference for simple problems Higher order approximation Harder problems Approximations to likelihoods Why likelihood? • likelihood function depends on data only through sufficient statistics • “likelihood map is sufficient” Fraser & Naderi, 2006 • provides summary statistics with known limiting distribution • leading to approximate pivotal functions, based on normal distribution • in some models the likelihood function gives exact inference • “likelihood function as pivotal” Hinkley, 1980 • likelihood function + sample space derivative gives better approximate inference 3 / 30

  4. Likelihood inference for simple problems Higher order approximation Harder problems Approximations to likelihoods Summary statistics and approximate pivotals f ( y ; θ ) , y ∈ R n , θ ∈ R d • model • log-likelihood function ℓ ( θ ; y ) = log f ( y ; θ ) + a ( y ) • score function u ( θ ) = ∂ℓ ( θ ; y ) /∂θ ˆ • maximum likelihood estimate θ = arg sup θ ℓ ( θ ; y ) w ( θ ) = 2 { ℓ (ˆ • log-likelihood ratio θ ; y ) − ℓ ( θ ; y ) } 4 / 30

  5. Likelihood inference for simple problems Higher order approximation Harder problems Approximations to likelihoods Approximate pivotals √ n (ˆ θ − θ ) . ∼ N d { 0 , j − 1 (ˆ θ ) } θ ) − ℓ ( θ ) } . w ( θ ) = 2 { ℓ (ˆ ∼ χ 2 d 1 √ nU ( θ ) . ∼ N d { 0 , j (ˆ θ ) } 1 L √ nU ( θ ) − → N d { 0 , I ( θ ) } j (ˆ θ ) = − ℓ ′′ (ˆ θ ) / n I ( θ ) = E { j ( θ ) } 5 / 30

  6. Likelihood inference for simple problems Higher order approximation Harder problems Approximations to likelihoods ...approximate pivotals log−likelihood function 0 −1 log−likelihood −2 −3 −4 16 17 18 19 20 21 22 23 θ 6 / 30

  7. Likelihood inference for simple problems Higher order approximation Harder problems Approximations to likelihoods ...approximate pivotals log−likelihood function 0 −1 log−likelihood −2 −3 −4 θ 16 17 18 19 20 21 22 23 θ 7 / 30

  8. Likelihood inference for simple problems Higher order approximation Harder problems Approximations to likelihoods ...approximate pivotals log−likelihood function 0 −1 log−likelihood −2 θ − θ −3 −4 θ 16 17 18 19 20 21 22 23 θ 8 / 30

  9. Likelihood inference for simple problems Higher order approximation Harder problems Approximations to likelihoods ...approximate pivotals log−likelihood function 0 −1 log−likelihood −2 θ − θ −3 −4 θ 16 17 18 19 20 21 22 23 θ 9 / 30

  10. Likelihood inference for simple problems Higher order approximation Harder problems Approximations to likelihoods ...approximate pivotals log−likelihood function 0 −1 1.92 w/2 log−likelihood −2 θ − θ −3 −4 θ 16 17 18 19 20 21 22 23 θ 10 / 30

  11. Likelihood inference for simple problems Higher order approximation Harder problems Approximations to likelihoods ...approximate pivotals θ ) − ℓ ( θ ) } . w ( θ ) = 2 { ℓ (ˆ ∼ χ 2 d M M (a) (a) -3 -2 -2 M -3 M -3 M M -4 M -4 -4 m -2 M 0 M 1 M 2 M 2 M 1 M M -1 0 M M M -1 M -1 2 M 2 M -3 -3 M -3 M -4 -2 -2 M M -2 M -1 -1 -4 M 1 M 1 M M 0 m 0 -4 M -1 (a) 2 1 0 -1 -2 -3 -4 -4 -3 -2 -1 0 1 2 11 / 30

  12. Likelihood inference for simple problems Higher order approximation Harder problems Approximations to likelihoods Likelihood as pivotal • Example: location model f ( y ; θ ) = � n i = 1 f 0 ( y i − θ ) , θ ∈ R exp { ℓ ( θ ; y ) } f (ˆ • Fisher (1934) θ | a ; θ ) = � exp { ℓ ( θ ; y ) } d θ • → (ˆ a i = y i − ˆ ( y 1 , . . . , y n ) ← θ, a 1 , . . . , a n ) θ • exact (conditional) distribution of maximum likelihood estimator given by renormalized likelihood function • p ∗ approximation: θ ) | 1 / 2 exp { ℓ ( θ ; ˆ p ∗ (ˆ θ | a ; θ ) = c ( θ, a ) | j (ˆ θ, a ) − ℓ (ˆ θ ; ˆ θ, a ) } 12 / 30

  13. Likelihood inference for simple problems Higher order approximation Harder problems Approximations to likelihoods A simpler approach • avoid → (ˆ ( y 1 , . . . , y n ) ← θ, a ) • define a derivative ∂ � ϕ ( θ ) ≡ ℓ ; V ( θ ; y 0 ) = � ∂ V ( y ) ℓ ( θ ; y ) � � y = y 0 • a directional derivative on the sample space • along with ℓ ( θ ; y 0 ) the observed log-likelihood function • can be extended to derivative of mean likelihood – usable in wider context Fraser/R Bka 2009 13 / 30

  14. Likelihood inference for simple problems Higher order approximation Harder problems Approximations to likelihoods Tangent exponential model • A continuous model f ( y ; θ ) on R n can be approximated by an exponential family model on R d : f TEM ( s ; θ ) ds = exp { ϕ ( θ ) ′ s + ℓ 0 ( θ ) } h ( s ) ds (1) s ( y ) = − ℓ ϕ (ˆ • s is a score variable on R d θ 0 ; y ) • ℓ 0 ( θ ) = ℓ ( θ ; y 0 ) is the observed log-likelihood function • ϕ ( θ ) = ϕ ( θ ; y 0 ) is the directional derivative ℓ ; V ( θ ; y 0 ) • (1) approximates original model to O ( n − 1 ) • gives approximation to the p -value for testing θ • p -value is accurate to O ( n − 3 / 2 ) 14 / 30

  15. Cauchy density and TEM approximation 0.30 0.25 0.20 density 0.15 0.10 0.05 -4 -2 0 2 4 y

  16. Likelihood inference for simple problems Higher order approximation Harder problems Approximations to likelihoods Example: microscopic fluorescence • “tracking of microscopic fluorescent particles attached to biological specimens” Hughes et al., AOAS, 2010 • “CCD (charge-coupled device) camera attached to a microscope used to observe the specimens repeatedly” • “we introduce an improved technique for analyzing such images over time” • Model for counts: − ( x i − x j ) 2 + ( y i − y j ) 2 � � � Z i ∼ N ( f i , f i + ψ ) , f i ≃ B + A j exp S 2 j • f i developed from a model for photon emission; Normal approximation to Poisson; ψ catches the instrument error 16 / 30

  17. Likelihood inference for simple problems Higher order approximation Harder problems Approximations to likelihoods ... microscopic fluorescence • “Our method, which applies maximum likelihood principles, improves the fit to the data, derives accurate standard errors from the data with minimal computation, and uses model-selection criteria to “count” the fluorophores in an image” • “likelihood ratio tests are used to select the final model” • potential for improved inference using likelihood methods? 17 / 30

  18. Likelihood inference for simple problems Higher order approximation Harder problems Approximations to likelihoods ... a simpler model Y i ∼ N ( µ i , µ i + ψ ) , µ i = exp ( β 0 + β 1 x i ) approximate pivot r ∗ constructed from ℓ ( θ ; y 0 ) , ϕ ( θ ; y 0 ) should follow a N ( 0 , 1 ) distribution – simulations Normal Q-Q Plot 3 2 Sample Quantiles 1 0 -1 -2 -3 18 / 30 -3 -2 -1 0 1 2 3

  19. Likelihood inference for simple problems Higher order approximation Harder problems Approximations to likelihoods More realistic models • for example for analytic inferences for survey data • stochastic processes in space or space-time • extremes in several dimensions • frailty models in survival data • longitudinal data • family-based genetic data and other forms of clustering • estimation of recombination rates from SNP data • ... 19 / 30

  20. Likelihood inference for simple problems Higher order approximation Harder problems Approximations to likelihoods Example: Gaussian random field • scalar output y at p − dimensional input x = ( x 1 , . . . , x p ) • y ( x ) = φ ( x ) T β + Z ( x ) , Z ( x ) Gaussian process on R p • p Cov { Z ( x 1 ) , Z ( x 2 ) } = σ 2 � R ( | x 1 i − x 2 i | ; θ ) i = 1 • R ( | x 1 i − x 2 i | ) = exp {− γ i | x 1 i − x 2 i | α } • anisotropic covariance matrix for inputs on different scales • application to computer experiments Ximing Xu,U Toronto; Derek Bingham, SFU 20 / 30

  21. Likelihood inference for simple problems Higher order approximation Harder problems Approximations to likelihoods ... Gaussian random field y n = ( y 1 , . . . , y n ) = { y ( x 1 ) , . . . , y ( x n ) } , at n locations x i in R p ℓ ( β, σ, θ ) = − 1 2 { n log σ 2 + log | R ( θ ) | + 1 σ 2 ( y n − Φ β ) T R − 1 ( θ )( y n − Φ β ) } , computation of R − 1 is O ( n 3 ) , n typically 100s or 1000s solution – make the correlation matrix sparse solution – simplify the likelihood function 21 / 30

Recommend


More recommend