Generalized Linear Factor Models: a local EM estimation Xavier Bry a, Christian Lavergne ab and Mohamed Saidane c E-mails: [bry , lavergne]@math.univ-montp2.fr ; Mohamed.Saidane@isg.rnu.tn a I3M, Université Montpellier II, France b université Montpellier III, France c Université du 7 Novembre à Carthage, Tunisie X. Bry, C. Lavergne, M. Saidane: Generalized Linear Factor Models: a local EM estimation COMPSTAT, August 2010, Paris
2 Motivations Motivations: Quantitative measures Social Sciences, Biology, Environment, ... → = Miscellaneous types of variables Qualitative characteristics Counts B 1, p ;B n , p Life times M 1 ; p 1 , p k ; M n ; p 1 , p k P ; k , ;etc. X. Bry, C. Lavergne, M. Saidane: Generalized Linear Factor Models: a local EM estimation COMPSTAT, August 2010, Paris
3 Motivations Motivations: Quantitative measures Social Sciences, Biology, Environment, ... → = Miscellaneous types of variables Qualitative characteristics Counts B 1, p ;B n , p Life times M 1 ; p 1 , p k ; M n ; p 1 , p k P ; k , ;etc. Abundant & related ⇒ Dimension reduction ⇒ Correlation modeling X. Bry, C. Lavergne, M. Saidane: Generalized Linear Factor Models: a local EM estimation COMPSTAT, August 2010, Paris
4 Model & notations unit t Data: ↓ → y t = ( yit ) i =1, p Observed on n observation units {1, ... , t , ... n }: p variables { y 1 ,..., yp } ( p ,1) underlying → q latent factors { f 1,..., fq } f t = ( fjt ) j =1, q q < p ( q ,1) Observation units are independent X. Bry, C. Lavergne, M. Saidane: Generalized Linear Factor Models: a local EM estimation COMPSTAT, August 2010, Paris
5 Model & notations unit t Data: ↓ → y t = ( yit ) i =1, p Observed on n observation units {1, ... , t , ... n }: p variables { y 1 ,..., yp } ( p ,1) underlying → q latent factors { f 1,..., fq } f t = ( fjt ) j =1, q q < p ( q ,1) Observation units are independent Factor Model: ∀ t , f t ~ N (0 ; I q ) Factors f j generate linear predictors of variables y i → linear predictor of y it | f t : η it = θ i + ai 'f t η t = θ + A f t ( p ,1) ( p ,1) ( p , q ) ( q ,1) A = ( a 1 , ... , a p )' ; θ = ( θ i ) i ; F = (f 1 , ... , f t , ... , f n ) ; η t = ( η it ) i ; η = ( η it ) i,t = ( η 1 , ... , η t , ... , η n ) = 1 n ' A F ( p , n ) ( p , q )( q,n ) X. Bry, C. Lavergne, M. Saidane: Generalized Linear Factor Models: a local EM estimation COMPSTAT, August 2010, Paris
6 Model & notations Model of y conditional to F: ∀ t , y t | f t ~ ℘ t ∈ Exponential family (Nelder & Wedderburn): l i y it ∣ it , = exp c i y it , y it it − b i it a it ∀ t , ( yit ) i | f t are µ it = E ( y it ) = b i '( δ € it ) ╨ Var ( y it ) = a it ( φ ) b i "( δ € it ) = a it ( φ ) b i "( b i ' -1 ( µ it )) v i ( µ it )= b i "([ bi ' -1 ( µ it )] Conditional variance matrix: Var y t = diag { a it v i it } i = 1, p X. Bry, C. Lavergne, M. Saidane: Generalized Linear Factor Models: a local EM estimation COMPSTAT, August 2010, Paris
7 Model & notations Model of y conditional to F: ∀ t , y t | f t ~ ℘ t ∈ Exponential family (Nelder & Wedderburn): l i y it ∣ it , = exp c i y it , y it it − b i it a it ∀ t , ( yit ) i | f t are µ it = E ( y it ) = b i '( δ € it ) ╨ Var ( y it ) = a it ( φ ) b i "( δ € it ) = a it ( φ ) b i "( b i ' -1 ( µ it )) v i ( µ it )= b i "([ bi ' -1 ( µ it )] Conditional variance matrix: Var y t = diag { a it v i it } i = 1, p ∀ i , t : η it = g i ( µ it ) Link with linear predictor: link function δ € it = canonical parameter ; g i = b i ' -1 ⇒ η it = δ it canonical link The classical Gaussian Linear Factor Model: y it | f t ~ Ν ( µ it ; σ ²) with µ it = η it X. Bry, C. Lavergne, M. Saidane: Generalized Linear Factor Models: a local EM estimation COMPSTAT, August 2010, Paris
8 Factor Models: available estimation techniques The classical Gaussian Linear Factor Model: EM algorithm estimation: A is estimated by maximizing the expectation, conditional to observations, of the derivative of the completed log-likelihood (EDLCO), integrated with respect to the factors: ∑ E ∇ log l y t , f t ∣ y t = 0 ⇒ taking the conditional expectation of the 1 st order conditions: t = possible because EDLCO is analytically determined X. Bry, C. Lavergne, M. Saidane: Generalized Linear Factor Models: a local EM estimation COMPSTAT, August 2010, Paris
9 Factor Models: available estimation techniques The classical Gaussian Linear Factor Model: EM algorithm estimation: A is estimated by maximizing the expectation, conditional to observations, of the derivative of the completed log-likelihood (EDLCO), integrated with respect to the factors: ∑ E ∇ log l y t , f t ∣ y t = 0 ⇒ taking the conditional expectation of the 1 st order conditions: t = possible because EDLCO is analytically determined Generalized Linear Factor Models: → Expectation of EDLCO not analytically determined Direct EM impossible → [Moustaki, I., & Knott, M. (2000)] Max of the expected completed log- Likelihood, single factor; Gauss-Hermite quadrature used to approximate integral. Computa- tionally → [Wedel, M. & Kamakura, W.A. (2001)] Monte Carlo approach intensive [Moustaki, I & Victoria-Feser, M.P.(2006)] → Iterative estimation method inspired from the indirect inference technique [Gourieroux (1993)]. X. Bry, C. Lavergne, M. Saidane: Generalized Linear Factor Models: a local EM estimation COMPSTAT, August 2010, Paris
10 Looking back at GLM's GLM: GLM of one variable y , depending on predictors X = ( x j ) j =1, q µ = E ( y ) η = X β ∀ t , η t = g ( µ t ) ⇒ x t ' β = g ( b' ( δ t )) Linear predictor: c y t , n n y t t − b t L ; y = ∑ L t t ; y t = ∑ Log-likelihood: a t t = 1 t = 1 ∂ L t = ∂ t ∂ t ∂ t ∂ L t y t − t 1 1 Derivation / β β : = x tj ∂ j ∂ j ∂ t ∂ t ∂ t g ' t b " t a t = V ( y t ) = a t ( φ ) v ( µ t ) W = diag g ' t 2 V y t t = 1, n = diag g ' t 2 a t v t t = 1, n Let: ; ∂ = diag ∂ t t = 1, n ∂ t ∂ = diag g ' t t = 1, n − 1 ∂ Then: ∇ L = 0 ⇔ X ' W ∂ y −= 0 non linear / β interpretable as normal equations of a linear model X. Bry, C. Lavergne, M. Saidane: Generalized Linear Factor Models: a local EM estimation COMPSTAT, August 2010, Paris
11 Looking back at GLM's Fisher's scores algorithm: [ k ] − E [ − 1 ∂∂ ' ] ∂ 2 L [ k ] ∂ [ k ] − 1 ∂ [ k ] ∂ L − 1 X ' W [ k ] ∂ − 1 X = [ k ] − X ' W [ k ] [ k 1 ] = y − [ k ] iteration nr. − 1 X [ k ] y − [ k ] ∂ [ k ] − 1 X ' W [ k ] − 1 X ∂ = X ' W [ k ] − 1 ∂ ∇ L = 0 ⇔ X ' W ∂ y −= 0 z [ k ] = working variable − 1 z − X = 0 normal equations of lin. model M : z = X ; E = 0 ⇔ X ' W 2 t V y t : V = W V t = V z ,t = g ' [ k ] ; E [ k ] : z = X [ k ] = 0 ,V [ k ] = W [ k ] Current linearized model: M X. Bry, C. Lavergne, M. Saidane: Generalized Linear Factor Models: a local EM estimation COMPSTAT, August 2010, Paris
12 Looking back at GLM's Fisher's scores algorithm: [ k ] − E [ − 1 ∂∂ ' ] ∂ 2 L [ k ] ∂ [ k ] − 1 ∂ [ k ] ∂ L − 1 X ' W [ k ] ∂ − 1 X = [ k ] − X ' W [ k ] [ k 1 ] = y − [ k ] iteration nr. − 1 X [ k ] y − [ k ] ∂ [ k ] − 1 X ' W [ k ] − 1 X ∂ = X ' W [ k ] − 1 ∂ ∇ L = 0 ⇔ X ' W ∂ y −= 0 z [ k ] = working variable − 1 z − X = 0 normal equations of lin. model M : z = X ; E = 0 ⇔ X ' W 2 t V y t : V = W V t = V z ,t = g ' [ k ] ; E [ k ] : z = X [ k ] = 0 ,V [ k ] = W [ k ] Current linearized model: M Iterative GLS estimation: g ( y ) ≈ g ( µ ) + g' ( µ ) ( y - µ ) = X ∂ y −= z ∂ 0) Initializing M [0] with OLS of g ( y ) on X → β [0] i) β [ k ] → W β [ k ] ; z β [ k ] Repeat until convergence ii) GLS on M [ k ] → β [ k ] = Quasi-Likelihood Estimation (QLE) = mimics MLE on each step, under a normality and independence assumption of the z β ,t 's with a fixed covariance structure. X. Bry, C. Lavergne, M. Saidane: Generalized Linear Factor Models: a local EM estimation COMPSTAT, August 2010, Paris
Recommend
More recommend