lcda local classification of discrete data by latent
play

lcda : Local Classification of Discrete Data by Latent Class Models - PowerPoint PPT Presentation

lcda : Local Classification of Discrete Data by Latent Class Models Michael B ucker buecker@statistik.tu-dortmund.de July 9, 2009 lcda: Local Classification of Discrete Data by Latent Class Models M. B ucker Introduction common


  1. lcda : Local Classification of Discrete Data by Latent Class Models Michael B¨ ucker buecker@statistik.tu-dortmund.de July 9, 2009

  2. lcda: Local Classification of Discrete Data by Latent Class Models M. B¨ ucker Introduction ◮ common global classification methods may be inefficient when groups are heteroge- nous ⇒ need for more flexible, local models ◮ continuous models that allow for subclasses: ⊲ Mixture Discriminant Analysis (MDA): assumption of class conditional mixtures of (multivariate) normals ⊲ Common Components (Titsias and Likas 2001) imply a mixture of normals with common components ◮ in this talk: discrete counterparts based on Latent Class Models (see Lazarsfeld and Henry 1968) implemented in R-package lcda ◮ application to SNP data useR! 2009 1

  3. lcda: Local Classification of Discrete Data by Latent Class Models M. B¨ ucker Local structures useR! 2009 2

  4. lcda: Local Classification of Discrete Data by Latent Class Models M. B¨ ucker Mixture Discriminant Analysis and Common Components ◮ class conditional density (MDA) M k � f ( x | Z = k ) = f k ( x ) = w mk φ ( x ; µ mk , Σ) m =1 useR! 2009 3

  5. lcda: Local Classification of Discrete Data by Latent Class Models M. B¨ ucker Mixture Discriminant Analysis and Common Components ◮ class conditional density (MDA) M k � f ( x | Z = k ) = f k ( x ) = w mk φ ( x ; µ mk , Σ) m =1 ◮ class conditional density of the Common Components Model (Titsias and Likas 2001) M � P ( X = x | Z = k ) = f k ( x ) = w mk φ ( x ; µ m , Σ) m =1 useR! 2009 3

  6. lcda: Local Classification of Discrete Data by Latent Class Models M. B¨ ucker Mixture Discriminant Analysis and Common Components ◮ class conditional density (MDA) M k � f ( x | Z = k ) = f k ( x ) = w mk φ ( x ; µ mk , Σ) m =1 ◮ class conditional density of the Common Components Model (Titsias and Likas 2001) M � P ( X = x | Z = k ) = f k ( x ) = w mk φ ( x ; µ m , Σ) m =1 ◮ posterior based on Bayes’ rule π k f k ( x ) P ( Z = k | X = x ) = � K l =1 π l f l ( x ) useR! 2009 3

  7. lcda: Local Classification of Discrete Data by Latent Class Models M. B¨ ucker Latent Class Model ◮ latent (unobservable) variable Y with categorical outcomes in { 1 , . . . , M } with probability P ( Y = m ) = w m useR! 2009 4

  8. lcda: Local Classification of Discrete Data by Latent Class Models M. B¨ ucker Latent Class Model ◮ latent (unobservable) variable Y with categorical outcomes in { 1 , . . . , M } with probability P ( Y = m ) = w m ◮ manifest (observable) variables X 1 , . . . , X D , X d with outcomes in { 1 , . . . , R d } with probability P ( X d = r | Y = m ) = θ mdr useR! 2009 4

  9. lcda: Local Classification of Discrete Data by Latent Class Models M. B¨ ucker Latent Class Model ◮ latent (unobservable) variable Y with categorical outcomes in { 1 , . . . , M } with probability P ( Y = m ) = w m ◮ manifest (observable) variables X 1 , . . . , X D , X d with outcomes in { 1 , . . . , R d } with probability P ( X d = r | Y = m ) = θ mdr ◮ define X dr = 1 if X d = r and X dr = 0 else and assume stochastic independence of manifest variables conditional on Y , then the conditional probability mass function is given by R d D � � θ x dr f ( x | m ) = mdr r =1 d =1 useR! 2009 4

  10. lcda: Local Classification of Discrete Data by Latent Class Models M. B¨ ucker Latent Class Model ◮ latent (unobservable) variable Y with categorical outcomes in { 1 , . . . , M } with probability P ( Y = m ) = w m ◮ manifest (observable) variables X 1 , . . . , X D , X d with outcomes in { 1 , . . . , R d } with probability P ( X d = r | Y = m ) = θ mdr ◮ define X dr = 1 if X d = r and X dr = 0 else and assume stochastic independence of manifest variables conditional on Y , then the conditional probability mass function is given by R d D � � θ x dr f ( x | m ) = mdr r =1 d =1 ◮ unconditional probability mass function of manifest variables is R d M D � � � θ x dr f ( x ) = w m mdr m =1 r =1 d =1 useR! 2009 4

  11. lcda: Local Classification of Discrete Data by Latent Class Models M. B¨ ucker Identifiability R d M D θ x dr Proposition 1. The LCM f ( x ) = � � � w m mdr is not identifiable. m =1 r =1 d =1 useR! 2009 5

  12. lcda: Local Classification of Discrete Data by Latent Class Models M. B¨ ucker Identifiability R d M D θ x dr Proposition 1. The LCM f ( x ) = � � � w m mdr is not identifiable. m =1 r =1 d =1 Proof. ◮ the LCM is a finite mixture of products of multinomial distributions ◮ each mixture component f ( x | m ) is the product of M (1 , θ md 1 , . . . , θ mdR d ) - distributed random variables ◮ mixtures of M multinomials M ( N, θ 1 , . . . , θ p ) are identifiable iff N ≥ 2 M − 1 (Elmore and Wang 2003) ◮ mixtures of the product of marginal distributions are identifiable if mixtures of the marginal distributions are identifiable (Teicher 1967) ⇒ the LCM is not identifiable. ✷ useR! 2009 5

  13. lcda: Local Classification of Discrete Data by Latent Class Models M. B¨ ucker Estimation of the LCM ◮ estimation by EM-algorithm: useR! 2009 6

  14. lcda: Local Classification of Discrete Data by Latent Class Models M. B¨ ucker Estimation of the LCM ◮ estimation by EM-algorithm: ◮ E step: Determination of conditional expectation of Y given X = x τ mn = w m f ( x n | m ) f ( x n ) useR! 2009 6

  15. lcda: Local Classification of Discrete Data by Latent Class Models M. B¨ ucker Estimation of the LCM ◮ estimation by EM-algorithm: ◮ E step: Determination of conditional expectation of Y given X = x τ mn = w m f ( x n | m ) f ( x n ) ◮ M step: Maximization of the log-Likelihood and estimation of N w m = 1 � τ mn N n =1 and N 1 � θ mdr = τ mn x ndr Nw m n =1 useR! 2009 6

  16. lcda: Local Classification of Discrete Data by Latent Class Models M. B¨ ucker Model selection criteria ◮ information criteria ⊲ AIC − 2 log L ( w, θ | x ) + 2 η ⊲ BIC − 2 log L ( w, θ | x ) + η log N �� D � where η = M d =1 R d − D + 1 − 1 (=number of parameters) ◮ goodness-of-fit test statistics (predicted vs. observed frequencies) ⊲ Pearson’s χ 2 ⊲ likelihood ratio χ 2 useR! 2009 7

  17. lcda: Local Classification of Discrete Data by Latent Class Models M. B¨ ucker Local Classification of Discrete Data ◮ two ways to use LCM for local classification: ⊲ class conditional mixtures (like in MDA) ⊲ common components useR! 2009 8

  18. lcda: Local Classification of Discrete Data by Latent Class Models M. B¨ ucker Local Classification of Discrete Data ◮ two ways to use LCM for local classification: ⊲ class conditional mixtures (like in MDA) ⊲ common components ◮ class conditional mixtures M k R d D � � � θ x kdr P ( X = x | Z = k ) = f k ( x ) = w mk mkdr , m =1 d =1 r =1 useR! 2009 8

  19. lcda: Local Classification of Discrete Data by Latent Class Models M. B¨ ucker Local Classification of Discrete Data ◮ two ways to use LCM for local classification: ⊲ class conditional mixtures (like in MDA) ⊲ common components ◮ class conditional mixtures M k R d D � � � θ x kdr P ( X = x | Z = k ) = f k ( x ) = w mk mkdr , m =1 d =1 r =1 ◮ common components R d M D � � � θ x dr P ( X = x | Z = k ) = f k ( x ) = w mk mdr , m =1 r =1 d =1 useR! 2009 8

  20. lcda: Local Classification of Discrete Data by Latent Class Models M. B¨ ucker Estimation of a common components model (option 1) ◮ let π k be the class prior, then R d K M D � � � � θ x dr P ( X = x ) = π k w mk mdr m =1 r =1 k =1 d =1 R d M D θ x dr � � � = w m mdr m =1 r =1 d =1 K K since � � w m := P ( m ) = P ( k ) P ( m | k ) = π k w mk k =1 k =1 useR! 2009 9

  21. lcda: Local Classification of Discrete Data by Latent Class Models M. B¨ ucker Estimation of a common components model (option 1) ◮ let π k be the class prior, then R d K M D � � � � θ x dr P ( X = x ) = π k w mk mdr m =1 r =1 k =1 d =1 R d M D θ x dr � � � = w m mdr m =1 r =1 d =1 K K since � � w m := P ( m ) = P ( k ) P ( m | k ) = π k w mk k =1 k =1 ◮ this is a common Latent Class Model ◮ hence, estimate a global Latent Class model and determine parameter w mk of the common components model by N k w mk = 1 � ˆ ˆ P ( Y = m | Z = k, X = x i ) N k i =1 useR! 2009 9

  22. lcda: Local Classification of Discrete Data by Latent Class Models M. B¨ ucker Estimation of a common components model (option 2) ◮ E step: Determination of conditional expectation τ mkn = w mk f ( x n | m ) f ( x n ) useR! 2009 10

  23. lcda: Local Classification of Discrete Data by Latent Class Models M. B¨ ucker Estimation of a common components model (option 2) ◮ E step: Determination of conditional expectation τ mkn = w mk f ( x n | m ) f ( x n ) ◮ M step: Maximization of the log-Likelihood and estimation of N k w mk = 1 � τ mkn N k n =1 and N k K 1 � � θ mdr = τ mkn x ndr N k w mk n =1 k =1 useR! 2009 10

  24. lcda: Local Classification of Discrete Data by Latent Class Models M. B¨ ucker Classification capability in Common Components Models ◮ measure for the ability to separate classes adequately ◮ impurity measures handling the subgroups like nodes in decision trees useR! 2009 11

Recommend


More recommend