On Mixtures of Factor Mixture Analyzers Cinzia Viroli cinzia.viroli@unibo.it Department of Statistics, University of Bologna, Italy Compstat 2010 Paris August 22-27 – slide 1
State of the art (1) ■ In model based clustering the data are assumed to come from a Introduction ➢ State of the art finite mixture model (McLachlan and Peel, 2000) with each ➢ State of the art component corresponding to a cluster. ➢ State of the art ➢ State of the art ➢ State of the art MFMA An empirical illustration Compstat 2010 Paris August 22-27 – slide 2
State of the art (1) ■ In model based clustering the data are assumed to come from a Introduction ➢ State of the art finite mixture model (McLachlan and Peel, 2000) with each ➢ State of the art component corresponding to a cluster. ➢ State of the art ➢ State of the art ➢ State of the art ■ For quantitative data each mixture component is usually modeled MFMA as a multivariate Gaussian distribution (Fraley and Raftery, An empirical illustration 2002): k � w i φ ( p ) ( y ; µ i , Σ i ) f ( y ; θ ) = i =1 Compstat 2010 Paris August 22-27 – slide 2
State of the art (1) ■ In model based clustering the data are assumed to come from a Introduction ➢ State of the art finite mixture model (McLachlan and Peel, 2000) with each ➢ State of the art component corresponding to a cluster. ➢ State of the art ➢ State of the art ➢ State of the art ■ For quantitative data each mixture component is usually modeled MFMA as a multivariate Gaussian distribution (Fraley and Raftery, An empirical illustration 2002): k � w i φ ( p ) ( y ; µ i , Σ i ) f ( y ; θ ) = i =1 ■ However when the number of observed variables is large, it is well known that Gaussian mixture models represent an over-parameterized solution. Compstat 2010 Paris August 22-27 – slide 2
State of the art (2) Some solutions (among the others): Model based clustering Dimensionally reduced model based clustering Compstat 2010 Paris August 22-27 – slide 3
State of the art (2) Some solutions (among the others): Model based clustering Dimensionally reduced model based clustering ■ Banfield and Raftery (1993): proposed a parameterization of the generic component- covariance matrix based on its spectral decomposition: Σ i = λ i A ⊤ i D i A i ■ Bouveyron et al. (2007): proposed a different parameteri- zation of the generic component- covariance matrix Compstat 2010 Paris August 22-27 – slide 3
State of the art (2) Some solutions (among the others): Model based clustering Dimensionally reduced model based clustering ■ Banfield and Raftery (1993): ■ Ghahrami and Hilton (1997) and proposed a parameterization McLachlan et al. (2003): of the generic component- Mixtures of Factor Analyzers (MFA) covariance matrix based on its spectral decomposition: Σ i = λ i A ⊤ i D i A i ■ Bouveyron et al. (2007): proposed a different parameteri- zation of the generic component- covariance matrix Compstat 2010 Paris August 22-27 – slide 3
State of the art (2) Some solutions (among the others): Model based clustering Dimensionally reduced model based clustering ■ Banfield and Raftery (1993): ■ Ghahrami and Hilton (1997) and proposed a parameterization McLachlan et al. (2003): of the generic component- Mixtures of Factor Analyzers (MFA) covariance matrix based on its spectral decomposition: ■ Yoshida et al. (2004), Baek and Σ i = λ i A ⊤ i D i A i McLachlan (2008), Montanari and Viroli (2010) : ■ Bouveyron et al. (2007): Factor Mixture Analysis (FMA) proposed a different parameteri- zation of the generic component- covariance matrix Compstat 2010 Paris August 22-27 – slide 3
Mixture of factor analyzers (MFA) ■ Dimensionality reduction is performed through k factor models with Gaussian factors Compstat 2010 Paris August 22-27 – slide 4
Mixture of factor analyzers (MFA) ■ Dimensionality reduction is performed through k factor models with Gaussian factors ■ The distribution of each observation is modelled, with probability π j ( j = 1 , . . . , k ), according to an ordinary factor analysis model y = η j + Λ j z + e j , with e j ∼ φ ( p ) ( 0 , Ψ j ) , where Ψ j is a diagonal matrix and z j ∼ φ ( q ) ( 0 , I q ) Compstat 2010 Paris August 22-27 – slide 4
Mixture of factor analyzers (MFA) ■ Dimensionality reduction is performed through k factor models with Gaussian factors ■ The distribution of each observation is modelled, with probability π j ( j = 1 , . . . , k ), according to an ordinary factor analysis model y = η j + Λ j z + e j , with e j ∼ φ ( p ) ( 0 , Ψ j ) , where Ψ j is a diagonal matrix and z j ∼ φ ( q ) ( 0 , I q ) ■ In the observed space we obtain a finite mixture of multivariate Gaussians with heteroscedastic components: k � π j φ ( p ) ( η j , Λ j Λ ⊤ f ( y ) = j + Ψ j ) j =1 Compstat 2010 Paris August 22-27 – slide 4
Factor Mixture Analysis (FMA) ■ Dimensionality reduction is performed through a single factor model with factors modelled by a multivariate Gaussian mixture Compstat 2010 Paris August 22-27 – slide 5
Factor Mixture Analysis (FMA) ■ Dimensionality reduction is performed through a single factor model with factors modelled by a multivariate Gaussian mixture ■ The observed centred data are described as y = Λ z + e with e ∼ φ ( p ) ( 0 , Ψ ) where Ψ is diagonal. Compstat 2010 Paris August 22-27 – slide 5
Factor Mixture Analysis (FMA) ■ Dimensionality reduction is performed through a single factor model with factors modelled by a multivariate Gaussian mixture ■ The observed centred data are described as y = Λ z + e with e ∼ φ ( p ) ( 0 , Ψ ) where Ψ is diagonal. ■ The q factors are assumed to be standardized and are modelled as a finite mixture of multivariate Gaussians k γ i φ ( q ) � f ( z ) = i ( µ i , Σ i ) . i =1 Compstat 2010 Paris August 22-27 – slide 5
Factor Mixture Analysis (FMA) ■ Dimensionality reduction is performed through a single factor model with factors modelled by a multivariate Gaussian mixture ■ The observed centred data are described as y = Λ z + e with e ∼ φ ( p ) ( 0 , Ψ ) where Ψ is diagonal. ■ The q factors are assumed to be standardized and are modelled as a finite mixture of multivariate Gaussians k γ i φ ( q ) � f ( z ) = i ( µ i , Σ i ) . i =1 ■ In the observed space we obtain a finite mixture of multivariate Gaussians with heteroscedastic components: k ( Λ µ i , ΛΣ i Λ ⊤ + Ψ ) . γ i φ ( p ) � f ( y ) = i i =1 Compstat 2010 Paris August 22-27 – slide 5
MFA vs FMA MFA FMA ■ k factor models with q Gaussian ■ one factor model with q non factors; Gaussian factors (distributed as a multivariate mixture of Gaus- sians); Compstat 2010 Paris August 22-27 – slide 6
MFA vs FMA MFA FMA ■ k factor models with q Gaussian ■ one factor model with q non factors; Gaussian factors (distributed as a multivariate mixture of Gaus- sians); ■ The number of clusters corre- ■ The number of clusters is defined sponds to the number of factor by the number of components of models; ⇒ ’local’ dimension re- the Gaussian mixture; ⇒ ’global’ duction within each group dimension reduction and cluster- ing is performed in the latent space. Compstat 2010 Paris August 22-27 – slide 6
MFA vs FMA MFA FMA ■ k factor models with q Gaussian ■ one factor model with q non factors; Gaussian factors (distributed as a multivariate mixture of Gaus- sians); ■ The number of clusters corre- ■ The number of clusters is defined sponds to the number of factor by the number of components of models; ⇒ ’local’ dimension re- the Gaussian mixture; ⇒ ’global’ duction within each group dimension reduction and cluster- ing is performed in the latent space. ■ A flexible solution with less pa- ■ A flexible solution with less pa- rameters than model based clus- rameters than model based clus- tering; tering; Compstat 2010 Paris August 22-27 – slide 6
Introduction MFMA ➢ Definition (1) ➢ Definition (2) ➢ A note An empirical illustration Mixtures of Factor Mixture Analyzers Compstat 2010 Paris August 22-27 – slide 7
The model We assume the data can be described by k 1 factor models with Introduction probability π j ( j = 1 , . . . , k 1 ): MFMA ➢ Definition (1) ➢ Definition (2) y = η j + Λ j z + e j . (1) ➢ A note An empirical illustration Compstat 2010 Paris August 22-27 – slide 8
The model We assume the data can be described by k 1 factor models with Introduction probability π j ( j = 1 , . . . , k 1 ): MFMA ➢ Definition (1) ➢ Definition (2) y = η j + Λ j z + e j . (1) ➢ A note An empirical illustration Within all the factor models, the factors are assumed to be distributed according to a finite mixture of k 2 Gaussians: k 2 � γ i φ ( q ) ( µ i , Σ i ) , f ( z ) = (2) i =1 with mixture parameters supposed to be equal across the factor models j = 1 , . . . , k 1 . Compstat 2010 Paris August 22-27 – slide 8
Recommend
More recommend