Wavelet-based clustering for mixed-effects functional models in high dimension. Franck Picard, LBBE - Lyon Madison Giacofci LJK (Grenoble) Sophie Lambert-Lacroix (TIMC - Grenoble) Guillemette Marot (Univ. Lille 2) F. Picard (LBBE) SSB - February 2012 1 / 34
Introduction, Presentation Outline Introduction, Presentation 1 Functional Clustering with mixed effects 2 Estimation and model selection 3 Simulations 4 F. Picard (LBBE) SSB - February 2012 2 / 34
Introduction, Presentation Functional data analysis More and more fields collect curve-like data (growth curves, mass spectrometry, ...) Functional data refers to observations that are curves sampled on a fine grid The usual statistical framework used to analyze such data is nonparametric regression: ( m = 1 . . . , M ) Y ( t m ) = µ ( t m ) + E ( t m ) , E ( t ) ∼ N (0 , σ 2 ) . Goal: recover function µ ( t ) from noisy observations F. Picard (LBBE) SSB - February 2012 3 / 34
Introduction, Presentation Choosing wavelets when dealing with high dimensional data Traditional approaches when dealing with functional data is to use a functional basis (polynomial, splines, wavelets) Splines have been long studied in longitudinal data analysis for instance Wavelets offer 3 main advantages: → The fine modeling of curves with irregularities → sparse representations → Computationally efficiency (the DWT is in O ( M )) F. Picard (LBBE) SSB - February 2012 4 / 34
Introduction, Presentation Definition of wavelets and wavelet coefficients Wavelets provide an orthonormal basis of L 2 ( R ) with a scaling function φ and a mother wavelet ψ such that: � � φ j 0 k ( t ) , k = 0 , . . . , 2 j 0 − 1; ψ jk ( t ) , j ≥ j 0 , k = 0 , . . . , 2 j − 1 Any function Y ∈ L 2 ( R ) is then expressed in the form: 2 j 0 − 1 2 j − 1 � � � c ∗ d ∗ Y ( t ) = j 0 k φ j 0 k ( t ) + jk ψ jk ( t ) k =0 j ≥ j 0 k =0 where c ∗ j 0 k = � Y , φ j 0 k � and d ∗ jk = � Y , φ jk � are the theorical scaling and wavelet coefficients. F. Picard (LBBE) SSB - February 2012 5 / 34
Introduction, Presentation DWT and empirical wavelet coefficients We observe function Y ( t ) on discrete sample points ( t m ), Y ( t ) = [ Y ( t 1 ) , . . . , Y ( t M )] The Discrete Wavelet Transform is given by � c � [ M × M ] Y W [ M × 1] = d W is an orthogonal matrix of filter (wavelet specific), ( c , d ) are empirical wavelet coefficients such that: √ M × c ∗ ≃ c √ M × d ∗ ≃ d F. Picard (LBBE) SSB - February 2012 6 / 34
Introduction, Presentation From non parametric to parametric linear models Once the data have been projected in the functional domain we retrieve a linear model such that: WY ( t ) = W µ ( t ) + WE ( t ) � c � � α � = + ε d β The next step is often to threshold wavelet coefficients for reconstruction purposes Many strategies have been proposed among which the standard hard thresholding rule [3] which sets to zero ( d jk )s whose absolute value � is lower than � σ 2 log( M ) F. Picard (LBBE) SSB - February 2012 7 / 34
Introduction, Presentation Functional ANOVA Experiments are now designed to collect sets of curves on different individuals We now observe many realizations of the same function which can be modeled by functional models: i = 1 , . . . , N , m = 1 . . . , M Y i ( t m ) = X i µ ( t m ) + E i ( t m ) , E i ( t ) ∼ N (0 , σ 2 ) . µ ( t ) becomes a fixed functional effect, and X is its design matrix Standard statistical questions can be assessed in the functional setting: test of a functional effect, comparison of treatments... F. Picard (LBBE) SSB - February 2012 8 / 34
Introduction, Presentation Functional Clustering Model (FCM) Among “classical” questions, clustering has focused much attention The idea is to cluster individuals based on functional observations We suppose that the cluster structure concerns the fixed effects of the model When using a mixture model we introduce the label variable ζ i ℓ ∼ M (1 , π = ( π 1 , . . . , π L )) such that given { ζ i ℓ = 1 } Y i ( t m ) = X i µ ℓ ( t m ) + E i ( t m ) In the coefficient domain, a standard EM algorithm can be used to estimate the parameters (case X = I ) [2]: � c i � � α ℓ � = + ε i . d i β ℓ F. Picard (LBBE) SSB - February 2012 9 / 34
Introduction, Presentation Application of Mass Spectrometry data 40 30 control Each spectra contains 15154 20 ionised peptides defined by a 10 m / z ratio. 0 Available at http://home.ccr.cancer. 40 gov/ncifdaproteomics/ cancer 30 ppatterns.asp 20 Samples from 253 women: 91 10 Controls, 162 Cases (ovarian 0 cancer) [6] Figure: MALDI-TOF Spectra (window of 512). F. Picard (LBBE) SSB - February 2012 10 / 34
Introduction, Presentation Application to array CGH data 2 1 1q16q 0 Each profile is CGH profile from −1 Breast Cancer patients −2 Samples from 55 profiles with clinical informations [5] 2 Subgroup discovery (1q16q) 1 other 0 Super high inter-individual −1 variability −2 Figure: Array CGH profiles from [5] Using a functional model on these data provides an EER ∼ 50% ! F. Picard (LBBE) SSB - February 2012 11 / 34
Functional Clustering with mixed effects Outline Introduction, Presentation 1 Functional Clustering with mixed effects 2 Estimation and model selection 3 Simulations 4 F. Picard (LBBE) SSB - February 2012 12 / 34
Functional Clustering with mixed effects Functional Clustering Mixed Models Mixed models are used to account for some structure in the variability of the observations Functional Mixed models are considered to introduce inter-individual functional variability such that given { ζ i ℓ = 1 } : Y i ( t m ) = X i µ ℓ ( t m ) + Z i U i ( t m ) + E i ( t m ) U i ( t ) ∼ N (0 , K ℓ ( s , t )) are random functions independent of E ( t ) In the wavelet domain, the model resumes to (case X = Z = I ): � c i � � α ℓ � � ν i � = + + ε i , d i β ℓ θ i � ν i � � � G ν �� 0 ∼ N 0 , . θ i 0 G θ F. Picard (LBBE) SSB - February 2012 13 / 34
Functional Clustering with mixed effects Specification of the covariance of random effects We suppose that G is diagonal [4] Then the fixed and random effects should lie in the same Besov space. Introduce parameter η related to the regularity of process U i ( t ) Theorem Abramovich & al. [1] Suppose µ ( t ) ∈ B s p , q and V ( θ i jk ) = 2 − j η γ 2 θ then � s + 1 / 2 − η/ 2 = 0 , if 1 ≤ p < ∞ and q = ∞ U i ( t ) ∈ B s ↔ p , q [0 , 1] a.s. s + 1 / 2 − η/ 2 < 0 , otherwise . The structure of the random effect can also vary wrt position and scale ( γ 2 θ, jk ), and/or group membership ( γ 2 θ, jk ℓ ) F. Picard (LBBE) SSB - February 2012 14 / 34
Functional Clustering with mixed effects Dimensionality reduction step Inspired by a strategy proposed in Antoniadis & al. [2] in two steps √ 2 log M . Individual hard thresholding with the universal threshold � σ ε Use the average of the MAD estimators computed on each indidivual This strategy seems reasonnable since: V ( d i Jk ) = σ 2 ε + 2 − J η γ 2 θ ≃ σ 2 ε Take union of selected coefficients Removes positions that are non informative wrt to the clustering goal (i.e positions that are zero for all individuals) F. Picard (LBBE) SSB - February 2012 15 / 34
Estimation and model selection Outline Introduction, Presentation 1 Functional Clustering with mixed effects 2 Estimation and model selection 3 Simulations 4 F. Picard (LBBE) SSB - February 2012 16 / 34
Estimation and model selection Using the EM algorithm In the wavelet domain, the model is a Gaussian mixture model with a structured variance Both label variables ζ and random effects ( ν , θ ) are unobserved The complete data log-likelihood can be written such that: � � � � c , d , ν , θ , ζ ; π , α , β , G , σ 2 c , d | ν , θ , ζ ; π , α , β , σ 2 log L = log L ε ε + log L ( ν , θ | ζ ; G ) + log L ( ζ ; π ) . This likelihood can be easily computed thanks to the properties of mixed linear models such that: � c i �� � ν i � �� α ℓ + ν i � � � � , σ 2 , { ζ i ℓ = 1 } ∼ N ε I . � d i θ i β ℓ + θ i F. Picard (LBBE) SSB - February 2012 17 / 34
Estimation and model selection Predictions of hidden variables The EM algorithm provides the posterior probability of membership to cluster ℓ , � � ℓ , G [ h ] + σ 2[ h ] π [ h ] c i , d i ; α [ h ] ℓ , β [ h ] ℓ f I ε τ [ h +1] � � . = � i ℓ p π [ h ] c i , d i ; α [ h ] p , β [ h ] p , G [ h ] + σ 2[ h ] p f I ε The E-step also provides the BLUP of random effects: � � � � ν [ h +1] c i − α [ h ] 1 + λ [ h ] , λ ν = σ 2 ε /γ 2 � = / ν , i ℓ ℓ ν � � � � [ h +1] d i − β [ h ] 1 + 2 j η λ [ h ] � , λ θ = σ 2 ε /γ 2 θ = / θ . i ℓ ℓ θ F. Picard (LBBE) SSB - February 2012 18 / 34
Recommend
More recommend