Mixed effect model for the spatiotemporal analysis of longitudinal manifold valued data Stéphanie Allassonnière with J.B. Schiratti, J. Chevallier, I. Koval, V. Debavalaere and S. Durrleman Université Paris Descartes & Ecole Polytechnique
Computational Anatomy Represent and analyse geometrical elements upon which deformations • can act Describe the observed objects as geometrical variations of one or several • representative elements Quantify this variability inside a population • Deformable template model from Grenander How does the deformation act? • What is a representative element? • How to quantify the geometrical variability ? • 2
Computational Anatomy One solution : Quantify the distance between observations using deformations • Provide a statistical model to approximate the generation of the observed • population from the atlas Propose a statistical learning algorithm • Optimise the numerical estimation • 3
Bayesian Mixed Effect model First model : • – One observation per subject – Image or shape (viewed as currents) – Deformations either linearized or diffeomorphic – Homogeneous or heterogeneous populations (mixture models) 4
Bayesian Mixed Effect model 5
Bayesian Mixed Effect model 6
Bayesian Mixed Effect model First model : • – One observation per subject – Image or shape (viewed as currents) – Deformations either linearized or diffeomorphic – Homogeneous or heterogeneous populations (mixture models) Ø Limitations Ø One observation per subject Ø Corresponding acquistion time 7
Longitudinal Data Analysis Longitudinal model : • – Several observation per subject – Image, shape, etc – Atlas = representative trajectory and population variability 8
Longitudinal Data Analysis Time (age) subject #1 subject #2 subject #3 How to learn representative trajectories of data changes from longitudinal data? Linear mixed-effects models Temporal marker of progression Regression vector-valued data [Laird&Ware’82, Diggle et al., (e.g. time since drug injection, seeding, birth, (e.g. compare measurements at Fitzmaurice et al.] etc..) same time-point) Needs to disentangle Learning spatiotemporal differences in: distribution of trajectories No temporal marker of progression manifold-valued data - Measurements (e.g. in aging, neurodegenerative diseases, - Find temporal correspondences (normalized data, positive - Compare data at corresponding etc..) matrices, shapes, etc;;) - Dynamics of measurement stages of progression changes 9
Spatiotemporal Statistical Model • Statistical model inclinding: • a representative trajectory of data changes • spatiotemporal variations in: t 0 v 0 • measurement values p 0 • pace of measurement t v i changes p ij T 0 ( t ) = Exp p 0 ,t 0 ( v 0 )( t ) v ij • Orthogonality condition ensures T i ( t ) = Exp T 0 ( t ) (P T 0 t 0 ,t ( v i )) identifiability (unique space/time y ij = T i ( ψ i ( t )) + ε ij decomposition) ψ i ( t ) = t 0 + α i ( t − t 0 − τ i ) • Time is not a covariate but a random variable Space-shift Time-shift Acceleration factor Random effects: α i ∼ log N (0 , σ 2 τ i ∼ N (0 , σ 2 v i = ( A 1 | . . . | A K ) s i α ) τ ) A k ⊥ v 0 Fixed effects: ( σ 2 α , σ 2 and ( p 0 , t 0 , v 0 ) τ , A 1 , ...A K ) 10 [Schiratti et al. IPMI’15, NIPS’15]
Spatiotemporal Statistical Model Submanifold value observations { y ij = T i ( ψ i ( t )) + ε ij Parallel curve T i ( t ) = Exp T 0 ( t ) (P T 0 t 0 ,t ( v i )) Representative trajectory T 0 ( t ) = Exp p 0 ,t 0 ( v 0 )( t ) Linear time reparametrization ψ i ( t ) = t 0 + α i ( t − t 0 − τ i ) α i ∼ log N (0 , σ 2 { α ) Hidden random variables: Acceleration factor τ i ∼ N (0 , σ 2 τ ) Time shift Space shift v i = ( A 1 | . . . | A K ) s i A k ⊥ v 0 Parameters: { ( p 0 , t 0 , v 0 ) Mean trajectory parametrization and ( σ 2 α , σ 2 τ , A 1 , ...A K ) prior parameter 11
Spatiotemporal Statistical Model Comparison with previous work : v 0 v 0 p 0 p 0 v i v i p ij v ij Interest : Parallel transport keep invariant the structure of the distribution, but updated it in time Σ P 0 tT Σ P 0 t p 0 v i p ij v ij 12
Spatiotemporal Statistical Model • The straight line model M = R y b b t 0 t 0 time time x x y ij = ( a + a i )( t i,j − t 0 − τ i } ) + b + ε i,j y ij = ( a + a i )( t i,j − t 0 ) + b + b i | {z } + ε i,j | {z Time at which Measurement of the i th measurement of the i th subject at time t 0 subject reaches ¯ b 13 Schiratti et al. (2015) Laird & Ware (1982)
Spatiotemporal Statistical Model uv • The logistic curve model M =]0 , 1[ , g ( p )( u, v ) = p 2 (1 − p ) 2 • Geodesic are logistic curves (1 − p 0 ) /p 0 ⇣ ⌘ γ 0 ( t ) = 1 + y ij = γ 0 t 0 + α i ( t − t 0 − τ i ) + ε ij ⇣ ⌘ exp p 0 (1 − p 0 ) ( t − t 0 ) v 0 − • It is not equivalent to a linear model on the logit of the observations (i.e. the Riemannian log at p 0 = 0.5), since p 0 is estimated • If we fix p 0 = 0.5 in our model à end up with our previous linear case (different from Laird&Ware) 14
Spatiotemporal Statistical Model N u k v k • The propagation model X M =]0 , 1[ N , g ( p )( u, v ) = p 2 k (1 − p k ) 2 k =1 • Geodesics are logistic curves in each coordinate • Parametric family of geodesics seen as a model of propagation of an effect ⇣ ⌘ γ δ ( t ) = γ 0 ( t ) , γ 0 ( t − δ 1 ) , . . . , γ 0 ( t − δ N − 1 ) • The parallel curve in the direction of the space-shift v i writes ✓ ✓ ◆ ✓ ◆ ✓ ◆◆ t + v i, 1 t − δ 1 + v i, 2 t − δ N − 1 + v i,N γ 0 , γ 0 , ..., γ 0 v 0 v 0 v 0 à The parallel changes the relative timing of the effect onset across coordinates 15
Parameter Estimation y = ( y 1 , ..., y N ) , z = ( z 1 , ...z N ) , θ = ( σ 2 z , σ 2 ε , A 1 , ..., A K , p 0 , t 0 , v 0 ) • Maximum Likelihood: Z max θ p ( y | θ ) = p ( y, z | θ ) dz N ✓ ◆ Z X • EM: θ k +1 = argmax θ log p ( y i , z i | θ ) p ( z i | y i , θ k ) dz i | {z } i =1 p ( y i | z i , θ ) p ( z i | θ ) • Distribution from the curved exponential family log p ( y i , z i | θ ) = φ ( θ ) T S ( y i , z i ) − log ( C ( θ )) ( N ) Z X φ ( θ ) T θ k +1 = argmax θ S ( y i , z i ) p ( z i | y i , θ k ) dz i − N log( C ( θ )) i =1 16
<latexit sha1_base64="Z9NCyt3yIeWKeHQl9WaurXkrylo=">AC2HicjVHLSsNAFD2Nr1pf1S7dBItQUqgi6LblxWsA9sS0jSaR2aJjGZCLUK7sStP+BWv0j8A/0L74wpqEV0QpIz595zZu69duDySBjGa0qbmp6ZnUvPZxYWl5ZXsqtrtciPQ4dVHd/1w4ZtRczlHqsKLlzWCEJmDWyX1e3+kYzXL1kYcd87FcOAtQdWz+Nd7liCKDObuyhcmfx6aPLtljhnwjL7W2Y2bxQNtfRJUEpAHsmq+NkXtNCBDwcxBmDwIAi7sBDR0QJBgLi2hgRFxLiKs5wgwxpY8pilGER26dvj3bNhPVoLz0jpXboFJfekJQ6NknjU15IWJ6mq3isnCX7m/dIecq7DelvJ14DYgXOif1LN878r07WItDFgaqBU02BYmR1TuISq67Im+tfqhLkEBAncYfiIWFHKcd91pUmUrXL3loq/qYyJSv3TpIb413ekgZc+jnOSVDbKZ2i8bJXr58mIw6jXVsoEDz3EcZx6igSt5DPOIJz9qZdqvdafefqVoq0eTwbWkPH86XlxU=</latexit> Parameter Estimation: stochastic algorithm • SA-EM : replaces integration by one simulation of the hidden variable: sample from , z i,k +1 p ( z i | y i , θ k ) and a stochastic approximation of the sufficient statistics N ! 1 X S k +1 = (1 − ∆ k ) S k + ∆ k S ( y i , z i,k +1 ) N i =1 Maximization step (unchanged) φ ( θ ) T S k +1 − log( C ( θ )) � θ k +1 = argmax θ • MCMC -SAEM: replaces sampling by a single Markov Chain step • For each subject, sample the random effect w.r.t a transition kernel of a geometrically ergodic Markov chain targeting the conditional distribution q ( z i | y i , θ k ) [Delyon, Lavielle, Moulines.’99] 17 [Allassonnière et al.’10]
Recommend
More recommend