covariance matrix estimation for the cryo em
play

Covariance Matrix Estimation for the Cryo-EM Heterogeneity Problem - PowerPoint PPT Presentation

Covariance Matrix Estimation for the Cryo-EM Heterogeneity Problem Amit Singer Princeton University Department of Mathematics and Program in Applied and Computational Mathematics July 25, 2014 Joint work with Gene Katsevich (Princeton) and


  1. Covariance Matrix Estimation for the Cryo-EM Heterogeneity Problem Amit Singer Princeton University Department of Mathematics and Program in Applied and Computational Mathematics July 25, 2014 Joint work with Gene Katsevich (Princeton) and Alexander Katsevich (UCF) Amit Singer (Princeton University) July 2014 1 / 41

  2. Single Particle Cryo-Electron Microscopy Drawing of the imaging process: Amit Singer (Princeton University) July 2014 2 / 41

  3. The X-ray Transform � φ ( R T ( P s φ )( x , y ) = s r ) dz , R where r = ( x , y , z ) T , R s ∈ SO (3), and φ ∈ L 2 ( B ) is the electric potential of the molecule. Projection I s   | | | Molecule φ  ∈ SO(3) R 1 R 2 R 3 R s =  s s s | | | Electron source Amit Singer (Princeton University) July 2014 3 / 41

  4. The cryo-EM problem If S : L 2 ( D ) → R q is discretization onto N × N grid ( q ≈ π 4 N 2 ) then the microscope produces I s = SP s φ + ǫ s , s = 1 , . . . , n , where ǫ s is noise. Nyquist ⇒ can reconstruct only up to a bandlimit W . Natural search space for φ : space B of band limited functions. Suppose dim( B ) = p . Cryo-EM problem: Given I s , estimate R s and find approximation of φ in B . Amit Singer (Princeton University) July 2014 4 / 41

  5. The heterogeneity problem What if the molecule has more than one possible structure? (Image source: H. Liao and J. Frank, Classification by bootstrapping in single particle methods, Proceedings of the 2010 IEEE international conference on biomedical imaging , 2010.) Amit Singer (Princeton University) July 2014 5 / 41

  6. The heterogeneity problem (discrete conformational space) Problem (1) φ : Ω × R 3 → R , with P [ φ = φ c ] = p c , c = 1 , . . . , C. φ 1 , . . . , φ n are i.i.d. samples from φ Let I s = SP s φ s + ǫ s , where P 1 , . . . , P n are obtained from i.i.d. rotations R s (independent of the φ s ) drawn from a distribution over SO (3) , and ǫ s ∼ N (0 , σ 2 I p ) , independent of φ s , P s . Given R s , I s , find the number of classes C, approximations of the molecular structures φ c in B , and their probabilities p c . Amit Singer (Princeton University) July 2014 6 / 41

  7. Previous work Common lines (e.g., Shatsky et al, 2010) Maximum likelihood (e.g., Wang et al, 2013) Bootstrapping and PCA (e.g., Penczek et al, 2011) Amit Singer (Princeton University) July 2014 7 / 41

  8. The heterogeneity problem: finite-dimensional formulation For purposes of this talk, suppose that φ c ∈ B . Note that P s := SP s | B is a linear operator P s : R p → R q . If X s is basis representation of φ s , then I s = P s X s + ǫ s . X 1 , . . . , X n are i.i.d samples of a random vector X in R p . When X has C classes, Σ = Var[ X ] has rank C − 1. Penczek (2011) proposed a (suboptimal) method to use the top C − 1 eigenvectors of Σ to solve the heterogeneity problem. How to estimate Σ? Amit Singer (Princeton University) July 2014 8 / 41

  9. Covariance Estimation from Projected Data Consider the following general statistical estimation problem: Problem (2) X is a random vector on C p , with E [ X ] = µ and Var ( X ) = Σ . X 1 , . . . , X n are i.i.d. samples from X Let I s = P s X s + ǫ s , s = 1 , . . . , n , P 1 , . . . , P n : C p → C q are i.i.d. samples (independent of the X s ) from a distribution over C q × p ( q ≤ p ) ǫ s are i.i.d. noises such that E [ ǫ s ] = 0 , Var [ ǫ s ] = σ 2 I q , independent of X s , P s Given I s , P s , and σ 2 , estimate µ and Σ . Amit Singer (Princeton University) July 2014 9 / 41

  10. PCA from projected data: Well-known special cases I s = P s X s + ǫ s , s = 1 , . . . , n . If p = q and P s = I p , we have the usual covariance estimation problem. When P s are coordinate-selection operators , we have a missing data statistics problem (Little and Rubin, 1987). A special case of the matrix sensing problem (Recht, Fazel, Parrilo, 2010) Here we do not attempt to impute the missing entries / estimate X s assuming Σ has low rank. Instead we shall try to estimate Σ and its rank directly. Amit Singer (Princeton University) July 2014 10 / 41

  11. Obtaining an estimator I s = P s X s + ǫ s . Note that Var[ I s ] = P s Σ P ∗ s + σ 2 I . E [ I s ] = P s µ ; Can construct estimators µ n and Σ n as solutions of optimization problem: n � � I s − P s µ � 2 µ n = argmin µ s =1 n � ( I s − P s µ n )( I s − P s µ n ) ∗ − ( P s Σ P ∗ � 2 � s + σ 2 I ) � � Σ n = argmin F . Σ s =1 Σ n is not constraint to be positive semidefinite: we are typically interested only in the few leading eigenvectors of Σ n . Amit Singer (Princeton University) July 2014 11 / 41

  12. Resulting linear equations These lead to linear equations for µ n and Σ n : � n � n 1 µ n = 1 � P ∗ � P ∗ A n µ n := s P s s I s n n s =1 s =1 n 1 � P ∗ s P s Σ n P ∗ L n Σ n := s P s n s =1 n n s ( I s − P s µ n )( I s − P s µ n ) ∗ P s − σ 2 1 � � P ∗ P ∗ = s P s . n n s =1 s =1 Amit Singer (Princeton University) July 2014 12 / 41

  13. Connection to missing-data statistics We recover available-case estimators of mean and covariance Known to be consistent under the missing completely at random (MCAR) assumption Amit Singer (Princeton University) July 2014 13 / 41

  14. PCA in high dimensions For the sample covariance matrix n Σ n = 1 � x i x T X ∼ N (0 , I p ) i , n i =1 the asymptotic regime p , n → ∞ with p / n → γ (“high dimensional statistics”) is by now well understood (Johnstone, ICM 2006) Marcenko-Pastur distribution, Tracy-Widom distribution, spiked model (Σ = I p + σ 2 vv T ), phase transition. Amit Singer (Princeton University) July 2014 14 / 41

  15. Connection to high-dimensional PCA n n n s ( I s − P s µ n )( I s − P s µ n ) ∗ P s − σ 2 1 s P s = 1 � P ∗ s P s Σ n P ∗ � P ∗ � P ∗ s P s . n n n s =1 s =1 s =1 When P s = I , we recover the sample covariance matrix How does the spectrum of Σ n behave as a function of n , p , q and the geometry of P s ? Empirical spectral density? Distribution of largest eigenvalue? Spiked model? Amit Singer (Princeton University) July 2014 15 / 41

  16. Theoretical results about Σ n : Preliminary Study Proposition [Asymptotic Consistency, fixed p ] Suppose A n → A and L n → L , where A , L are invertible. Suppose � P s � is uniformly bounded and that X has bounded fourth moments. Then, � 1 � � 1 � � E [Σ n ] − Σ � = O ; Var[Σ n ] = O , n n Proposition [Finite Sample Bounds] Suppose � X � ≤ C 1 , � ǫ s � ≤ C 2 , � P s � ≤ C 3 a . s . Assume that X is centered. If λ = λ min ( L n ) and B = C 2 3 C 1 + C 3 C 2 , then � − 3 n λ 2 t 2 / 2 � P {� Σ − Σ n � ≥ t } ≤ p exp , t ≥ 0 . B 4 + 2 B 2 λ t Amit Singer (Princeton University) July 2014 16 / 41

  17. Regularization Extreme example: q = 2, P s are random coordinate selection operators. Coupon collector: n � p 2 log p for L n to be invertible. Regularization? n � ( I s − P s µ n )( I s − P s µ n ) ∗ − ( P s Σ P ∗ � 2 � � s + σ 2 I ) � F + λ � Σ � 2 min F . Σ s =1 This would set unattainable entries of Σ n to 0. While other regularization strategies are possible, e.g., � W Σ W ∗ � 1 to promote sparsity in some basis W , they come at a price of increased computational complexity. Amit Singer (Princeton University) July 2014 17 / 41

  18. Back to Heterogeneity: Fourier Slice Theorem If ˆ P s is the Fourier-domain version of P s , then ˆ P s simply restricts Fourier volumes to central planes perpendicular to the viewing direction. Amit Singer (Princeton University) July 2014 18 / 41

  19. The slice theorem and covariance matrix estimation The Fourier slice theorem: Better work in Fourier domain φ s = F φ s , ˆ ˆ I s = F I s µ , ˆ Estimate ˆ Σ and then inverse transform to get µ , Σ. ˆ P s is a restriction to a central plane corresponding to R s . In 3D, for any pair of frequencies we can find a central plane that passes through them — coordinate selection operators in the continuous limit. The covariance of 2D objects from their 1D line projections cannot be estimated. Amit Singer (Princeton University) July 2014 19 / 41

  20. The ramp filter �� n � ˆ µ n = 1 s =1 ˆ s ˆ P ∗ A n ˆ P s ˆ µ n n In the continuous limit: replace P s with P s and average with integral over SO (3) with respect to the Haar measure. Fourier slice theorem: ( ˆ s ˆ P s ˆ φ )( ρ ) = ˆ φ ( ρ ) δ ( ρ · R 3 P ∗ s ). µ ( ρ ) 1 � S 2 δ ( ρ · θ ) d θ = ˆ µ ( ρ ) 1 � S 2 δ ( ρ | ρ | · θ ) d θ = ˆ µ ( ρ ) ˆ A ˆ µ ( ρ ) = ˆ 2 | ρ | . | ρ | 4 π 4 π Amit Singer (Princeton University) July 2014 20 / 41

  21. The triangular area filter L n ˆ ˆ Σ n := 1 � n s =1 ˆ s ˆ P s ˆ Σ n ˆ s ˆ P ∗ P ∗ P s n Fourier slice theorem: ( ˆ s ˆ P s ˆ φ )( ρ ) = ˆ P ∗ φ ( ρ ) δ ( ρ · R 3 s ). ( ˆ P ∗ s ˆ P s ˆ Σ ˆ P ∗ s ˆ P s )( ρ 1 , ρ 2 ) = ˆ Σ( ρ 1 , ρ 2 ) δ ( ρ 1 · R 3 s ) δ ( ρ 2 · R 3 s ) Integrate over R 3 s : ( ˆ L ˆ Σ)( ρ 1 , ρ 2 ) = ˆ Σ( ρ 1 , ρ 2 ) K ( ρ 1 , ρ 2 ) . ˆ L is a diagonal operator, with K ( ρ 1 , ρ 2 ) = 1 � S 2 δ ( ρ 1 · θ ) δ ( ρ 2 · θ ) d θ = 1 2 | ρ 1 × ρ 2 | . 4 π 4 π K is the density of planes that go through frequencies ρ 1 , ρ 2 The filter 1 / K is proportional to the area of a triangle with nodes at ρ 1 , ρ 2 , and the origin. Amit Singer (Princeton University) July 2014 21 / 41

Recommend


More recommend