3d structure determination using cryo electron microscopy
play

3D Structure Determination using Cryo-Electron Microscopy - PowerPoint PPT Presentation

3D Structure Determination using Cryo-Electron Microscopy Computational Challenges Amit Singer Princeton University Department of Mathematics and Program in Applied and Computational Mathematics July 28, 2015 Mathematics in Data Science


  1. 3D Structure Determination using Cryo-Electron Microscopy — Computational Challenges Amit Singer Princeton University Department of Mathematics and Program in Applied and Computational Mathematics July 28, 2015 Mathematics in Data Science Amit Singer (Princeton University) July 2015 1 / 28

  2. Single Particle Reconstruction using cryo-EM Schematic drawing of the imaging process: The cryo-EM problem: Amit Singer (Princeton University) July 2015 2 / 28

  3. New detector technology: Exciting times for cryo-EM www.sciencemag.org SCIENCE VOL 343 28 MARCH 2014 1443 BIOCHEMISTRY Advances in detector technology and image The Resolution Revolution processing are yielding high-resolution electron cryo-microscopy structures of biomolecules. Werner Kühlbrandt P recise knowledge of the structure of A B C macromolecules in the cell is essen- tial for understanding how they func- tion. Structures of large macromolecules can now be obtained at near-atomic resolution by averaging thousands of electron microscope images recorded before radiation damage accumulates. This is what Amunts et al . have done in their research article on page 1485 of this issue ( 1 ), reporting the structure of the large subunit of the mitochondrial ribosome at 3.2 Å resolution by electron cryo-micros- copy (cryo-EM). Together with other recent high-resolution cryo-EM structures ( 2 – 4 ) (see the fi gure), this achievement heralds the beginning of a new era in molecular biology, where structures at near-atomic resolution are no longer the prerogative of x-ray crys- tallography or nuclear magnetic resonance Near-atomic resolution with cryo-EM. ( A ) The large subunit of the yeast mitochondrial ribosome at 3.2 Å (NMR) spectroscopy. reported by Amunts et al . In the detailed view below, the base pairs of an RNA double helix and a magnesium Ribosomes are ancient, massive protein- ion (blue) are clearly resolved. ( B ) TRPV1 ion channel at 3.4 Å ( 2 ), with a detailed view of residues lining the RNA complexes that translate the linear ion pore on the four-fold axis of the tetrameric channel. ( C ) F 420 -reducing [NiFe] hydrogenase at 3.36 Å ( 3 ). genetic code into three-dimensional proteins. The detail shows an α helix in the FrhA subunit with resolved side chains. The maps are not drawn to scale. Mitochondria—semi-autonomous organelles Amit Singer (Princeton University) July 2015 3 / 28

  4. Big “Movie” Data, Publicly Available http://www.ebi.ac.uk/pdbe/emdb/empiar/ Amit Singer (Princeton University) July 2015 4 / 28

  5. Image Formation Model and Inverse Problem Projection I i  T  R 1 − − Molecule φ i T R i = R 2  ∈ SO(3) − −   i  T R 3 − − i Electron source � ∞ −∞ φ ( xR 1 i + yR 2 i + zR 3 Projection images I i ( x , y ) = i ) dz + “noise”. φ : R 3 �→ R is the electric potential of the molecule. Cryo-EM problem: Find φ and R 1 , . . . , R n given I 1 , . . . , I n . Amit Singer (Princeton University) July 2015 5 / 28

  6. Toy Example Amit Singer (Princeton University) July 2015 6 / 28

  7. E. coli 50S ribosomal subunit 27,000 particle images provided by Dr. Fred Sigworth, Yale Medical School 3D reconstruction by S, Lanhui Wang, and Jane Zhao Amit Singer (Princeton University) July 2015 7 / 28

  8. Main Algorithmic Challenges 1 Orientation assignment 2 Heterogeneity (resolving structural variability) 3 2D Class averaging (de-noising) 4 Particle picking 5 Symmetry detection 6 Motion correction Amit Singer (Princeton University) July 2015 8 / 28

  9. The heterogeneity problem What if the molecule has more than one possible structure? (Image source: H. Liao and J. Frank, Classification by bootstrapping in single particle methods, ISBI , 2010.) Katsevich, Katsevich, S ( SIAM Journal on Imaging Sciences , 2015) Covariance matrix estimation of the 3-D structures from their 2-D projections (high-dimensional statistics, random matrices, low-rank matrix completion) Amit Singer (Princeton University) July 2015 9 / 28

  10. Experimental Data: 70S Ribosome 10000 image dataset (130-by-130), courtesy Joachim Frank (Columbia University) Class 1 Class 2 Morphing video by S, Joakim And´ en, and Eugene Katsevich Amit Singer (Princeton University) July 2015 10 / 28

  11. Class averaging for image denoising Rotation invariant representation (steerable PCA, bispectrum) Vector diffusion maps (S, Wu 2012), generalization of Laplacian Eigenmaps (Belkin, Niyogi 2003) and Diffusion Maps (Coifman, Lafon 2006) Graph Connection Laplacian Experimental images (70S) courtesy of Dr. Joachim Frank (Columbia) Class averages by vector diffusion maps (averaging with 20 nearest neighbors) (Zhao, S 2014) Amit Singer (Princeton University) July 2015 11 / 28

  12. Orientation Estimation Standard procedure is iterative refinement. Alternating minimization or expectation-maximization, starting from an initial guess φ 0 for the 3-D structure I i = P ( R i · φ ) + ǫ i , i = 1 , . . . , n . R i · φ ( r ) = φ ( R − 1 r ) is the left group action i P is integration in the z -direction and grid sampling. Converges to a local optimum, not necessarily the global one. Model bias is a well-known pitfall Is “reference free” orientation assignment and reconstruction possible? Amit Singer (Princeton University) July 2015 12 / 28

  13. 3D Puzzle Optimization problem: n � f ij ( g i g − 1 min ) j g 1 , g 2 ,..., g n ∈ G i , j =1 G = SO (3) is the group of rotations in space. Parameter space G × G × · · · × G is exponentially large. Amit Singer (Princeton University) July 2015 13 / 28

  14. Non-Unique Games over Compact Groups 1 n w 14 � f ij ( g i g − 1 min ) w 12 4 j w 24 g 1 , g 2 ,..., g n ∈ G i , j =1 2 w 45 w 25 w 34 5 For G = Z 2 this encodes Max-Cut. 3 w 35 Max-2-Lin( Z L ) formulation of Unique Games (Khot et al 2005): Find x 1 , . . . , x n ∈ Z L that satisfy as many difference eqs as possible x i − x j = b ij mod L , ( i , j ) ∈ E This corresponds to G = Z L and f ij ( x i − x j ) = − 1 { x i − x j = b ij } Our games are non-unique in general, and the group is not necessarily finite. Amit Singer (Princeton University) July 2015 14 / 28

  15. Orientation Estimation: Fourier projection-slice theorem R i c ij c ij = ( x ij , y ij , 0) T ( x ij , y ij ) ˆ Projection I i I i 3D Fourier space ( x ji , y ji ) R i c ij = R j c ji ˆ Projection I j I j 3D Fourier space Amit Singer (Princeton University) July 2015 15 / 28

  16. Angular Reconstitution (Vainshtein and Goncharov 1986, Van Heel 1987) Amit Singer (Princeton University) July 2015 16 / 28

  17. Least Squares Approach R i c ij c ij = ( x ij , y ij , 0) T ( x ij , y ij ) ˆ Projection I i 3D Fourier space I i ( x ji , y ji ) R i c ij = R j c ji ˆ Projection I j I j 3D Fourier space � � R i c ij − R j c ji � 2 min R 1 , R 2 ,..., R n ∈ SO (3) i � = j Search space is exponentially large and non-convex. Spectral and semidefinite programming relaxations (non-commutative Grothendieck, Max-Cut) S, Shkolnisky ( SIAM Journal on Imaging Sciences , 2011) Amit Singer (Princeton University) July 2015 17 / 28

  18. MLE The images contain more information than that expressed by optimal pairwise matching of common lines. Algorithms based on pairwise matching can succeed only at “high” SNR. (Quasi) Maximum Likelihood: We would like to try all possible rotations R 1 , . . . , R n and choose the combination for which the agreement on the common lines (implied by the rotations) as observed in the images is maximal. Computationally intractable: exponentially large search space, complicated cost function. Amit Singer (Princeton University) July 2015 18 / 28

  19. Quasi MLE R i e 3 × R j e 3 Common line equation: R i c ij = R j c ji = � R i e 3 × R j e 3 � with e 3 = (0 , 0 , 1) T . e 3 × R − 1 R i e 3 × R j e 3 R j e 3 R − 1 i c ij = � R i e 3 × R j e 3 � = i � e 3 × R − 1 R j e 3 � i R − 1 R i e 3 × e 3 R i e 3 × R j e 3 j R − 1 = � R i e 3 × R j e 3 � = c ji j � R − 1 R i e 3 × e 3 � i Quasi MLE n n I j ( · , c ji ) � 2 ⇐ � � ˆ I i ( · , c ij ) − ˆ � f ij ( g i g − 1 min ⇒ min ) j g 1 ,..., g n ∈ G R 1 ,..., R n ∈ SO (3) i , j =1 i , j =1 Amit Singer (Princeton University) July 2015 19 / 28

  20. Fourier transform over G Recall for G = SO (2) ∞ � ˆ f ( k ) e ı k α f ( α ) = k = −∞ � 2 π 1 f ( α ) e − ı k α d α ˆ f ( k ) = 2 π 0 In general, for a compact group G ∞ � � � ˆ f ( g ) = d k Tr f ( k ) ρ k ( g ) k =0 � f ( g ) ρ k ( g ) ∗ dg ˆ f ( k ) = G Here ρ k are the unitary irreducible representations of G d k is the dimension of the representation ρ k (e.g., d k = 1 for SO (2), d k = 2 k + 1 for SO (3)) dg is the Haar measure on G Amit Singer (Princeton University) July 2015 20 / 28

  21. Linearization of the cost function Introduce matrix variables X ( k ) = ρ k ( g i g − 1 ) ij j Fourier expansion of f ij ∞ � � � ˆ f ij ( g ) = d k Tr f ij ( k ) ρ k ( g ) k =0 Linear cost function ∞ n n � � f ij ( k ) X ( k ) � f ij ( g i g − 1 � � ˆ f ( g 1 , . . . , g n ) = ) = d k Tr j ij i , j =1 i , j =1 k =0 Amit Singer (Princeton University) July 2015 21 / 28

Recommend


More recommend