Orientation Assignment in Cryo-EM Amit Singer Princeton University - PowerPoint PPT Presentation

Orientation Assignment in Cryo-EM Amit Singer Princeton University Department of Mathematics and Program in Applied and Computational Mathematics July 24, 2014 Amit Singer (Princeton University) July 2014 1 / 30

Single Particle Reconstruction using cryo-EM Schematic drawing of the imaging process: The cryo-EM problem: Amit Singer (Princeton University) July 2014 2 / 30

Orientation Estimation: Fourier projection-slice theorem R i c ij c ij = ( x ij , y ij , 0) T ( x ij , y ij ) Projection I i ˆ Projection I i I i 3D Fourier space ( x ji , y ji ) Molecule φ R i c ij = R j c ji R i ∈ SO(3) ˆ Projection I j I j 3D Fourier space Electron source Cryo-EM inverse problem: Find φ (and R 1 , . . . , R n ) given I 1 , . . . , I n . n = 3: Vainshtein and Goncharov 1986, van Heel 1987 n > 3: S, Shkolnisky SIAM Imaging 2011 n > 3: Bandeira, Charikar, Chen, S, in preparation Amit Singer (Princeton University) July 2014 3 / 30

Assumptions for today’s talk Trivial point-group symmetry Homogeneity Amit Singer (Princeton University) July 2014 4 / 30

Experiments with simulated noisy projections Each projection is 129x129 pixels. SNR = Var( Signal ) Var( Noise ) , (b) SNR=2 0 (c) SNR=2 − 1 (d) SNR=2 − 2 (e) SNR=2 − 3 (a) Clean (f) SNR=2 − 4 (g) SNR=2 − 5 (h) SNR=2 − 6 (i) SNR=2 − 7 (j) SNR=2 − 8 Amit Singer (Princeton University) July 2014 5 / 30

Fraction of correctly identified common lines and the SNR Define common line as being correctly identified if both radial lines deviate by no more than 10 ◦ from true directions. Fraction p of correctly identified common lines increases by PCA log 2 (SNR) p 20 0.997 0 0.980 -1 0.956 -2 0.890 -3 0.764 -4 0.575 -5 0.345 -6 0.157 -7 0.064 -8 0.028 -9 0.019 Amit Singer (Princeton University) July 2014 6 / 30

Least Squares Approach R i c ij c ij = ( x ij , y ij , 0) T ( x ij , y ij ) ˆ Projection I i I i 3D Fourier space ( x ji , y ji ) R i c ij = R j c ji ˆ Projection I j I j 3D Fourier space Least squares: � � R i c ij − R j c ji � 2 min R 1 , R 2 ,..., R n ∈ SO (3) i � = j Search space is exponentially large and non-convex. Amit Singer (Princeton University) July 2014 7 / 30

Quadratic Optimization Under Orthogonality Constraints i � = j � R i c ij − R j c ji � 2 Quadratic cost: � Quadratic constraints: R T i R i = I 3 × 3 (det( R i ) = +1 constraint is ignored) We approximate the solution using SDP and rounding (S, Shkolnisky 2011) Related to: MAX-CUT (Goemans and Williamson 1995) Generalized Orthogonal Procrustes Problem (Nemirovski 2007) PhaseLift (Candes, Strohmer, Voroninski 2013) Non-commutative Grothendick Problem (Naor, Regev, Vidick 2013) “Robust” version – Least Unsquared Deviations (Wang, S, Wen 2013) � min � R i c ij − R j c ji � R 1 , R 2 ,..., R n ∈ SO (3) i � = j Amit Singer (Princeton University) July 2014 8 / 30

Detour: MAX-CUT MAX-CUT: Given a weighted undirected graph with n vertices and non-negative weights w ij ≥ 0, split the vertices into two sets such that the total weight of edges across the cut is maximized. 1 w 14 w 12 4 w 24 2 w 45 w 25 w 34 5 3 w 35 CUT( { 1 , 2 , 3 } , { 4 , 5 } ) = w 14 + w 24 + w 25 + w 34 + w 35 Amit Singer (Princeton University) July 2014 9 / 30

The MAX-CUT Approximation of Goemans & Williamson 1 � OPT = max w ij (1 − y i y j ) 4 y 1 , y 2 ,..., y n ∈{− 1 , +1 } i � = j Combinatorial problem that is known to be NP-complete SDP relaxation of Goemans-Williamson (1995): The n × n Gram matrix G = yy T ( G ij = y i y j ) is PSD ( G � 0), G ii = 1, rank( G ) = 1. 1 w ij − 1 � max 4 Tr( GW ) 4 G ∈ R n × n i , j s.t. G � 0 , G ii = 1 , i = 1 , 2 , . . . , n . Randomize a random vector z with i.i.d standard Gaussian entries z i ∼ N (0 , 1), and compute Yz , where G = YY T is the Cholesky decomposition of G . Round y i = sign[( Yz ) i ]. Theorem (G-W): E [CUT GW ] > 0 . 87 · OPT. Amit Singer (Princeton University) July 2014 10 / 30

SDP Relaxation for the Common-Lines Problem Least squares is equivalent to maximizing the sum of inner products: � R i c ij − R j c ji � 2 ⇐ � � min ⇒ max � R i c ij , R j c ji � R 1 , R 2 ,..., R n ∈ SO (3) R 1 , R 2 ,..., R n ∈ SO (3) i � = j i � = j � Tr( c ji c T ij R T ⇐ ⇒ i R j ) ⇐ ⇒ max R 1 , R 2 ,..., R n ∈ SO (3) Tr( CG ) max R 1 , R 2 ,..., R n ∈ SO (3) i � = j C is the 2 n × 2 n matrix (“the common lines matrix”) with � 0 � x ji � x ji x ij � � � � x ji y ij 0 c T � C ij = ˜ c ji ˜ ij = x ij y ij = , C ii = y ji y ji x ij y ji y ij 0 0 R T ˜ G is the 2 n × 2 n Gram matrix G = ˜ R with G ij = ˜ i ˜ R T R j : ˜ ˜ 1 ˜ ˜ 1 ˜  R T   R T R T  · · · I 2 × 2 R 2 R n 1 ˜ ˜ 2 ˜ ˜ 2 ˜ R T R T R T · · · � ˜ R 1 I 2 × 2 R n     2 ˜ ˜ � G = R 1 R 2 · · · R n =  .   . . .  ... . . . .     . . . .     ˜ ˜ n ˜ ˜ n ˜ R T R T R T · · · R 1 R 2 I 2 × 2 n Amit Singer (Princeton University) July 2014 11 / 30

SDP Relaxation and Rounding R 1 , R 2 ,..., R n ∈ SO (3) Tr( CG ) max SDP Relaxation : G ∈ R 2 n × 2 n Tr( CG ) max s.t. G � 0 , G ii = I 2 × 2 , i = 1 , 2 , . . . , n . Missing is the non-convex constraint rank( G ) = 3. Randomize a 2 n × 3 orthogonal matrix Q using (careful) QR factorization of a 2 n × 3 matrix with i.i.d standard Gaussian entries Compute Cholesky decomposition G = YY T ⇒ ˜ Round using SVD: ( YQ ) i = U i Σ i V T R T = U i V T = i . i i Use the cross product to find R T i . Loss of handedness. Amit Singer (Princeton University) July 2014 12 / 30

Spectral Relaxation for Uniformly Distributed Rotations � x 1 x 2 � | | � � i i R 1 R 2 y 1 y 2 = i = 1 , . . . , n . , i i i i z 1 z 2 | | i i Define 3 vectors of length 2 n � T � x 1 x 2 x 1 x 2 x 1 x 2 · · · x = 1 1 2 2 n n � T � y 1 y 2 y 1 y 2 y 1 y 2 = · · · y 1 1 2 2 n n � T � z 1 z 2 z 1 z 2 z 1 z 2 = · · · z 1 1 2 2 n n Rewrite the least squares objective function as R 1 ,..., R n ∈ SO (3) x T Cx + y T Cy + z T Cz � � R i c ij , R j c ji � = max max R 1 ,..., R n ∈ SO (3) i � = j By symmetry , if rotations are uniformly distributed over SO (3), then the top eigenvalue of C has multiplicity 3 and corresponding eigenvectors are x , y , z from which we recover R 1 , R 2 , . . . , R n ! Amit Singer (Princeton University) July 2014 13 / 30

Spectrum of C Numerical simulation with n = 1000 rotations sampled from the Haar measure; no noise. Bar plot of positive (left) and negative (right) eigenvalues of C : 600 180 160 500 140 400 120 100 − λ λ 300 80 200 60 40 100 20 0 0 0 10 20 30 40 50 60 0 10 20 30 40 50 60 Eigenvalues: λ l ≈ n ( − 1) l +1 l = 1 , 2 , 3 , . . . . ( 1 2 , − 1 6 , 1 l ( l +1) , 12 , . . . ) Multiplicities: 2 l + 1. Two basic questions: Why this spectrum? Answer: Representation Theory of SO(3) 1 (Hadani, S, 2011) Is it stable to noise? Answer: Yes, due to random matrix theory. 2 Amit Singer (Princeton University) July 2014 14 / 30

Common Lines Kernel R 3 i × R 3 j Common line equation: R i c ij = R j c ji = ± � R 3 i × R 3 j � R 3 i × R 3 R 3 i × R 3 j j c ij = ± R − 1 c ji = ± R − 1 j � , i � R 3 i × R 3 j � R 3 i × R 3 j � Basic properties of the cross product: C ij is only a function of the ratio U ij = R − 1 R j i � x ij x ji � x ij � � � x ij y ji = ( Pc ij )( Pc ji ) T � = = C ij x ji y ji y ij x ji y ij y ji y ij � 1 [ I 3 × U 3 ij ) 3 × I 3 ] T ij ][( U − 1 � 0 0 P T , where P = = P . � I 3 × U 3 ij � 2 0 1 0 R j ) : R 2 �→ R 2 . Common lines kernel: C ij = k ( R i , R j ) = ˜ k ( R − 1 i Amit Singer (Princeton University) July 2014 15 / 30

Common Lines Operator R j ) : R 2 �→ R 2 . Common lines kernel: C ij = k ( R i , R j ) = ˜ k ( R − 1 i Suppose f : SO (3) �→ R 2 then n n ( Cf )( R i ) = 1 1 � � C ij f ( R j ) ≈ k ( R i , R j ) f ( R j ) d µ ( R j ) = ( C f )( R i ) . n SO (3) j =1 Common lines integral operator C is a convolution over SO(3): � ˜ k ( R − 1 ( C f )( R i ) = R j ) f ( R j ) d µ ( R j ) = λ f ( R i ) i SO (3) Representation theory of SO(3) explains eigenvalues and eigenfunctions. Eigenfunctions are tangent vector fields to the 2-sphere (sections of the tangent bundle). They are the projected gradient of the scalar spherical harmonics. Amit Singer (Princeton University) July 2014 16 / 30

Probabilistic Model and Wigner’s Semi-Circle Law Simplistic Model: every common line is detected correctly with probability p , independently of all other common-lines, and with probability 1 − p the common lines are falsely detected and are uniformly distributed over the unit circle. Let C clean be the matrix C when all common-lines are detected correctly ( p = 1). The expected value of the noisy matrix C is E [ C ] = pC clean , as the contribution of the falsely detected common lines to the expected value vanishes . Decompose C as C = pC clean + W , where W is a 2 n × 2 n zero-mean random matrix. The eigenvalues of W are distributed according to Wigner’s semi-circle law whose support, up to O ( p ) and finite sample √ √ fluctuations, is [ − 2 n , 2 n ]. Amit Singer (Princeton University) July 2014 17 / 30

Orientation Assignment in Cryo-EM Amit Singer Princeton University - PowerPoint PPT Presentation

Orientation Assignment in Cryo-EM Amit Singer Princeton University Department of Mathematics and Program in Applied and Computational Mathematics July 24, 2014 Amit Singer (Princeton University) July 2014 1 / 30 Single Particle

Basics and progress of single particle reconstructions with cryo- EM (3DEM) Shashi Bhushan

Regional Consortia for High Resolution Cryo Electron Microscopy Goal: ensure access of cryo EM

TOM TOM A toolbox toolbox for for Cryo Cryo- -Electron Electron A Tomography and Single

ORIENTATION TO ORIENTATION TO ORIENTATION TO ORIENTATION TO ELECTRONIC VISIT VERIFICATION

ORIENTATION TO ORIENTATION TO ORIENTATION TO ORIENTATION TO ELECTRONIC VISIT VERIFICATION

ORIENTATION Transferring to Loyola ORIENTATION

Welcome to Orientation Employee Orientation Handbook The Carewest Employee Orientation Handbook

Transformative Potential of High Resolution Cryo-Electron Microscopy Sponsoring ICOs: NIGMS,

Cryo-transfer system for helium stage Y. Fujiyoshi, Adv. Biophys. 35, 25-80 (1998) Yoshi's Lab. 1

Early cryo-electron microscopy Jacques Dubochet 1 Thank you 2 Edouard Kellenberger Sir John

Challenges for molecular structure determination by single particle cryo-EM Yifan Cheng

New substrates for electron cryo-microscopy Lori Passmore 2014 NRAMM Workshop on Advanced Topics

Cryo Preparation at NRAMM NRAMM group Joel Quispe Overview of the freezing procedures at NRAMM

Bayesian Methods in Cryo-EM Marcus A. Brubaker York University / Structura Biotechnology Toronto,

Class Averaging in Cryo-Electron Microscopy Zhizhen Jane Zhao Courant Institute of Mathematical

Time-resolved Cryo-EM Jack Fu Joachim Franks lab Columbia University Questions to address?

Integration-by-parts reductions via algebraic geometry Kasper J. Larsen University of

On the inverse matrix of the Laplacian and all ones matrix Sho Suda (Joint work with Michio Seto

Language Modeling Introduction to N-grams Dan Jurafsky Probabilistic Language Models

Quartic Curves and Their Bitangents Bernd Sturmfels, UC Berkeley joint work with Daniel Plaumann

HOST Cryptography II ECE 525 CryptoAnalysis Upper case letters can be represented by numbers

Intrinsic Plagiarism Detection Intrinsic Plagiarism Detection Using Character n gram Profiles

Modern Information Retrieval Dictionaries and and tolerant retrieval 1 Hamid Beigy Sharif

On the Extreme Eigenvalues of Certain Gram Matrices of Hermite Polynomials q Martin Ple