Ab initio cryo-EM structure determination as a validation problem - PowerPoint PPT Presentation

Ab initio cryo-EM structure determination as a validation problem Pawel ¡A. ¡Penczek The ¡University ¡of ¡Texas ¡– ¡Houston ¡Medical ¡School, ¡ Department ¡of ¡Biochemistry Thursday, November 13, 14

ACKNOWLEDGMENTS Francisco ¡J. ¡Asturias La ¡Jolla, ¡CA Chris&an ¡M.T. ¡Spahn Charité, ¡Berlin NIH Thursday, November 13, 14

CONCLUSIONS 1. Valida&on ¡should ¡be ¡an ¡integral ¡part ¡of ¡the ¡structure ¡ determina&on ¡process. 2. Any ¡method ¡should ¡be ¡permiHed ¡to ¡fail ¡under ¡controlled ¡ circumstances ¡as ¡the ¡failure ¡can ¡be ¡as ¡informa&ve ¡as ¡success. 3. EM ¡projec&on ¡images ¡are ¡of ¡very ¡poor ¡quality. Therefore, ¡they ¡should ¡not ¡be ¡evaluated ¡individually ¡but ¡as ¡ members ¡of ¡sta&s&cal ¡assemblies. 4. Implementa&on ¡in ¡SPARX ¡hHp://sparx-‑em.org/sparxwiki/ with ¡new ¡addi&ons ¡of ¡tools ¡for ¡the ¡analysis ¡of ¡local ¡variability ¡ (please ¡see ¡the ¡poster). Thursday, November 13, 14

Statistical cross-validation for detecting and preventing overfitting Problem ¡of ¡model ¡selec4on Thursday, November 13, 14

EM DATA AND PARAMETER ERROR ESTIMATION • A typical EM experiment generates a single dataset and it is not possible to derive an analytical expression to determine (alignment) parameter errors • The challenge is then to estimate parameter errors in the absence of independent sample sets • Statistical Resampling offers the best option for accurate estimation of parameter errors independent of assumptions about their statistical properties Thursday, November 13, 14

EM DATA AND PARAMETER ERROR ESTIMATION • A typical EM experiment generates a single dataset and it is not possible to derive an analytical expression to determine (alignment) parameter errors • The challenge is then to estimate parameter errors in the absence of independent sample sets • Statistical Resampling offers the best option for accurate estimation of parameter errors independent of assumptions about their statistical properties If we treat the observed sample (EM dataset) as though it exactly represented the entire population, evaluating artificial variability generated through resampling allows us to accurately estimate variability of a sample statistic Thursday, November 13, 14

CTF parameter estimation and error assessment through bootstrap resampling ( CTER ) Penczek, P. A., Fang, J., X. Li, X., Cheng, Y., Loerke, J., Spahn, Ch.M.T.: CTER-Rapid estimation of CTF parameters with error assessment. Ultramicroscopy , 140 :9-19, 2014. Thursday, November 13, 14

CTF parameter estimation and error assessment through bootstrap resampling ( CTER ) Average power spectrum and its variance Penczek, P. A., Fang, J., X. Li, X., Cheng, Y., Loerke, J., Spahn, Ch.M.T.: CTER-Rapid estimation of CTF parameters with error assessment. Ultramicroscopy , 140 :9-19, 2014. Thursday, November 13, 14

CTF parameter estimation and error assessment through bootstrap resampling ( CTER ) 1 2 Average of selected 2 2 power spectra 3 4 Determine: 1. defocus 2. astigmatism amplitude 3. astigmatism angle 4 4 5 4 Repeat B times BOOTSTRAP RESAMPLING OF TILED POWER SPECTRA Average power spectrum and its variance Penczek, P. A., Fang, J., X. Li, X., Cheng, Y., Loerke, J., Spahn, Ch.M.T.: CTER-Rapid estimation of CTF parameters with error assessment. Ultramicroscopy , 140 :9-19, 2014. Thursday, November 13, 14

CTF parameter estimation and error assessment through bootstrap resampling ( CTER ) 1 2 Average of selected 2 2 RESULT power spectra Based on B estimates compute average value and error 3 4 (std. dev.) of <defocus> Determine: <astigmatism amplitude> 1. defocus <astigmatism angle> 2. astigmatism amplitude 3. astigmatism angle 4 4 5 4 Repeat B times BOOTSTRAP RESAMPLING OF TILED POWER SPECTRA Average power spectrum and its variance Penczek, P. A., Fang, J., X. Li, X., Cheng, Y., Loerke, J., Spahn, Ch.M.T.: CTER-Rapid estimation of CTF parameters with error assessment. Ultramicroscopy , 140 :9-19, 2014. Thursday, November 13, 14

ISAC: VALIDATION OF 2D MULTI-REFERENCE ALIGNMENT THROUGH STABILITY TESTING 1. If ¡a ¡set ¡of ¡images ¡is ¡homogeneous, ¡the ¡result ¡from ¡ reference-‑free ¡alignment ¡is ¡stable ¡even ¡for ¡very ¡low ¡ SNR ¡data. 2. The ¡converse ¡is ¡true, ¡i.e., ¡if ¡a ¡set ¡of ¡images ¡is ¡stable, ¡ it ¡must ¡be ¡homogeneous. 2D alignment is stable if perturbation of initial alignment parameters does not produce dramatically different results. Thursday, November 13, 14

ISAC: VALIDATION OF 2D MULTI-REFERENCE ALIGNMENT THROUGH STABILITY TESTING 1. If ¡a ¡set ¡of ¡images ¡is ¡homogeneous, ¡the ¡result ¡from ¡ reference-‑free ¡alignment ¡is ¡stable ¡even ¡for ¡very ¡low ¡ SNR ¡data. 2. The ¡converse ¡is ¡true, ¡i.e., ¡if ¡a ¡set ¡of ¡images ¡is ¡stable, ¡ it ¡must ¡be ¡homogeneous. 2D alignment is stable if perturbation of initial alignment parameters does not produce dramatically different results. Assuming ¡1 ¡and ¡2 ¡are ¡correct: If ¡we ¡can ¡find ¡homogeneous ¡subsets ¡of ¡images, we ¡can ¡solve ¡the ¡mul&-‑reference ¡alignment ¡problem. Thursday, November 13, 14

STABLE ¡VS. ¡UNSTABLE ¡CLASSES: ¡A ¡TEST ¡CASE Two ¡groups ¡were ¡mixed ¡50-‑50, ¡their ¡respec&ve ¡ averages ¡are: ! Sum ¡of ¡these ¡two ¡averages: Thursday, November 13, 14

STABLE ¡VS. ¡UNSTABLE ¡CLASSES: ¡TEST ¡RESULTS Unstable Stable Thursday, November 13, 14

STABLE ¡VS. ¡UNSTABLE ¡CLASSES: ¡TEST ¡RESULTS Unstable Stable FRC Thursday, November 13, 14

STABLE ¡VS. ¡UNSTABLE ¡CLASSES: ¡TEST ¡RESULTS Unstable Stable (remaining are mirror-unstable) pixel error FRC Thursday, November 13, 14

2D ¡MULTI-‑REFERENCE ¡ALIGNMENT ¡(MRA) n images MRA is equivalent to K -means clustering, with the distance between images defined as a maximum similarity over the permissible range of image rotations and translations. K -means results depend on the solution to another nontrivial problem: the alignment of a set of 2D images. Because neither of these two problems can be easily solved, K averages (clusters) the difficulty is compounded. Thursday, November 13, 14

K -‑MEANS ¡CLUSTERING KNOWN ¡PROPERTIES: Very ¡fast ¡convergence ¡guaranteed ¡in ¡a ¡finite ¡ number ¡of ¡steps Converges ¡only ¡to ¡a ¡local ¡minimum Unclear ¡how ¡to ¡determine ¡the ¡appropriate ¡ number ¡of ¡classes ¡( K ) ¡ All ¡images ¡must ¡be ¡assigned ¡to ¡an ¡average The ¡solu4on ¡(final ¡averages) ¡depends ¡on ¡the ¡ ini4al ¡set ¡of ¡averages, ¡and ¡will ¡change ¡if ¡clustering ¡ is ¡repeated ¡using ¡different ¡ini4al ¡averages In ¡EM, ¡when ¡alignment ¡is ¡added, ¡classes ¡tend ¡to ¡ collapse Thursday, November 13, 14

EQK (EQUAL ¡GROUP ¡SIZE) -‑MEANS ¡CLUSTERING Assign n images to K classes such that each class contains n images K Thursday, November 13, 14

Ab initio cryo-EM structure determination as a validation problem - PowerPoint PPT Presentation

Ab initio cryo-EM structure determination as a validation problem Pawel A. Penczek The University of Texas Houston Medical School, Department of Biochemistry Thursday, November 13, 14

Ab initio modelling methods Al Kikhney EMBL Hamburg Ab initio shape reconstruction Log I(s)

Basics and progress of single particle reconstructions with cryo- EM (3DEM) Shashi Bhushan

Challenges for molecular structure determination by single particle cryo-EM Yifan Cheng

Regional Consortia for High Resolution Cryo Electron Microscopy Goal: ensure access of cryo EM

TOM TOM A toolbox toolbox for for Cryo Cryo- -Electron Electron A Tomography and Single

Structure of Cement Phases Structure of Cement Phases from ab initio Modeling Modeling from ab

Ab Initio Models of Solar Activity Ab Initio Models of Solar Activity Robert Stein, Michigan State

Ab initio gene prediction Genome 559, Winter 2014 Ab initio gene prediction method Define

New substrates for electron cryo-microscopy Lori Passmore 2014 NRAMM Workshop on Advanced Topics

Introduction to Three Dimensional Structure Determination of Macromolecules by Cryo-Electron

3D Structure Determination using Cryo-Electron Microscopy Computational Challenges Amit

Validation of National Burn Severity Validation of National Burn Severity Validation of National

Form Validation 1 CS380 What is form validation? 2 validation: ensuring that form's values

Geometric arrangement algorithms for protein structure determination Jeff Martin Bruce Donald

AN INTRODUCTION TO CONTENT DETERMINATION Gerard Casamayor Chris Mellish Contents 1. The place

Lecture IX: Ab Initio Nuclear Structure for Double-Beta Decay J. Engel November 1, 2017 Ab

Viper A Verification Infrastructure for Permission-Based Reasoning Alex Summers, ETH Zurich

Probabilistic prediction of solar power supply to distribution networks, using global radiation

1 We can have a healthy desire for improvement, but it becomes unhealthy if we lose our feeling

Algorithms for Natural Language Processing Lecture 2: Language Models and Smoothing Language

Securing the connected world Flexible and scalable embedded security IP Pieter Willems

Using m -Best Solutions S. Hamid Rezatofighi Anton Milan Zhen Zhang Qinfeng Shi Antony Dick

Le Lever eragin aging g Rust Types es for Modular lar Speci cification and Verification

OntoGather Information Gathering in a Dynamic World Thomas Hornung Kai Simon Georg Lausen

Sambuz

Useful Links

Newsletter

Mail Us