Studies of flexible systems D.Svergun, EMBL-Hamburg
DOGMA: No function without structure!
Not true! Flexibility enables more function Hub proteins in networks Interaction specialists P. Tompa et al . “Intrinsically disordered proteins: emerging interaction specialists." Current Opinion in Structural Biology 35 (2015): 49-59.
Scattering from monodisperse systems mixtures ( ) ( ) D sin I s v I s sr ( ) 4 ( ) I s p r dr k k sr 0 k The scattering is proportional to that of a single particle averaged over all For equilibrium and non-equilibrium orientations, which allows one to mixtures, solution scattering permits to determine size, shape and internal determine the number of components structure of the particle at low ( 1-10 and, given their scattering intensities nm ) resolution. I k (s), also the volume fractions
Scattering from flexible systems • Conformational polydispersity (eg. IDPs) • <Almost> infinite range of conformations • Cannot really identify all possible v k and I k (s) • Requires a more indirect approach
Flexible systems Dealing with flexibility is not easy but possible
Classics: Kratky plot This plot provides a sensitive means of monitoring the degree of compactness of a protein as a function of a given parameter. Most conveniently represented using the so-called Kratky plot of s 2 I(s) vs s. Globular particle : bell-shaped curve Gaussian chain : plateau at large s-values but beware: a plateau does not imply a Gaussian chain
SAXS from folded vs disordered protein Folded: relatively small R g and D max , bell-shaped Kratky plot (e.g. for folded α -amylase (448 AAs) R g =2.4 nm) Disordered: large R g and D max , increasing Kratky plot (e.g. for IUP tau (441 AAs) Rg=6.5 nm)
Dimensionless Kratky plot Receveur-Brechot V. & Durand D (2012), Curr Protein Pept. Sci., 13, 55-75 The bell shape vanishes as folded domains disappear and flexibility increases
Flexibility analysis as mixture of different conformations D sin sr ( ) 4 ( ) I s p r dr sr 0 For monodisperse systems the scattering is proportional to that of a single particle averaged over all orientations ( ) ( ) I s v I s k k k v k = volume fraction I k (s) = scattering intensity from the k -th component
Detection of Flexibility: A Crucial Issue SAXS curves Analysis of the overall size descriptors (R g , p(r) , Kratky) Rigid Scenario Modelling: ab initio (DAMMIN/DAMMIF) Go for flexibility! and Rigid body (BUNCH/CORAL) Flexible Analysis of the differences Scenario
Detection of Flexibility: A Crucial Issue PolyUbiquitin Molecules Flexible Rigid 2,3,4 and 5 Ubiquitin (72 AA) domains connected by 20 AA linker (RanCH) Flexible Multidomain Proteins present less features in the SAXS curve than their rigid counterparts Bernadó Eur. Biophys. J. 2009, 39, 769 Flexible Rigid Scenario Scenario
Detection of Flexibility: A Crucial Issue Flexible Flexible Proteins have large D max values and smooth ending to p(D max ) =0 Rigid Not more than two peaks in the p(r) , indicating distal correlation between folded domains, appear in the flexible scenario Additional peaks are only present in the rigid scenario Bernadó Eur. Biophys. J. 2009, 39, 769
Rigid Flexible Modelling the Flexible Scenario with Single Conformation Strategies Good fits are obtained in both ab initio and rigid body modelling No structural variation is observed between solutions Homogeneous densities are observed in DAMMIN solutions. There is a systematic decrease of resolution Domains appear isolated, no interdomain contacts are observed in BUNCH solutions Bernadó Eur. Biophys. J. 2009, 39, 769
In Indicatio tions (no (not Proof oofs!) s!) of of Fle Flexibility ibility ► Smooth scattering profiles and featureless Kratky Plots ► Large R g and D max values ► Absence of correlation peaks in the p(r) function ► Low correlation densities in ab initio reconstructions ► Isolated domains in rigid body modelling ► Prediction of disorder using bioinformatics tools
Bioinformatics tools for disordered proteins http://www.idpbynmr.eu/home/science/research ‐ tools.html
A general approach for characterization of flexible systems: Ensemble Optimization Method (EOM) Problem of standard methods: notoriously flexible systems, e.g. intrinsically disordered proteins or multidomain proteins with long flexible linkers could often not be interpreted in terms of a single model (either no fit to the data or irreproducibility of reconstructions) Solution: take flexibility into account by allowing for co- existence of multiple conformations, which are selected from a large initial random pool Bernadó, P., Mylonas, E., Petoukhov, M.V., Blackledge, M., & Svergun, D. I . (2007) J. Am. Chem. Soc. 129, 5656-5664.
Overview of EOM • Generate a large representative pool (~10 4 -10 5) random models and compute their scattering patterns • Using a genetic algorithm, select sub-ensemble(s) (~10 0 -10 1 ) such that their mixture fits the available experimental data (also from deletion mutants, if available) • Analyze structural properties of the selected ensembles to characterize the flexibility of the macromolecule Bernadó, P., Mylonas, E., Petoukhov, M.V., Blackledge, M., & Svergun, D. I . (2007) J. Am. Chem. Soc. 129, 5656-5664.
Idea of the ensemble approach Genetic Algorithm Pool generation ... ... Crysol 1 N ( ) ( ) I s I n s N 1 n ... 2 R g 3 R g R g 1 R g 4 R g 5 (R g )
Genetic Algorithm (optimized ensemble size) Chromosome Mutation Crossing Elitism Generation 1 Generation 2 1 N ( ) ( ) I s I n s N 1 n Elitism Crossing Mutation Chromosome
C Modelling: Native vs. Random C Quasi C α -C α Ramachandran plot Bond angles vs. Dihedral angles G. Kleywegt , Validation of protein models from C α coordinates alone , JMB , 1997, 273, 371 ‐ 376 Theoretical distribution of the bond and dihedral angles for random chains R g R 0 Persistence Length R g = R 0 ∙ N Solvent ‘quality’ Several experimental and theoretical studies establish 0.598 as an indication of the ‘random coil’ in chemically denatured (Urea or GuHCl) proteins. Kohn et al. PNAS , 2004, 101, 12491 N R 0 =1.927 0.270 0.598 0.028 ‘fully disordered’, IDPs R 0 =2.54 0.01 0.522 0.010 ‘less disordered’
Unfolded protein … • TAU protein isoform (124AA) Inputs for using EOM: curve.dat > CYLSRKLMLDARENLKLLDRMNRLSPHSCL QDRKDFGLPQEMVEGDQLQKDQAFPVLYE MLQQSFNLFYTEHSSAAWDTTLLEQLCTGL QQQLDHLDTCRGQVMGEEDSELGNMDPIV sequence.seq TVKKYF 22
… unfolded protein: results 0.07 R g = 45.05 0.06 R g = 32.96 0.05 0.04 pool Series1 0.03 ensembles Series2 0.02 0.01 0 0 20 40 60 80 R g [Å] 0.08 D max = 140.38 0.07 D max = 101.02 0.06 0.05 pool Series1 0.04 ensembles Series2 0.03 0.02 0.01 0 0 50 100 150 200 250 D max [Å]
Missing loops (i.e. flat electron density map) … Nter.pdb Cter.pdb curve.dat Kratky Plot vs. apoferritin MRIGMV……..GGVQSHVLQ…..VLRDAGHEVS…….PHVKLPDYVS seq.seq missing loop 30 AA pool 24
... Missing loops: results 0.09 0.08 R g = 24.46 0.07 R g = 24.28 0.06 0.05 Series1 pool 0.04 Series2 ensembles 0.03 0.02 0.01 0 21 23 25 27 R g [Å] 25
I ( s ) v k I k ( s ) EOM use of symmetry k • Symmetry ❏ Symmetric core ❏ Symmetric linkers/termini ❏ Symmetric core ❏ Asymmetric linkers/termini
Flexible pentamer in solution … (full length protein measured in two buffers, with low and high ionic strength respectively) high resolution (MX) N-terminal pentamer domain ??missing?? ??missing?? 31 AA 122 AA N-terminal tail inter-domains linker ??missing?? ??missing?? high resolution (MX) C-terminal monomer domain pool 27
… Flexible pentamer in solution: results (full length protein measured in two buffers with low and high ionic strength respectively) 0.08 Pool 0.07 High ionic strength 0.06 Low ionic strength 0.05 0.04 0.03 0.02 0.01 0 20 40 60 80 100 R g , Å pool 0.07 Pool 0.06 High ionic strength Low ionic strength 0.05 0.04 Multi-curves fitting 0.03 0.02 0.01 0 80 130 180 230 280 330 380 D max , Å 28
Case extra: dodecamer (P62, 2 domains) + tRNA 158 AA N-terminal tail high resolution (MX) N-terminal monomer domain (141 AA) max distance in � 9 AA inter-domains linker high resolution (MX) C-terminal monomer domain (270 AA) 30 N single strand tRNA subUnit contact residues range 29
EOM Tests: Size of Pool Number of chains: 10 Number of chains: 100 Number of chains: 1 000 0.30 0.15 0.08 0.25 0.06 0.20 0.10 Density Density Density 0.15 0.04 0.10 0.05 0.02 0.05 0.00 0.00 0.00 10 20 30 40 50 60 10 20 30 40 50 60 10 20 30 40 50 60 Rg Rg Rg Number of chains: 5 000 Number of chains: 10 000 Number of chains: 64 790 0.08 0.08 0.08 0.06 0.06 0.06 Density Density Density 0.04 0.04 0.04 0.02 0.02 0.02 0.00 0.00 0.00 10 20 30 40 50 60 10 20 30 40 50 60 10 20 30 40 50 60 Rg Rg Rg 30
Resolution of Subpopulations by EOM Generate a pool, select two subpopulations from it and calculate scattering curve for their mixture Wide subpopulations Narrow subpopulations Rg, Å Rg, Å Rg, Å 31
Recommend
More recommend