Using selective pressure to improve protein tridimensional structure prediction Using selective pressure to improve protein Aude GRELAUD tridimensional structure prediction Context Markov random fields Aude GRELAUD 1 , 2 Jean-Michel MARIN 3 , Christian P Parameter . posterior ROBERT 1 , François RODOLPHE 2 distribution Model choice Simulations 1 Cérémade, Université Paris Dauphine et Laboratoire de statistique, CREST-INSEE 2 Unite Mathématique, Informatique et Génome, INRA Conclusion 3 INRIA Saclay MIEP Hameau de l’Etoile, june 2008
Using selective pressure to improve protein tridimensional structure Context prediction 1 Aude GRELAUD Markov random fi elds 2 Context Markov random fields 3 Parameter posterior distribution Parameter posterior distribution Model choice Model choice 4 Simulations Conclusion Simulations 5 6 Conclusion
Using selective pressure to improve protein Aim tridimensional structure prediction Aude Predict the tridimensional structure of the protein GRELAUD Context Markov random fields Parameter posterior distribution Model choice Simulations Conclusion Knowning amino acid sequence ········· Met Thr Gln Cys
Using selective pressure to improve protein Existing methods tridimensional structure prediction Aude GRELAUD • Experimental methods : Context • X-ray cristallography Markov random fields • Nuclear magnetic resonance spectroscopy Parameter • Cryomicroscopy posterior distribution � Expensive and slow, but provide the exact 3D structure Model choice Simulations • Computational methods : Conclusion • Based on homologies with proteins of known structure : methods based on sequence similarity, protein threading • De novo prediction � Gives several possible 3D structures, with no criterion of choice
Using selective pressure to improve protein Purpose : build a ranking method tridimensional structure prediction based on a phylogenetic stability Aude GRELAUD criterion Context Markov random fields Parameter • In a 3D structure, amino acids in contact frequently have posterior distribution similar modifi cation tolerances Model choice Simulations Conclusion • Criterion : selective pressure sequence
Using selective pressure to improve protein Data tridimensional structure prediction • First step : Estimate the selective pressure sequence Aude GRELAUD ω 1 ,... ω n on a multiple aligment of homologs Context Seq : caa agg tgc tta Markov random fields H1 : cat agg tgc gta Parameter H2 : cat tgg tgc cta posterior distribution H3 : aat tgg tgc ctg Model choice ↓ Simulations ω 1 ω 2 ω 3 ω 4 Conclusion • m folding candidates or
Using selective pressure to improve protein Statistical tools tridimensional structure prediction Aude GRELAUD Context Markov random fields • Markov random fi elds Parameter posterior distribution Model choice • ABC (Approximate Bayesian Computation) Simulations Conclusion • Bayesian model choice
Using selective pressure to improve protein tridimensional structure Context prediction 1 Aude GRELAUD Markov random fi elds 2 Context Markov random fields 3 Parameter posterior distribution Parameter posterior distribution Model choice Model choice 4 Simulations Conclusion Simulations 5 6 Conclusion
Using selective pressure to improve protein Defi nition tridimensional structure prediction Aude • Markov chain : GRELAUD Context Markov random fields Parameter • Markov random fi eld : Markov chain generalisation posterior distribution Model choice Simulations Conclusion
Using selective pressure to improve protein Defi nition (2) tridimensional structure prediction • State at a point i only depends on the state of its neighbours Aude GRELAUD n ( i ) : Context Markov random π ( x i = j | x − i ) = π ( x i = j | z n ( i ) ) fields Parameter • Hammersley-Clifford theorem : posterior distribution P ( X = x ) = 1 Model choice Z exp ( − U ( x )) Simulations with Conclusion • U ( x ) : potential U ( x ) = ∑ V c ( x ) c ∈ C U ( x ) = − θ ∑ 1 { x i = x j } ( i , j ) : i s ∼ j • Z : normalizing constant Z = ∑ exp ( − U ( x )) x
Using selective pressure to improve protein tridimensional structure Context prediction 1 Aude GRELAUD Markov random fi elds 2 Context Markov random fields 3 Parameter posterior distribution Parameter posterior distribution Model choice Model choice 4 Simulations Conclusion Simulations 5 6 Conclusion
Using selective pressure to improve protein Bayesian modelisation tridimensional structure prediction Aude GRELAUD Context • Prior distribution : Markov random fields • ( θ ) ∼ π ( θ ) Parameter posterior distribution • Likelihood : Model choice ( X | θ ) ∼ MRF ( θ ) Simulations Conclusion exp ( θ ∑ f ( x | θ ) = 1 1 { x i = x j } ) Z θ ( i , j ) : i ∼ j � Target : Posterior distribution of θ
Using selective pressure to improve protein Parameter posterior distribution tridimensional structure prediction Aude GRELAUD Context MCMC methods : Markov random fields Hastings-Metropolis algorithm : Parameter posterior • Proposal : θ ′ ∼ p ( θ ′ | θ ( t ) ) distribution Model choice • θ ( t + 1 ) = θ ′ with probability Simulations Conclusion 1 Z θ ′ q θ ′ ( X ) p ( θ ( t ) | θ ′ ) π ( θ ′ ) min { 1 , π ( θ ( t ) ) } p ( θ ′ | θ ( t ) ) 1 Z θ ( t ) q θ ( t ) ( X ) � Ratio involves intractable normalizing constants Z θ ′ and Z θ ( t )
Using selective pressure to improve protein ABC : Approximate Bayesian tridimensional structure prediction Computation Aude GRELAUD Context Markov random • Bayesian inference without using likelihood fields Parameter • Idea : Data suffi ciently close provide similar parameter posterior distribution posterior distribution Model choice Simulations • What we need : Conclusion • Simulate data given parameter values • Summary statistics (suffi cient) • Calculate closeness between our data ( X 0 ) and simulated data ( X i ∗ ) : distance between summary statistics
Using selective pressure to improve protein ABC Algorithm tridimensional structure prediction Aude GRELAUD • Sufficient statistic : S ( X ) = ∑ ( i , j ) : i ∼ j 1 { x i = x j } Context Markov random • Distance : d ( S ( X 0 ) , S ( X i ∗ )) = ( S ( X 0 ) − S ( X i ∗ )) 2 fields Parameter • Algorithm : posterior distribution • Generate θ i ∗ ∼ π ( θ i ∗ ) Model choice • Generate ( X | θ i ∗ ) ∼ MRF ( θ i ∗ ) Simulations • Calculate d i = d ( S ( X 0 ) , S ( X i ∗ )) Conclusion • Accept θ i ∗ if d i < ε • Result : sample of independent draws from f ( θ | d < ε ) � Good approximation of f ( θ | X 0 ) • In practice, ε is a 1 % quantile of d
Using selective pressure to improve protein tridimensional structure Context prediction 1 Aude GRELAUD Markov random fi elds 2 Context Markov random fields 3 Parameter posterior distribution Parameter posterior distribution Model choice Model choice 4 Simulations Conclusion Simulations 5 6 Conclusion
Using selective pressure to improve protein Bayesian hierarchical modelisation tridimensional structure prediction Aude GRELAUD 1 model ← → 1 neighborhood / 3D structure Context Markov random fields Parameter • Prior distributions : posterior distribution • s ∼ π ( s ) Model choice • ( θ s | s ) ∼ π s ( θ s ) Simulations Conclusion • Likelihood : ( X | θ s , s ) ∼ MRF ( θ s , s ) 1 exp ( θ s ∑ f s ( x | θ s ) = 1 { x i = x j } ) Z θ s , s s ( i , j ) : i ∼ j
Using selective pressure to improve protein Bayes factor defi nition tridimensional structure prediction Aude GRELAUD • Context Markov random P ( s = 0 | X ) f 0 ( X | θ 0 ) π 0 ( θ 0 ) d θ 0 fields R P ( s = 1 | X ) = = BF 0 / 1 Parameter f 1 ( X | θ 1 ) π 1 ( θ 1 ) d θ 1 P ( s = 0 ) R posterior P ( s = 1 ) distribution Model choice • Interpretation : Simulations • BF > 1 : Model 0 Conclusion • BF < 1 : Model 1 • Jeffreys scale : < 10 − 2 [ 10 − 2 , 10 − 3 / 2 ] [ 10 − 3 / 2 , 10 − 1 ] [ 10 − 1 , 10 − 1 / 2 ] [ 10 3 / 2 , 10 2 ] [ 10 1 , 10 3 / 2 ] [ 10 1 / 2 , 10 1 ] > 10 2 decisive very hard hard substantial
Using selective pressure to improve protein Another way to write the Bayes tridimensional structure prediction factor Aude GRELAUD Context Markov random fields ∑ P ( S i ( X ) = s | θ i ) f i ( X | θ i ) = Parameter posterior X : S i ( X )= s distribution 1 Model choice exp ( θ i s ) card { X : S i ( X ) = s } = Z θ i , i Simulations Conclusion P ( S 0 ( X ) = s 0 | θ 0 ) π 0 ( θ 0 ) d θ 0 card { X : S 1 ( X ) = s 1 } R = BF 0 / 1 P ( S 1 ( X ) = s 1 | θ 1 ) π 1 ( θ 1 ) d θ 1 card { X : S 0 ( X ) = s 0 } R
Recommend
More recommend