Searching forTransition Paths (in Protein Folding) Henri Orland IPhT, CEA-Saclay France mardi 13 mai 14
Outline • The Folding Path problem • Langevin dynamics and Path integral representation • Dominant paths • Hamilton-Jacobi representation • Langevin Bridges • short time approximation • exact numerical solution mardi 13 mai 14
1. What is a Protein Biological Polymers (biopolymers): Proteins, Nucleic Acids (DNA and RNA), Polysaccharides ! catalytic activity: enzymes ! transport of ions: hemoglobin (O 2 ), ion channels ! motor protein ! shell of viruses (influenza, HIV, etc...) ! prions ! food, etc… mardi 13 mai 14
Proteins exist under 2 forms • Folded or Native: globular unique conformation, biologically active • Unfolded: random coil, biologically inactive • Note that a globular polymer has an extensive entropy N = µ N 4 mardi 13 mai 14
HIV protease (199 residues) mardi 13 mai 14
The Protein Folding problem • A sequence of amino-acids is given by the biologists. • What is the 3d shape of the corresponding protein? • To study this problem, try Molecular Dynamics: Karplus, Levitt and Warschel, Nobel prize in Chemistry 2013 mardi 13 mai 14
Parametrization (CHARMM, AMBER, OPLS, …) 2 2 2 " " " " ( ) ( ) ( 1 cos( )) ( ) E % k b ) b $ k / ) / $ k $ n . ) - $ k , ) , b 0 0 0 / . , bonds valence angles dihedrals impropers ( % + + q q 332 ij ij i j & 12 6 # " " 4 ( ) ( ) $ * ) $ ij & # r r * r ' $ i j i j # # ij ij ij in kCal/mol Use Newton or Langevin dynamics E .. . % m r r ( t ) " $ " ! ! i i i i i r % i where ! i (t) is a Gaussian noise satisfying the fluctuation-dissipation theorem: $ ( t ) ( t ) #! 2 k T ( t , t ' ) ! ! $ " " i j i B ij mardi 13 mai 14
Why it does not work (yet?)? • To discretize the equations, one must use time steps of the order of 10 − 15 s • Large number of degrees of freedom (a few thousand) plus few thousand water molecules • Force fields not necessarily adapted to folding • Longest runs: around 1 s << folding time 1 ms- 1s µ • Recently, runs of 1ms on short proteins • Many metastable states and high barriers mardi 13 mai 14
The problem of protein structure prediction is too complicated Simpler problem: How do proteins fold? How do they go from Unfolded to Native State? mardi 13 mai 14
• In given denaturant conditions, a protein spends a fraction of its time in the native state and a fraction of its time in the denatured state. (b) F E TP U time mardi 13 mai 14
Denaturation curves 1 [Native Fraction] 0.8 0.6 0.4 0.2 0 0 2 4 6 8 10 [Denaturant] In given denaturant conditions, a fraction of the proteins are native, and the rest are denatured mardi 13 mai 14
The Folding Pathway Problem • The problem: Assume a protein can go from state A to state B. Which pathway (or family of pathways) does the protein take? How are the trajectories from A to B? mardi 13 mai 14
Motivation from single molecule experiments • Examples: • from denatured to native in native conditions • Allosteric transition between A and B Difficulty: looking for rare events Can one describe these reactions in terms of a small set of dominant trajectories with fluctuations around? mardi 13 mai 14
Langevin dynamics • The case of one particle in a potential U ( x ) at temperature T • Use Langevin dynamics md 2 x dt + ∂ U dt 2 + γ dx ∂ x = ζ ( t ) • where is the friction and is a ζ ( t ) γ random noise < ζ ( t ) ζ ( t 0 ) > = 2 k B T γδ ( t − t 0 ) mardi 13 mai 14
Overdamped Langevin dynamics • At large enough time scale, mass term negligible m ω 2 ≈ γω τ ≈ 2 π m γ γ = k B T D τ ≈ 10 − 13 s m ≈ 5 . 10 − 26 kg D = 10 − 5 cm 2 / s mardi 13 mai 14
• Take overdamped Langevin (Brownian) dynamics dt = − 1 ∂ U dx ∂ x + η ( t ) γ • with Gaussian noise: < ζ ( t ) ζ ( t 0 ) > = 2 k B T δ ( t − t 0 ) γ D = k B T • is the friction coefficient: γ γ Diffusion coefficient mardi 13 mai 14
• Equation of motion is a stochastic equation • The Probability to find the particle at point x at time t is given by a Fokker-Planck equation � 1 ∂ x P ( x, t ) + ∂ P ( x, t ) ⇥ ∂ tP ( x, t ) = D ∂ ∂ ∂ U ∂ x ∂ x k B T with P ( x, 0) = δ ( x − x i ) mardi 13 mai 14
• Fokker-Planck equation looks very much like a Schrödinger equation, except for 1st order derivative. Define P ( x, t ) = e − β U ( x ) Q ( x, t ) 2 • The function satisfies an imaginary time Q ( x, t ) Schrödinger equation with a Hamiltonian H ∂ Q ∂ t = HQ − mardi 13 mai 14
• where H is a “quantum” Hamiltonian given by H = 1 �r 2 + 1 4( r U ) 2 � k B T ⇣ ⌘ r 2 U 2 γ U ( xf ) − U ( xi ) < x f | e − ( t f − t i ) H | x i > P ( x f , t f | x i , t i ) = e − 2 kBT • Spectral decomposition � < x f | e − ( t f − t i ) H | x i > = e − ( t f − t i ) E α Ψ α ( x f ) Ψ α ( x i ) α H Ψ α ( x ) = E α Ψ α ( x ) mardi 13 mai 14
• At large time, the matrix element is dominated by the ground state Ψ 0 ( x ) = e − β U ( x ) / 2 √ Z � dxe − β U ( x ) Z = with H Ψ 0 = 0 so that P ( x f , t f | x i , t i ) ≈ e − β U ( x ) U ( xf ) − U ( xi ) e − ( t f − t i ) E 1 Ψ 1 ( x f ) Ψ 1 ( x i ) + e − β 2 Z τ = E − 1 is the reaction time 1 mardi 13 mai 14
• Stationary distribution: the Boltzmann distribution hat the stationary solution o t → + ∞ P ( x, t ) = lim n P ( x ) ∼ exp ( − U ( x ) / k B T ) . the boundary conditions x t • General form: Path Integral Z x f U ( xf ) − U ( xi ) P ( x f , t f | x i , t i ) = e − D x ( τ ) e − S ef f [ x ] / 2 D , /k B T 2 kBT x i • Boundary conditions: � � x ( t f ) = x f s x ( t i ) = x i -integral: mardi 13 mai 14
Path Integral representation Z x f U ( xf ) − U ( xi ) P ( x f , t f | x i , t i ) = e − D x ( τ ) e − S ef f [ x ] / 2 D , /k B T 2 kBT x i � � • The effective action is given by Z t f x 2 + V eff [ x ( t )]) dt ( γ S eff [ x ] = 4 ˙ t i • and the effective potential is given by V eff [ x ] = 1 4 γ (( r U ) 2 � 2 k B T r 2 U ) • one could do MC sampling but it is difficult and not very efficient mardi 13 mai 14
Saddle-Point method: WKB approximation work in collaboration with P . Faccioli, F. Pederiva, M. Sega University of Trento To compute the path integral, look for paths which have the largest weight: semi-classical approximation. mardi 13 mai 14
• Dominant trajectories: classical trajectories inverted d 2 x dt 2 = − ∂ ( − V eff [ x ]) γ potential 2 ∂ x • with correct boundary conditions. x ( t i ) = x i x ( t f ) = x f • Problem: one does not know the transition time. Inverse folding rate is equal to mean first passage time (first passage time is distributed). mardi 13 mai 14
U ( x ) = x 2 (5( x − 1) 2 − 0 . 5) 1.25 1 0.75 0.5 0.25 x -0.5 0.5 1 1.5 2 -0.25 -0.5 ∗ N V eff ( x ) = U � ( x ) 2 / 2 − TU �� ( x ) V eff ( x ) = U � ( x ) 2 / 2 − TU �� ( x ) 6 20 5 15 T = 0 T = 0 . 5 4 10 3 5 2 -0.5 0.5 1 1.5 2 1 -5 ∗ ∗ -0.5 0.5 1 1.5 2 N -10 N mardi 13 mai 14
V eff ( x ) 4 T = 0 . 02 3 2 1 x -0.2 0.2 0.4 0.6 0.8 1 1.2 Denatured Native state mardi 13 mai 14
N * 10 N -0.2 0.2 0.4 0.6 0.8 1 1.2 5 ∗ -1 -0.5 0.5 1 1.5 -5 -2 -10 -3 -15 T = 0 . 5 T = 0 . 02 -20 -4 Conserved energy E = γ x 2 − V eff ( x ) 4 ˙ mardi 13 mai 14
• Solution: go from time-dependent Newtonian dynamics to energy-dependent Hamilton-Jacobi description Z t f x 2 + V eff [ x ( t )]) dt ( γ S eff [ x ] = 4 ˙ t i • For classical trajectories E eff = γ x 2 − V eff [ x ] 4 ˙ r 4 Z x f S eff [ x ] = − E eff ( t f − t i ) + γ ( E eff + V eff [ x ]) dx x i 28 mardi 13 mai 14
• The method: minimize the Hamilton-Jacobi action Z x f � S HJ = 2 ( E ef f + V ef f [ x ( l )]) , dl x i • over all paths joining to x i x f re dl is an infinitesimal displacement along the path t ry. E is a free parameter which determines the to y. E ef f is a free parameter wh • The total time is determined by lapsed during the transition, Z x f � 1 t f − t i = dl 2 ( E ef f + V ef f [ x ( l )]) . x i determine folding time mardi 13 mai 14
• The total time is determined by the trajectory and by the energy E ef f • is not the true energy of the system y E ef f lding tra E = γ x 2 − V eff ( x ) 4 ˙ • If the final state is an (almost) equilibrium state, then the system should spend a maximum time x f = 0 ˙ simulations). In the p ce E ef f = − V ef f ( x f ) , D g time. However, we 2 k B T U �� ( x f ) = mardi 13 mai 14
• The HJ method is much more efficient than Newtonian mechanics because proteins spend most of their time trying to overcome energy barriers. • No waiting-times in HJ: work with fixed interval length dl mardi 13 mai 14
Recommend
More recommend