3rd CP2K tutorial: Enabling the Power of Imagination in MD Simulations June 17-21 2013, Zürich Sampling Free Energy Surfaces by MD Marcella Iannuzzi � Department of Chemistry, University of Zurich http://www.cp2k.org
OUTLINE Free energy in statistical mechanics � Free energy difference by improving the sampling along the evolution order parameters � Enhanced exploration of the configuration space and disclosure of mechanisms of transformation 2
Complex Processes by MD Choose a suitable model of the system Determine the thermodynamic conditions ⇒ Ensemble averages Equilibrium sampling of physical quantities O-O RDF Cl-H 1.5 2 2.5 3 3.5 4 4.5 5 Predictive power frustrated by sampling fast degrees of freedom with time-steps from < 0.1 fs (CPMD) up to 1 fs (MM) ~ few ns 3
Rare Events Phase Transitions, Conformational Rearrangements, Chemical Reactions, Nucleation, Diffusion, Growth, etc. Minimum energy pathways Activation Energies Δ Exploration of configurational space Complex and high dimensional Intrinsically multidimensional configurational space order parameter Multitude of unknown Entropic bottlenecks intermediates and products � Unforeseen processes, many Diffusive trajectories irrelevant transition states 4
Hamiltonian MD A system of N particles in a volume V is completely determined through its Hamiltonian P 2 � I H ( { R I } , { P I } ) = + U ( R I } ) 2 M I I NVE-P total energy and linear momentum are constant of motion Choice of ensemble: portion of phase space, microstates Difference in averages and fluctuations Counters, blue on one side and green on the other, 6 × 6 checkerboard Each pattern is a microstate Subset belonging to macrostate “15 green” 5
Canonical Partition Function The Laplace transform of the density of state Z Q ( N, V, T ) = exp( − β E ) Ω ( N, V, E ) dE Probability of the macrostate at a given T 1 Z 1 d r N d p N = − β H ( r N , p N ) ⇥ ⇤ Q ( N, V, T ) = exp Λ ( β ) 3 N N ! Z ( N, V, T ) N ! h 3 N one dimensional integral over E replaced by configurational integral analytic kinetic part is integrated out configurational partition � e − β U ( r ) d r Z ( N, V, T ) = function 6
Free Energy Helmholtz free energy or thermodynamic potential A = − 1 β ln Q ( N, V, T ) Thermodynamics Statistical Mechanics � Z 1 ⇥ ∆ A = − 1 entropic and enthalpic β ln contributions Z 0 � Macroscopic state 0 corresponds to a portion of e − β H ( r , p ) d r d p Q 0 ∝ the phase space : Γ 0 Γ 0 � Macroscopic state 0 corresponds to H 0 e − β H 0 ( r , p ) d r d p Q 0 ∝ Γ � Macroscopic state 0 corresponds to a value of a e − β 0 H ( r , p ) d r d p Q 0 ∝ macroscopic parameter, e.g T Γ 7
Perturbation formalism Reference (0) and target system (1) H 1 ( r , p ) = H 0 ( r , p ) + ∆ H ( r , p ) e − β H 0 ( r , p ) Probability of finding (0) in configuration ( r , p ) P 0 ( r , p ) = e H ( r , p ) d r d p � Free energy difference e − β H 1 d r N d p N e − β H 0 e − β ∆ H d r N d p N R R ∆ A = − 1 e − β H 0 d r N d p N = − 1 β ln β ln R R e − β H 0 d r N d p N ∆ A = − 1 − β ∆ H ( r N , p N ) ⌦ ⇥ ⇤↵ β ln exp 0 Integrating out the analytic kinetic part ⇥ F ( r , p ) ⇤ 1 = ⇥ F e − β ∆ U ⇤ 0 ∆ A 0 , 1 = � 1 β ln ⇥ e − β ∆ U ⇤ 0 ⇥ e − β ∆ U ⇤ 0 8
Limitations ∆ A = − 1 Z β ln exp [ − β ∆ U ] P 0 ( ∆ U ) d ∆ U 2.4 Shifted function exp( − bD U ) 2.0 Low- 𝚬 U tail is poorly sampled P 0 ( D U ) 1.6 low statistical accuracy P ( D U ) 1.2 but important contribution to 𝚬 A 0.8 P 0 ( D U )x exp( − b D U ) } 0.4 0.0 − 1 − 0.8 − 0.6 − 0.4 − 0.2 0 0.2 0.4 0.6 D U Accuracy ⇒ target and reference systems are similar ⇒ overlapping regions 1 1 1 0 0 0 0 0 0 Γ Γ Γ insufficient statistics or incomplete overlap ⇒ enhanced sampling 9
Alchemical Transformations Protein ligand binding, host-guest chemistry, solvation properties, ... D A 1 Transformation (0) → (1) as series of L 1 L 1 + R non-physical, intermediate states along a pathway characterized by the D A 3 D A 4 “coupling parameter” λ L 2 L 2 + R D A 2 Free energy as continuous function of λ through H( λ ) H ( λ ) = H env + λ H 0 + (1 − λ ) H 1 Point mutation of alanine into serine: coexistence without seeing each other � Interaction with side chain tuned through λ A. E. Mark, Encyclopaedia of computational chemistry , 2, 1070 (1998) 10
Order Parameters Variables chosen to describe changes in the system Reaction coordinate : the order parameter corresponds to the pathway along which the transformation occur in nature � Collective variable : fully represented as function of coordinates � Indicating intermediate stages of the transformation: mutation point torsion angle annihilation non-bonded Different possible definitions of OP Set up of system with desired � values of OP � Effects on accuracy and efficiency of Δ A calculations Smoothness of the simulated path 11
Extended Ensemble ˆ ξ i ( r N ) Select parameters, continuous functions of coordinates Density of States Z ⇣ ⌘ Π i δ [ˆ δ [ U ( r N ) − E ] ξ i ( r N ) − ξ i ] d r N Ω ξ ( N, V, E, ξ ) = ξ = { ξ i } Canonical Partition Function Z e − β U ( r N ) ⇣ ⌘ Π i δ [ˆ ξ i ( r N ) − ξ i ] d r N Q ξ ( N, V, T, ξ ) = A ξ = − 1 Free Energy β ln Q ξ ˆ ξ i ( r N ) must distinguish among metastable states � select specific configurations in the partition function 12
Stratification Scheme Free energy butane isomerization Probability distribution of the order parameter 7 Not uniform sampling 6 ∆ A ( ξ ) = A ( ξ 1 ) − A ( ξ 0 ) = − β − 1 ln P ( ξ 1 ) 5 P ( ξ 0 ) A (kcal/mol) 4 3 Histogram of M bins δξ = ( ξ 1 − ξ 0 ) /M 2 1 f i 0 P ( ξ 0 + ( i − 0 . 5) δξ ) = - 60 0 60 120 180 240 P j f j C-C-C-C Θ (degrees) Restrain the system within a window by harmonic potential � Overlapping windows � τ = L τ w ∝ ( ξ 1 − ξ 0 ) 2 Efficient sampling LD ξ � Reconstruct the full probability by matching 13
Importance Sampling Non-Boltzmann sampling to enhance the probability of important regions positive bias w (ˆ ξ ( r N )) exp[ − β U ( r N )] δ [ ξ − ˆ ξ ( r N )] d r N R P ( w ) ( ξ , T ) = w( ξ ) w (ˆ R ξ ( r N )) exp[ − β U ( r N )] d r N Free energy differences ∆ A ( w ) ( ξ ) = − β − 1 ln w ( ξ 0 ) P ( w ) ( ξ , T ) ln P ( w ) ( ξ , T ) � w ( ξ ) P ( w ) ( ξ 0 , T ) = − β − 1 P ( w 0 ) ( ξ , T ) + ln w ( ξ 0 ) − ln w ( ξ ) Biasing potential U ( b ) ( r N ) = U ( r N ) + V ( ξ ( r N )) w ( ξ ) = exp[ − β V ( ξ )] Umbrella Potential connecting different regions of the phase space 14
Umbrella Sampling Modify the underlying potential to obtain a uniform sampling : V b (s) = - Δ A(s) ∆ A ( s ) good V b bad V b Free Energy Free Energy − V b ( s ) U ( p ) ( x ) = U ( x ) + V b ( s ( x )) First guess ∆ A ( b ) ( s ) & iterative improvement as flat as possible 1 Z P ( b ) ( s ) = dx exp ( − β ( U ( x ) + V b ( s ( x )))) δ ( s − s ( x )) Z ( b ) Z ( b ) exp ( − β V b ( s )) 1 Z Z = dx exp ( − β U ( x )) δ ( s − s ( x )) Z Z Probability in the = Z ( b ) exp ( − β ( V b ( s ) + A ( s ))) presence of the bias & ✓ Z un-biasing A ( s ) = − 1 β ln P ( b ) ( s ) − V b ( s ) − 1 ◆ β ln Z ( b ) 15
Good Coordinates for Pathways Capture the essential physics include all relevant DoF and properly describes the dynamics q distinguishes between A and B but might fail in describing essential aspects of the transition Discriminate configurations of Reversibility � reactants and products and intermediates Fast equilibration of orthogonal � DoF (no relevant bottlenecks) Characterisation of the mechanisms of transition 16
Hypothetical 2D Free Energy Landscape (a) x ′ B Not including A important DoF leads to wrong x interpretation A ( x ) (b) x * x 17
Some simple collective variables Derivable function of the degrees of freedom to which a given value can be assigned Distance | R I − R J | Angle θ ( R I , R J , R k ) Dihedral Θ ( R I , R J , R k , R L ) Difference of distances | R I − R J | − | R J − R K | ⇧ ⌃ � Coordination number Generalised coordination number � ⇥ n � ⇤ ⌅ r ij N L 1 N L 2 1 − 1 ⌥ � r 0 C L 1 L 2 = � ⇥ m � N L 1 r ij 1 − ⇧ ⌃ j =1 i =1 r 0 � Generalised displacement 1 1 D [ klm ] � � L 1 L 2 = d i · ˆ v [ klm ] − d j · ˆ v [ klm ] N L 1 N L 2 i ∈ L 1 j ∈ L 2 18
Path Collective Variables A Knowing end-points or a full approximate path (NEB) RMSD � ⇥� N dist ( d j ( R ) − d (k) ⇥� j ( R )) 2 i ( R i − R (k) ) 2 j R k ( R ) = i R k ( R ) = � N dist N B � � [ R k ( R )] n k c k s ( R ) = � k � Position along the path � � P k=1 k e − λ || S ( R ) − S (k) || 2 � 1 s ( R ) = � P P − 1 k=1 e − λ || S ( R ) − S (k) || 2 � Distance from the path � P ⇥ z ( R ) = 1 e − λ || S ( R ) − S (k) || 2 ⇤ λ ln k=1 19
Dialanine in vacuum competitive paths linear interpolation for the initial path � Ramachandran plot s and z in terms of coordinates � multidimensional path 20 D. Branduardi, F.L. Gervasio, M. Parrinello, J. Chem. Phys. 126 (2007) 054103
Recommend
More recommend