Optimization for Electronic Structure Calculation Zaiwen Wen Beijing International Center For Mathematical Research Peking University Thanks: Yaxiang Yuan, Xin Liu, Xiao Wang, Xin Zhang, Jinwei Zhu, Aihui Zhou, Michael Ulbrich, Chao Yang BICMR - PKU, 2014
Electronic Structure Calculation N particle Schrodinger equation: Physics of material systems — atomic and molecular properties, almost correct (nonrelativistic) phyiscs is quantum mechanics (a) Thanks: Hege et. al. ZIB Berlin (b) Thanks: Reinhold Schneider Numerical simulation of material on atomic and molecular scale Zaiwen Wen (BICMR, PKU) Optimization for DFT BICMR, 2014 2 / 39
Electronic Structure Calculation Main goal: Given atomic positions { R α } M α = 1 , compute the ground state electron energy E e ( { R α } ) . Ground state electron wavefunction Ψ e ( r 1 , . . . , r N ; { R α } ) : N M N N − 1 | r i − R α | + 1 Z α 1 � � � � Ψ e ∆ i − 2 2 | r i − r j | i = 1 α = 1 j = 1 i , j = 1 , i � = j = E e ( { R α } )Ψ e Curse of dimensionality: Computational work goes as 10 3 N , where N is the number of electrons Zaiwen Wen (BICMR, PKU) Optimization for DFT BICMR, 2014 3 / 39
Density Functional Theory (DFT) The unknown is simple — the electron density ρ Hohenberg-Kohn Theory (1964) There is a unique mapping between the ground state energy from Schrödinger equation and the electron density Exact form of the functional is unknown Independent particle model Electrons move independently in an average effective potential field Add correction for correlation Best compromise between efficiency and accuracy. Most widely used electronic structure theory for condensed matter systems. Zaiwen Wen (BICMR, PKU) Optimization for DFT BICMR, 2014 4 / 39
Kohn-Sham Formulation Replace many-particle wavefunctions, Ψ i , with single particle wavefunction, ψ i Write Kohn-Sham total energy as n e 1 � � |∇ ψ i | 2 + � E KS ( { ψ i } ) = V ion ( ρ ) 2 Ω Ω i = 1 ρ ( r ) ρ ( r ′ ) + 1 � | r − r ′ | drdr ′ + E xc ( ρ ) 2 Ω n e � � | ψ i ( r ) | 2 , ρ ( r ) = ψ i ψ j = δ i , j Ω i = 1 Exchange-correlation term, E xc , contains quantum mechanical contribution, plus, part of K.E. not converged by first term when using single-particle wavefunctions Zaiwen Wen (BICMR, PKU) Optimization for DFT BICMR, 2014 5 / 39
Towards Large-scale Simulation Thanks: Taisuke Ozaki Zaiwen Wen (BICMR, PKU) Optimization for DFT BICMR, 2014 6 / 39
Discretized Kohn-Sham Formulation Goal: find ground state energy/density by minimizing E KS . Finite dimensional problem: X ∗ X = I E KS ( X ) := E kinetic ( X ) + E ion ( X ) + E Hartree ( X ) + E xc ( X ) , min where X ∈ C K × N , 1 2 tr ( X ∗ LX ) E kinetic ( X ) = � � i w l | 2 tr ( X ∗ V ion X ) + | x ∗ E ion ( X ) = i l 1 2 ρ ( X ) ⊤ L † ρ ( X ) E Hartree ( X ) = e ⊤ ǫ xc ( ρ ( X )) , e = ( 1 , . . . , 1 ) ⊤ E xc ( X ) = N � diag ( XX ∗ ) = ( | x ij | 2 ) 1 ≤ i ≤ K ρ ( X ) = j = 1 Zaiwen Wen (BICMR, PKU) Optimization for DFT BICMR, 2014 7 / 39
KKT Conditions Lagrange function: L ( X , Λ) = E KS ( X ) − 1 2 tr (Λ( X ∗ X − I )) First-order optimality conditions: � � ∇ X L ( X , Λ) = 0 . H ( X ) X = X Λ , X ∗ X = I , = ⇒ X ∗ X = I . Λ = X ∗ H ( X ) X , not necessarily a diagonal matrix Kohn-Sham Hamiltonian: � � H ( X ) := 1 � l w l w ∗ ℜ ( L † ) ρ ( X ) + ∂ ρ ǫ xc ( ρ ( X )) ⊤ e 2 L + V ion + l + diag . Zaiwen Wen (BICMR, PKU) Optimization for DFT BICMR, 2014 8 / 39
Orbital Free DFT (OFDFT) Expresses the system by only using the charge density Avoids computing N eigenpairs Pros: main group elements and nearly-free-electron-like metals Cons: not for covalently bonded and ionic systems Orbital Free total energy: E OF ( ρ ) = T OF ( ρ ) + E ext ( ρ ) + E H ( ρ ) + E xc ( ρ ) + E II T OF ( ρ ) : kinetic energy density functional (KEDF) T TFW ( ρ ) = C TF T TF ( ρ ) + µ T vW ( ρ ) , � � R 3 K ( r − r ′ ) ρ α ( r ) ρ β ( r ′ ) d r d r ′ T LR ( ρ ) = T TF ( ρ ) + µ T vW ( ρ ) + C TF R 3 Other terms are the same as KSDFT Zaiwen Wen (BICMR, PKU) Optimization for DFT BICMR, 2014 9 / 39
Orbital Free DFT (OFDFT) Variational problem � 1 inf E OF ( ρ ) s.t. ρ ∈ L 1 ( R 3 ) , ρ 2 ∈ H 1 ( R 3 ) , ρ ≥ 0 , R 3 ρ ( r ) d r = N . KKT Conditions: � � T OF ( ρ ) − µ T vW ( ρ ) δ − µ ϕ = λϕ, H OF ϕ � 2 ∆ + + V eff ( ρ ) δρ � R 3 ϕ 2 = N , Discretized Form: c ⊤ Bc = 1 . c ∈ R n E OF ( ρ ( c )) , min s.t. Zaiwen Wen (BICMR, PKU) Optimization for DFT BICMR, 2014 10 / 39
Self Consistent Field Iteration (SCF) SCF Algorithm: 1. Find the p -smallest eigenpairs ( X , Λ) : H ( ρ k ) X = X Λ X ∗ X = I 2. Calculate ρ out ( X ) = diag ( XX ∗ ) . 3. ρ k + 1 = ( 1 − α ) ρ k + αρ out . 4. Increment k and go to step 1 until ρ k + 1 − ρ k is small enough. Our motivation: Computation for the linear eigenvalue problem can be expensive Convergence of SCF is not clear Optimization Algorithms for solving DFT directly? Zaiwen Wen (BICMR, PKU) Optimization for DFT BICMR, 2014 11 / 39
Gradient-Type Approach: Wen and Yin ’12 Consider X ⊤ X = I . min E ( X ) , subject to At iteration i X ( i + 1 ) ← Orthogonalize ← X ( i ) − σ W ( i ) X ( i ) � � X ( i + 1 ) ← solution Y ( τ ) of Y = X ( i ) + σ 2 W ( i ) ( X ( i ) + Y ) W ( i ) is a skew-symmetric matrix defined by X ( i ) � ∗ � ∗ W ( i ) = ∇ E ( X ( i ) ) � − X ( i ) � ∇ E ( X ( i ) ) [ Y ( i ) ] ′ ( 0 ) = − W ( i ) X ( i ) = tangential part of −∇ E ( X ( i ) ) Zaiwen Wen (BICMR, PKU) Optimization for DFT BICMR, 2014 12 / 39
Understanding SCF: the Hessian of E KS Convenient scaling: E s ( X ) := 1 2 E KS ( X ) . Gradient: ∇ E s ( X ) := H ( X ) X . Exact Hessian Suppose that ǫ xc ( ρ ) is twice differentiable with respect to ρ . Given a direction S ∈ C K × N , the Hessian-direction product for E s ( X ) is (¯ X ⊙ S + X ⊙ ¯ ∇ 2 ( E s ( X ))[ S ] = H ( X ) S + diag � � �� J S ) e X , where J = ℜ L † + ∂ 2 ρ ( ǫ ⊤ xc e ) . Note: The second part corresponds to ( H ′ ( X )[ S ]) X . Good news: the Hessian-direction product is not too expensive. Zaiwen Wen (BICMR, PKU) Optimization for DFT BICMR, 2014 13 / 39
SCF from the Viewpoint of Optimization see also: Yang Meza, Wang ’07 The linear eigenvalue problem in each SCF iteration is equivalent to: q ( X ) := 1 X ∗ X = I . min 2 � H ( X k ) X , X � s.t. On the other hand, a direct calculation reveals: 1 2 � H ( X k ) X , X � = ℜ � H ( X k ) X k , X − X k � + 1 2 ℜ � H ( X k )( X − X k ) , X − X k � + const. The second part H ′ ( X k )[ X − X k ] X is omitted in SCF . similar to Gauss-Newton methods Our Goals: Provable global convergence + fast local convergence. Zaiwen Wen (BICMR, PKU) Optimization for DFT BICMR, 2014 14 / 39
Levenberg-Marquardt Type Regularization SCF iteration is similar to Gauss-Newton (GN) method. Regularization of SCF by Levenberg-Marquardt type approach: 2 � H ( X k ) X , X � + τ k min m L k ( X ) := 1 2 � X − X k � 2 F s.t. X ∗ X = I , with regularization parameter τ k > 0. First-order optimality conditions: The solution X = X k + 1 satisfies ( H ( X k ) + τ k I ) X k + 1 = X k + 1 Λ k + 1 + τ k X k and X ∗ k + 1 X k + 1 = I , k + 1 ∈ C N × N is a Lagrange multiplier. where Λ k + 1 = Λ ∗ Zaiwen Wen (BICMR, PKU) Optimization for DFT BICMR, 2014 15 / 39
Exact Hessian + Adaptive Regularization using the exact Hessian: m N k ( X k + S ) := ℜ � H ( X k ) X k , S � + 1 2 ℜ � H ( X k ) S , S � S , diag ( J ((¯ X k ⊙ S + X k ⊙ ¯ + 1 � � 2 ℜ S ) e ) X k + τ k ν � S � ν F , τ k ν � S � ν F : trust region like strategy for ensuring global convergence. Compute the regularized Newton step: m N min k ( X ) X ∗ X = I . s.t. Cartis, Gould, Toint ’10, ’11, ’12 on cubic regularization Zaiwen Wen (BICMR, PKU) Optimization for DFT BICMR, 2014 16 / 39
Convergence Results Assumption: The gradient ∇ E s ( X ) = H ( X ) X is Lipschitz on the convex hull of the Stiefel manifold { X ; X ∗ X = I } . Let G k = ∇ E s ( X k ) = H ( X k ) X k and define W k = G k X ∗ k − X k G ∗ k Global Convergence Result: W l = 0 for some l ≥ 0 or k →∞ � W k � F = 0 . lim Note: W k X k = tangential part of G k in the canonical inner product. Zaiwen Wen (BICMR, PKU) Optimization for DFT BICMR, 2014 17 / 39
Formulating the KS Equation as a Fixed Point Map Nonlinear equations with respect to ρ as ρ = diag ( X ( ρ ) X ( ρ ) T ) . X is determined by the eigenvalue problem: � ˆ H ( ρ ) X = X Λ , X T X = I , the Hamiltonian matrix H ( ρ ) := 1 ˆ 2 L + V ion + Diag ( L † ρ ) + Diag ( µ xc ( ρ ) T e ) Zaiwen Wen (BICMR, PKU) Optimization for DFT BICMR, 2014 18 / 39
Formulating the KS Equation as a Fixed Point Map The Hamiltonian matrix H ( V ) := 1 2 L + V ion + Diag ( V ) The potential V := V ( ρ ) = L † ρ + µ xc ( ρ ) T e Nonlinear equations with respect to � V = V ( F φ ( V )) , F φ ( V ) = diag ( X ( V ) X ( V ) T ) . Zaiwen Wen (BICMR, PKU) Optimization for DFT BICMR, 2014 19 / 39
Recommend
More recommend