scalable algorithms for electronic structure calculations
play

Scalable Algorithms for Electronic Structure Calculations on - PowerPoint PPT Presentation

Scalable Algorithms for Electronic Structure Calculations on Petascale Computers Franois Gygi University of California, Davis fgygi@ucdavis.edu http://eslab.ucdavis.edu Supported by NSF ITR-HECURA-0749217 and DOE-SciDAC RANMEP2008


  1. Scalable Algorithms for Electronic Structure Calculations on Petascale Computers François Gygi University of California, Davis fgygi@ucdavis.edu http://eslab.ucdavis.edu Supported by NSF ITR-HECURA-0749217 and DOE-SciDAC RANMEP2008 Workshop, NCTS, Taiwan, Jan 6, 2008 1

  2. Outline • First-Principles simulations • Eigenvalue problems in electronic structure calculations • Localized representations of solutions and simultaneous diagonalization problem • Data compression through simultaneous diagonalization FG 2

  3. First-Principles Simulations • Goal: Simulate molecules, solids, liquids, from first principles, without input from experiments • The approach: Molecular dynamics: an atomic-scale simulation method – Compute the trajectories of all atoms – extract statistical information from the trajectories Atoms move according to Newton’s law: �� = m R F i i i FG 3

  4. First-Principles Simulations • Why “First-Principles”? – Avoid empirical models and adjustable parameters • Goal: applications to extreme conditions (high pressure, etc.) where no experimental data is available – Use fundamental principles: Quantum Mechanics – Must describe ions and electrons consistently and simultaneously At each time step: 1) Compute the electronic structure 2) Derive interatomic forces 3) Move atoms FG 4

  5. First-Principles Simulations • Applications – Chemistry – Nanotechnology – Semiconductors – Biochemistry – High-pressure physics Growth of a carbon nanotube Biotin on silicon carbide on an iron catalyst Ice-water interface Silicon quantum dot FG 5

  6. First-Principles Simulations • The computation of the electronic structure is the most expensive part of the simulation >99% of CPU time At each time step: 1) Compute the electronic structure 2) Derive interatomic forces 3) Move atoms FG 6

  7. First-principles simulations require large computing resources • Cost of one time step scales as O( n 3 ) – n: number of electrons • Many time steps required / long simulations • Requires use of large-scale parallel platforms – target: O(10 4 ) to O(10 5 ) CPUs • Focus on scalable algorithms – communication cost is primary concern FG 7

  8. Using large computers: BlueGene/L • 65,536 nodes, 128k CPUs • 3D torus network System (64 cabinets, 64x32x32) • 512 MB/node Cabinet (32 Node boards, 8x8x16) • 367 TFlop peak Node Board (32 chips, 4x4x2) 16 Compute Cards Compute Card 180/360 TF/s (2 chips, 2x1x1) 16 TB DDR Chip (2 processors) 2.9/5.7 TF/s 256 GB DDR 90/180 GF/s 8 GB DDR 5.6/11.2 GF/s 2.8/5.6 GF/s 0.5 GB DDR 4 MB FG 8

  9. Computing the electronic structure • Kohn-Sham equations – solutions φ i represent electronic wavefunctions (one per electron) ( ) ϕ ∈ � 2 3 L i ϕ = −Δ ϕ + ρ ϕ = ε ϕ = ⎧ … ( , ) 1 H V i n r i i i i i ⎪ ′ ρ ( ) r ∫ ⎪ ′ ρ = + + ρ ∇ ρ ( , ) ( ) ( ( ), ( )) V V d V r r r r r ′ − ion XC ⎪ r r ⎪ ⎨ n ∑ ⎪ 2 ρ = ϕ ( ) ( ) r r ⎪ i = 1 i ⎪ ∫ ∗ ϕ ϕ = δ ( ) ( ) ⎪ r r d r ⎩ j ij i FG 9

  10. Computing the electronic structure • Solutions are represented as Fourier series = ∑ ϕ ⋅ i q r ( ) r c e , j q j 2 < q E cut • A set of solutions is represented by an (orthogonal) ( m x n ) matrix of complex Fourier coefficients = Y c , ij q j i • Dimensions of Y : 10 6 x10 4 • Note: typically m / n ~ 100 FG 10

  11. Computing the electronic structure • The energy is invariant under unitary transformations of Y ( ) [ ] ( ) = + F ρ T tr E Y Y HY = ∑ 2 ρ ϕ ( ) ( ) r r j j ( ) ( ) , = unitary E Y E YQ Q FG 11

  12. Electronic structure calculation: (with fixed potential) • Invariant subspace computation Find Y such that: = Λ HY Y × × × ∈ ∈ Λ∈ � � � m m m n n n , , H Y – H is sparse – Cost of computing Hx: O(m log m) (involves Fast Fourier Transforms) FG 12

  13. Electronic structure calculation: (with fixed potential) • Iterative methods for invariant subspace computations – Variants of Jacobi-Davidson – DIIS (a.k.a. Anderson acceleration) • Simple, diagonal preconditioning works well • Robustness of eigensolvers is key FG 13

  14. Preconditioned steepest descent 1) correction ( ) = + β − T : Y Y K I YY HY 2) orthogonalization FG 14

  15. Preconditioned DIIS ( ) Δ = − T 1) descent direction K I Y Y HY k k k k ( ) Δ Δ − Δ T 2) update tr − θ = k k k 1 Δ − Δ − 1 k k F ( ) = + θ − Y Y Y Y − 1 k k k k ( ) Δ = Δ + θ Δ − Δ − 1 k k k k = + β Δ Y Y + 1 k k k 3) orthogonalization FG 15

  16. Self-consistent electronic structure computation • H depends non-linearly on the solution Y (through ρ ) • Fixed point iteration: repeat { ( ) ρ = T YY 1) compute charge density i ii 2) solve ρ = Λ ( ) H Y Y } until converged (i.e. ρ does not change) • Convergence can be accelerated using various charge-mixing schemes (e.g. Broyden) FG 16

  17. Molecular Dynamics: solve the SCF problem at each time step • H is time-dependent (depends on positions of atoms) for each time step t { repeat { ( ) ρ = T YY 1) compute charge density i ii 2) solve ρ = Λ ( , ) ( ) ( ) ( ) H t Y t Y t t } until converged compute forces, move atoms } FG 17

  18. Molecular Dynamics: using previous solutions optimally • Computing Y(t) – The previous solution Y(t-dt) is “close” to Y(t ), can be used as initial guess for iterative calculation of Y(t) Y + 1 k Y k Y − 1 k FG 18

  19. Molecular Dynamics: using previous solutions optimally • Computing Y(t) – The previous solution Y(t-dt) is “close” to Y(t ), can be used as initial guess for iterative calculation of Y(t) � = − – The extrapolated subspace 2 Y Y Y − 1 k k is a better initial guess � = − 2 Y Y Y − 1 k k Y + 1 k Y k Y − 1 k FG 19

  20. Molecular Dynamics: using previous solutions optimally • Subspace alignment – The eigensolver introduces arbitrary rotations in Y(t) – Extrapolation must be preceded by subspace alignment – Orthogonal Procrustes problem − = T min Y Y Q Q Q I − 1 k k Q � = − 2 Y Y Y Q − 1 k k Y + 1 k Y k Y − 1 k FG 20

  21. Subspace alignment − Orthogonal Procrustes problem: minimize Y Y Q − 1 k k 1) Compute the polar decomposition ≡ = T Y Y A UH k − 1 k where U is unitary, H hermitian. = 2) rotation of Y k-1 : Y Y U − − 1 1 k k FG 21

  22. Polar decomposition Polar decomposition A=UH (Higham ‘86) = X A 0 ( ) ( ) ∗ − = + 1 1 X X X + 1 k 2 k k converges quadratically to the unitary polar factor U Need better, inverse-free, scalable algorithm FG 22

  23. Outline • First-Principles simulations • Eigenvalue problems in electronic structure calculations • Localized representations of solutions and simultaneous diagonalization problem • Data compression through simultaneous diagonalization FG 23

  24. Localized representations of the invariant subspace • Linear combinations of electronic wavefunctions that minimize the spatial spread are called “Maximally Localized Wannier Functions” (MLWF) 2 σ = − 2 x x ˆ X • MLWFs are used to compute the electronic polarization in crystals • Computing MLWFs during a molecular dynamics simulation yields the infrared absorption spectrum N. Marzari and D. Vanderbilt, Phys. Rev. B56, 12847 (1997) R. Resta, Phys. Rev. Lett. 80, 1800 (1998) FG 24

  25. Spread Functionals • Spread of a wavefunction associated with an operator  ( ) 2 ( ) σ φ = φ ˆ − φ ˆ φ φ 2 | | | | A A ˆ A 2 ˆ ˆ = φ φ − φ φ 2 | | | | A A • Spread of a set of wavefunctions associated with an operator  = ∑ ( ) { } ( ) σ φ σ φ 2 2 ˆ ˆ i i A A i FG 25

  26. Spread Functionals • The spread is not invariant under orthogonal transformations ∑ ψ = φ ∈ × � n n orthogonal x X i ij j j ( ) ( ) { } { } σ ψ ≠ σ φ 2 2 ˆ ˆ i i A A • There exists a matrix X that minimizes the spread FG 26

  27. Spread Functionals • Let × ˆ ˆ ∈ = = � 2 n n , | | | | A B a i A j b i A j ij ij ( ) ( ) n ( ) − ∑ { } 2 σ ψ = 2 T T tr X BX X AX ˆ i A ii = 1 i ( ) n ∑ 2 • Minimize the spread = maximize T X AX ii = 1 i = diagonalize A FG 27

  28. Spread Functionals • Case of multiple operators ˆ = ( ) … k operators 1, , A k m = ( ) … k matrices 1, , A k m = ∑∑ ( ) { } ( ) σ ψ σ ψ 2 2 ˆ ˆ ( ) i k i A A i k ( ) n ∑∑ 2 ( ) T k X A X • Minimize the spread = maximize ii = 1 i k = joint approximate diagonalization of the matrices A (k) FG 28

  29. Spread Functionals • Example of multiple operators ( ) ˆ = ˆ ˆ ϕ ≡ ϕ (1) ( , , ) ( , , ) A X X x y z x x y z ( ) ˆ = ˆ ˆ ϕ ≡ ϕ (2) ( , , ) ( , , ) A Y Y x y z y x y z ( ) ˆ = ˆ ˆ ϕ ≡ ϕ (3) ( , , ) ( , , ) A Z Z x y z z x y z • The matrices A (k) do not necessarily commute, even if the operators  (k) do commute FG 29

Recommend


More recommend