Scalable Algorithms for Electronic Structure Calculations on - PowerPoint PPT Presentation

Scalable Algorithms for Electronic Structure Calculations on Petascale Computers François Gygi University of California, Davis fgygi@ucdavis.edu http://eslab.ucdavis.edu Supported by NSF ITR-HECURA-0749217 and DOE-SciDAC RANMEP2008 Workshop, NCTS, Taiwan, Jan 6, 2008 1

Outline • First-Principles simulations • Eigenvalue problems in electronic structure calculations • Localized representations of solutions and simultaneous diagonalization problem • Data compression through simultaneous diagonalization FG 2

First-Principles Simulations • Goal: Simulate molecules, solids, liquids, from first principles, without input from experiments • The approach: Molecular dynamics: an atomic-scale simulation method – Compute the trajectories of all atoms – extract statistical information from the trajectories Atoms move according to Newton’s law: �� = m R F i i i FG 3

First-Principles Simulations • Why “First-Principles”? – Avoid empirical models and adjustable parameters • Goal: applications to extreme conditions (high pressure, etc.) where no experimental data is available – Use fundamental principles: Quantum Mechanics – Must describe ions and electrons consistently and simultaneously At each time step: 1) Compute the electronic structure 2) Derive interatomic forces 3) Move atoms FG 4

First-Principles Simulations • Applications – Chemistry – Nanotechnology – Semiconductors – Biochemistry – High-pressure physics Growth of a carbon nanotube Biotin on silicon carbide on an iron catalyst Ice-water interface Silicon quantum dot FG 5

First-Principles Simulations • The computation of the electronic structure is the most expensive part of the simulation >99% of CPU time At each time step: 1) Compute the electronic structure 2) Derive interatomic forces 3) Move atoms FG 6

First-principles simulations require large computing resources • Cost of one time step scales as O( n 3 ) – n: number of electrons • Many time steps required / long simulations • Requires use of large-scale parallel platforms – target: O(10 4 ) to O(10 5 ) CPUs • Focus on scalable algorithms – communication cost is primary concern FG 7

Using large computers: BlueGene/L • 65,536 nodes, 128k CPUs • 3D torus network System (64 cabinets, 64x32x32) • 512 MB/node Cabinet (32 Node boards, 8x8x16) • 367 TFlop peak Node Board (32 chips, 4x4x2) 16 Compute Cards Compute Card 180/360 TF/s (2 chips, 2x1x1) 16 TB DDR Chip (2 processors) 2.9/5.7 TF/s 256 GB DDR 90/180 GF/s 8 GB DDR 5.6/11.2 GF/s 2.8/5.6 GF/s 0.5 GB DDR 4 MB FG 8

Computing the electronic structure • Kohn-Sham equations – solutions φ i represent electronic wavefunctions (one per electron) ( ) ϕ ∈ � 2 3 L i ϕ = −Δ ϕ + ρ ϕ = ε ϕ = ⎧ … ( , ) 1 H V i n r i i i i i ⎪ ′ ρ ( ) r ∫ ⎪ ′ ρ = + + ρ ∇ ρ ( , ) ( ) ( ( ), ( )) V V d V r r r r r ′ − ion XC ⎪ r r ⎪ ⎨ n ∑ ⎪ 2 ρ = ϕ ( ) ( ) r r ⎪ i = 1 i ⎪ ∫ ∗ ϕ ϕ = δ ( ) ( ) ⎪ r r d r ⎩ j ij i FG 9

Computing the electronic structure • Solutions are represented as Fourier series = ∑ ϕ ⋅ i q r ( ) r c e , j q j 2 < q E cut • A set of solutions is represented by an (orthogonal) ( m x n ) matrix of complex Fourier coefficients = Y c , ij q j i • Dimensions of Y : 10 6 x10 4 • Note: typically m / n ~ 100 FG 10

Computing the electronic structure • The energy is invariant under unitary transformations of Y ( ) [ ] ( ) = + F ρ T tr E Y Y HY = ∑ 2 ρ ϕ ( ) ( ) r r j j ( ) ( ) , = unitary E Y E YQ Q FG 11

Electronic structure calculation: (with fixed potential) • Invariant subspace computation Find Y such that: = Λ HY Y × × × ∈ ∈ Λ∈ � � � m m m n n n , , H Y – H is sparse – Cost of computing Hx: O(m log m) (involves Fast Fourier Transforms) FG 12

Electronic structure calculation: (with fixed potential) • Iterative methods for invariant subspace computations – Variants of Jacobi-Davidson – DIIS (a.k.a. Anderson acceleration) • Simple, diagonal preconditioning works well • Robustness of eigensolvers is key FG 13

Preconditioned steepest descent 1) correction ( ) = + β − T : Y Y K I YY HY 2) orthogonalization FG 14

Preconditioned DIIS ( ) Δ = − T 1) descent direction K I Y Y HY k k k k ( ) Δ Δ − Δ T 2) update tr − θ = k k k 1 Δ − Δ − 1 k k F ( ) = + θ − Y Y Y Y − 1 k k k k ( ) Δ = Δ + θ Δ − Δ − 1 k k k k = + β Δ Y Y + 1 k k k 3) orthogonalization FG 15

Self-consistent electronic structure computation • H depends non-linearly on the solution Y (through ρ ) • Fixed point iteration: repeat { ( ) ρ = T YY 1) compute charge density i ii 2) solve ρ = Λ ( ) H Y Y } until converged (i.e. ρ does not change) • Convergence can be accelerated using various charge-mixing schemes (e.g. Broyden) FG 16

Molecular Dynamics: solve the SCF problem at each time step • H is time-dependent (depends on positions of atoms) for each time step t { repeat { ( ) ρ = T YY 1) compute charge density i ii 2) solve ρ = Λ ( , ) ( ) ( ) ( ) H t Y t Y t t } until converged compute forces, move atoms } FG 17

Molecular Dynamics: using previous solutions optimally • Computing Y(t) – The previous solution Y(t-dt) is “close” to Y(t ), can be used as initial guess for iterative calculation of Y(t) Y + 1 k Y k Y − 1 k FG 18

Molecular Dynamics: using previous solutions optimally • Computing Y(t) – The previous solution Y(t-dt) is “close” to Y(t ), can be used as initial guess for iterative calculation of Y(t) � = − – The extrapolated subspace 2 Y Y Y − 1 k k is a better initial guess � = − 2 Y Y Y − 1 k k Y + 1 k Y k Y − 1 k FG 19

Molecular Dynamics: using previous solutions optimally • Subspace alignment – The eigensolver introduces arbitrary rotations in Y(t) – Extrapolation must be preceded by subspace alignment – Orthogonal Procrustes problem − = T min Y Y Q Q Q I − 1 k k Q � = − 2 Y Y Y Q − 1 k k Y + 1 k Y k Y − 1 k FG 20

Subspace alignment − Orthogonal Procrustes problem: minimize Y Y Q − 1 k k 1) Compute the polar decomposition ≡ = T Y Y A UH k − 1 k where U is unitary, H hermitian. = 2) rotation of Y k-1 : Y Y U − − 1 1 k k FG 21

Polar decomposition Polar decomposition A=UH (Higham ‘86) = X A 0 ( ) ( ) ∗ − = + 1 1 X X X + 1 k 2 k k converges quadratically to the unitary polar factor U Need better, inverse-free, scalable algorithm FG 22

Outline • First-Principles simulations • Eigenvalue problems in electronic structure calculations • Localized representations of solutions and simultaneous diagonalization problem • Data compression through simultaneous diagonalization FG 23

Localized representations of the invariant subspace • Linear combinations of electronic wavefunctions that minimize the spatial spread are called “Maximally Localized Wannier Functions” (MLWF) 2 σ = − 2 x x ˆ X • MLWFs are used to compute the electronic polarization in crystals • Computing MLWFs during a molecular dynamics simulation yields the infrared absorption spectrum N. Marzari and D. Vanderbilt, Phys. Rev. B56, 12847 (1997) R. Resta, Phys. Rev. Lett. 80, 1800 (1998) FG 24

Spread Functionals • Spread of a wavefunction associated with an operator Â ( ) 2 ( ) σ φ = φ ˆ − φ ˆ φ φ 2 | | | | A A ˆ A 2 ˆ ˆ = φ φ − φ φ 2 | | | | A A • Spread of a set of wavefunctions associated with an operator Â = ∑ ( ) { } ( ) σ φ σ φ 2 2 ˆ ˆ i i A A i FG 25

Spread Functionals • The spread is not invariant under orthogonal transformations ∑ ψ = φ ∈ × � n n orthogonal x X i ij j j ( ) ( ) { } { } σ ψ ≠ σ φ 2 2 ˆ ˆ i i A A • There exists a matrix X that minimizes the spread FG 26

Spread Functionals • Let × ˆ ˆ ∈ = = � 2 n n , | | | | A B a i A j b i A j ij ij ( ) ( ) n ( ) − ∑ { } 2 σ ψ = 2 T T tr X BX X AX ˆ i A ii = 1 i ( ) n ∑ 2 • Minimize the spread = maximize T X AX ii = 1 i = diagonalize A FG 27

Spread Functionals • Case of multiple operators ˆ = ( ) … k operators 1, , A k m = ( ) … k matrices 1, , A k m = ∑∑ ( ) { } ( ) σ ψ σ ψ 2 2 ˆ ˆ ( ) i k i A A i k ( ) n ∑∑ 2 ( ) T k X A X • Minimize the spread = maximize ii = 1 i k = joint approximate diagonalization of the matrices A (k) FG 28

Spread Functionals • Example of multiple operators ( ) ˆ = ˆ ˆ ϕ ≡ ϕ (1) ( , , ) ( , , ) A X X x y z x x y z ( ) ˆ = ˆ ˆ ϕ ≡ ϕ (2) ( , , ) ( , , ) A Y Y x y z y x y z ( ) ˆ = ˆ ˆ ϕ ≡ ϕ (3) ( , , ) ( , , ) A Z Z x y z z x y z • The matrices A (k) do not necessarily commute, even if the operators Â (k) do commute FG 29

Scalable Algorithms for Electronic Structure Calculations on - PowerPoint PPT Presentation

Scalable Algorithms for Electronic Structure Calculations on Petascale Computers Franois Gygi University of California, Davis fgygi@ucdavis.edu http://eslab.ucdavis.edu Supported by NSF ITR-HECURA-0749217 and DOE-SciDAC RANMEP2008

Massively parallel electronic structure calculations with Python software Jussi Enkovaara

Parallel Numerical Algorithms Chapter 7 Differential Equations Section 7.4 Electronic

Cache Coherence in Scalable Machines Scalable Cache Coherent Systems Scalable, distributed

Making the Lanczos method work for electronic structure calculations Kesheng Wu Andrew Canning

Scalable String Matching on the Scalable String Matching on the Scalable String Matching on the

First-principles electronic transport calculations Electronic transport in nano-scale

h-P discontinuous Galerkin finite element method for electronic structure calculations Carlo

ELECTRONIC RELEASE and MANDATORY ELECTRONIC SUBMISSION MAY 2010 ELECTRONIC RELEASE BACKGROUND

Electronic Industries Co. Electronic Industries Co. Electronic Industries Co. Baghdad- Iraq

Electronic Signature Electronic Signature El Electronic Signature t i Si t Digital

Relativistic atomic structure calculations with application in fusion plasma Narendra Singh

Stoichiometric Calculations Slide 3 / 109 Slide 4 / 109 Table of Contents Stoichiometry

Stoichiometric Calculations Slide 3 / 109 Slide 4 / 109 Table of Contents Stoichiometry

& The Calculations 4/12/2017 CCIM History - Calculations through the Decades 1 There is

Can you find them all? 1 WALT check calculations. What number sentence is this showing? + = 2

Mole - mole calculations Calculations from chemical equations A balanced chemical equation

Live Webcast This activity is supported by an independent educational Wednesday, July 1, 2020

MAGNETORESISTANCE PHENOMENA IN MAGNETIC MATERIALS AND DEVICES JOSE MARIA DE TERESA (CSIC -

Extending Microarray Technology print head to Study Protein Function Samples in 96- or 384-well

Serial Femtosecond Crystallography at the Linac Coherent Light Source Chunhong Yoon BASCD2016,

Single-cell RNA sequencing methodologies and ESCG pla:orm

FARMS: a probabilistic latent variable model for summarizing Affymetrix array data at probe level

DeconstructingStigma.org 1 DeconstructingStigma.org Agenda for American Hospital Association

Maryland Department of Health Population Health Summit: Innovation Under the Maryland Model

Scalable Algorithms for Electronic Structure Calculations on - PowerPoint PPT Presentation

Scalable Algorithms for Electronic Structure Calculations on Petascale Computers Franois Gygi University of California, Davis fgygi@ucdavis.edu http://eslab.ucdavis.edu Supported by NSF ITR-HECURA-0749217 and DOE-SciDAC RANMEP2008

Massively parallel electronic structure calculations with Python software Jussi Enkovaara

Parallel Numerical Algorithms Chapter 7 Differential Equations Section 7.4 Electronic

Cache Coherence in Scalable Machines Scalable Cache Coherent Systems Scalable, distributed

Making the Lanczos method work for electronic structure calculations Kesheng Wu Andrew Canning

Scalable String Matching on the Scalable String Matching on the Scalable String Matching on the

First-principles electronic transport calculations Electronic transport in nano-scale

h-P discontinuous Galerkin finite element method for electronic structure calculations Carlo

ELECTRONIC RELEASE and MANDATORY ELECTRONIC SUBMISSION MAY 2010 ELECTRONIC RELEASE BACKGROUND

Electronic Industries Co. Electronic Industries Co. Electronic Industries Co. Baghdad- Iraq

Electronic Signature Electronic Signature El Electronic Signature t i Si t Digital

Relativistic atomic structure calculations with application in fusion plasma Narendra Singh

Stoichiometric Calculations Slide 3 / 109 Slide 4 / 109 Table of Contents Stoichiometry

Stoichiometric Calculations Slide 3 / 109 Slide 4 / 109 Table of Contents Stoichiometry

&amp; The Calculations 4/12/2017 CCIM History - Calculations through the Decades 1 There is

Can you find them all? 1 WALT check calculations. What number sentence is this showing? + = 2

Mole - mole calculations Calculations from chemical equations A balanced chemical equation

Live Webcast This activity is supported by an independent educational Wednesday, July 1, 2020

MAGNETORESISTANCE PHENOMENA IN MAGNETIC MATERIALS AND DEVICES JOSE MARIA DE TERESA (CSIC -

Extending Microarray Technology print head to Study Protein Function Samples in 96- or 384-well

Serial Femtosecond Crystallography at the Linac Coherent Light Source Chunhong Yoon BASCD2016,

Single-cell RNA sequencing methodologies and ESCG pla:orm

FARMS: a probabilistic latent variable model for summarizing Affymetrix array data at probe level

DeconstructingStigma.org 1 DeconstructingStigma.org Agenda for American Hospital Association

Maryland Department of Health Population Health Summit: Innovation Under the Maryland Model

& The Calculations 4/12/2017 CCIM History - Calculations through the Decades 1 There is