multi length scale matrix computations and applications
play

Multi-Length Scale Matrix Computations and Applications in Quantum - PowerPoint PPT Presentation

Multi-Length Scale Matrix Computations and Applications in Quantum Mechanical Simulations Zhaojun Bai http://www.cs.ucdavis.edu/ bai joint work with Wenbin Chen, Roger Lee, Richard Scalettar, Ichitaro Yamazaki Workshop on Future Directions


  1. Multi-Length Scale Matrix Computations and Applications in Quantum Mechanical Simulations Zhaojun Bai http://www.cs.ucdavis.edu/ ∼ bai joint work with Wenbin Chen, Roger Lee, Richard Scalettar, Ichitaro Yamazaki Workshop on Future Directions in Tensor-Based Computation and Modeling NSF, Arlington, Feb. 20-21, 2009

  2. Computational Material Science Simulation and understanding properties of solid-state materials: magnetism, metal-insulator transition, high temperature superconductivity, ... Conductance map of an electronic crystal state ... at the atomic scale. [T. Hanaguri et al , Nature 430, 1001 (2004)]

  3. Outline of this talk 1. Hubbard model and quantum Monte Carlo simulation 2. QUEST: QUantum Electron Simulation Toolbox 3. Multi-length scale matrix computations – “tensor-based?” • Multi-length scale matrix analysis • Communication-avoiding stable matrix inversion • Self-adapting direct linear solvers • Robust preconditioned iterative solvers 4. Concluding remarks Supported by NSF and DOE SciDAC

  4. Many body simulation on multi-layer lattice Hubbard model and quantum Monte Carlo simulation

  5. Hubbard Hamiltonian – 4 N H = H K + H µ + H V � � � ( n i ↑ − 1 2)( n i ↓ − 1 ( c † iσ c jσ + c † = − t jσ c iσ ) + µ ( n i ↑ + n i ↓ ) + U 2) i i � i,j � ,σ • � i, j � : a pair of nearest-neighbor spatial sites • σ = ↑ , ↓ : spin direction of electrons • c † iσ ( c iσ ) creates (destroys) an electron of spin σ on site i • t : hopping parameter ⇐ kinetic energy • U : local repulsion between electrons ⇐ potential energy • µ : controls the electron density ⇐ chemical potential energy Related quantum many body models: • Ising model for phase transition • Anderson model of localization for electron transport Hubbard model and quantum Monte Carlo simulation

  6. Physical observable E The expected value of a physical observable E , such as density-density correlation, spin-spin correlation, the magnetic susceptibility , is given by �E� = Tr ( EP ) , where P is the probability (Boltzmann) distribution P = 1 Z e − β H and β ∝ 1 1 T = Temperature Z = Tr ( e − β H ) = the partition function Hubbard model and quantum Monte Carlo simulation

  7. Computational approximations of Boltzmann distribution P ⎧ 1 ⎪ P ( h ) = Z h det [ M + ( h )] det [ M − ( h )] ⎨ P = 1 “path integral” Z e − β H − → ⎪ ⎩ 1 Z H e − H ( x,p, Φ σ ) P ( x, p, Φ σ ) = [Feynman’65, ....] • Determinant QMC h ∼ P ( h ) • Hybird QMC (=molecular dynamics + mc) ( x ; p, Φ σ ) ∼ P ( x ; p, Φ σ ) [Blankenbecler/Scalapino/Sugar’81, Hirsch’85, ....] [Scalettar/Scalapino/Sugar/Toussaint’87, ....] Hubbard model and quantum Monte Carlo simulation

  8. QUEST: QUantum Electron Simulation Toolbox • Fortran 90 package for determinant (and hybrid) Monte Carlo simulations • Integration and revision of several “legacy” codes developed in the past two decades • Modulized structures and configurations variations of Hamiltonian different lattice geometry and multi-layer many physical measurements • Stable and efficient multi-length scale matrix computation kernels • Partially parallelized (MPI, OpenMP) QUEST

  9. QUEST ⇒ “Right” Physics and “Record-breaking” lattice sizes 0.1 0.1 0.05 0.05 0 0 c(lx,ly) c(lx,ly) -0.05 -0.05 (10,10) -0.1 (L/2, L/2) 24 x 24 -0.1 20 x 20 -0.15 16 x 16 β = 32 12 x 12 β = 20 (0,0) (10,0) 8 x 8 β = 12 (0,0) (L/2,0) -0.15 -0.2 (0,0) (10,0) (10,10) (0,0) (0,0) (L/2,0) (L/2, L/2) (0,0) magnetism forms as T is lowered large lattice sizes lead to converge

  10. Matrix kernel • Multi-length scale matrices DQMC : M σ ( h ) = I + BD L BD L − 1 · · · BD 1 HQMC : M σ ( x ) = I NL − ( I N ⊗ B ) D [ L ] ( P ⊗ I N ) • B = e t ∆ τK D ℓ = e σV ℓ ( x ℓ ) • D [ L ] = diag ( D 1 , D 2 , . . . , D L ) V ℓ ( x ℓ ) = diag ( x 1 , x 2 , . . . , x L ) • K is defined based on lattice structure: K = K x ⊕ K y for 2-D rectangle Multi-length scale matrices

  11. Multi-length scaling • Length-scales: N , L – N : spatial lattice size, – L : the number of imaginary-time slices, • Energy scales: t , U , β – t : hopping of electrons between atoms and layers (kinetic energy), – U : strength of the interactions between the electrons (potential energy), – β : inverse temperature, • Length and energy scale connection: ∆ τ = β L In more complex situations other energy scales also enter, such as the frequency of ionic vibrations (phonons) and the strength of the coupling of electrons to those vibrations Multi-length scale matrices

  12. QMC simulation kernels Matrix computation problems det [ M σ ( h ′ )] ( M T σ ( x ) M σ ( x )) − 1 Φ σ , G σ ( h ) = M − 1 det [ M σ ( h )] , σ ( h ) . . . A typical MD trajectory: 1. High order, say 4.5 4 N x × N y × L = 64 × 64 × 160 MD step: 3.5 = 655 , 360 0.01 3 ’ h 11 2. Wide range of eigenvalue dis- ’ : 2.5 11 → h h X 11 tributions λ ( M ) and condition 2 −1 400*M numbers κ ( M ) 1.5 h 1 3. Metropolis Monte-Carlo, 11 O (10 4 ) solutions required 0.5 0 −1.5 −1 −0.5 0 0.5 1 1.5 P Multi-length scale matrix computations

  13. Hubbard matrix eigenvalue distribution λ ( M ) L e i (2 ℓ +1) π 1 λ ( M ) = 1 − λ ( B L · · · B 2 B 1 ) , 0 ≤ ℓ ≤ L − 1 L [Frobenius ’12, Romanovsky ’43, Varga ’62] 2 1.5 1 0.5 0 −0.5 −1 −1.5 −2 −1 −0.5 0 0.5 1 1.5 2 2.5 3 Multi-length scale matrix analysis

  14. Hubbard matrix eigenvalue distribution λ ( M ) L e i (2 ℓ +1) π 1 λ ( M ) = 1 − λ ( B L · · · B 2 B 1 ) , 0 ≤ ℓ ≤ L − 1 L [Frobenius ’12, Romanovsky ’43, Varga ’62] 2 N=16 N=256 1.5 1 0.5 0 −0.5 −1 −1.5 −2 −1 −0.5 0 0.5 1 1.5 2 2.5 3 Multi-length scale matrix analysis

  15. Hubbard matrix eigenvalue distribution λ ( M ) L e i (2 ℓ +1) π 1 λ ( M ) = 1 − λ ( B L · · · B 2 B 1 ) , 0 ≤ ℓ ≤ L − 1 L [Frobenius ’12, Romanovsky ’43, Varga ’62] 2 p=8 p=64 1.5 1 0.5 0 −0.5 −1 −1.5 −2 −1 −0.5 0 0.5 1 1.5 2 2.5 3 Multi-length scale matrix analysis

  16. Hubbard matrix eigenvalue distribution λ ( M ) L e i (2 ℓ +1) π 1 λ ( M ) = 1 − λ ( B L · · · B 2 B 1 ) , 0 ≤ ℓ ≤ L − 1 L [Frobenius ’12, Romanovsky ’43, Varga ’62] 2.5 U=0 U=6 2 1.5 1 0.5 0 −0.5 −1 −1.5 −2 −2.5 −1.5 −1 −0.5 0 0.5 1 1.5 2 2.5 3 3.5 Multi-length scale matrix analysis

  17. Hubbard matrix eigenvalue distribution λ ( M ) L e i (2 ℓ +1) π 1 λ ( M ) = 1 − λ ( B L · · · B 2 B 1 ) , 0 ≤ ℓ ≤ L − 1 L [Frobenius ’12, Romanovsky ’43, Varga ’62] 2 p=8,t=1 p=64,t=8 1.5 1 0.5 0 −0.5 −1 −1.5 −2 −1 −0.5 0 0.5 1 1.5 2 2.5 3 Multi-length scale matrix analysis

  18. Communication-avoiding stable matrix inversion • Green’s function G = M − 1 = ( I + B L B L − 1 · · · B 1 ) − 1 • Singular values of B L B L − 1 · · · B 1 could spread out from 10 30 to 10 − 30 • A number of methods have been studied to calculate the matrix product in our community, Heath/Laub/Paige/Ward, Bojanczyk/Nagy/Plemmnons, van Dooren, Kagstrom, ... • Graded (stratified) decomposition ⎡ ⎤ x ⎢ ⎥ x ⎢ ⎥ B L B L − 1 · · · B 1 = UDT = U ⎦ T ⎣ x x using QR with pivoting, and proper ordering of multiplications [Loh et al ’92], or Jacobi rotations [Stewart’94]. • G = M − 1 = ( I + UDT ) − 1 = T − 1 ( U T T − 1 + D ) − 1 U T Communication-avoiding stable matrix inversion

  19. Communication-avoiding stable matrix inversion • QR with pivoting is not friendly to multicore computing BLAS/LAPACK in MKL on Intel Quad 2.4GHz Gflops dgemm dgeqrf dgeqp3 1-core 7.79 6.47 1.47 2-core 15.68 13.10 2.47 2-way Quad 26.39 21.68 3.22 • We developed an alternative based on a block structure orthogonal factorization (BSOF) without pivoting UDT BSOF 1-core 4.02 7.68 2-core 5.06 13.28 2-way Quad 6.22 18.85 Communication-avoiding stable matrix inversion

  20. Sel-adapting direct linear solvers • Block cyclic reduction ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ I B 1 x 1 b 1 ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ − B 2 I x 2 b 1 ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ − B 3 I x 3 = b 2 , ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ − B 4 I x 4 b 3 − B 5 I x 5 b 4 ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ b 1 I B 1 x 1 ⎢ ⎥ b (2) ⎦ = ⎣ ⎦ ⎣ ⇒ − B 3 B 2 I x 3 ⎦ , ⎣ 3 b (2) − B 5 B 4 I x 5 5 • Factor of 2 k –reduction, however, limited k due to numerical instability [Buzbee/Golub/Nielson’70, ... ] Self-adapting direct linear solvers

  21. Sel-adapting direct linear solvers • Block structured orthogonal factorization (BSOF) M = ( Q 1 Q 2 · · · Q L ) R, i.e. ⎡ ⎤ ⎡ ⎤ R 1 X X I B 1 ⎢ ⎥ R 2 X X ⎢ ⎥ ⎢ ⎥ − B 2 I Q k ... ... ⎢ ⎥ ⎢ ⎥ = ⇒ X . ... ... ⎣ ⎦ ⎢ ⎥ ⎣ ⎦ R L − 1 X − B L I R L • Rich substructure of Q 1 Q 2 · · · Q L exploited for the Green’s function calculations • Parallelizable [Wright’92,...] • Stable method, but, high memory cost O ( N 2 L ) . Self-adapting direct linear solvers

Recommend


More recommend