Low-rank tensor methods for high-dimensional eigenvalue problems Daniel Kressner CADMOS Chair of Numerical Algorithms and HPC MATHICSE / SMA / SB EPF Lausanne http://anchp.epfl.ch Based on joint work with Ch. Tobler (MathWorks) M. Steinlechner (EPFL), A. Uschmajew (EPFL) 1
Low-rank tensor techniques ◮ Emerged during last five years in numerical analysis. ◮ Successfully applied to: ◮ parameter-dependent / multi-dimensional integrals; ◮ electronic structure calculations: Hartree-Fock / DFT; ◮ stochastic and parametric PDEs; ◮ high-dimensional Boltzmann / chemical master / Fokker-Planck / Schrödinger equations; ◮ micromagnetism; ◮ rational approximation problems; ◮ computational homogenization; ◮ computational finance; ◮ stochastic automata networks; ◮ multivariate regression and machine learning; ◮ . . . ◮ For references on these applications, see ◮ L. Grasedyck, DK, Ch. Tobler (2013). A literature survey of low- rank tensor approximation techniques. GAMM-Mitteilungen, 36(1). ◮ W. Hackbusch (2012). Tensor Spaces and Numerical Tensor Calculus, Springer. 2
High dimensionality Continuous problem on d -dimensional domain with d ≫ 1 ⇓ Straightforward discretization ⇓ Discretized problem of order O ( n d ) Other causes of high dimensionality: ◮ parameter dependencies ◮ parametrized coefficients ◮ parametrized topology ◮ stochastic coefficients ◮ systems describing joint probability distributions ◮ . . . 3
Dealing with high dimensionality Established techniques: l2 ◮ Sparse grid 11 12 13 14 collocation/Galerkin methods. 21 22 23 24 ◮ Adaptive (wavelet) methods. 31 32 33 34 ◮ Monte Carlo method. ◮ Reduced basis method. 41 42 43 44 ◮ · · · l1 Common trait: Smart discretization � system hopefully of order ≪ O ( N D ) ... ... but not in this talk! This talk: Straightforward discretization � system of order O ( N D ) . 4
Dealing with high dimensionality Established techniques: l2 ◮ Sparse grid 11 12 13 14 collocation/Galerkin methods. 21 22 23 24 ◮ Adaptive (wavelet) methods. 31 32 33 34 ◮ Monte Carlo method. ◮ Reduced basis method. 41 42 43 44 ◮ · · · l1 Common trait: Smart discretization � system hopefully of order ≪ O ( N D ) ... ... but not in this talk! This talk: Straightforward discretization � system of order O ( N D ) . 4
Dealing with high dimensionality Established techniques: l2 ◮ Sparse grid 11 12 13 14 collocation/Galerkin methods. 21 22 23 24 ◮ Adaptive (wavelet) methods. 31 32 33 34 ◮ Monte Carlo method. ◮ Reduced basis method. 41 42 43 44 ◮ · · · l1 Common trait: Smart discretization � system hopefully of order ≪ O ( N D ) ... ... but not in this talk! This talk: Straightforward discretization � system of order O ( N D ) . 4
Example: PDE-eigenvalue problem Goal: Compute smallest eigenvalue for in Ω = [ 0 , 1 ] d , ∆ u ( ξ ) + V ( ξ ) u ( ξ ) = λ u ( ξ ) u ( ξ ) = 0 on ∂ Ω . Assumption: Potential represented as s � V ( 1 ) ( ξ 1 ) V ( 2 ) ( ξ 2 ) · · · V ( d ) V ( ξ ) = ( ξ d ) . j j j j = 1 � finite difference discretization A u = ( A L + A V ) u = λ u , with d � A L = I ⊗ · · · ⊗ I ⊗ A L ⊗ I ⊗ · · · ⊗ I , � �� � � �� � j = 1 d − j times j − 1 times s � A ( d ) V , j ⊗ · · · ⊗ A ( 2 ) V , j ⊗ A ( 1 ) A V = V , j . j = 1 5
Example: Henon-Heiles potential Consider Ω = [ − 10 , 2 ] d and potential ([Meyer et al. 1990; Raab et al. 2000; Faou et al. 2009]) d d − 1 j ) + σ 2 V ( ξ ) = 1 � j + 1 − 1 j + 1 ) 2 � � � ∗ σ j ξ 2 σ ∗ ( ξ j ξ 2 3 ξ 3 16 ( ξ 2 j + ξ 2 j + . 2 j = 1 j = 1 with σ j ≡ 1, σ ∗ = 0 . 2. Discretization with n = 128 dof/dimension for d = 20 dimensions. ◮ Eigenvector has n d ≈ 10 42 entries. ◮ Explicit storage of eigenvector would require 10 25 exabyte! 1 1 Global data storage in 2011 calculated at 295 exabyte, see http://www.bbc.co.uk/news/technology-12419672 . 6
Example: Henon-Heiles potential Consider Ω = [ − 10 , 2 ] d and potential ([Meyer et al. 1990; Raab et al. 2000; Faou et al. 2009]) d d − 1 j ) + σ 2 V ( ξ ) = 1 � j + 1 − 1 j + 1 ) 2 � � � ∗ σ j ξ 2 σ ∗ ( ξ j ξ 2 3 ξ 3 16 ( ξ 2 j + ξ 2 j + . 2 j = 1 j = 1 with σ j ≡ 1, σ ∗ = 0 . 2. Discretization with n = 128 dof/dimension for d = 20 dimensions. ◮ Eigenvector has n d ≈ 10 42 entries. ◮ Explicit storage of eigenvector would require 10 25 exabyte! 1 Solved with accuracy 10 − 12 in less than 1 hour on laptop. 1 Global data storage in 2011 calculated at 295 exabyte, see http://www.bbc.co.uk/news/technology-12419672 . 7
Contents 1. Low-rank matrices 2. Low-rank tensor formats 3. Low-rank methods for 2D 4. Low-rank methods for arbitrary dimensions 5. Outlook 8
Low-rank matrices 9
Discretization of bivariate function � � � � ◮ Bivariate function: f ( x , y ) : x min , x max × y min , y max → R . ◮ Function values on tensor grid [ x 1 , . . . , x n ] × [ y 1 , . . . , y m ] . ◮ Normally collected in looong vector: f ( x 1 , y 1 ) f ( x 2 , y 1 ) . . . f ( x m , y 1 ) f ( x 1 , y 2 ) f ( x 2 , y 2 ) . . . f = f ( x m , y 2 ) . . . . . . f ( x 1 , y n ) f ( x 2 , y n ) . . . f ( x m , y n ) 10
Discretization of bivariate function � � � � ◮ Bivariate function: f ( x , y ) : x min , x max × y min , y max → R . ◮ Function values on tensor grid [ x 1 , . . . , x n ] × [ y 1 , . . . , y m ] . ◮ Collected in a matrix: f ( x 1 , y 1 ) f ( x 1 , y 2 ) f ( x 1 , y n ) · · · f ( x 2 , y 1 ) f ( x 2 , y 2 ) f ( x 2 , y n ) · · · F = . . . . . . . . . f ( x m , y 1 ) f ( x m , y 2 ) f ( x m , y n ) · · · 11
Low-rank approximation Setting: Matrix X ∈ R n × m , m and n too large to compute/store X explicitly. Idea: Replace X by RS T with R ∈ R n × r , S ∈ R m × r and r ≪ m , n . RS T X Memory nm nr + rm � � X − RS T � 2 : R ∈ R n × r , S ∈ R m × r � min = σ r + 1 . with singular values σ 1 ≥ σ 2 ≥ · · · ≥ σ min { m , n } of X . 12
Singular values of random matrices A = rand(200); semilogy(svd(A)) Singular values A 2 10 50 1 10 0 100 10 −1 150 10 −2 200 10 50 100 150 200 0 50 100 150 200 No reasonable low-rank approximation possible 13
Singular values of ground state ◮ Computed ground state for Henon-Heiles potential for d = 2. ◮ Reshaped ground state into matrix. Singular values Ground state 0 10 −5 10 −10 10 −15 10 −20 10 0 100 200 300 Excellent rank-10 approximation possible 14
When to expect good low-rank approximations Rule of thumb: Smoothness helps. 15
When to expect good low-rank approximations Rule of thumb: Smoothness helps, but is not always needed. 16
Discretization of bivariate function � � � � ◮ Bivariate function: f ( x , y ) : x min , x max × y min , y max → R . ◮ Function values on tensor grid [ x 1 , . . . , x n ] × [ y 1 , . . . , y m ] : f ( x 1 , y 1 ) f ( x 1 , y 2 ) f ( x 1 , y n ) · · · f ( x 2 , y 1 ) f ( x 2 , y 2 ) f ( x 2 , y n ) · · · F = . . . . . . . . . f ( x m , y 1 ) f ( x m , y 2 ) f ( x m , y n ) · · · Basic but crucial observation: f ( x , y ) = g ( x ) h ( y ) � g ( x 1 ) h ( y 1 ) g ( x 1 ) h ( y n ) g ( x 1 ) · · · . . . h ( y n ) ] F = . . = . [ h ( y 1 ) . . . · · · g ( x m ) h ( y 1 ) g ( x m ) h ( y n ) g ( x m ) · · · Separability implies rank 1. 17
Separability and low rank Approximation by sum of separable functions f ( x , y ) = g 1 ( x ) h 1 ( y ) + · · · + g r ( x ) h r ( y ) + error . � �� � =: f r ( x , y ) Define f r ( x 1 , y 1 ) f r ( x 1 , y n ) · · · . . F r = . . . . . f r ( x m , y 1 ) f r ( x m , y n ) · · · Then F r has rank ≤ r and � F − F r � F ≤ √ mn × error . � √ σ r + 1 ( F ) ≤� F − F r � 2 ≤ � F − F r � F ≤ mn × error . Semi-separable approximation implies low-rank approximation. 18
Semi-separable approximation by polynomials Solution of approximation problem f ( x , y ) = g 1 ( x ) h 1 ( y ) + · · · + g r ( x ) h r ( y ) + error . not trivial; g j , h j can be chosen arbitrarily! General construction by polynomial interpolation: 1. Lagrange interpolation of f ( x , y ) in y -coordinate: r � I y [ f ]( x , y ) = f ( x , θ j ) L j ( y ) j = 1 with Lagrange polynomials L j of degree r − 1 on [ x min , x max ] . 2. Interpolation of I y [ f ] in x -coordinate: r r � � I x [ I y [ f ]]( x , y ) = f ( ξ i , θ j ) L i ( x ) L j ( y ) ˆ = L i , x ( x ) L j , y ( y ) , i , j = 1 i = 1 where f [ f ( ξ i , θ j )] i , j is “diagonalized” by SVD. 19
Recommend
More recommend