using a hilbert schmidt svd for stable kernel computations
play

Using a Hilbert-Schmidt SVD for Stable Kernel Computations Greg - PowerPoint PPT Presentation

Using a Hilbert-Schmidt SVD for Stable Kernel Computations Greg Fasshauer Mike McCourt Roberto Cavoretto Department of Applied Mathematics Illinois Institute of Technology Partially supported by NSF Grant DMS1115392 MAIA 2013


  1. Using a Hilbert-Schmidt SVD for Stable Kernel Computations Greg Fasshauer ∗ Mike McCourt Roberto Cavoretto Department of Applied Mathematics Illinois Institute of Technology Partially supported by NSF Grant DMS–1115392 MAIA 2013 Multivariate Approximation and Interpolation with Applications Erice, Sicily Sept.25, 2013 Greg Fasshauer Hilbert-Schmidt SVD 1

  2. Outline Fundamental Problem 1 Hilbert-Schmidt SVD and General RBF-QR Algorithm 2 Implementation for Compact Matérn Kernels 3 Application 1: Basic Function Approximation 4 Application 2: Optimal Shape Parameters via MLE 5 Summary 6 Greg Fasshauer Hilbert-Schmidt SVD 2

  3. Fundamental Problem Kernel-based Interpolation Given data ( x i , y i ) N i = 1 , use a data-dependent linear function space N � x ∈ Ω ⊆ R d s ( x ) = c j K ( x , x j ) , j = 1 with K : Ω × Ω → R a positive definite reproducing kernel. Greg Fasshauer Hilbert-Schmidt SVD 3

  4. Fundamental Problem Kernel-based Interpolation Given data ( x i , y i ) N i = 1 , use a data-dependent linear function space N � x ∈ Ω ⊆ R d s ( x ) = c j K ( x , x j ) , j = 1 with K : Ω × Ω → R a positive definite reproducing kernel. To find c j solve the interpolation equations s ( x i ) = y i , i = 1 , . . . , N , which leads to a linear system K c = y with symmetric positive definite – often ill-conditioned – system matrix K ij = K ( x i , x j ) , i , j = 1 , . . . , N . Greg Fasshauer Hilbert-Schmidt SVD 3

  5. Fundamental Problem Common Complaints About Kernels Kernel methods suffer from numerical instability, the presence of free parameter(s), high computational cost. Greg Fasshauer Hilbert-Schmidt SVD 4

  6. Fundamental Problem Common Complaints About Kernels Kernel methods suffer from numerical instability, the presence of free parameter(s), high computational cost. In this talk we will address the first two issues: We obtain stable methods by working with a “better” basis which leads to a Hilbert-Schmidt SVD of the matrix K. Free parameters can be “optimally” chosen by using statistical methods such as MLE, which are significantly enhanced by using the HS-SVD. Greg Fasshauer Hilbert-Schmidt SVD 4

  7. Hilbert-Schmidt SVD and General RBF-QR Algorithm Hilbert-Schmidt Theory We assume that we know a Hilbert-Schmidt expansion (or Mercer series expansion) of our kernel K : ∞ � K ( x , z ) = λ n ϕ n ( x ) ϕ n ( z ) , n = 1 where ( λ n , ϕ n ) are orthonormal eigenpairs of a Hilbert-Schmidt integral operator T K : L 2 (Ω , ρ ) → L 2 (Ω , ρ ) defined as � ( T K f )( x ) = K ( x , z ) f ( z ) ρ ( z ) d z , Ω where Ω ⊂ R d and � K � L 2 (Ω × Ω ,ρ × ρ ) < ∞ . Greg Fasshauer Hilbert-Schmidt SVD 5

  8. Hilbert-Schmidt SVD and General RBF-QR Algorithm Gaussian Eigenfunctions [Rasmussen/Williams (2006), F ./McCourt (2012)] ∞ � e − ε 2 ( x − z ) 2 = λ n ϕ n ( x ) ϕ n ( z ) n = 0 Greg Fasshauer Hilbert-Schmidt SVD 6

  9. Hilbert-Schmidt SVD and General RBF-QR Algorithm Gaussian Eigenfunctions [Rasmussen/Williams (2006), F ./McCourt (2012)] ∞ � e − ε 2 ( x − z ) 2 = λ n ϕ n ( x ) ϕ n ( z ) n = 0 where � � � n α 2 ε 2 ϕ n ( x ) = γ n e − δ 2 x 2 H n ( αβ x ) λ n = , α 2 + δ 2 + ε 2 α 2 + δ 2 + ε 2 with H n Hermite polynomials, � � 2 � 1 � � 2 ε � � 4 δ 2 = α 2 β β 2 − 1 β = 1 + , γ n = 2 n Γ( n + 1 ) , 2 α and { ϕ n } ∞ n = 0 ( ρ -weighted) L 2 -orthonormal, i.e., � ∞ ρ ( x ) = α √ π e − α 2 x 2 ϕ m ( x ) ϕ n ( x ) ρ ( x ) d x = δ mn , −∞ Greg Fasshauer Hilbert-Schmidt SVD 6

  10. Hilbert-Schmidt SVD and General RBF-QR Algorithm Multivariate Eigenfunction Expansion Use tensor product form of the Gaussian kernel d d � ε 2 ( x ℓ − z ℓ ) 2 − � K ( x , z ) = e − ε 2 � x − z � 2 e − ε 2 ( x ℓ − z ℓ ) 2 2 = e = ℓ = 1 ℓ = 1 x = ( x 1 , . . . , x d ) ∈ R d , Greg Fasshauer Hilbert-Schmidt SVD 7

  11. Hilbert-Schmidt SVD and General RBF-QR Algorithm Multivariate Eigenfunction Expansion Use tensor product form of the Gaussian kernel d d � ε 2 ℓ ( x ℓ − z ℓ ) 2 − � K ( x , z ) = e − ε 2 � x − z � 2 e − ε 2 ℓ ( x ℓ − z ℓ ) 2 2 = e = ℓ = 1 ℓ = 1 x = ( x 1 , . . . , x d ) ∈ R d , Greg Fasshauer Hilbert-Schmidt SVD 7

  12. Hilbert-Schmidt SVD and General RBF-QR Algorithm Multivariate Eigenfunction Expansion Use tensor product form of the Gaussian kernel d d � ε 2 ℓ ( x ℓ − z ℓ ) 2 − � K ( x , z ) = e − ε 2 � x − z � 2 e − ε 2 ℓ ( x ℓ − z ℓ ) 2 2 = e = ℓ = 1 ℓ = 1 � x = ( x 1 , . . . , x d ) ∈ R d , = λ n ϕ n ( x ) ϕ n ( z ) , n ∈ N d where d d � � λ n = ϕ n ( x ) = ϕ n ℓ ( x ℓ ) . λ n ℓ , ℓ = 1 ℓ = 1 Different shape parameters ε ℓ for different space dimensions allowed (i.e., K may be anisotropic). Greg Fasshauer Hilbert-Schmidt SVD 7

  13. Hilbert-Schmidt SVD and General RBF-QR Algorithm Fundamental idea: use the eigen-expansion of the kernel K to rewrite the matrix K from the interpolation problem as   K ( x 1 , x 1 ) . . . K ( x 1 , x N )   . . K = . .   . . K ( x N , x 1 ) . . . K ( x N , x N )  λ 1   ϕ 1 ( x 1 ) . . . ϕ 1 ( x N )  ϕ 1 ( x 1 ) . . . ϕ M ( x 1 ) . . .   . . ...    . .  . . . .     = . .       . . λ M ϕ M ( x 1 ) . . . ϕ M ( x N )           ϕ 1 ( x N ) . . . ϕ M ( x N ) . . . . .  ...    . . . . Greg Fasshauer Hilbert-Schmidt SVD 8

  14. Hilbert-Schmidt SVD and General RBF-QR Algorithm But we can’t compute with infinite matrices, so we choose a truncation value M (supported by λ n → 0 as n → ∞ , more later) and rewrite   K ( x 1 , x 1 ) . . . K ( x 1 , x N )   . . K = . .   . . K ( x N , x 1 ) . . . K ( x N , x N )       λ 1 ϕ 1 ( x 1 ) . . . ϕ M ( x 1 ) ϕ 1 ( x 1 ) . . . ϕ 1 ( x N )   ...  . .     . .  . . . . =       . . . .   λ M ϕ 1 ( x N ) ϕ M ( x N ) ϕ M ( x 1 ) ϕ M ( x N ) . . . . . . � �� � � �� � � �� � = Φ = Φ T = Λ Greg Fasshauer Hilbert-Schmidt SVD 8

  15. Hilbert-Schmidt SVD and General RBF-QR Algorithm But we can’t compute with infinite matrices, so we choose a truncation value M (supported by λ n → 0 as n → ∞ , more later) and rewrite   K ( x 1 , x 1 ) . . . K ( x 1 , x N )   . . K = . .   . . K ( x N , x 1 ) . . . K ( x N , x N )       λ 1 ϕ 1 ( x 1 ) . . . ϕ M ( x 1 ) ϕ 1 ( x 1 ) . . . ϕ 1 ( x N )   ...  . .     . .  . . . . =       . . . .   λ M ϕ 1 ( x N ) ϕ M ( x N ) ϕ M ( x 1 ) ϕ M ( x N ) . . . . . . � �� � � �� � � �� � = Φ = Φ T = Λ Since ∞ M � � K ( x i , x j ) = λ n ϕ n ( x i ) ϕ n ( x j ) ≈ λ n ϕ n ( x i ) ϕ n ( x j ) n = 1 n = 1 accurate reconstruction of all entries of K will likely require M > N . Greg Fasshauer Hilbert-Schmidt SVD 8

  16. Hilbert-Schmidt SVD and General RBF-QR Algorithm The matrix K is often ill-conditioned, so forming K and computing with it is not a good idea. Greg Fasshauer Hilbert-Schmidt SVD 9

  17. Hilbert-Schmidt SVD and General RBF-QR Algorithm The matrix K is often ill-conditioned, so forming K and computing with it is not a good idea. The eigen-decomposition K = ΦΛΦ T provides an accurate (elementwise) approximation of K without ever forming it. Greg Fasshauer Hilbert-Schmidt SVD 9

  18. Hilbert-Schmidt SVD and General RBF-QR Algorithm The matrix K is often ill-conditioned, so forming K and computing with it is not a good idea. The eigen-decomposition K = ΦΛΦ T provides an accurate (elementwise) approximation of K without ever forming it. However, it is not recommended to directly use this decomposition either since all of the ill-conditioning associated with K is still present – sitting in the matrix Λ . Greg Fasshauer Hilbert-Schmidt SVD 9

  19. Hilbert-Schmidt SVD and General RBF-QR Algorithm The matrix K is often ill-conditioned, so forming K and computing with it is not a good idea. The eigen-decomposition K = ΦΛΦ T provides an accurate (elementwise) approximation of K without ever forming it. However, it is not recommended to directly use this decomposition either since all of the ill-conditioning associated with K is still present – sitting in the matrix Λ . We now use mostly standard numerical linear algebra to isolate some of this ill-conditioning and develop the Hilbert-Schmidt SVD and a general RBF-QR algorithm. Greg Fasshauer Hilbert-Schmidt SVD 9

  20. Hilbert-Schmidt SVD and General RBF-QR Algorithm Details of the Hilbert-Schmidt SVD Assume M > N , so that Φ is “short and fat” and partition Φ :     ϕ 1 ( x 1 ) . . . ϕ N ( x 1 ) ϕ N + 1 ( x 1 ) . . . ϕ M ( x 1 )   Φ 1 Φ 2   . . . .   . . . . ���� ����  =   . . . . .  N × N N × ( M − N ) ϕ 1 ( x N ) . . . ϕ N ( x N ) ϕ N + 1 ( x N ) . . . ϕ M ( x N ) Greg Fasshauer Hilbert-Schmidt SVD 10

Recommend


More recommend