compression inversion and sparse approximate pca of dense
play

Compression, inversion and sparse approximate PCA of dense kernel - PowerPoint PPT Presentation

Compression, inversion and sparse approximate PCA of dense kernel matrices in near linear computational complexity Florian Schfer ICERM 2017 F. Schfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017


  1. A numerical experiment We define J ( k ) := I ( k ) \ I ( k − 1 ) and define the sparsity pattern: � � ≤ 2 ∗ 2 min ( k , l ) � � i ∈ J ( k ) , j ∈ J ( l ) , dist � � S 2 := ( i , j ) ∈ I × I x i , x j . � We order the elements of I from coarse to fine, that is from J ( 1 ) to J ( q ) . F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 30 / 130

  2. A numerical experiment L provides a good approximation of Γ at only 2 percent of the storage cost. F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 31 / 130

  3. A numerical experiment L provides a good approximation of Γ at only 2 percent of the storage cost. Can be computed in near linear complexity in time and space. F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 32 / 130

  4. A numerical experiment L provides a good approximation of Γ at only 2 percent of the storage cost. Can be computed in near linear complexity in time and space. Allows for approximate evaluation of Γ , Γ − 1 , and det (Γ) in near-linear time. F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 33 / 130

  5. A numerical experiment L provides a good approximation of Γ at only 2 percent of the storage cost. Can be computed in near linear complexity in time and space. Allows for approximate evaluation of Γ , Γ − 1 , and det (Γ) in near-linear time. Allows for sampling of X ∼ N ( 0 , Γ) in near-linear time. F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 34 / 130

  6. A numerical experiment In this work, we F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 35 / 130

  7. A numerical experiment In this work, we prove that this phaenomenon holds whenever the covariance function K is the Green’s function of an elliptic boundary value problem. F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 36 / 130

  8. A numerical experiment In this work, we prove that this phaenomenon holds whenever the covariance function K is the Green’s function of an elliptic boundary value problem. prove that it leads to an algorithm with computational complexity � � 4 d + 1 � N log 2 ( N ) � log ( 1 /ǫ ) + log 2 ( N ) of O in time and N log ( N ) log d ( N 1 � � O ǫ ) in space for an approximation error of ǫ . F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 37 / 130

  9. A numerical experiment In this work, we prove that this phaenomenon holds whenever the covariance function K is the Green’s function of an elliptic boundary value problem. prove that it leads to an algorithm with computational complexity � � 4 d + 1 � N log 2 ( N ) � log ( 1 /ǫ ) + log 2 ( N ) of O in time and N log ( N ) log d ( N 1 � � O ǫ ) in space. show that even though the Matérn family is not covered rigorously by our theoretical results, we get good approximation results, in particular in the interior of the domain. F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 38 / 130

  10. A numerical experiment In this work, we prove that this phaenomenon holds whenever the covariance function K is the Green’s function of an elliptic boundary value problem. prove that it leads to an algorithm with computational complexity � � 4 d + 1 � N log 2 ( N ) � log ( 1 /ǫ ) + log 2 ( N ) of O in time and N log ( N ) log d ( N 1 � � O ǫ ) in space. show that even though the Matérn family is not covered rigorously by our theoretical results, we get good approximation results, in particular in the interior of the domain. show that as a byproduct of our algorithm we obtain a sparse approximate PCA with near optimal approximation property. F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 39 / 130

  11. Disintegration of Gaussian Measures and the Screening Effect Let X be a centered Gaussian vector with covariance Θ . F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 40 / 130

  12. Disintegration of Gaussian Measures and the Screening Effect Let X be a centered Gaussian vector with covariance Θ . Assume we want to compute E [ f ( X )] for some function f . F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 41 / 130

  13. Disintegration of Gaussian Measures and the Screening Effect Let X be a centered Gaussian vector with covariance Θ . Assume we want to compute E [ f ( X )] for some function f . Use Monte Carlo, but for Θ large, each sample is expensive. F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 42 / 130

  14. Disintegration of Gaussian Measures and the Screening Effect Let X be a centered Gaussian vector with covariance Θ . Assume we want to compute E [ f ( X )] for some function f . Use Monte Carlo, but for Θ large, each sample is expensive. Idea: use disintegration of measure: E [ f ( X )] = E [ E [ f ( X ) | Y ] ( Y )] . F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 43 / 130

  15. Disintegration of Gaussian Measures and the Screening Effect Let X be a centered Gaussian vector with covariance Θ . Assume we want to compute E [ f ( X )] for some function f . Use Monte Carlo, but for Θ large, each sample is expensive. Idea: use disintegration of measure: E [ f ( X )] = E [ E [ f ( X ) | Y ] ( Y )] . Choose Y , such that Y and E [ f ( X ) | Y ] can be sampled cheaply. F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 44 / 130

  16. Disintegration of Gaussian Measures and the Screening Effect Consider X ∈ R N , { x i } 1 ≤ i ≤ N ⊂ [ 0 , 1 ] and Θ i , j := exp � � −| x i − x j | . F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 45 / 130

  17. Disintegration of Gaussian Measures and the Screening Effect Consider X ∈ R N , { x i } 1 ≤ i ≤ N ⊂ [ 0 , 1 ] and Θ i , j := exp � � −| x i − x j | . Corresponds to a prior on the space H 1 ( 0 , 1 ) of mean square differentiable functions. F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 46 / 130

  18. Disintegration of Gaussian Measures and the Screening Effect Consider X ∈ R N , { x i } 1 ≤ i ≤ N ⊂ [ 0 , 1 ] and Θ i , j := exp � � −| x i − x j | . Corresponds to a prior on the space H 1 ( 0 , 1 ) of mean square differentiable functions. Assume x i are ordered in increasing order and x ⌊ N / 2 ⌋ ≈ 1 / 2 F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 47 / 130

  19. Disintegration of Gaussian Measures and the Screening Effect Consider X ∈ R N , { x i } 1 ≤ i ≤ N ⊂ [ 0 , 1 ] and Θ i , j := exp � � −| x i − x j | . Corresponds to a prior on the space H 1 ( 0 , 1 ) of mean square differentiable functions. Assume x i are ordered in increasing order and x ⌊ N / 2 ⌋ ≈ 1 / 2 � � We then have, for i < ⌊ N / 2 ⌋ < j : Cov X i , X j | X ⌊ N / 2 ⌋ ≈ 0. F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 48 / 130

  20. Disintegration of Gaussian Measures and the Screening Effect Consider X ∈ R N , { x i } 1 ≤ i ≤ N ⊂ [ 0 , 1 ] and Θ i , j := exp � � . −| x i − x j | Corresponds to a prior on the space H 1 ( 0 , 1 ) of mean square differentiable functions. Assume x i are ordered in increasing order and x ⌊ N / 2 ⌋ ≈ 1 / 2 � � We then have, for i < ⌊ N / 2 ⌋ < j : Cov X i , X j | X ⌊ N / 2 ⌋ ≈ 0. F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 49 / 130

  21. Disintegration of Gaussian Measures and the Screening Effect Consider X ∈ R N , { x i } 1 ≤ i ≤ N ⊂ [ 0 , 1 ] and Θ i , j := exp � � −| x i − x j | . Corresponds to a prior on the space H 1 ( 0 , 1 ) of mean square differentiable functions. Assume x i are ordered in increasing order and x ⌊ N / 2 ⌋ ≈ 1 / 2 � � We then have, for i < ⌊ N / 2 ⌋ < j : Cov X i , X j | X ⌊ N / 2 ⌋ ≈ 0. F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 50 / 130

  22. Disintegration of Gaussian Measures and the Screening Effect For two observation sites x i , x j , the covariance conditional on the obervation sites inbetween is small. F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 51 / 130

  23. Disintegration of Gaussian Measures and the Screening Effect For two observation sites x i , x j , the covariance conditional on the obervation sites inbetween is small. Known as screening effect in the spatial statistics community. Analysed by Stein (2002). Used, among others, by Banerjee et al. (2008) and Katzfuss (2015) for efficient approximation of Gaussian processes. F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 52 / 130

  24. Disintegration of Gaussian Measures and the Screening Effect For two observation sites x i , x j , the covariance conditional on the obervation sites inbetween is small. Known as screening effect in the spatial statistics community. Analysed by Stein (2002). Used, among others, by Banerjee et al. (2008) and Katzfuss (2015) for efficient approximation of Gaussian processes. Let us take Y = X ⌊ N / 2 ⌋ . Then Y is cheap to sample, and the covariance matrix of X | Y has only 2 ( N / 2 ) 2 noneglegible entries. F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 53 / 130

  25. Disintegration of Gaussian Measures and the Screening Effect For two observation sites x i , x j , the covariance conditional on the obervation sites inbetween is small. Known as screening effect in the spatial statistics community. Analysed by Stein (2002). Used, among others, by Banerjee et al. (2008) and Katzfuss (2015) for efficient approximation of Gaussian processes. Let us take Y = X ⌊ N / 2 ⌋ . Then Y is cheap to sample, and the covariance matrix of X | Y has only 2 ( N / 2 ) 2 noneglegible entries. When using Cholesky decomposition, this yields a factor 4 improvement of computational speed. F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 54 / 130

  26. Sparse Factorisation of Dense Matrices: fade-out instead of fill-in Look at a single step of Block Cholesky decomposition: F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 55 / 130

  27. Sparse Factorisation of Dense Matrices: fade-out instead of fill-in Look at a single step of Block Cholesky decomposition: This corresponds to: � Θ 11 � Θ 12 Θ 21 Θ 22 Θ − 1 � � � Θ 11 � � � Id 0 0 Id 11 Θ 12 = Θ 21 Θ − 1 Θ 22 − Θ 21 Θ − 1 Id 0 11 Θ 12 0 Id 11 F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 56 / 130

  28. Sparse Factorisation of Dense Matrices: fade-out instead of fill-in Look at a single step of Block Cholesky decomposition: This corresponds to: � Θ 11 � Θ 12 Θ 21 Θ 22 Θ − 1 � Id 0 � � Θ 11 0 � � � Id 11 Θ 12 = Θ 21 Θ − 1 Θ 22 − Θ 21 Θ − 1 Id 0 11 Θ 12 0 Id 11 � � Θ 21 Θ − 1 Note, that for b = E [ X 2 | X 1 = b ] , and 11 Θ 22 − Θ 21 Θ − 1 11 Θ 12 = Cov [ X 2 | X 1 ] . F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 57 / 130

  29. Sparse Factorisation of Dense Matrices: fade-out instead of fill-in Look at a single step of Block Cholesky decomposition: This corresponds to: � Θ 11 � Θ 12 Θ 21 Θ 22 Θ − 1 � � � Θ 11 � � � Id 0 0 Id 11 Θ 12 = Θ 21 Θ − 1 Θ 22 − Θ 21 Θ − 1 Id 0 11 Θ 12 0 Id 11 � � Θ 21 Θ − 1 Note, that for b = E [ X 2 | X 1 = b ] , and 11 Θ 22 − Θ 21 Θ − 1 11 Θ 12 = Cov [ X 2 | X 1 ] . (Block-)Cholesky decomposition is computationally equivalent to the disintegration of Gaussian measures. F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 58 / 130

  30. Sparse Factorisation of Dense Matrices: fade-out instead of fill-in Look at a single step of Block Cholesky decomposition: This corresponds to: � Θ 11 Θ 12 � Θ 21 Θ 22 Θ − 1 � Id 0 � � Θ 11 0 � � � Id 11 Θ 12 = Θ 21 Θ − 1 Θ 22 − Θ 21 Θ − 1 Id 0 11 Θ 12 0 Id 11 � � Θ 21 Θ − 1 Note, that for b = E [ X 2 | X 1 = b ] , and 11 Θ 22 − Θ 21 Θ − 1 11 Θ 12 = Cov [ X 2 | X 1 ] . (Block-)Cholesky decomposition is computationally equivalent to the disintegration of Gaussian measures. Follows immediately from well known formulas, but rarely used in the literature. One Example: Bickson (2008). F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 59 / 130

  31. Sparse Factorisation of Dense Matrices: fade-out instead of fill-in This suggests to choose a bisective elimination ordering: F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 60 / 130

  32. Sparse Factorisation of Dense Matrices: fade-out instead of fill-in This suggests to choose a bisective elimination ordering: Lets start compting the Cholesky decomposition F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 61 / 130

  33. Sparse Factorisation of Dense Matrices: fade-out instead of fill-in This suggests to choose a bisective elimination ordering: Lets start compting the Cholesky decomposition We observe a fade-out of entries! F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 62 / 130

  34. Sparse Factorisation of Dense Matrices: fade-out instead of fill-in What about higher dimensional examples? F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 63 / 130

  35. Sparse Factorisation of Dense Matrices: fade-out instead of fill-in How about higher dimensional examples? In 2d, use quadsection: F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 64 / 130

  36. Sparse Factorisation of Dense Matrices: fade-out instead of fill-in We know that the result of the factorisation is sparse, but can we compute it efficiently? F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 65 / 130

  37. Sparse Factorisation of Dense Matrices: fade-out instead of fill-in We know that the result of the factorisation is sparse, but can we compute it efficiently? Key observation: The entry ( i , j ) is used for the first time with the min ( i , j ) -th pivot. F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 66 / 130

  38. Sparse Factorisation of Dense Matrices: fade-out instead of fill-in We know that the result of the factorisation is sparse, but can we compute it efficiently? Key observation: The entry ( i , j ) is used for the first time with the min ( i , j ) -th pivot. If we know they will be negligible untill we use them, we don’t have to update them, nor know them in the first place. F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 67 / 130

  39. Sparse Factorisation of Dense Matrices: fade-out instead of fill-in We know that the result of the factorisation is sparse, but can we compute it efficiently? Key observation: The entry ( i , j ) is used for the first time with the min ( i , j ) -th pivot. If we know they will be negligible untill we use them, we don’t have to update them, nor know them in the first place. F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 68 / 130

  40. Sparse Factorisation of Dense Matrices: fade-out instead of fill-in Bisective/Quadsective ordering is the reverse of nested dissection. Indeed, for P the order-reversing permutation matrix, we have: LL T � − 1 (Θ) − 1 = � = L − T L − 1 � T ⇒ P (Θ) − 1 P = PL − T PPL − 1 P = � � � PL − T P PL − T P = , But we have L − 1 = L T (Θ) − 1 . For a sparse elimination ordering of Θ , the reverse ordering leads to sparse factorisation of (Θ) − 1 F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 69 / 130

  41. Sparse Factorisation of Dense Matrices: fade-out instead of fill-in We obtain a very simple algorithm: F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 70 / 130

  42. Sparse Factorisation of Dense Matrices: fade-out instead of fill-in We obtain a very simple algorithm: Given a positive definite matrix Θ and a Graph G , such that Θ − 1 is sparse according to G . F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 71 / 130

  43. Sparse Factorisation of Dense Matrices: fade-out instead of fill-in We obtain a very simple algorithm: Given a positive definite matrix Θ and a Graph G , such that Θ − 1 is sparse according to G . Obtain inverse nested dissection ordering for G . F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 72 / 130

  44. Sparse Factorisation of Dense Matrices: fade-out instead of fill-in We obtain a very simple algorithm: Given a positive definite matrix Θ and a Graph G , such that Θ − 1 is sparse according to G . Obtain inverse nested dissection ordering for G . Set entries ( i , j ) that are separated after pivot number min ( i , j ) to zero. F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 73 / 130

  45. Sparse Factorisation of Dense Matrices: fade-out instead of fill-in We obtain a very simple algorithm: Given a positive definite matrix Θ and a Graph G , such that Θ − 1 is sparse according to G . Obtain inverse nested dissection ordering for G . Set entries ( i , j ) that are separated after pivot number min ( i , j ) to zero. Compute incomplete Cholesky factorisation. F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 74 / 130

  46. Sparse Factorisation of Dense Matrices: fade-out instead of fill-in Remaining problems with our approach: F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 75 / 130

  47. Sparse Factorisation of Dense Matrices: fade-out instead of fill-in Remaining problems with our approach: Nested dissection does not lead to near-linear complexity algorithms F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 76 / 130

  48. Sparse Factorisation of Dense Matrices: fade-out instead of fill-in Remaining problems with our approach: Nested dissection does not lead to near-linear complexity algorithms Precision matrix will not be exactly sparse. How is it localised? F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 77 / 130

  49. Sparse Factorisation of Dense Matrices: fade-out instead of fill-in Remaining problems with our approach: Nested dissection does not lead to near-linear complexity algorithms Precision matrix will not be exactly sparse. How is it localised? The answer can be found in the recent literature on numerical homogenisation: F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 78 / 130

  50. Sparse factorisation of dense matrices using gamblets “Gamblet” bases have been introduced as part of the game theoretical approach to numerical PDE (Owhadi (2017), Owhadi and Scovel (2017) ). F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 79 / 130

  51. Sparse factorisation of dense matrices using gamblets “Gamblet” bases have been introduced as part of the game theoretical approach to numerical PDE (Owhadi (2017), Owhadi and Scovel (2017) ). Assume our covariance matrix is � φ ( q ) ( x ) G ( x , y ) φ ( q ) Θ i , j = ( y ) d x d y i j [ 0 , 1 ] 2 For φ ( q ) := ✶ [( i − 1 ) h q , ih q ] and G the Green’s function of a second i order elliptic PDE. F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 80 / 130

  52. Sparse factorisation of dense matrices using gamblets “Gamblet” bases have been introduced as part of the game theoretical approach to numerical PDE (Owhadi (2017), Owhadi and Scovel (2017) ). Assume our covariance matrix is � φ ( q ) ( x ) G ( x , y ) φ ( q ) Θ i , j = ( y ) d x d y i j [ 0 , 1 ] 2 For φ ( q ) := ✶ [( i − 1 ) h q , ih q ] and G the Green’s function of a second i order elliptic PDE. 1 φ ( q ) � Corresponds to X i ( ω ) = ( x ) u ( x , ω ) d x , with u ( ω ) solution i 0 to elliptic SPDE with Gaussian forcing. F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 81 / 130

  53. Sparse factorisation of dense matrices using gamblets Similiar to our case, only with ✶ [( i − 1 ) h q , ih q ] instead of dirac mesure. F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 82 / 130

  54. Sparse factorisation of dense matrices using gamblets Similiar to our case, only with ✶ [( i − 1 ) h q , ih q ] instead of dirac mesure. For φ ( k ) := ✶ [ ( i − 1 ) h k , ih k ] , Owhadi and Scovel (2017) shows: i F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 83 / 130

  55. Sparse factorisation of dense matrices using gamblets Similiar to our case, only with ✶ [( i − 1 ) h q , ih q ] instead of dirac mesure. For φ ( k ) := ✶ [ ( i − 1 ) h k , ih k ] , Owhadi and Scovel (2017) shows: i � 1 � � ψ ( k ) 0 u ( x ) φ ( k ) := E u | ( x ) d x = δ i , j is exponentially localised, i j on a scale of h k : F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 84 / 130

  56. Sparse factorisation of dense matrices using gamblets Similiar to our case, only with ✶ [( i − 1 ) h q , ih q ] instead of dirac mesure. For φ ( k ) := ✶ [ ( i − 1 ) h k , ih k ] , Owhadi and Scovel (2017) shows: i � 1 � � ψ ( k ) 0 u ( x ) φ ( k ) := E u | ( x ) d x = δ i , j is exponentially localised, i j on a scale of h k : Main idea: Estimate on exponential decay of a conditional expectation implies exponential decay of a Cholesky factors. F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 85 / 130

  57. Sparse factorisation of dense matrices using gamblets Transform to multiresolution basis to obtain block matrix: � φ ( k ) ,χ ( x ) G ( x , y ) φ ( l ) ,χ � � Γ k , l i , j = ( y ) d x d y i j [ 0 , 1 ] 2 F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 86 / 130

  58. Sparse factorisation of dense matrices using gamblets Transform to multiresolution basis to obtain block matrix: � φ ( k ) ,χ ( x ) G ( x , y ) φ ( l ) ,χ � � Γ k , l i , j = ( y ) d x d y i j [ 0 , 1 ] 2 � � φ ( k ) ,χ Where the j ∈ J ( k ) are chosen as Haar basis functions. j F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 87 / 130

  59. Sparse factorisation of dense matrices using gamblets Then the results of Owhadi (2017) and Owhadi and Scovel (2017) imply that: F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 88 / 130

  60. Sparse factorisation of dense matrices using gamblets Then the results of Owhadi (2017) and Owhadi and Scovel (2017) imply that: � 1 � � χ ( k ) 0 u ( x ) φ ( l ) ,χ := E is exponentially u | ( x ) d x = δ i , j δ k , l , ∀ l ≤ k i j localised, on a scale of h k : − γ � � �� � � � � � χ ( k ) x − x ( k ) � x − x ( k ) � ≤ C exp . � � � � i i i h k � F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 89 / 130

  61. Sparse factorisation of dense matrices using gamblets Then the results of Owhadi (2017) and Owhadi and Scovel (2017) imply that: � 1 � � χ ( k ) 0 u ( x ) φ ( l ) ,χ := E u | ( x ) d x = δ i , j δ k , l , ∀ l ≤ k is exponentially i j localised, on a scale of h k : − γ � � �� � � � � � χ ( k ) x − x ( k ) � x − x ( k ) � ≤ C exp . � � � � i i i h k � Furthermore, the stiffness matrices decay exponentially on each level: 1 � B ( k ) χ ( k ) ( x ) G − 1 χ ( k ) � � �� � := ( x ) d x ≤ exp − γ � x i − x j i , j i j 0 F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 90 / 130

  62. Sparse factorisation of dense matrices using gamblets Then the results of Owhadi (2017) and Owhadi and Scovel (2017) imply that: � 1 � � χ ( k ) 0 u ( x ) φ ( l ) ,χ := E u | ( x ) d x = δ i , j δ k , l , ∀ l ≤ k is exponentially i j localised, on a scale of h k : − γ � � �� � � � � � χ ( k ) x − x ( k ) � x − x ( k ) � ≤ C exp . � � � � i i i h k � Furthermore, the stiffness matrices decay exponentially on each level: 1 � B ( k ) χ ( k ) ( x ) G − 1 χ ( k ) � � �� � := ( x ) d x ≤ exp − γ � x i − x j i , j i j 0 Finally, we have for a constant κ : � B ( k ) � cond ≤ κ, ∀ k F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 91 / 130

  63. Sparse factorisation of dense matrices using gamblets The above properties will allow us to show localisation of the (block ) Cholesky factors: F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 92 / 130

  64. Sparse factorisation of dense matrices using gamblets The above properties will allow us to show localisation of the (block ) Cholesky factors: Consider the two-scale case: � Γ 11 � Γ 12 Γ 21 Γ 22 Γ − 1 � Id 0 � � Θ 11 0 � � � Id 11 Γ 12 = Γ 21 Γ − 1 Γ 22 − Γ 21 Γ − 1 Id 0 11 Γ 12 0 Id 11 F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 93 / 130

  65. Sparse factorisation of dense matrices using gamblets The above properties will allow us to show localisation of the (block ) Cholesky factors: Consider the two-scale case: � Γ 11 � Γ 12 Γ 21 Γ 22 Γ − 1 � Id 0 � � Γ 11 0 � � � Id 11 Γ 12 = Γ 21 Γ − 1 Γ 22 − Γ 21 Γ − 1 Id 0 11 Γ 12 0 Id 11 �� � � � � � � u φ ( 2 ) ,χ u φ ( 1 ) ,χ φ ( 2 ) ,χ χ ( 1 ) Γ 21 Γ − 1 � i , j = E d x d x = δ j , m = d x m � 11 i i j � �� � � � B ( 2 ) � − 1 u φ ( 2 ) ,χ d x u φ ( 1 ) ,χ d x � Γ 22 − Γ 21 Γ − 1 � 11 Γ 12 = Cov = � � F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 94 / 130

  66. Sparse factorisation of dense matrices using gamblets � � � � φ ( 2 ) ,χ χ ( 1 ) � � x ( 2 ) − x ( 1 ) � − γ Γ 21 Γ − 1 � i , j = d x ≤ C exp � � 11 i j h i j � F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 95 / 130

  67. Sparse factorisation of dense matrices using gamblets � � � � φ ( 2 ) ,χ χ ( 1 ) � � x ( 2 ) − x ( 1 ) � − γ Γ 21 Γ − 1 � i , j = d x ≤ C exp � � 11 i j h i j � Fact: Inverses ( Demko (1984), Jaffard (1990) ) and Cholesky factors (Benzi and T˚ uma (2000), Krishtal et al. (2015) ) of well-conditioned and banded/exponentially localised matrices are exponentially localised. F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 96 / 130

  68. Sparse factorisation of dense matrices using gamblets � � � � � � φ ( 2 ) ,χ χ ( 1 ) � x ( 2 ) − x ( 1 ) Γ 21 Γ − 1 − γ � i , j = d x ≤ C exp � � 11 i j h i j � Fact: Inverses ( Demko (1984), Jaffard (1990) ) and Cholesky factors (Benzi and T˚ uma (2000), Krishtal et al. (2015) ) of well-conditioned and banded/exponentially localised matrices are exponentially localised. �� B ( 2 ) � − 1 � � � � � i − x ( 2 ) − γ � x 2 Therefore: i , j ≤ C exp . � � h 2 j � F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 97 / 130

  69. Sparse factorisation of dense matrices using gamblets � � � � � � φ ( 2 ) ,χ χ ( 1 ) � x ( 2 ) − x ( 1 ) Γ 21 Γ − 1 − γ � i , j = d x ≤ C exp � � 11 i j h i j � Fact: Inverses ( Demko (1984), Jaffard (1990) ) and Cholesky factors (Benzi and T˚ uma (2000), Krishtal et al. (2015) ) of well-conditioned and banded/exponentially localised matrices are exponentially localised. �� B ( 2 ) � − 1 � � � � � i − x ( 2 ) − γ � x 2 Therefore: i , j ≤ C exp . � � h 2 j � Argument can can be extended to multiple scales. Results in exponentially decaying (block-)Cholesky factors. F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 98 / 130

  70. Sparse factorisation of dense matrices using gamblets � � � � � � φ ( 2 ) ,χ χ ( 1 ) � x ( 2 ) − x ( 1 ) Γ 21 Γ − 1 − γ � i , j = d x ≤ C exp � � 11 i j h i j � Fact: Inverses ( Demko (1984), Jaffard (1990) ) and Cholesky factors (Benzi and T˚ uma (2000), Krishtal et al. (2015) ) of well-conditioned and banded/exponentially localised matrices are exponentially localised. �� B ( 2 ) � − 1 � � � � � i − x ( 2 ) − γ � x 2 Therefore: i , j ≤ C exp . � � h 2 j � Argument can can be extended to multiple scales. Results in exponentially decaying (block-)Cholesky factors. These factors can be approximated in time complexity by (block-)Cholesky decomposition in computational complexity of � � 4 d + 1 � N log 2 ( N ) � log ( 1 /ǫ ) + log 2 ( N ) O in time and N log ( N ) log d ( N 1 � � O ǫ ) in space for an approximation error of ǫ . F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 99 / 130

  71. Sparse factorisation of dense matrices using gamblets How about φ ( q ) = δ x ( q ) , i.e. pointwise sampling? i i F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 100 / 130

Recommend


More recommend