greedy sparse linear approximations of functionals from
play

Greedy Sparse Linear Approximations of Functionals from Nodal Data - PowerPoint PPT Presentation

Greedy Sparse Linear Approximations of Functionals from Nodal Data R. Schaback Gttingen Academy of Sciences and Humanities GeorgAugustUniversitt Gttingen Institut fr Numerische und Angewandte Mathematik


  1. Greedy Sparse Linear Approximations of Functionals from Nodal Data R. Schaback Göttingen Academy of Sciences and Humanities Georg–August–Universität Göttingen Institut für Numerische und Angewandte Mathematik http://www.num.math.uni-goettingen.de/schaback/research/group.html MAIA 2013, Erice, Italy, Sept. 2013

  2. Overview Linear Recovery Optimal Recovery Greedy Recovery Examples

  3. Linear Recovery

  4. Standard Problem of Numerical Analysis ◮ You have data, but you want different data ◮ You have a few data of an otherwise unknown function, but you want different data of that function ◮ Examples: ◮ Numerical integration ◮ Numerical differentiation ◮ PDE solving ◮ Given λ 1 ( u ) , . . . , λ N ( u ) , find µ ( u ) for continuous linear functionals µ, λ 1 , . . . , λ N ∈ U ∗ ◮ Assumption: Linear recovery N � µ ( u ) ≈ a j λ j ( u ) for all u ∈ U j = 1

  5. Linear Recovery ◮ Linear recovery N � µ ( u ) ≈ a j λ j ( u ) for all u ∈ U j = 1 ◮ Error functional N � for all u ∈ U , a ∈ R N ǫ a ( u ) := µ ( u ) − a j λ j ( u ) j = 1 ◮ Sharp error bound � � N � � � � � µ ( u ) − a j λ j ( u ) ·� u � u ≤ � ǫ a � U ∗ � � � � � �� � j = 1 � � ???? ◮ If � ǫ a � U ∗ were known: Error known in % of � u � u ◮ Problem: Calculate � ǫ a � U ∗ and optimize over a

  6. Reproducing Kernel Hilbert Spaces ◮ Assumption: U Hilbert space of functions on Ω with continuous and linearly independent point evaluations ◮ Theorem: Then u is a RKHS with some positive definite kernel K : Ω × Ω → R u ( x ) ( u , K ( x , · )) U for all x ∈ Ω , u ∈ U = µ ( u ) ( u , µ x K ( x , · )) U for all µ ∈ U ∗ , u ∈ U = λ x µ y K ( x , y ) for all λ, µ ∈ U ∗ ( λ, µ ) U ∗ = ◮ Consequence: a ǫ y ǫ x a K ( x , y ) � ǫ a � 2 = U ∗ µ x µ y K ( x , y ) − 2 � N j = 1 a j µ x λ y j K ( x , y ) = + � N � N j λ y k = 1 a j a k λ x k K ( x , y ) j = 1 ◮ This quadratic form in a can be explicitly calculated ◮ Different approximations can be compared ◮ This quadratic form can be optimized

  7. Sobolev Spaces ◮ Assumption: U Hilbert space of functions on Ω with continuous and linearly independent point evaluations ◮ Take U := W m 2 ( R d ) with m > d / 2 ◮ Matérn–Sobolev kernel: K ( x , y ) = � x − y � m − d / 2 K m − d / 2 ( � x − y � 2 ) 2 with modified Bessel function of second kind. ◮ For “nice” domains Ω ⊂ R d : W m 2 (Ω) is norm–equivalent to W m 2 ( R d )

  8. Example: Numerical Integration in Sobolev Spaces ◮ 5 points in [ − 1 , 1 ] ◮ Standard weights � ǫ � 2 � ǫ � 2 � ǫ � 2 Points 1 2 3 equidistant 0.103077 0.001034817 0.00016066 Gauß 0.090650164 0.000409353 0.00000298

  9. Optimal Recovery

  10. Optimal Recovery in RKHS ◮ Minimize the quadratic form a ǫ y ǫ x a K ( x , y ) � ǫ a � 2 = U ∗ µ x µ y K ( x , y ) − 2 � N j = 1 a j µ x λ y j K ( x , y ) = + � N � N j λ y k = 1 a j a k λ x k K ( x , y ) j = 1 ◮ Solution a ∗ satisfies linear system N � j λ y k K ( x , y ) = µ x λ y j λ x a ∗ k K ( x , y ) , 1 ≤ k ≤ N j = 1 ◮ Then N � j µ x λ y U ∗ = µ x µ y K ( x , y ) − a ∗ j K ( x , y ) � ǫ a ∗ � 2 j = 1

  11. Connection to Interpolation ◮ Theorem: The optimal recovery is equal to the µ –value of the interpolant of the data λ j ( u ) , 1 ≤ j ≤ N on the span of the functions λ x j K ( x , · ) , 1 ≤ j ≤ N . ◮ Proof steps: ◮ There is a Lagrange basis u 1 , . . . , u N of that span with λ j ( u k ) = δ jk , 1 ≤ j , k ≤ N ◮ The interpolant of data f j = λ j ( u ) is N N � � u ( x ) u j ( x ) f j = u j ( x ) λ j ( u ) ˜ = j = 1 j = 1 N N � � µ ( u ) ≈ µ (˜ u ) µ ( u j ) f j = µ ( u j ) λ j ( u ) = j = 1 j = 1 ◮ Then, as a recovery formula, a j = µ ( u j ) , 1 ≤ k ≤ N

  12. Connection to Interpolation II ◮ Optimality? ◮ We need N � j λ y k K ( x , y ) = µ x λ y µ ( u j ) λ x k K ( x , y ) , 1 ≤ k ≤ N j = 1 The Lagrange property implies N � j λ y k K ( x , y ) = λ y u j ( x ) λ x k K ( x , y ) , 1 ≤ k ≤ N j = 1 and application of µ proves optimality.

  13. Example: Optimal Integration in Sobolev Spaces ◮ 5 points in [ − 1 , 1 ] � ǫ � 2 � ǫ � 2 � ǫ � 2 Points Method W 1 W 2 W 3 2 ( R ) ∗ 2 ( R ) ∗ 2 ( R ) ∗ equi standard 0.103077 0.00103481 0.00016066 equi optimal 0.101896 0.00066343 0.00001082 Gauß standard 0.090650 0.00040935 0.00000298 Gauß optimal 0.085169 0.00040768 0.00000296 optimal optimal 0.063935 0.00019882 0.00000106 ◮ Optimal points are chosen to minimize the error norm of the optimal recovery in Sobolev space

  14. Summary of Recovery ◮ Consider linear recoveries of unknown data from known data ◮ Do this on a RKHS ◮ Evaluate the norm of the error functional ◮ This allows fair comparisons between recoveries ◮ Optimize the recovery weights ◮ Goal: Optimize the choice of data points ◮ Goal: Do this efficiently

  15. Greedy Recovery

  16. Greedy Recovery Specialize to recovery from nodal data: N � µ ( u ) ≈ a ∗ j u ( x j ) for all u ∈ U j = 1 with optimal weights a ∗ j , but vary the points x j Optimization of error norm in some RKHS, e.g. Sobolev space W m 2 ( R d ) , m > d / 2 Note: a ∗ j = a ∗ j ( x 1 , . . . , x N ) Goal: Greedy recursive method, N �→ N + 1 Goal: Sparsity via greedy selection

  17. Interpolation For a Lagrange basis u N 1 , . . . , u N N : interpolant to u is N � u N s u ( x ) = j ( x ) u ( x j ) for all u ∈ U j = 1 Optimal recovery is N � µ ( u N µ ( u ) ≈ µ ( s u ) = u ( x j ) for all u ∈ U j ) j = 1 � �� � =: a ∗ j Drawback: Lagrange basis is not efficiently recursive Goal: Use Newton basis recursively

  18. Recursive Newton Basis Points x 1 , x 2 , . . . , x k , x k + 1 , . . . Recursive Kernels (RS, 1997) K 0 ( x , y ) K ( x , y ) := K k ( x , y ) − K k ( x k + 1 , x ) K k ( x k + 1 , y ) K k + 1 ( x , y ) := K k ( x k + 1 , x k + 1 ) Newton basis functions (St. Müller/RS, 2009) K k ( x k + 1 , x ) v k + 1 ( x ) := � K k ( x k + 1 , x k + 1 ) k + 1 � K k + 1 ( x , y ) K ( x , y ) − v j ( x ) v j ( y ) = j = 1 K k ( x , y ) − v k + 1 ( x ) v k + 1 ( y ) =

  19. Properties of Newton Basis Orthonormality ( v j , v k ) U = δ jk Zeros v k + 1 ( x j ) = 0 , 1 ≤ j ≤ k Power Function for interpolation: P 2 � ǫ ( δ x ; α ∗ 1 ( δ x ) , . . . , α ∗ k + 1 ( δ x )) � 2 k + 1 ( δ x ) := H ∗ K k + 1 ( x , x ) ( M . Mouattamid , RS 09 ) = K k ( x , x ) − v 2 k + 1 ( x ) = P 2 k ( δ x ) − v 2 k + 1 ( x ) =

  20. Recursive Recovery Functional µ , points x 1 , x 2 , . . . , x k , x k + 1 , . . . Recursive Equations I µ x K 0 ( x , y ) µ x K ( x , y ) = µ x K k ( x , y ) − µ x K k ( x k + 1 , x ) K k ( x k + 1 , y ) µ x K k + 1 ( x , y ) = K k ( x k + 1 , x k + 1 ) µ y µ x K k ( x , y ) − µ x K k ( x k + 1 , x ) µ y K k ( x k + 1 , y µ y µ x K k + 1 ( x , y ) = K k ( x k + 1 , x k + 1 ) µ y µ x K k ( x , y ) − µ ( v k + 1 ) 2 = P 2 k + 1 ( µ )) � 2 � ǫ ( µ ; α ∗ 1 ( µ ) , . . . , α ∗ k + 1 ( µ ) := H ∗ µ x µ y K k + 1 ( x , y ) = P 2 k ( µ ) − µ ( v k + 1 ) 2 = k + 1 � µ x µ y K ( x , y ) − µ ( v j ) 2 . = j = 1 Goal: Choose x k + 1 to maximize µ ( v k + 1 ) 2

  21. Recursive Recovery II Functional µ , points x 1 , x 2 , . . . , x k , x k + 1 , . . . Goal: Choose x k + 1 to maximize µ ( v k + 1 ) 2 Recursive Equations K k ( x k + 1 , x ) v k + 1 ( x ) := � K k ( x k + 1 , x k + 1 ) µ x ( K k ( x k + 1 , x )) µ ( v k + 1 ) = � K k ( x k + 1 , x k + 1 ) Maximize as function of z : R k ( z ) := ( µ x K k ( z , x )) 2 K k ( z , z ) Stop if R k is small on available points Problems: Efficiency? Stability?

  22. Implementation: Overview Total given points: x 1 , . . . , x N Goal: Select a small subset of n points greedily Computational Complexity: O ( nN ) for update n − 1 → n Computational Complexity: O ( n 2 N ) in total Storage: O ( nN ) for n Newton basis functions Similarly for interpolation of n data with evaluation on N points via Newton basis

  23. Implementation I Total given points: x 1 , . . . , x N Goal: Select a small subset greedily Index set I := ∅ to collect selected point indices Various N -vectors for steps k = 0 , 1 , . . . , N : K k ( K k ( x j , x j )) 1 ≤ j ≤ N := ( µ x K k ( x , x j )) 1 ≤ j ≤ N M k := ( R k ( x 1 ) , . . . , R k ( x N )) T = M k . ∧ 2 ./ K k R k := Easy initialization for k = 0 Find maximum of R 0 . insert point index into I Further N -vectors : k 1 ( K 0 ( x I ( 1 ) , x j )) 1 ≤ j ≤ N := k 1 ./. √ K 0 V 1 :=

  24. Implementation II Recursion n ⇒ n + 1 : Given: N –vectors K n , M n , V 1 , . . . , V n , Index set I with n point indices Maximize R n = M n . ∧ 2 ./ K n , if max. is not too small and add new point index into I Now I is I = { y 1 , . . . , y n + 1 } . Use n � K n ( y n + 1 , x j ) K ( y n + 1 , x j ) v k ( y n + 1 ) v k ( x j ) , = � − � �� k = 1 n � �� � � e T k n + 1 k n + 1 , 0 − I ( n + 1 ) V k V k = k = 1 k n + 1 ./ √ K n V n + 1 := K n + 1 K n − V n + 1 . ∧ 2 =

Recommend


More recommend