Greedy Sparse Linear Approximations of Functionals from Nodal Data R. Schaback Göttingen Academy of Sciences and Humanities Georg–August–Universität Göttingen Institut für Numerische und Angewandte Mathematik http://www.num.math.uni-goettingen.de/schaback/research/group.html MAIA 2013, Erice, Italy, Sept. 2013
Overview Linear Recovery Optimal Recovery Greedy Recovery Examples
Linear Recovery
Standard Problem of Numerical Analysis ◮ You have data, but you want different data ◮ You have a few data of an otherwise unknown function, but you want different data of that function ◮ Examples: ◮ Numerical integration ◮ Numerical differentiation ◮ PDE solving ◮ Given λ 1 ( u ) , . . . , λ N ( u ) , find µ ( u ) for continuous linear functionals µ, λ 1 , . . . , λ N ∈ U ∗ ◮ Assumption: Linear recovery N � µ ( u ) ≈ a j λ j ( u ) for all u ∈ U j = 1
Linear Recovery ◮ Linear recovery N � µ ( u ) ≈ a j λ j ( u ) for all u ∈ U j = 1 ◮ Error functional N � for all u ∈ U , a ∈ R N ǫ a ( u ) := µ ( u ) − a j λ j ( u ) j = 1 ◮ Sharp error bound � � N � � � � � µ ( u ) − a j λ j ( u ) ·� u � u ≤ � ǫ a � U ∗ � � � � � �� � j = 1 � � ???? ◮ If � ǫ a � U ∗ were known: Error known in % of � u � u ◮ Problem: Calculate � ǫ a � U ∗ and optimize over a
Reproducing Kernel Hilbert Spaces ◮ Assumption: U Hilbert space of functions on Ω with continuous and linearly independent point evaluations ◮ Theorem: Then u is a RKHS with some positive definite kernel K : Ω × Ω → R u ( x ) ( u , K ( x , · )) U for all x ∈ Ω , u ∈ U = µ ( u ) ( u , µ x K ( x , · )) U for all µ ∈ U ∗ , u ∈ U = λ x µ y K ( x , y ) for all λ, µ ∈ U ∗ ( λ, µ ) U ∗ = ◮ Consequence: a ǫ y ǫ x a K ( x , y ) � ǫ a � 2 = U ∗ µ x µ y K ( x , y ) − 2 � N j = 1 a j µ x λ y j K ( x , y ) = + � N � N j λ y k = 1 a j a k λ x k K ( x , y ) j = 1 ◮ This quadratic form in a can be explicitly calculated ◮ Different approximations can be compared ◮ This quadratic form can be optimized
Sobolev Spaces ◮ Assumption: U Hilbert space of functions on Ω with continuous and linearly independent point evaluations ◮ Take U := W m 2 ( R d ) with m > d / 2 ◮ Matérn–Sobolev kernel: K ( x , y ) = � x − y � m − d / 2 K m − d / 2 ( � x − y � 2 ) 2 with modified Bessel function of second kind. ◮ For “nice” domains Ω ⊂ R d : W m 2 (Ω) is norm–equivalent to W m 2 ( R d )
Example: Numerical Integration in Sobolev Spaces ◮ 5 points in [ − 1 , 1 ] ◮ Standard weights � ǫ � 2 � ǫ � 2 � ǫ � 2 Points 1 2 3 equidistant 0.103077 0.001034817 0.00016066 Gauß 0.090650164 0.000409353 0.00000298
Optimal Recovery
Optimal Recovery in RKHS ◮ Minimize the quadratic form a ǫ y ǫ x a K ( x , y ) � ǫ a � 2 = U ∗ µ x µ y K ( x , y ) − 2 � N j = 1 a j µ x λ y j K ( x , y ) = + � N � N j λ y k = 1 a j a k λ x k K ( x , y ) j = 1 ◮ Solution a ∗ satisfies linear system N � j λ y k K ( x , y ) = µ x λ y j λ x a ∗ k K ( x , y ) , 1 ≤ k ≤ N j = 1 ◮ Then N � j µ x λ y U ∗ = µ x µ y K ( x , y ) − a ∗ j K ( x , y ) � ǫ a ∗ � 2 j = 1
Connection to Interpolation ◮ Theorem: The optimal recovery is equal to the µ –value of the interpolant of the data λ j ( u ) , 1 ≤ j ≤ N on the span of the functions λ x j K ( x , · ) , 1 ≤ j ≤ N . ◮ Proof steps: ◮ There is a Lagrange basis u 1 , . . . , u N of that span with λ j ( u k ) = δ jk , 1 ≤ j , k ≤ N ◮ The interpolant of data f j = λ j ( u ) is N N � � u ( x ) u j ( x ) f j = u j ( x ) λ j ( u ) ˜ = j = 1 j = 1 N N � � µ ( u ) ≈ µ (˜ u ) µ ( u j ) f j = µ ( u j ) λ j ( u ) = j = 1 j = 1 ◮ Then, as a recovery formula, a j = µ ( u j ) , 1 ≤ k ≤ N
Connection to Interpolation II ◮ Optimality? ◮ We need N � j λ y k K ( x , y ) = µ x λ y µ ( u j ) λ x k K ( x , y ) , 1 ≤ k ≤ N j = 1 The Lagrange property implies N � j λ y k K ( x , y ) = λ y u j ( x ) λ x k K ( x , y ) , 1 ≤ k ≤ N j = 1 and application of µ proves optimality.
Example: Optimal Integration in Sobolev Spaces ◮ 5 points in [ − 1 , 1 ] � ǫ � 2 � ǫ � 2 � ǫ � 2 Points Method W 1 W 2 W 3 2 ( R ) ∗ 2 ( R ) ∗ 2 ( R ) ∗ equi standard 0.103077 0.00103481 0.00016066 equi optimal 0.101896 0.00066343 0.00001082 Gauß standard 0.090650 0.00040935 0.00000298 Gauß optimal 0.085169 0.00040768 0.00000296 optimal optimal 0.063935 0.00019882 0.00000106 ◮ Optimal points are chosen to minimize the error norm of the optimal recovery in Sobolev space
Summary of Recovery ◮ Consider linear recoveries of unknown data from known data ◮ Do this on a RKHS ◮ Evaluate the norm of the error functional ◮ This allows fair comparisons between recoveries ◮ Optimize the recovery weights ◮ Goal: Optimize the choice of data points ◮ Goal: Do this efficiently
Greedy Recovery
Greedy Recovery Specialize to recovery from nodal data: N � µ ( u ) ≈ a ∗ j u ( x j ) for all u ∈ U j = 1 with optimal weights a ∗ j , but vary the points x j Optimization of error norm in some RKHS, e.g. Sobolev space W m 2 ( R d ) , m > d / 2 Note: a ∗ j = a ∗ j ( x 1 , . . . , x N ) Goal: Greedy recursive method, N �→ N + 1 Goal: Sparsity via greedy selection
Interpolation For a Lagrange basis u N 1 , . . . , u N N : interpolant to u is N � u N s u ( x ) = j ( x ) u ( x j ) for all u ∈ U j = 1 Optimal recovery is N � µ ( u N µ ( u ) ≈ µ ( s u ) = u ( x j ) for all u ∈ U j ) j = 1 � �� � =: a ∗ j Drawback: Lagrange basis is not efficiently recursive Goal: Use Newton basis recursively
Recursive Newton Basis Points x 1 , x 2 , . . . , x k , x k + 1 , . . . Recursive Kernels (RS, 1997) K 0 ( x , y ) K ( x , y ) := K k ( x , y ) − K k ( x k + 1 , x ) K k ( x k + 1 , y ) K k + 1 ( x , y ) := K k ( x k + 1 , x k + 1 ) Newton basis functions (St. Müller/RS, 2009) K k ( x k + 1 , x ) v k + 1 ( x ) := � K k ( x k + 1 , x k + 1 ) k + 1 � K k + 1 ( x , y ) K ( x , y ) − v j ( x ) v j ( y ) = j = 1 K k ( x , y ) − v k + 1 ( x ) v k + 1 ( y ) =
Properties of Newton Basis Orthonormality ( v j , v k ) U = δ jk Zeros v k + 1 ( x j ) = 0 , 1 ≤ j ≤ k Power Function for interpolation: P 2 � ǫ ( δ x ; α ∗ 1 ( δ x ) , . . . , α ∗ k + 1 ( δ x )) � 2 k + 1 ( δ x ) := H ∗ K k + 1 ( x , x ) ( M . Mouattamid , RS 09 ) = K k ( x , x ) − v 2 k + 1 ( x ) = P 2 k ( δ x ) − v 2 k + 1 ( x ) =
Recursive Recovery Functional µ , points x 1 , x 2 , . . . , x k , x k + 1 , . . . Recursive Equations I µ x K 0 ( x , y ) µ x K ( x , y ) = µ x K k ( x , y ) − µ x K k ( x k + 1 , x ) K k ( x k + 1 , y ) µ x K k + 1 ( x , y ) = K k ( x k + 1 , x k + 1 ) µ y µ x K k ( x , y ) − µ x K k ( x k + 1 , x ) µ y K k ( x k + 1 , y µ y µ x K k + 1 ( x , y ) = K k ( x k + 1 , x k + 1 ) µ y µ x K k ( x , y ) − µ ( v k + 1 ) 2 = P 2 k + 1 ( µ )) � 2 � ǫ ( µ ; α ∗ 1 ( µ ) , . . . , α ∗ k + 1 ( µ ) := H ∗ µ x µ y K k + 1 ( x , y ) = P 2 k ( µ ) − µ ( v k + 1 ) 2 = k + 1 � µ x µ y K ( x , y ) − µ ( v j ) 2 . = j = 1 Goal: Choose x k + 1 to maximize µ ( v k + 1 ) 2
Recursive Recovery II Functional µ , points x 1 , x 2 , . . . , x k , x k + 1 , . . . Goal: Choose x k + 1 to maximize µ ( v k + 1 ) 2 Recursive Equations K k ( x k + 1 , x ) v k + 1 ( x ) := � K k ( x k + 1 , x k + 1 ) µ x ( K k ( x k + 1 , x )) µ ( v k + 1 ) = � K k ( x k + 1 , x k + 1 ) Maximize as function of z : R k ( z ) := ( µ x K k ( z , x )) 2 K k ( z , z ) Stop if R k is small on available points Problems: Efficiency? Stability?
Implementation: Overview Total given points: x 1 , . . . , x N Goal: Select a small subset of n points greedily Computational Complexity: O ( nN ) for update n − 1 → n Computational Complexity: O ( n 2 N ) in total Storage: O ( nN ) for n Newton basis functions Similarly for interpolation of n data with evaluation on N points via Newton basis
Implementation I Total given points: x 1 , . . . , x N Goal: Select a small subset greedily Index set I := ∅ to collect selected point indices Various N -vectors for steps k = 0 , 1 , . . . , N : K k ( K k ( x j , x j )) 1 ≤ j ≤ N := ( µ x K k ( x , x j )) 1 ≤ j ≤ N M k := ( R k ( x 1 ) , . . . , R k ( x N )) T = M k . ∧ 2 ./ K k R k := Easy initialization for k = 0 Find maximum of R 0 . insert point index into I Further N -vectors : k 1 ( K 0 ( x I ( 1 ) , x j )) 1 ≤ j ≤ N := k 1 ./. √ K 0 V 1 :=
Implementation II Recursion n ⇒ n + 1 : Given: N –vectors K n , M n , V 1 , . . . , V n , Index set I with n point indices Maximize R n = M n . ∧ 2 ./ K n , if max. is not too small and add new point index into I Now I is I = { y 1 , . . . , y n + 1 } . Use n � K n ( y n + 1 , x j ) K ( y n + 1 , x j ) v k ( y n + 1 ) v k ( x j ) , = � − � �� k = 1 n � �� � � e T k n + 1 k n + 1 , 0 − I ( n + 1 ) V k V k = k = 1 k n + 1 ./ √ K n V n + 1 := K n + 1 K n − V n + 1 . ∧ 2 =
Recommend
More recommend