Greedy Sparse Linear Approximations of Functionals from Nodal Data - PowerPoint PPT Presentation
Greedy Sparse Linear Approximations of Functionals from Nodal Data R. Schaback Gttingen Academy of Sciences and Humanities GeorgAugustUniversitt Gttingen Institut fr Numerische und Angewandte Mathematik
Greedy Sparse Linear Approximations of Functionals from Nodal Data R. Schaback Göttingen Academy of Sciences and Humanities Georg–August–Universität Göttingen Institut für Numerische und Angewandte Mathematik http://www.num.math.uni-goettingen.de/schaback/research/group.html MAIA 2013, Erice, Italy, Sept. 2013
Overview Linear Recovery Optimal Recovery Greedy Recovery Examples
Linear Recovery
Standard Problem of Numerical Analysis ◮ You have data, but you want different data ◮ You have a few data of an otherwise unknown function, but you want different data of that function ◮ Examples: ◮ Numerical integration ◮ Numerical differentiation ◮ PDE solving ◮ Given λ 1 ( u ) , . . . , λ N ( u ) , find µ ( u ) for continuous linear functionals µ, λ 1 , . . . , λ N ∈ U ∗ ◮ Assumption: Linear recovery N � µ ( u ) ≈ a j λ j ( u ) for all u ∈ U j = 1
Linear Recovery ◮ Linear recovery N � µ ( u ) ≈ a j λ j ( u ) for all u ∈ U j = 1 ◮ Error functional N � for all u ∈ U , a ∈ R N ǫ a ( u ) := µ ( u ) − a j λ j ( u ) j = 1 ◮ Sharp error bound � � N � � � � � µ ( u ) − a j λ j ( u ) ·� u � u ≤ � ǫ a � U ∗ � � � � � �� � j = 1 � � ???? ◮ If � ǫ a � U ∗ were known: Error known in % of � u � u ◮ Problem: Calculate � ǫ a � U ∗ and optimize over a
Reproducing Kernel Hilbert Spaces ◮ Assumption: U Hilbert space of functions on Ω with continuous and linearly independent point evaluations ◮ Theorem: Then u is a RKHS with some positive definite kernel K : Ω × Ω → R u ( x ) ( u , K ( x , · )) U for all x ∈ Ω , u ∈ U = µ ( u ) ( u , µ x K ( x , · )) U for all µ ∈ U ∗ , u ∈ U = λ x µ y K ( x , y ) for all λ, µ ∈ U ∗ ( λ, µ ) U ∗ = ◮ Consequence: a ǫ y ǫ x a K ( x , y ) � ǫ a � 2 = U ∗ µ x µ y K ( x , y ) − 2 � N j = 1 a j µ x λ y j K ( x , y ) = + � N � N j λ y k = 1 a j a k λ x k K ( x , y ) j = 1 ◮ This quadratic form in a can be explicitly calculated ◮ Different approximations can be compared ◮ This quadratic form can be optimized
Sobolev Spaces ◮ Assumption: U Hilbert space of functions on Ω with continuous and linearly independent point evaluations ◮ Take U := W m 2 ( R d ) with m > d / 2 ◮ Matérn–Sobolev kernel: K ( x , y ) = � x − y � m − d / 2 K m − d / 2 ( � x − y � 2 ) 2 with modified Bessel function of second kind. ◮ For “nice” domains Ω ⊂ R d : W m 2 (Ω) is norm–equivalent to W m 2 ( R d )
Example: Numerical Integration in Sobolev Spaces ◮ 5 points in [ − 1 , 1 ] ◮ Standard weights � ǫ � 2 � ǫ � 2 � ǫ � 2 Points 1 2 3 equidistant 0.103077 0.001034817 0.00016066 Gauß 0.090650164 0.000409353 0.00000298
Optimal Recovery
Optimal Recovery in RKHS ◮ Minimize the quadratic form a ǫ y ǫ x a K ( x , y ) � ǫ a � 2 = U ∗ µ x µ y K ( x , y ) − 2 � N j = 1 a j µ x λ y j K ( x , y ) = + � N � N j λ y k = 1 a j a k λ x k K ( x , y ) j = 1 ◮ Solution a ∗ satisfies linear system N � j λ y k K ( x , y ) = µ x λ y j λ x a ∗ k K ( x , y ) , 1 ≤ k ≤ N j = 1 ◮ Then N � j µ x λ y U ∗ = µ x µ y K ( x , y ) − a ∗ j K ( x , y ) � ǫ a ∗ � 2 j = 1
Connection to Interpolation ◮ Theorem: The optimal recovery is equal to the µ –value of the interpolant of the data λ j ( u ) , 1 ≤ j ≤ N on the span of the functions λ x j K ( x , · ) , 1 ≤ j ≤ N . ◮ Proof steps: ◮ There is a Lagrange basis u 1 , . . . , u N of that span with λ j ( u k ) = δ jk , 1 ≤ j , k ≤ N ◮ The interpolant of data f j = λ j ( u ) is N N � � u ( x ) u j ( x ) f j = u j ( x ) λ j ( u ) ˜ = j = 1 j = 1 N N � � µ ( u ) ≈ µ (˜ u ) µ ( u j ) f j = µ ( u j ) λ j ( u ) = j = 1 j = 1 ◮ Then, as a recovery formula, a j = µ ( u j ) , 1 ≤ k ≤ N
Connection to Interpolation II ◮ Optimality? ◮ We need N � j λ y k K ( x , y ) = µ x λ y µ ( u j ) λ x k K ( x , y ) , 1 ≤ k ≤ N j = 1 The Lagrange property implies N � j λ y k K ( x , y ) = λ y u j ( x ) λ x k K ( x , y ) , 1 ≤ k ≤ N j = 1 and application of µ proves optimality.
Example: Optimal Integration in Sobolev Spaces ◮ 5 points in [ − 1 , 1 ] � ǫ � 2 � ǫ � 2 � ǫ � 2 Points Method W 1 W 2 W 3 2 ( R ) ∗ 2 ( R ) ∗ 2 ( R ) ∗ equi standard 0.103077 0.00103481 0.00016066 equi optimal 0.101896 0.00066343 0.00001082 Gauß standard 0.090650 0.00040935 0.00000298 Gauß optimal 0.085169 0.00040768 0.00000296 optimal optimal 0.063935 0.00019882 0.00000106 ◮ Optimal points are chosen to minimize the error norm of the optimal recovery in Sobolev space
Summary of Recovery ◮ Consider linear recoveries of unknown data from known data ◮ Do this on a RKHS ◮ Evaluate the norm of the error functional ◮ This allows fair comparisons between recoveries ◮ Optimize the recovery weights ◮ Goal: Optimize the choice of data points ◮ Goal: Do this efficiently
Greedy Recovery
Greedy Recovery Specialize to recovery from nodal data: N � µ ( u ) ≈ a ∗ j u ( x j ) for all u ∈ U j = 1 with optimal weights a ∗ j , but vary the points x j Optimization of error norm in some RKHS, e.g. Sobolev space W m 2 ( R d ) , m > d / 2 Note: a ∗ j = a ∗ j ( x 1 , . . . , x N ) Goal: Greedy recursive method, N �→ N + 1 Goal: Sparsity via greedy selection
Interpolation For a Lagrange basis u N 1 , . . . , u N N : interpolant to u is N � u N s u ( x ) = j ( x ) u ( x j ) for all u ∈ U j = 1 Optimal recovery is N � µ ( u N µ ( u ) ≈ µ ( s u ) = u ( x j ) for all u ∈ U j ) j = 1 � �� � =: a ∗ j Drawback: Lagrange basis is not efficiently recursive Goal: Use Newton basis recursively
Recursive Newton Basis Points x 1 , x 2 , . . . , x k , x k + 1 , . . . Recursive Kernels (RS, 1997) K 0 ( x , y ) K ( x , y ) := K k ( x , y ) − K k ( x k + 1 , x ) K k ( x k + 1 , y ) K k + 1 ( x , y ) := K k ( x k + 1 , x k + 1 ) Newton basis functions (St. Müller/RS, 2009) K k ( x k + 1 , x ) v k + 1 ( x ) := � K k ( x k + 1 , x k + 1 ) k + 1 � K k + 1 ( x , y ) K ( x , y ) − v j ( x ) v j ( y ) = j = 1 K k ( x , y ) − v k + 1 ( x ) v k + 1 ( y ) =
Properties of Newton Basis Orthonormality ( v j , v k ) U = δ jk Zeros v k + 1 ( x j ) = 0 , 1 ≤ j ≤ k Power Function for interpolation: P 2 � ǫ ( δ x ; α ∗ 1 ( δ x ) , . . . , α ∗ k + 1 ( δ x )) � 2 k + 1 ( δ x ) := H ∗ K k + 1 ( x , x ) ( M . Mouattamid , RS 09 ) = K k ( x , x ) − v 2 k + 1 ( x ) = P 2 k ( δ x ) − v 2 k + 1 ( x ) =
Recursive Recovery Functional µ , points x 1 , x 2 , . . . , x k , x k + 1 , . . . Recursive Equations I µ x K 0 ( x , y ) µ x K ( x , y ) = µ x K k ( x , y ) − µ x K k ( x k + 1 , x ) K k ( x k + 1 , y ) µ x K k + 1 ( x , y ) = K k ( x k + 1 , x k + 1 ) µ y µ x K k ( x , y ) − µ x K k ( x k + 1 , x ) µ y K k ( x k + 1 , y µ y µ x K k + 1 ( x , y ) = K k ( x k + 1 , x k + 1 ) µ y µ x K k ( x , y ) − µ ( v k + 1 ) 2 = P 2 k + 1 ( µ )) � 2 � ǫ ( µ ; α ∗ 1 ( µ ) , . . . , α ∗ k + 1 ( µ ) := H ∗ µ x µ y K k + 1 ( x , y ) = P 2 k ( µ ) − µ ( v k + 1 ) 2 = k + 1 � µ x µ y K ( x , y ) − µ ( v j ) 2 . = j = 1 Goal: Choose x k + 1 to maximize µ ( v k + 1 ) 2
Recursive Recovery II Functional µ , points x 1 , x 2 , . . . , x k , x k + 1 , . . . Goal: Choose x k + 1 to maximize µ ( v k + 1 ) 2 Recursive Equations K k ( x k + 1 , x ) v k + 1 ( x ) := � K k ( x k + 1 , x k + 1 ) µ x ( K k ( x k + 1 , x )) µ ( v k + 1 ) = � K k ( x k + 1 , x k + 1 ) Maximize as function of z : R k ( z ) := ( µ x K k ( z , x )) 2 K k ( z , z ) Stop if R k is small on available points Problems: Efficiency? Stability?
Implementation: Overview Total given points: x 1 , . . . , x N Goal: Select a small subset of n points greedily Computational Complexity: O ( nN ) for update n − 1 → n Computational Complexity: O ( n 2 N ) in total Storage: O ( nN ) for n Newton basis functions Similarly for interpolation of n data with evaluation on N points via Newton basis
Implementation I Total given points: x 1 , . . . , x N Goal: Select a small subset greedily Index set I := ∅ to collect selected point indices Various N -vectors for steps k = 0 , 1 , . . . , N : K k ( K k ( x j , x j )) 1 ≤ j ≤ N := ( µ x K k ( x , x j )) 1 ≤ j ≤ N M k := ( R k ( x 1 ) , . . . , R k ( x N )) T = M k . ∧ 2 ./ K k R k := Easy initialization for k = 0 Find maximum of R 0 . insert point index into I Further N -vectors : k 1 ( K 0 ( x I ( 1 ) , x j )) 1 ≤ j ≤ N := k 1 ./. √ K 0 V 1 :=
Implementation II Recursion n ⇒ n + 1 : Given: N –vectors K n , M n , V 1 , . . . , V n , Index set I with n point indices Maximize R n = M n . ∧ 2 ./ K n , if max. is not too small and add new point index into I Now I is I = { y 1 , . . . , y n + 1 } . Use n � K n ( y n + 1 , x j ) K ( y n + 1 , x j ) v k ( y n + 1 ) v k ( x j ) , = � − � �� k = 1 n � �� � � e T k n + 1 k n + 1 , 0 − I ( n + 1 ) V k V k = k = 1 k n + 1 ./ √ K n V n + 1 := K n + 1 K n − V n + 1 . ∧ 2 =
Recommend
More recommend
Explore More Topics
Stay informed with curated content and fresh updates.