Singular value analysis of Joint Inversion Jodi Mead Department of Mathematics James Ford Clearwater Analytics Thanks to: Diego Domenzain, John Bradford NSF DMS-1418714
Outline • Subsurface earth imaging with electromagnetic waves • Inverse methods and regularization • Discrete joint inversion and the SVD • Green’s function solutions of differential equations • Continuous inversion and the SVE • Joint singular values and Galerkin approximation • Example of joint inversion
Near subsurface imaging • Landfill investigation • Mapping and monitoring of groundwater pollution • Determination of depth to bedrock • Locating sinkholes, cave systems, faults and mine shafts • Landslide assessments • Buried foundation mapping
Boise Hydrogeophysical Research Site Field laboratory about 10 miles from downtown Boise
Near subsurface imaging with electromagnetic waves Electrical Resistivity Tomography Ground Penetrating Radar
Ground Penetrating Radar (GPR) Damped Wave: ǫ∂ 2 E ∂t 2 + σ∂E ∂t = 1 µ ∇ 2 E + f • Radar signals f transmitted into the ground and energy that is reflected back to the surface is recorded. • If there’s a contrast in properties between adjacent material proper- ties (permittivity ǫ , permeability µ and conductivity σ ) a proportion of the electromagnetic pulse will be reflected back. • Subsurface structures are imaged by measuring the amplitude and travel time of this reflected energy.
Electrical Resistivity Tomography (ERT) Diffusion: −∇ · σ ∇ φ = ∇ · J s • Current is passed through the ground via outer electrodes J s and potential difference φ is measured between an inner pair of electrodes. • Only responds to variability in electrical resistivity ρ = 1 /σ exhibited by earth materials. • ERT data must be inverted to produce detailed electrical structures of the cross-sections below the survey lines.
Forward vs Inverse Modeling Forward Inverse ǫ∂ 2 E ∂t 2 + σ∂E ∂t = 1 µ ∇ 2 E + f Given ǫ , µ , σ Given E Solve for E Solve for ǫ , µ , σ −∇ · 1 /ρ ∇ φ = ∇ · J s Given ρ Given φ solve for φ solve for ρ
Nonlinear regression Newton’s method for F ( m ) = d m k + J − 1 ( m k )( F ( m k ) − d ) m k +1 = m k + △ m = where J ij = ∂F i ∂m j . We focus on the linear problem J ( m k ) △ m F ( m k ) − d = Gm = d
Full Disclosure.... This work is motivated by inverting both damped wave and diffusion equation simultaneously (Joint Inversion). However, we have only obtained results for simplified equations: u ′′ = f u ′′ + b 2 u = f
Inverse Problems Consider solving problems of the form: Gm = d , • G ∈ R m × n - mathematical model • d ∈ R m - observed data • m ∈ R n - unknown model parameters G cannot be resolved by the data d because it is ill-conditioned det ( G T G ) = “large" and the solution m = ( G T G ) − 1 d is not possible.
Regularization � � � Gm − d � 2 2 + λ || L p ( m − m 0 ) || 2 m L p = argmin m 2 m 0 - initial estimate of m L p - typically represents the first ( p = 1 ) or second derivative ( p = 2 ) λ - regularization parameter This gives estimates m L p = m 0 + ( G T G + λ L T p L p ) − 1 G T d
Choice of λ Methods: L-curve, Generalized Cross Validation (GCV) and Morozov’s Discrepancy Principle, UPRE, χ 2 method 1 ... • λ large → constraint: � L p ( m − m 0 ) || 2 2 ≈ 0 � � � Gm − d � 2 2 + λ || L p ( m − m 0 ) || 2 m L p = argmin m 2 • λ small → problem may stay ill-conditioned m L p = m 0 + ( G T G + λ L T p L p ) − 1 G T d 1 Mead et al, 2008, 2009, 2010, 2016
Choice of L p � � � Gm − d � 2 2 + λ || L p ( m − m 0 ) || 2 m L p = argmin m 2 L 0 = I - requires good initial estimate m 0 , otherwise may not regularize the problem. L 1 - requires first derivative estimate, could be less information than m 0 (just changes in m 0 ). L 2 - requires second derivative estimate, leaves more degrees of freedom than first derivative so that data has more opportunity to inform changes in parameter estimates.
Joint Inversion as Regularization � � � G 1 m − d 1 � 2 2 + || G 2 m − d 2 || 2 m 12 = argmin m 2 Objective function can be written 2 � �� � � � G 1 d 1 � � ≡ � G 12 m − d 12 � 2 [ m ] − � � 2 � � G 2 d 2 � � 2 Goal: Improve condition number κ ( G 12 ) < κ ( G 1 ) , κ ( G 2 ) where κ ( G ) = σ max ( G ) σ min ( G )
Singular Value Decomposition (SVD) n U T · ,i d 12 G 12 = UΣV T → m 12 = � V · ,i σ i i =1 Truncated SVD (with decomposition on appropriate matrix) k 1 U T k 2 U T l U T · ,i d 1 · ,i d 2 · ,i d 12 � � � m 1 = V · ,i , m 2 = V · ,i , m 12 = V · ,i σ i σ i σ i i =1 i =1 i =1 Goal: Keep as many singular values as possible l >> k 1 , k 2
Green’s Functions Let L A = L A ( t ) be a linear differential operator. Then the corresponding Green’s function K A ( t, s ) satisfies L A K A ( t, s ) = δ ( t − s ) , (1) where δ denotes the delta function, a generalized function:
Green’s Function solutions of differential equations Given the Green’s function, we can find the solution to the inhomogeneous equation L A u ( t ) = f ( t ) : � u ( t ) = K A ( t, s ) f ( s ) ds. Ω Forward problem: Given f , find u ; Inverse problem: Given u , find f Conditioning of the inverse problem depends on the forward operator A : H → H A � Ah ( t ) ≡ K A ( t, s ) h ( s ) ds Ω
Singular Value Expansion (SVE) If A is compact ∞ � A = σ k ψ k ⊗ φ k k =1 where { φ k } ⊂ H and { ψ k } ⊂ H A are orthonormal and { σ k } are the singular values of A . Thus ∞ � Ah = σ k � φ k , h � H ψ k k =1 or Aφ k = σ k ψ k for all k .
Singular Values Adjoint A ∗ : H A → H defined by � Ah, h A � H A = � h, A ∗ h A � H ∞ A ∗ = � σ k φ k ⊗ ψ k k =1 Singular values of A are √ eigenvalues of A ∗ A : H → H : A ∗ Aφ = σ 2 φ yields { ( σ k , φ k ) } ∞ k =1
Least Squares � Ah − f � 2 H A has solution ∞ � ψ k , f � H A h = A † f = � φ k σ k k =1 Condition number: σ 1 /σ r ; but as r → ∞ , σ r → 0 . Decay rate q : σ k ( A ) decays like k − q Decay rate of singular values allow us to classify model conditioning
Decay rate example Consider A : H → H A with H = H A = L 2 (0 , 1) � t Ah ( t ) = h ( s ) ds. 0 Solve A ∗ Aφ = σ 2 φ for σ and φ . This problem is equivalent to λφ ′′ + φ = 0 , φ (1) = φ ′ (0) = 0 . with The singular functions and values are φ k ( t ) = c 1 cos 2 k − 1 2 πt and σ k = (2 k − 1) π, k ∈ N . 2
Decay rate example semi-log log-log
Tikhonov Operator T λ : H → H A × H is defined by √ T λ h = ( Ah, λh ) where H A × H = { ( h A , h ) : h A ∈ H A , h ∈ H } with inner product � ( h A, 1 , h 1 ) , ( h A, 2 , h 2 ) � H A × H = � h A, 1 , h A, 2 � H A + � h 1 , h 2 � H .
Tikhonov regularization We minimize � T λ h − ( f, 0) � 2 H A × H = � Ah − f � 2 H A + λ � h � 2 H . Normal equations 2 T ∗ Th = T ∗ ( f, 0) √ T ∗ ( Ah, λh ) = T ∗ ( f, 0) √ A ∗ Ah + λh = A ∗ f + λ · 0 ( A ∗ A + λI ) h = A ∗ f. 2 Gockenbach, 2015
Pseudoinverse Generalized inverse operator for the modified least squares problem is ∞ σ k λ = ( A ∗ A + λI ) − 1 A ∗ = A † � k + λφ k ⊗ ψ k . σ 2 k =1 Notice that σ k k + λ → 0 , as k → ∞ σ 2 and λ restricts the solution space.
Joint Inversion � Ah − f � 2 H A + � Bh − g � 2 H B with A : H → H A and B : H → H B . Joint operator: C : H → H A ⊕ H B = { ( h A , h B ) : h A ∈ H A , h B ∈ H B } so that Ch = ( Ah ⊕ B ) , h ∈ H Continuous analog to stacking matrices
Joint Operator Example Define H = L 2 (0 , 2 π ) and H A = H B = R � 2 π � 2 π Ah = h ( y ) δ ( y − 5) dy a nd Bh = h ( y ) δ ( y − 7) dy. 0 0 Then C : H → H A ⊕ H B is given by �� 2 π � 2 π � Ch = ( Ah, Bh ) = h ( y ) δ ( y − 5) dy, h ( y ) δ ( y − 7) dy , 0 0 a parametric curve in the space H A ⊕ H B = R 2 .
Joint Operator Example, con’t Consider S = { cos kx : k ∈ R , x ∈ [0 , 2 π ] } ⊂ H , then �� 2 π � 2 π � C (cos kx ) = cos( ky ) δ ( y − 5) dy, cos( ky ) δ ( y − 7) dy 0 0 = (cos( k · 5) , cos( k · 7)) . with image [ − 1 , 1] × [ − 1 , 1] .
Joint Operator Example Parametric curve defined by C (cos kx ) for k ∈ [ − 5 , 5]
Joint Singular Values Adjoint C ∗ : H A ⊕ H B → H C ∗ ( h A , h B ) = A ∗ h A + B ∗ h B . so that σ 2 φ = C ∗ Cφ = C ∗ ( Aφ, Bφ ) = A ∗ Aφ + B ∗ Bφ. (2)
Galerkin method for SVE A ( n ) approximates A with orthonormal bases { q i ( s ) } n i =1 and { p j ( t ) } n j =1 a ( n ) = � q i , Ap j � ij = � q i , � K A , p j �� � � = q i ( s ) K A ( s, t ) p j ( t ) dtds. (3) Ω s Ω t so that V ( n ) � T A ( n ) = U ( n ) Σ ( n ) � with Σ ( n ) = diag � σ ( n ) 1 , σ ( n ) 2 , . . . σ ( n ) � n
Convergence of Galerkin Method Define � 2 = � K A � 2 − � A ( n ) � 2 � ∆ ( n ) A F ∞ n ( σ ( n ) σ 2 ) 2 . � � = i − i i =1 i =1 Then the following hold for all i and n , independent of the convergence of ∆ ( n ) to 0: 3 A 1. σ ( n ) ≤ σ ( n +1) ≤ σ i i i 2. 0 ≤ σ i − σ ( n ) ≤ ∆ ( n ) i A 3 Hansen 1988, Renaut et al 2016
Recommend
More recommend