Convergence results for the Bayesian inversion theory Hanna Katriina Pikkarainen Johann Radon Institute for Computational and Applied Mathematics Austrian Academy of Sciences Workshop on Inverse and Partial Information Problems Linz, Austria, October 27-31, 2008 in collaboration with Prof. Andreas Neubauer (Johannes Kepler University Linz) H.K. Pikkarainen (RICAM) Convergence results for the Bayesian inversion theory 1 / 20
Overview Introduction 1 Convergence rates for the finite-dimensional problem with normality assumption 2 Convergence issues in the infinite-dimensional setting 3 References 4 H.K. Pikkarainen (RICAM) Convergence results for the Bayesian inversion theory 2 / 20
Overview Introduction 1 Convergence rates for the finite-dimensional problem with normality assumption 2 Convergence issues in the infinite-dimensional setting 3 References 4 H.K. Pikkarainen (RICAM) Convergence results for the Bayesian inversion theory 3 / 20
Bayes formula Assumptions: a probability space (Ω , F , P ) X and Y random variables with values in R n and R m , respectively Given: the prior probability density π pr ( x ) of X the likelihood function π ( y | x ) Posterior (conditional) probability density of X : π post ( x ) = π ( x | y ) = π pr ( x ) π ( y | x ) π ( y ) if π ( y ) � = 0 . H.K. Pikkarainen (RICAM) Convergence results for the Bayesian inversion theory 4 / 20
Bayesian inversion theory Inverse problem: y = F ( x ) x ∈ X unknown y ∈ Y exact measurement F : X → Y operator with discontinuous inverse Inverse problem in the Bayesian framework: Given a noisy measurement Y = y data , find the posterior distribution of X . 9 the model of an inverse problem > > a noise model = Bayes the posterior distribution − − − − − − → a noise distribution formula > > a prior distribution ; H.K. Pikkarainen (RICAM) Convergence results for the Bayesian inversion theory 5 / 20
Linear inverse problem with additive noise Linear model for indirect measurements: Y = AX + E Assumptions: X normal random variable with mean x 0 and covariance matrix Γ E normal random variable with mean 0 and covariance matrix Σ X and E mutually independent ` − 1 2 ( x − x post ) T Γ − 1 ´ π post ( x ) ∝ π pr ( x ) π noise ( y data − Ax ) ∝ exp post ( x − x post ) where x post = (Γ − 1 + A T Σ − 1 A ) − 1 ( A T Σ − 1 y data + Γ − 1 x 0 ) and Γ post = (Γ − 1 + A T Σ − 1 A ) − 1 . H.K. Pikkarainen (RICAM) Convergence results for the Bayesian inversion theory 6 / 20
Overview Introduction 1 Convergence rates for the finite-dimensional problem with normality assumption 2 Convergence issues in the infinite-dimensional setting 3 References 4 H.K. Pikkarainen (RICAM) Convergence results for the Bayesian inversion theory 7 / 20
Posterior distribution as a random variable The data y data is a realization of the random variable y + E . = ⇒ The posterior mean is a realization of the random variable X post ( ω ) = (Γ − 1 + A T Σ − 1 A ) − 1 ( A T Σ − 1 ( y + E ( ω )) + Γ − 1 x 0 ) . (1) = ⇒ The posterior distribution µ post is a realization of the random variable M post : (Ω , F , P ) → ( M ( R n ) , ρ P ) , ω �→ N ( X post ( ω ) , Γ post ) . (2) M ( R n ) is the set of all Borel measures and ρ P is the Prokhorov metric. H.K. Pikkarainen (RICAM) Convergence results for the Bayesian inversion theory 8 / 20
Prokhorov and Ky Fan metrics Definition Let µ 1 and µ 2 be Borel measures in a metric space ( X , ρ X ). The distance between µ 1 and µ 1 in the Prokhorov metric is defined by ρ P ( µ 1 , µ 1 ) := inf { ε > 0 : µ 1 ( B ) ≤ µ 2 ( B ε ) + ε ∀ B ∈ B ( X ) } where B ( X ) is the Borel σ -algebra in X . The set B ε is the ε -neighbourhood of B , i.e., B ε := { x ∈ X : inf z ∈ B ρ X ( x , z ) < ε } . Definition Let ξ 1 and ξ 2 be random variables in a probability space (Ω , F , P ) with values in a metric space ( X , ρ X ). The distance between ξ 1 and ξ 2 in the Ky Fan metric is defined by ρ K ( ξ 1 , ξ 1 ) := inf { ε > 0 : P ( ρ X ( ξ 1 ( ω ) , ξ 2 ( ω )) > ε ) < ε } . H.K. Pikkarainen (RICAM) Convergence results for the Bayesian inversion theory 9 / 20
Lifting theorem Theorem Let ξ 1 , ξ 2 and η 1 , η 2 be random variables on metric spaces ( X , ρ X ) and ( Y , ρ Y ) , respectively. Let ρ X ( ξ 1 ( ω ) , ξ 2 ( ω )) ≤ Φ( ρ Y ( η 1 ( ω ) , η 2 ( ω ))) for almost all ω ∈ Ω where Φ is a monotonically increasing right-continuous function. Then ρ K ( ξ 1 , ξ 2 ) ≤ max { ρ K ( η 1 , η 2 ) , Φ( ρ K ( η 1 , η 2 )) } . H.K. Pikkarainen (RICAM) Convergence results for the Bayesian inversion theory 10 / 20
x 0 -minimum norm least squares solution Σ = σ 2 ˆ Σ with � ˆ Σ � = 1 � x � Q = ( x T Q − 1 x ) 1 / 2 for a positive definite symmetric matrix Q x † is the x 0 -minimum norm least squares solution: x † minimizes the residual � Ax − y � ˆ Σ and among all minimizers it then minimizes � x − x 0 � Γ . i , v i ) is an orthonormal eigensystem of Γ 1 / 2 A T ˆ ( λ 2 Σ − 1 A Γ 1 / 2 λ 1 ≥ . . . ≥ λ p > λ p + 1 = . . . = λ n = 0 V 1 = ( v 1 , . . . , v p ) and V 2 = ( v p + 1 , . . . , v n ) µ x † denotes the normal distribution N ( x † , Γ 1 / 2 V T 2 V 2 Γ 1 / 2 ). If the null space of A is trivial, µ x † := δ x † . H.K. Pikkarainen (RICAM) Convergence results for the Bayesian inversion theory 11 / 20
Convergence rates Theorem Let X post and M post be defined by (1) and (2) , respectively. Then � � � ρ K ( E , 0 ) = O σ 1 + | log σ | . Furthermore, � � ρ K ( X post , x † ) = O � σ 1 + | log σ | and � � � ρ K ( M post , µ x † ) = O σ 1 + | log σ | . ⇒ (order optimal) convergence rates with the same order as the noise = H.K. Pikkarainen (RICAM) Convergence results for the Bayesian inversion theory 12 / 20
Overview Introduction 1 Convergence rates for the finite-dimensional problem with normality assumption 2 Convergence issues in the infinite-dimensional setting 3 References 4 H.K. Pikkarainen (RICAM) Convergence results for the Bayesian inversion theory 13 / 20
Regularization with projection ⇒ Tx = y = T n x = Q n y , T n = Q n T x ∈ X real separable Hilbert space y ∈ Y real separable Hilbert space T ∈ L ( X , Y ) {Y n } finite dimensional subspaces of R ( T ) Q n : Y → Y n orthogonal projector lim n →∞ � ( I − Q n ) y � = 0 for all y ∈ R ( T ) Prior information: X ∼ N ( x 0 , Γ) Wanted: stable solutions in the space X n = Γ T ∗ Y n H.K. Pikkarainen (RICAM) Convergence results for the Bayesian inversion theory 14 / 20
Bayesian approach Additive noise: E n ∼ N ( 0 , σ 2 Q n ) Posterior mean: X post , n ( ω ) = ( σ 2 Γ − 1 + T ∗ Q n T ) − 1 ( T ∗ Q n ( y + E n ( ω )) + σ 2 Γ − 1 P n x 0 ) T ∗ adjoint of T : X → Y X Γ = R (Γ 1 / 2 ) and � x 1 , x 2 � Γ = � Γ − 1 / 2 x 1 , Γ − 1 / 2 x 2 � P n : X Γ → X n orthogonal projector Posterior distribution: M post , n : (Ω , F , P ) → ( M ( X ) , ρ P ) ω �→ N ( X post , n ( ω ) , Γ post , n ) where Γ post , n = σ 2 ( σ 2 Γ − 1 + T ∗ Q n T ) − 1 Assumption: σ nc n ≥ 1 for some c > 1 1 2 ) ⇒ ρ K ( E n , 0 ) = O ( σ n = H.K. Pikkarainen (RICAM) Convergence results for the Bayesian inversion theory 15 / 20
Weighted Bayesian approach Weighted posterior mean: X α post , n ( ω ) = ( α I + T # Q n T ) − 1 ( T # Q n ( y + E n ( ω )) + α P n x 0 ) (3) X Γ = R (Γ 1 / 2 ) and � x 1 , x 2 � Γ = � Γ − 1 / 2 x 1 , Γ − 1 / 2 x 2 � T # = Γ T ∗ adjoint of T : X Γ → Y P n : X Γ → X n orthogonal projector X n , Γ space X n equipped with the X Γ -norm Weighted posterior distribution: M α ω �→ N ( X α post , n ( ω ) , Γ α post , n : (Ω , F , P ) → ( M ( X n , Γ ) , ρ P ) post , n ) (4) where Γ α post , n = σ 2 ( α I + T # Q n T ) − 1 H.K. Pikkarainen (RICAM) Convergence results for the Bayesian inversion theory 16 / 20
Convergence result Theorem Let X α post , n and M α post , n be defined by (3) and (4) , respectively, and suppose that T is compact. If σ n 1 / 2 → 0 and σ nc n ≥ 1 for some c > 1 and if α → 0 and σ 2 n /α → 0 as σ → 0 and n → ∞ , ρ K ( X α post , n , x † ) = O ( � ( I − P n ) x † � Γ ) + o ( 1 ) where x † = T † y . Moreover, ρ K ( M α post , n , δ P n x † ) = o ( 1 ) where δ P n x † is the point measure at P n x † . H.K. Pikkarainen (RICAM) Convergence results for the Bayesian inversion theory 17 / 20
Convergence rates Theorem Let X α post , n and M α post , n be defined by (3) and (4) , respectively, and suppose that T is compact. Moreover, assume that x 0 ∈ X Γ fullfills ( I − P ) x 0 − x † = ( T # T ) µ v , v ∈ N ( T ) ⊥ , µ ∈ ( 0 , 1 ] where P is the orthogonal projector from X Γ onto N ( T ) ⊂ X Γ and x † = T † y . If σ n 1 / 2 → 0 and σ nc n ≥ 1 for some c > 1 and if α ∼ ( σ n 1 / 2 ) 2 / ( 2 µ + 1 ) as σ → 0 and n → ∞ , ρ K ( X α post , n , x † ) = O ( � ( I − P n ) x † � Γ ) + O ( γ 2 µ n + ( σ n 1 / 2 ) 2 µ/ ( 2 µ + 1 ) ) where γ n = � ( I − Q n ) T � X Γ →Y . Moreover, ρ K ( M α post , n , δ P n x † ) = O ( γ 2 µ n + ( σ n 1 / 2 ) 2 µ/ ( 2 µ + 1 ) ) where δ P n x † is the point measure at P n x † . H.K. Pikkarainen (RICAM) Convergence results for the Bayesian inversion theory 18 / 20
Recommend
More recommend