Motivation Least Squares Problems Statistical Results for Least Squares Implications of Statistical Results for Regularized Least Squares Newton Statistical Properties of the Regularized Least Squares Functional and a hybrid LSQR Newton method for Finding the Regularization Parameter: Application in Image Deblurring and Signal Restoration Rosemary Renaut Midwest Conference on Mathematical Methods for Images and Surfaces April 18, 2009 National Science Foundation: Division of Computational Mathematics 1 / 34
Motivation Least Squares Problems Statistical Results for Least Squares Implications of Statistical Results for Regularized Least Squares Newton Outline Motivation 1 Least Squares Problems 2 Statistical Results for Least Squares 3 Implications of Statistical Results for Regularized Least Squares 4 Newton algorithm 5 Algorithm with LSQR (Paige and Saunders) 6 Results 7 Conclusions and Future Work 8 National Science Foundation: Division of Computational Mathematics 2 / 34
Motivation Least Squares Problems Statistical Results for Least Squares Implications of Statistical Results for Regularized Least Squares Newton Signal/Image Restoration: � Integral Model of Signal Degradation b ( t ) = K ( t , s ) x ( s ) ds K ( t , s ) describes blur of the signal. Convolutional model: invariant K ( t , s ) = K ( t − s ) is Point Spread Function (PSF). Typically sampling includes noise e ( t ) , model is � b ( t ) = K ( t − s ) x ( s ) ds + e ( t ) Discrete model: given discrete samples b, find samples x of x Let A discretize K , assume known, model is given by b = A x + e . Na¨ ıvely invert the system to find x ! National Science Foundation: Division of Computational Mathematics 3 / 34
Motivation Least Squares Problems Statistical Results for Least Squares Implications of Statistical Results for Regularized Least Squares Newton Example 1-D Original and Blurred Noisy Signal Original signal x . Blurred and noisy signal b , Gaussian PSF. National Science Foundation: Division of Computational Mathematics 4 / 34
Motivation Least Squares Problems Statistical Results for Least Squares Implications of Statistical Results for Regularized Least Squares Newton The Solution: Regularization is needed Na¨ ıve Solution A Regularized Solution National Science Foundation: Division of Computational Mathematics 5 / 34
Motivation Least Squares Problems Statistical Results for Least Squares Implications of Statistical Results for Regularized Least Squares Newton Least Squares for A x = b: A Quick Review Consider discrete systems: A ∈ R m × n , b ∈ R m , x ∈ R n A x = b + e , Classical Approach Linear Least Squares x || A x − b || 2 x LS = arg min 2 Difficulty x LS is sensitive to changes in the right hand side b when A is ill-conditioned. For convolutional models system is numerically ill-posed . National Science Foundation: Division of Computational Mathematics 6 / 34
Motivation Least Squares Problems Statistical Results for Least Squares Implications of Statistical Results for Regularized Least Squares Newton Introduce Regularization to Pick a Solution Weighted Fidelity with Regularization • Regularize x {� b − A x � 2 W b + λ 2 R ( x ) } , x RLS = arg min Weighting matrix W b • R ( x ) is a regularization term • λ is a regularization parameter which is unknown. Solution x RLS ( λ ) depends on λ . depends on regularization operator R depends on the weighting matrix W b National Science Foundation: Division of Computational Mathematics 7 / 34
Motivation Least Squares Problems Statistical Results for Least Squares Implications of Statistical Results for Regularized Least Squares Newton Generalized Tikhonov Regularization With Weighting x = argmin J ( x ) = argmin {� A x − b � 2 W b + λ 2 � D ( x − x 0 ) � 2 } . ˆ (1) D is a suitable operator, often derivative approximation. Assume N ( A ) ∩ N ( D ) = ∅ x 0 is a reference solution, often x 0 = 0. Given multiple measurements of data: Usually error in b , e is an m − vector of random measurement errors with mean 0 and positive definite covariance matrix C b = E ( ee T ) . For uncorrelated measurements C b is diagonal matrix of standard deviations of the errors. (Colored noise) For white noise C b = σ 2 I . Weighting by W b = C b − 1 in data fit term, theoretically, ˜ e are uncorrelated. Question Given D , W b how do we find λ ? National Science Foundation: Division of Computational Mathematics 8 / 34
Motivation Least Squares Problems Statistical Results for Least Squares Implications of Statistical Results for Regularized Least Squares Newton Example: solution for Increasing λ , D = I . National Science Foundation: Division of Computational Mathematics 9 / 34
Motivation Least Squares Problems Statistical Results for Least Squares Implications of Statistical Results for Regularized Least Squares Newton Example: solution for Increasing λ , D = I . National Science Foundation: Division of Computational Mathematics 9 / 34
Motivation Least Squares Problems Statistical Results for Least Squares Implications of Statistical Results for Regularized Least Squares Newton Example: solution for Increasing λ , D = I . National Science Foundation: Division of Computational Mathematics 9 / 34
Motivation Least Squares Problems Statistical Results for Least Squares Implications of Statistical Results for Regularized Least Squares Newton Example: solution for Increasing λ , D = I . National Science Foundation: Division of Computational Mathematics 9 / 34
Motivation Least Squares Problems Statistical Results for Least Squares Implications of Statistical Results for Regularized Least Squares Newton Choice of λ crucial Different algorithms yield different solutions. Examples: Discrepancy Principle Generalized Cross Validation (GCV) L-Curve Unbiased Predictive Risk (UPRE) General Difficulties Expensive (GCV, L, UPRE) Not necessarily unique solution (GCV) Oversmoothing (Discrepancy) No kink in the L-curve A new statistical approach χ 2 result National Science Foundation: Division of Computational Mathematics 10 / 34
Motivation Least Squares Problems Statistical Results for Least Squares Implications of Statistical Results for Regularized Least Squares Newton Background: Statistics of the Least Squares Problem Theorem (Rao73: First Fundamental Theorem) Let r be the rank of A and for b ∼ N ( A x , σ 2 b I ) , (errors in measurements are normally distributed with mean 0 and covariance σ 2 b I), then x � A x − b � 2 ∼ σ 2 b χ 2 ( m − r ) . J = min J follows a χ 2 distribution with m − r degrees of freedom: Basically the Discrepancy Principle Corollary (Weighted Least Squares) For b ∼ N ( A x , C b ) , and W b = C b − 1 then x � A x − b � 2 W b ∼ χ 2 ( m − r ) . J = min National Science Foundation: Division of Computational Mathematics 11 / 34
Motivation Least Squares Problems Statistical Results for Least Squares Implications of Statistical Results for Regularized Least Squares Newton Extension: Statistics of the Regularized Least Squares Problem Theorem: χ 2 distribution of the regularized functional (Renaut/Mead 2008) x = argmin J D ( x ) = argmin {� A x − b � 2 W b + � ( x − x 0 ) � 2 W D = D T W x D . ˆ W D } , (2) Assume W b and W x are symmetric positive definite. Problem is uniquely solvable N ( A ) ∩ N ( D ) � = 0. Moore-Penrose generalized inverse of W D is C D Statistics: ( b − A x ) = e ∼ N ( 0 , C b ) , ( x − x 0 ) = f ∼ N ( 0 , C D ) , x 0 is the mean vector of the model parameters. Then J D ∼ χ 2 ( m + p − n ) National Science Foundation: Division of Computational Mathematics 12 / 34
Motivation Least Squares Problems Statistical Results for Least Squares Implications of Statistical Results for Regularized Least Squares Newton Key Aspects of the Proof I: The Functional J Algebraic Simplifications: Rewrite functional as quadratic form Regularized solution given in terms of resolution matrix R ( W D ) x 0 + ( A T W b A + D T W x D ) − 1 A T W b r , ˆ x = (3) x 0 + R ( W D ) W b 1 / 2 r , = r = b − A x 0 = x 0 + y ( W D ) . (4) ( A T W b A + D T W x D ) − 1 A T W b 1 / 2 R ( W D ) = (5) Functional is given in terms of influence matrix A ( W D ) W b 1 / 2 AR ( W D ) A ( W D ) = (6) r T W b 1 / 2 ( I m − A ( W D )) W b 1 / 2 r , r = W b 1 / 2 r (7) J D (ˆ ˜ x ) = let r T ( I m − A ( W D ))˜ = ˜ r . A Quadratic Form (8) National Science Foundation: Division of Computational Mathematics 13 / 34
Motivation Least Squares Problems Statistical Results for Least Squares Implications of Statistical Results for Regularized Least Squares Newton Key Aspects of the Proof II : Properties of a Quadratic Form χ 2 distribution of Quadratic Forms x T P x for normal variables (Fisher- Cochran Theorem) Components x i are independent normal variables x i ∼ N ( 0 , 1 ) , i = 1 : n . A necessary and sufficient condition that x T P x has a central χ 2 distribution is that P is idempotent , P 2 = P . In which case the degrees of freedom of χ 2 is rank( P ) = trace( P ) = n . . When the means of x i are µ i � = 0, x T P x has a non-central χ 2 distribution, with non-centrality parameter c = µ T P µ A χ 2 random variable with n degrees of freedom and centrality parameter c has mean n + c and variance 2 ( n + 2 c ) . National Science Foundation: Division of Computational Mathematics 14 / 34
Recommend
More recommend