Johann Radon Institute for Computational and Applied Mathematics Two-Point Gradient Methods for Nonlinear Ill-Posed Problems Simon Hubmer and Ronny Ramlau Johann Radon Institute for Computational and Applied Mathematics (RICAM) Austrian Academy of Sciences (ÖAW) Linz, Austria Joint Fudan - RICAM Seminar, July 8, 2020 www.ricam.oeaw.ac.at S. Hubmer, R. Ramlau, Two-Point Gradient Methods for Nonlinear Ill-Posed Problems
Johann Radon Institute for Computational and Applied Mathematics Outline 1 Introduction 2 TPG Methods 3 Convergence Analysis 4 Numerical Examples 5 Recent Developments www.ricam.oeaw.ac.at S. Hubmer, R. Ramlau, Two-Point Gradient Methods for Nonlinear Ill-Posed Problems
Johann Radon Institute for Computational and Applied Mathematics The Problem Hilbert spaces X and Y , with norms � . � . Operator F : X → Y , continuously Fréchet differentiable. Noisy data y δ ∈ Y and noise level δ ∈ R + . Problem F ( x ) = y ( δ ) The noisy data y δ satisfies � � y − y δ � � ≤ δ . � � www.ricam.oeaw.ac.at S. Hubmer, R. Ramlau, Two-Point Gradient Methods for Nonlinear Ill-Posed Problems
Johann Radon Institute for Computational and Applied Mathematics Tikhonov Regularization Required: Initial guess x 0 and regularization parameter α . The method: � 1 � � 2 + α 2 � x − x 0 � 2 � F ( x ) − y δ � � min 2 x Properties: + Weak conditions necessary for analysis. + Very versatile (different norms, regularization functionals). − Computation of the minimum ↔ HOW?? www.ricam.oeaw.ac.at S. Hubmer, R. Ramlau, Two-Point Gradient Methods for Nonlinear Ill-Posed Problems
Johann Radon Institute for Computational and Applied Mathematics Landweber Iteration Required: Initial guess x 0 and stopping criterion. The method: k ) ∗ ( y δ − F ( x δ x δ k + 1 = x δ k + F ′ ( x δ k )) Properties: + Easy to implement. − Strong conditions necessary for analysis. − Slow convergence, i.e., lots of iterations required. www.ricam.oeaw.ac.at S. Hubmer, R. Ramlau, Two-Point Gradient Methods for Nonlinear Ill-Posed Problems
Johann Radon Institute for Computational and Applied Mathematics Second Order Methods Levenberg-Marquardt method k ) ∗ ( y δ − F ( x δ x δ k + 1 = x δ k + ( F ′ ( x δ k ) ∗ F ′ ( x δ k ) + α k I ) − 1 F ′ ( x δ k )) Iteratively regularized Gauss-Newton method k ) ∗ ( y δ − F ( x δ x δ k + 1 = x δ k + ( F ′ ( x δ k ) ∗ F ′ ( x δ k ) + α k I ) − 1 ( F ′ ( x δ k )) + α k ( x 0 − x δ k )) Properties: + Require much less iterations. − Very strong conditions necessary for analysis. − Require inversion of ( F ′ ( x ) ∗ F ′ ( x ) + α I ) in every iteration step → difficult and takes time. www.ricam.oeaw.ac.at S. Hubmer, R. Ramlau, Two-Point Gradient Methods for Nonlinear Ill-Posed Problems
Johann Radon Institute for Computational and Applied Mathematics Acceleration Techniques Landweber Iteration with operator approximation: k ) ∗ ( y δ − ˜ k + ˜ x δ k + 1 = x δ F ′ ( x δ F ( x δ k )) Landweber Iteration in Hilbert Scales: k ) ∗ ( y δ − F ( x δ x δ k + 1 = x δ k + L − 2 s F ′ ( x δ k )) Landweber Iteration with intelligent stepsizes: k ) ∗ ( y δ − F ( x δ x δ k + 1 = x δ k + α δ k F ′ ( x δ k )) Examples: Steepest Descent, Barzilai-Borwein, Neubauer. www.ricam.oeaw.ac.at S. Hubmer, R. Ramlau, Two-Point Gradient Methods for Nonlinear Ill-Posed Problems
Johann Radon Institute for Computational and Applied Mathematics Connection: Residual Functional 2 Φ( x ) = 1 � � F ( x ) − y δ � � � 2 � Tikhonov = Minimize{ Φ( x ) + Regularization ( x ) }. Landweber = Gradient Descent for Φ( x ) . Levenberg Marquardt = 2nd order descent for Φ( x ) . Iteratively regularized Gauss-Newton = 2nd order descent for Φ( x ) + Tikhonov Type Stabilization. www.ricam.oeaw.ac.at S. Hubmer, R. Ramlau, Two-Point Gradient Methods for Nonlinear Ill-Posed Problems
Johann Radon Institute for Computational and Applied Mathematics Nesterov Acceleration General minimization problem min { Φ( x ) } x Yurii Nesterov: Instead of using gradient descent: x k + 1 = x k − ω ∇ Φ( x k ) , use the following iteration: k − 1 z k = x k + k + α − 1 ( x k − x k − 1 ) , x k + 1 = z k − ω ∇ Φ( z k ) . www.ricam.oeaw.ac.at S. Hubmer, R. Ramlau, Two-Point Gradient Methods for Nonlinear Ill-Posed Problems
Johann Radon Institute for Computational and Applied Mathematics Motivating Picture z k x k + 1 x k ˜ x k − 1 x k + 1 www.ricam.oeaw.ac.at S. Hubmer, R. Ramlau, Two-Point Gradient Methods for Nonlinear Ill-Posed Problems
Johann Radon Institute for Computational and Applied Mathematics Motivating Picture z k x k + 1 x k ˜ x k − 1 x k + 1 www.ricam.oeaw.ac.at S. Hubmer, R. Ramlau, Two-Point Gradient Methods for Nonlinear Ill-Posed Problems
Johann Radon Institute for Computational and Applied Mathematics Motivating Picture z k x k + 1 x k ˜ x k − 1 x k + 1 x k + 1 = x k − ω ∇ Φ( x k ) ˜ www.ricam.oeaw.ac.at S. Hubmer, R. Ramlau, Two-Point Gradient Methods for Nonlinear Ill-Posed Problems
Johann Radon Institute for Computational and Applied Mathematics Motivating Picture z k x k + 1 x k x k − 1 ˜ x k + 1 k − 1 z k = x k + k + α − 1 ( x k − x k − 1 ) www.ricam.oeaw.ac.at S. Hubmer, R. Ramlau, Two-Point Gradient Methods for Nonlinear Ill-Posed Problems
Johann Radon Institute for Computational and Applied Mathematics Motivating Picture z k x k + 1 x k x k − 1 ˜ x k + 1 k − 1 z k = x k + k + α − 1 ( x k − x k − 1 ) x k + 1 = z k − ω ∇ Φ( z k ) www.ricam.oeaw.ac.at S. Hubmer, R. Ramlau, Two-Point Gradient Methods for Nonlinear Ill-Posed Problems
Johann Radon Institute for Computational and Applied Mathematics What’s so good about that? Assume: Φ is convex. Gradient Descent: � � � = O ( k − 1 ) � Φ( x k ) − Φ( x † ) � � Nesterov Acceleration: � � � = O ( k − 2 ) � Φ( x k ) − Φ( x † ) � � H. Attouch, J. Peypouquet, The rate of convergence of Nesterov’s accelerated forward-backward method is actually o ( k − 2 ) , SIAM Journal on Optimization Y. Nesterov, A method of solving a convex programming problem with convergence rate O ( 1 / k 2 ) , Soviet Mathematics Doklady, 27, 2, 1983 www.ricam.oeaw.ac.at S. Hubmer, R. Ramlau, Two-Point Gradient Methods for Nonlinear Ill-Posed Problems
Johann Radon Institute for Computational and Applied Mathematics Application to Nonlinear Ill-Posed Problems For our problem, the method reads z δ k = x δ k + α − 1 ( x δ k − 1 k − x δ k + k − 1 ) , k ) ∗ ( y δ − F ( z δ x δ k + 1 = z δ k + α δ k F ′ ( z δ k )) . There is a generalization to deal with min { Φ( x ) + Ψ( x ) } , which reads k − 1 k + α − 1 ( x k − x k − 1 ) , z k = x k + x k + 1 = prox Ψ ( z k − ω ∇ Φ( z k )) . = ⇒ Sparsity Constraints, Projections, etc. www.ricam.oeaw.ac.at S. Hubmer, R. Ramlau, Two-Point Gradient Methods for Nonlinear Ill-Posed Problems
Johann Radon Institute for Computational and Applied Mathematics Application to Nonlinear Ill-Posed Problems For our problem, the method reads z δ k = x δ k + α − 1 ( x δ k − 1 k − x δ k + k − 1 ) , k ) ∗ ( y δ − F ( z δ x δ k + 1 = z δ k + α δ k F ′ ( z δ k )) . There is a generalization to deal with min { Φ( x ) + Ψ( x ) } , which reads k − 1 k + α − 1 ( x k − x k − 1 ) , z k = x k + x k + 1 = prox Ψ ( z k − ω ∇ Φ( z k )) . = ⇒ Sparsity Constraints, Projections, etc. www.ricam.oeaw.ac.at S. Hubmer, R. Ramlau, Two-Point Gradient Methods for Nonlinear Ill-Posed Problems
Johann Radon Institute for Computational and Applied Mathematics Neubauer’s Linear Results Assumptions: Linear operator F ( x ) = Tx , source condition x † ∈ R (( T ∗ T ) µ ) , a priori stopping rule. If 0 ≤ µ ≤ 1 2 , then 2 µ 1 � k ( δ ) − x † � k ( δ ) = O ( δ − 2 µ + 1 ) , � x δ 2 µ + 1 ) . � = o ( δ � � If µ > 1 2 , then 2 µ + 1 2 � k ( δ ) − x † � k ( δ ) = O ( δ − 2 µ + 3 ) , � x δ 2 µ + 3 ) . � = o ( δ � � Similar results also when using the discrepancy principle. A. Neubauer, On Nesterov acceleration for Landweber iteration of linear ill-posed problems, J. Inv. Ill-Posed Problems, Vol. 25, No. 3, 2017 www.ricam.oeaw.ac.at S. Hubmer, R. Ramlau, Two-Point Gradient Methods for Nonlinear Ill-Posed Problems
Johann Radon Institute for Computational and Applied Mathematics Two-Point Gradient (TPG) Methods How about general methods of the form z δ k = x δ k + λ δ k ( x δ k − x δ k − 1 ) , k ) ∗ ( y δ − F ( z δ x δ k + 1 = z δ k + α δ k F ′ ( z δ k )) . Question: Do they converge under standard assumptions? Yes for linear problems and λ δ k − 1 k = k + α − 1 ← Neubauer Yes for λ δ k → 0 fast enough Yes for some explicit choices of λ δ k Yes for λ δ k defined via a backtracking search k − 1 Yes for λ δ k = k + α − 1 and a locally convex residual functional www.ricam.oeaw.ac.at S. Hubmer, R. Ramlau, Two-Point Gradient Methods for Nonlinear Ill-Posed Problems
Recommend
More recommend