iterative regularization via dual diagonal descent
play

Iterative regularization via dual diagonal descent Silvia Villa - PowerPoint PPT Presentation

Iterative regularization via dual diagonal descent Silvia Villa Department of Mathematics University of Genoa IHP, Paris, April 1 st , 2019 S. Villa (Unige) Dual iterative regularization 1 / 44 Outline Introduction and motivation S. Villa


  1. Introduction and motivation Iterative regularization at work Recall that x t − x † � ≤ � � + � x t − x † � � � x t − x t � � �� � � �� � stability optimization original image x t ˆ ˆ y S. Villa (Unige) Dual iterative regularization 19 / 44

  2. Introduction and motivation Iterative regularization at work Recall that x t − x † � ≤ � � + � x t − x † � � � x t − x t � � �� � � �� � stability optimization original image x t ˆ ˆ y S. Villa (Unige) Dual iterative regularization 19 / 44

  3. Introduction and motivation Iterative regularization at work Recall that x t − x † � ≤ � � + � x t − x † � � � x t − x t � � �� � � �� � stability optimization original image x t ˆ ˆ y S. Villa (Unige) Dual iterative regularization 19 / 44

  4. Introduction and motivation Iterative regularization at work Recall that x t − x † � ≤ � � + � x t − x † � � � x t − x t � � �� � � �� � stability optimization original image x t ˆ ˆ y S. Villa (Unige) Dual iterative regularization 19 / 44

  5. Introduction and motivation Iterative regularization at work Recall that x t − x † � ≤ � � + � x t − x † � � � x t − x t � � �� � � �� � stability optimization original image x t ˆ ˆ y S. Villa (Unige) Dual iterative regularization 19 / 44

  6. Introduction and motivation Iterative regularization at work Recall that x t − x † � ≤ � � + � x t − x † � � � x t − x t � � �� � � �� � stability optimization original image x t ˆ ˆ y S. Villa (Unige) Dual iterative regularization 19 / 44

  7. Introduction and motivation Iterative regularization at work Recall that x t − x † � ≤ � � + � x t − x † � � � x t − x t � � �� � � �� � stability optimization original image x t ˆ ˆ y S. Villa (Unige) Dual iterative regularization 19 / 44

  8. Introduction and motivation Iterative regularization at work Recall that x t − x † � ≤ � � + � x t − x † � � � x t − x t � � �� � � �� � stability optimization original image x t ˆ ˆ y S. Villa (Unige) Dual iterative regularization 19 / 44

  9. Introduction and motivation Iterative regularization at work Recall that x t − x † � ≤ � � + � x t − x † � � � x t − x t � � �� � � �� � stability optimization original image x t ˆ ˆ y S. Villa (Unige) Dual iterative regularization 19 / 44

  10. Introduction and motivation Iterative regularization at work Recall that x t − x † � ≤ � � + � x t − x † � � � x t − x t � � �� � � �� � stability optimization original image x t ˆ ˆ y S. Villa (Unige) Dual iterative regularization 19 / 44

  11. Introduction and motivation Iterative regularization at work Recall that x t − x † � ≤ � � + � x t − x † � � � x t − x t � � �� � � �� � stability optimization original image x t ˆ ˆ y S. Villa (Unige) Dual iterative regularization 19 / 44

  12. Introduction and motivation Iterative regularization at work Recall that x t − x † � ≤ � � + � x t − x † � � � x t − x t � � �� � � �� � stability optimization original image x t ˆ ˆ y S. Villa (Unige) Dual iterative regularization 19 / 44

  13. Introduction and motivation Iterative regularization at work Recall that x t − x † � ≤ � � + � x t − x † � � � x t − x t � � �� � � �� � stability optimization original image x t ˆ ˆ y S. Villa (Unige) Dual iterative regularization 19 / 44

  14. Introduction and motivation Iterative regularization at work Recall that x t − x † � ≤ � � + � x t − x † � � � x t − x t � � �� � � �� � stability optimization original image x t ˆ ˆ y S. Villa (Unige) Dual iterative regularization 19 / 44

  15. Introduction and motivation Iterative regularization at work Recall that x t − x † � ≤ � � + � x t − x † � � � x t − x t � � �� � � �� � stability optimization original image x t ˆ ˆ y S. Villa (Unige) Dual iterative regularization 19 / 44

  16. Introduction and motivation Iterative regularization at work Recall that x t − x † � ≤ � � + � x t − x † � � � x t − x t � � �� � � �� � stability optimization original image x t ˆ ˆ y S. Villa (Unige) Dual iterative regularization 19 / 44

  17. Introduction and motivation Iterative regularization at work Recall that x t − x † � ≤ � � + � x t − x † � � � x t − x t � � �� � � �� � stability optimization original image x t ˆ ˆ y S. Villa (Unige) Dual iterative regularization 19 / 44

  18. Introduction and motivation Iterative regularization at work Recall that x t − x † � ≤ � � + � x t − x † � � � x t − x t � � �� � � �� � stability optimization original image x t ˆ ˆ y S. Villa (Unige) Dual iterative regularization 19 / 44

  19. Introduction and motivation Iterative regularization at work Recall that x t − x † � ≤ � � + � x t − x † � � � x t − x t � � �� � � �� � stability optimization original image x t ˆ ˆ y S. Villa (Unige) Dual iterative regularization 19 / 44

  20. Introduction and motivation Iterative regularization at work Recall that x t − x † � ≤ � � + � x t − x † � � � x t − x t � � �� � � �� � stability optimization original image x t ˆ ˆ y S. Villa (Unige) Dual iterative regularization 19 / 44

  21. Introduction and motivation Iterative regularization at work Recall that x t − x † � ≤ � � + � x t − x † � � � x t − x t � � �� � � �� � stability optimization original image x t ˆ ˆ y S. Villa (Unige) Dual iterative regularization 19 / 44

  22. Introduction and motivation Iterative regularization at work Recall that x t − x † � ≤ � � + � x t − x † � � � x t − x t � � �� � � �� � stability optimization original image x t ˆ ˆ y S. Villa (Unige) Dual iterative regularization 19 / 44

  23. Introduction and motivation Iterative regularization at work Recall that x t − x † � ≤ � � + � x t − x † � � � x t − x t � � �� � � �� � stability optimization original image x t ˆ ˆ y S. Villa (Unige) Dual iterative regularization 19 / 44

  24. Introduction and motivation Iterative regularization at work Recall that x t − x † � ≤ � � + � x t − x † � � � x t − x t � � �� � � �� � stability optimization original image x t ˆ ˆ y S. Villa (Unige) Dual iterative regularization 19 / 44

  25. Introduction and motivation Iterative regularization at work Recall that x t − x † � ≤ � � + � x t − x † � � � x t − x t � � �� � � �� � stability optimization original image x t ˆ ˆ y S. Villa (Unige) Dual iterative regularization 19 / 44

  26. Introduction and motivation Iterative regularization at work Recall that x t − x † � ≤ � � + � x t − x † � � � x t − x t � � �� � � �� � stability optimization original image x t ˆ ˆ y S. Villa (Unige) Dual iterative regularization 19 / 44

  27. Introduction and motivation Iterative regularization at work Recall that x t − x † � ≤ � � + � x t − x † � � � x t − x t � � �� � � �� � stability optimization original image x t ˆ ˆ y S. Villa (Unige) Dual iterative regularization 19 / 44

  28. Introduction and motivation Iterative regularization at work Recall that x t − x † � ≤ � � + � x t − x † � � � x t − x t � � �� � � �� � stability optimization original image x t ˆ ˆ y S. Villa (Unige) Dual iterative regularization 19 / 44

  29. Introduction and motivation Iterative regularization at work Recall that x t − x † � ≤ � � + � x t − x † � � � x t − x t � � �� � � �� � stability optimization original image x t ˆ ˆ y S. Villa (Unige) Dual iterative regularization 19 / 44

  30. Introduction and motivation Iterative regularization at work Recall that x t − x † � ≤ � � + � x t − x † � � � x t − x t � � �� � � �� � stability optimization original image x t ˆ ˆ y S. Villa (Unige) Dual iterative regularization 19 / 44

  31. Introduction and motivation Iterative regularization at work Recall that x t − x † � ≤ � � + � x t − x † � � � x t − x t � � �� � � �� � stability optimization original image x t ˆ ˆ y S. Villa (Unige) Dual iterative regularization 19 / 44

  32. Quadratic data fit Derivation of the algorithm and convergence results Dual problem Ax = y R ( x ) min ← → min x ∈H R ( x ) + ι { y } ( Ax ) , where ι { y } ( x ) = 0 if x = y and ι { y } ( x ) = + ∞ otherwise. S. Villa (Unige) Dual iterative regularization 20 / 44

  33. Quadratic data fit Derivation of the algorithm and convergence results Dual problem Ax = y R ( x ) min ← → min x ∈H R ( x ) + ι { y } ( Ax ) , where ι { y } ( x ) = 0 if x = y and ι { y } ( x ) = + ∞ otherwise. The dual problem is d ( u ) = R ∗ ( − A ∗ u ) + � y , u � . min u ∈ G d ( u ) , R strongly convex ⇒ the dual is smooth S. Villa (Unige) Dual iterative regularization 20 / 44

  34. Quadratic data fit Derivation of the algorithm and convergence results Dual gradient descent We can apply gradient descent to the dual problem: � x t = ∇ R ∗ ( − A ∗ u t ) u t +1 = u t + γ ( Ax t − y ) S. Villa (Unige) Dual iterative regularization 21 / 44

  35. Quadratic data fit Derivation of the algorithm and convergence results Dual gradient descent We can apply gradient descent to the dual problem: � x t = ∇ R ∗ ( − A ∗ u t ) u t +1 = u t + γ ( Ax t − y ) A.k.a. linearized Bregman iteration [Yin-Osher-Burger, several papers, Bachmayr-Burger, 2005] S. Villa (Unige) Dual iterative regularization 21 / 44

  36. Quadratic data fit Derivation of the algorithm and convergence results Dual gradient descent We can apply gradient descent to the dual problem: � x t = ∇ R ∗ ( − A ∗ u t ) u t +1 = u t + γ ( Ax t − y ) A.k.a. linearized Bregman iteration [Yin-Osher-Burger, several papers, Bachmayr-Burger, 2005] When R = � · � 2 / 2, it becomes the Landweber algorithm x t +1 = ( I − γ A ∗ A ) x t + γ A ∗ y S. Villa (Unige) Dual iterative regularization 21 / 44

  37. Quadratic data fit Derivation of the algorithm and convergence results Dual gradient descent We can apply gradient descent to the dual problem: � x t = ∇ R ∗ ( − A ∗ u t ) u t +1 = u t + γ ( Ax t − y ) A.k.a. linearized Bregman iteration [Yin-Osher-Burger, several papers, Bachmayr-Burger, 2005] When R = � · � 2 / 2, it becomes the Landweber algorithm x t +1 = ( I − γ A ∗ A ) x t + γ A ∗ y Gradient method applied to (1 / 2) � Ax − y � 2 S. Villa (Unige) Dual iterative regularization 21 / 44

  38. Quadratic data fit Derivation of the algorithm and convergence results Accelerated dual gradient descent We can apply an accelerated gradient descent to the dual problem:   x t = ∇ R ∗ ( − A ∗ u t )   z t = ∇ R ∗ � �   − A ∗ w t α t = t − 1 , t + α, α ≥ 2  w t = u t + α t ( u t − u t − 1 )     u t +1 = w t + γ ( Az t − y ) S. Villa (Unige) Dual iterative regularization 22 / 44

  39. Quadratic data fit Derivation of the algorithm and convergence results Accelerated dual gradient descent We can apply an accelerated gradient descent to the dual problem:   x t = ∇ R ∗ ( − A ∗ u t )   z t = ∇ R ∗ � �   − A ∗ w t α t = t − 1 , t + α, α ≥ 2  w t = u t + α t ( u t − u t − 1 )     u t +1 = w t + γ ( Az t − y ) When R = � · � 2 / 2, it becomes an accelerated Landweber algorithm � z t = x t + α t ( x t − x t − 1 ) x t +1 = z t − γ A ∗ ( Az t − y ) S. Villa (Unige) Dual iterative regularization 22 / 44

  40. Quadratic data fit Derivation of the algorithm and convergence results Accelerated dual gradient descent We can apply an accelerated gradient descent to the dual problem:   x t = ∇ R ∗ ( − A ∗ u t )   z t = ∇ R ∗ � �   − A ∗ w t α t = t − 1 , t + α, α ≥ 2  w t = u t + α t ( u t − u t − 1 )     u t +1 = w t + γ ( Az t − y ) When R = � · � 2 / 2, it becomes an accelerated Landweber algorithm � z t = x t + α t ( x t − x t − 1 ) x t +1 = z t − γ A ∗ ( Az t − y ) Accelerated gradient applied to (1 / 2) � Ax − y � 2 S. Villa (Unige) Dual iterative regularization 22 / 44

  41. Quadratic data fit Derivation of the algorithm and convergence results A technical condition 1 Existence of the solution of the dual (for the exact y ) needed for convergence rates 2 From convergence on the dual to convergence on the primal Qualification (source) condition (Only for the exact datum) There exists q ∈ G such that A ∗ q ∈ ∂ R ( x † ) S. Villa (Unige) Dual iterative regularization 23 / 44

  42. Quadratic data fit Derivation of the algorithm and convergence results A technical condition 1 Existence of the solution of the dual (for the exact y ) needed for convergence rates 2 From convergence on the dual to convergence on the primal Qualification (source) condition (Only for the exact datum) There exists q ∈ G such that A ∗ q ∈ ∂ R ( x † ) Same condition needed for establishing rates for Tikhonov regularization. S. Villa (Unige) Dual iterative regularization 23 / 44

  43. Quadratic data fit Derivation of the algorithm and convergence results Dual gradient descent is a regularization method Theorem ( Matet-Rosasco-V.-Vu, 2017 ) Assume that there ∃ q ∈ G such that A ∗ q ∈ ∂ R ( x † ) . Let u † be a solution of the dual problem. For every δ > 0 there exists t δ ∼ δ − 1 such that x t δ − x † � � δ 1 / 2 � � S. Villa (Unige) Dual iterative regularization 24 / 44

  44. Quadratic data fit Derivation of the algorithm and convergence results Accelerated dual gradient descent is a regularization method Theorem ( Matet-Rosasco-V.-Vu, 2017 ) Assume that there ∃ q ∈ G such that A ∗ q ∈ ∂ R ( x † ) . Let u † be a solution of the dual problem. For every δ > 0 there exists t δ ∼ δ − 1 / 2 such that x t δ − x † � � δ 1 / 2 � � Based on the results of [Aujol-Dossal, 2016] S. Villa (Unige) Dual iterative regularization 25 / 44

  45. Quadratic data fit Derivation of the algorithm and convergence results Accelerated dual gradient descent is a regularization method Theorem ( Matet-Rosasco-V.-Vu, 2017 ) Assume that there ∃ q ∈ G such that A ∗ q ∈ ∂ R ( x † ) . Let u † be a solution of the dual problem. For every δ > 0 there exists t δ ∼ δ − 1 / 2 such that x t δ − x † � � δ 1 / 2 � � Based on the results of [Aujol-Dossal, 2016] For R = � · � 2 / 2 see also [A. Neubauer 2016] S. Villa (Unige) Dual iterative regularization 25 / 44

  46. Quadratic data fit Derivation of the algorithm and convergence results Accelerated dual gradient descent is a regularization method Theorem ( Matet-Rosasco-V.-Vu, 2017 ) Assume that there ∃ q ∈ G such that A ∗ q ∈ ∂ R ( x † ) . Let u † be a solution of the dual problem. For every δ > 0 there exists t δ ∼ δ − 1 / 2 such that x t δ − x † � � δ 1 / 2 � � Based on the results of [Aujol-Dossal, 2016] For R = � · � 2 / 2 see also [A. Neubauer 2016] What is the difference? S. Villa (Unige) Dual iterative regularization 25 / 44

  47. Quadratic data fit Derivation of the algorithm and convergence results Accelerated dual gradient descent is a regularization method Theorem ( Matet-Rosasco-V.-Vu, 2017 ) Assume that there ∃ q ∈ G such that A ∗ q ∈ ∂ R ( x † ) . Let u † be a solution of the dual problem. For every δ > 0 there exists t δ ∼ δ − 1 / 2 such that x t δ − x † � � δ 1 / 2 � � Based on the results of [Aujol-Dossal, 2016] For R = � · � 2 / 2 see also [A. Neubauer 2016] What is the difference? Gradient descent: t δ ∼ δ − 1 Accelerated gradient descent: t δ ∼ δ − 1 / 2 S. Villa (Unige) Dual iterative regularization 25 / 44

  48. General data fit Dual descent algorithm General data fit If D ( Ax , y ) � = � Ax − y � 2 the previous approach does not work. Tikhonov regularization: original hierarchical problem is replaced by 1 minimize λ D ( Ax , y ) + R ( x ) , for a suitable λ > 0, and an algorithm is chosen to compute x t +1 = Algo( x t , λ ) . S. Villa (Unige) Dual iterative regularization 26 / 44

  49. General data fit Dual descent algorithm General data fit If D ( Ax , y ) � = � Ax − y � 2 the previous approach does not work. Tikhonov regularization: original hierarchical problem is replaced by 1 minimize λ D ( Ax , y ) + R ( x ) , for a suitable λ > 0, and an algorithm is chosen to compute x t +1 = Algo( x t , λ ) . A diagonal approach [Lemaire 80s-90s] x t +1 = Algo( x t , λ t ) , with λ t → 0. S. Villa (Unige) Dual iterative regularization 26 / 44

  50. General data fit Dual descent algorithm A picture The previous approach allows to describe: A diagonal strategy A warm restart strategy S. Villa (Unige) Dual iterative regularization 27 / 44

  51. General data fit Dual descent algorithm A dual approach Diagonal forward-backward: [Attouch, Cabot, Czarnecki, Peypouquet ...] S. Villa (Unige) Dual iterative regularization 28 / 44

  52. General data fit Dual descent algorithm A dual approach Diagonal forward-backward: [Attouch, Cabot, Czarnecki, Peypouquet ...] Not well-suited if D is not smooth Require to know the conditioning of D ( A · ; y ) (might not exists) S. Villa (Unige) Dual iterative regularization 28 / 44

  53. General data fit Dual descent algorithm A dual approach Diagonal forward-backward: [Attouch, Cabot, Czarnecki, Peypouquet ...] Not well-suited if D is not smooth Require to know the conditioning of D ( A · ; y ) (might not exists) 1 min R ( x ) − → λ D ( Ax , y ) + R ( x ) s.t. D ( Ax , y ) = 0 ↑ ↓ 1 � u , y � + R ∗ ( − A ∗ u ) λ D ∗ ( λ u , y ) + R ∗ ( − A ∗ u ) min u ∈ G ← − . � �� � � �� � = d ( u ) = d λ ( u ) S. Villa (Unige) Dual iterative regularization 28 / 44

  54. General data fit Dual descent algorithm Dual diagonal descent algorithm (3D) If R = F + ( σ R / 2) � · � 2 is strongly convex: + 1 R ∗ ( − A ∗ u ) λ D ∗ ( λ u , y ) d λ ( u ) = � �� � � �� � smooth nonsmooth We can use the forward-backward splitting algorithm on the dual. u 0 ∈ G , λ t → 0 , τ = σ R / � A � 2 z t +1 = u t + τ A ∇ R ∗ ( − A ∗ u t ) u t +1 = prox τλ − 1 D ∗ ( λ t · , y ) ( z t +1 ) . t S. Villa (Unige) Dual iterative regularization 29 / 44

  55. General data fit Dual descent algorithm Dual diagonal descent algorithm (3D) If R = F + ( σ R / 2) � · � 2 is strongly convex: + 1 R ∗ ( − A ∗ u ) λ D ∗ ( λ u , y ) d λ ( u ) = � �� � � �� � smooth nonsmooth We can use the forward-backward splitting algorithm on the dual. u 0 ∈ G , λ t → 0 , τ = σ R / � A � 2 z t +1 = u t + τ A ∇ R ∗ ( − A ∗ u t ) � � τ − 1 z t +1 u t +1 = z t +1 − τ prox ( τλ t ) − 1 D ( · , y ) S. Villa (Unige) Dual iterative regularization 29 / 44

  56. General data fit Dual descent algorithm Dual diagonal descent algorithm (3D) If R = F + ( σ R / 2) � · � 2 is strongly convex: + 1 R ∗ ( − A ∗ u ) λ D ∗ ( λ u , y ) d λ ( u ) = � �� � � �� � smooth nonsmooth We can use the forward-backward splitting algorithm on the dual. u 0 ∈ G , λ t → 0 , τ = σ R / � A � 2 x t = ∇ R ∗ ( − A ∗ u t ) = prox σ − 1 R F ( − A ∗ u t ) z t +1 = u t + τ Ax t � � τ − 1 z t +1 u t +1 = z t +1 − τ prox ( τλ t ) − 1 D ( · , y ) S. Villa (Unige) Dual iterative regularization 29 / 44

  57. General data fit Dual descent algorithm Convergence of diagonal dual descent algorithm AD1) D : G × G → [0 , + ∞ ] and D ( u , y ) = 0 ⇐ ⇒ u = y . S. Villa (Unige) Dual iterative regularization 30 / 44

  58. General data fit Dual descent algorithm Convergence of diagonal dual descent algorithm AD1) D : G × G → [0 , + ∞ ] and D ( u , y ) = 0 ⇐ ⇒ u = y . AD2) Let p ∈ [1 , + ∞ ]. D ( · , y ) is p -well conditioned S. Villa (Unige) Dual iterative regularization 30 / 44

  59. General data fit Dual descent algorithm Convergence of diagonal dual descent algorithm AD1) D : G × G → [0 , + ∞ ] and D ( u , y ) = 0 ⇐ ⇒ u = y . AD2) Let p ∈ [1 , + ∞ ]. D ( · , y ) is p -well conditioned AR) There exists a solution ¯ x such that A ¯ x = y ¯ x ∈ dom R . S. Villa (Unige) Dual iterative regularization 30 / 44

  60. General data fit Dual descent algorithm Convergence of diagonal dual descent algorithm AD1) D : G × G → [0 , + ∞ ] and D ( u , y ) = 0 ⇐ ⇒ u = y . AD2) Let p ∈ [1 , + ∞ ]. D ( · , y ) is p -well conditioned AR) There exists a solution ¯ x such that A ¯ x = y ¯ x ∈ dom R . Theorem [Garrigos-Rosasco-V. 2017] Suppose that ( λ t ) 1 / ( p − 1) ∈ ℓ 1 ( N ). Let x † be the solution of (P). Assume that there exists q ∈ G such that A ∗ q ∈ ∂ R ( x † ) Then � x t − x † � = o ( t − 1 / 2 ) S. Villa (Unige) Dual iterative regularization 30 / 44

  61. General data fit Dual descent algorithm Stability x t − x † � ≤ � � + � x t − x † � � � x t − x t � � �� � � �� � stability optimization S. Villa (Unige) Dual iterative regularization 31 / 44

  62. General data fit Dual descent algorithm Stability x t − x † � ≤ � � + � x t − x † � � � x t − x t � � �� � � �� � stability optimization Stability Theorem [Garrigos-Rosasco-V. 2017] Assume that the source/qualification condition holds. Let ˆ y ∈ Y , with � ˆ y − y � ≤ δ . Let (ˆ x t , ˆ u t ) be the sequence generated by the (3D) algorithm with y = ˆ y and ˆ u 0 = u 0 . Suppose that ( λ t ) 1 / ( p − 1) ∈ ℓ 1 ( N ) . Then � x t − ˆ x t � ≤ C δ t . For simplicity here D ( u , y ) = L ( u − y ). But this is not needed. S. Villa (Unige) Dual iterative regularization 31 / 44

  63. General data fit Dual descent algorithm Stability with respect to errors = iterative regularization results Theorem (Early-stopping) [Garrigos-Rosasco-V. 2017] Assume that the source/qualification condition holds. Let ˆ y ∈ Y , with � ˆ y − y � ≤ δ . Let (ˆ x t , ˆ u t ) be the sequence generated by the (3D) algorithm with y = ˆ y and ˆ u 0 = u 0 . Suppose that ( λ t ) 1 / ( p − 1) ∈ ℓ 1 ( N ) . Then there exists an early stopping rule t ( δ ) ∼ δ − 2 / 3 which verifies 1 x t ( δ ) − x † � = O ( δ 3 ) when δ → 0 . � ˆ S. Villa (Unige) Dual iterative regularization 32 / 44

  64. General data fit Accelerated dual descent algorithm Accelerated dual diagonal descent algorithm (A3D) If R = F + ( σ R / 2) � · � 2 is strongly convex: + 1 R ∗ ( − A ∗ u ) λ D ∗ ( λ u , y ) d λ ( u ) = � �� � � �� � smooth nonsmooth We can use the accelerated forward-backward splitting algorithm on the dual. u 0 ∈ G , λ t → 0 , τ = σ R / � A � 2 R F ( − A ∗ u t ) x t = prox σ − 1 R F ( − A ∗ w t ) s t = prox σ − 1 w t = u t + α t ( u t − u t − 1 ) z t +1 = w t + τ As t � � τ − 1 z t +1 u t +1 = z t +1 − τ prox ( τλ t ) − 1 D ( · , y ) S. Villa (Unige) Dual iterative regularization 33 / 44

  65. General data fit Accelerated dual descent algorithm (A3D) is a regularization method Theorem (Early-stopping) [Calatroni-Garrigos-Rosasco-V. 2019] Assume that the source/qualification condition holds. Let ˆ y ∈ Y , with � ˆ y − y � ≤ δ . Let (ˆ x t , ˆ u t ) be the sequence generated by the (A3D) algorithm with y = ˆ y and ˆ u 0 = u 0 . Suppose that ( t λ 1 / ( p − 1) ) ∈ ℓ 1 ( N ) t Then there exists an early stopping rule t ( δ ) ∼ δ − 1 / 2 which verifies 1 x t ( δ ) − x † � = O ( δ 2 ) when δ → 0 . � ˆ S. Villa (Unige) Dual iterative regularization 34 / 44

  66. General data fit Accelerated dual descent algorithm (A3D) is a regularization method Theorem (Early-stopping) [Calatroni-Garrigos-Rosasco-V. 2019] Assume that the source/qualification condition holds. Let ˆ y ∈ Y , with � ˆ y − y � ≤ δ . Let (ˆ x t , ˆ u t ) be the sequence generated by the (A3D) algorithm with y = ˆ y and ˆ u 0 = u 0 . Suppose that ( t λ 1 / ( p − 1) ) ∈ ℓ 1 ( N ) t Then there exists an early stopping rule t ( δ ) ∼ δ − 1 / 2 which verifies 1 x t ( δ ) − x † � = O ( δ 2 ) when δ → 0 . � ˆ For simplicity here D ( u , y ) = L ( u − y ). But this is not needed. S. Villa (Unige) Dual iterative regularization 34 / 44

  67. General data fit Experimental results Setting deblurring and denoising (salt and pepper, gaussian, gaussian+salt and pepper, Poisson) of 512 x 512 images comparison between the two versions: diagonal and warm restart warm restart: diagonal: one parameter = ( λ t )= n. iter. two parameters: ( λ t ); accuracy S. Villa (Unige) Dual iterative regularization 35 / 44

  68. General data fit Experimental results Diagonal works as well as warm restart (i.e. Tikhonov) Euclidean distance from the true image Dotted lines: diagonal with 10 3 and 10 4 iterations Dashed lines: warm restart with 30 λ s and accuracy : 10 − 3 , 10 − 4 , 10 − 5 S. Villa (Unige) Dual iterative regularization 36 / 44

  69. General data fit Experimental results Diagonal works better than(?) warm restart (i.e. Tikhonov) Total number of iterations as a function of ( λ t ) Dotted lines: diagonal Dashed lines: warm restart with 30 λ s and accuracy: 10 − 3 , 10 − 4 , 10 − 5 S. Villa (Unige) Dual iterative regularization 37 / 44

Recommend


More recommend