Introduction Introduction to optimization Majoration-Minimization approaches Optimization for data processing at a large scale 1/32 Optimization for data processing at a large scale Sparsity4PSL Summer School Emilie Chouzenoux Center for Visual Computing CentraleSup´ elec, INRIA Saclay 24 June 2019
Introduction Introduction to optimization Majoration-Minimization approaches Optimization for data processing at a large scale 2/32 Inverse problems and large scale optimization [Microscopy, ISBI Challenge 2013, F. Soulez] Original image Degraded image
Introduction Introduction to optimization Majoration-Minimization approaches Optimization for data processing at a large scale 2/32 Inverse problems and large scale optimization [Microscopy, ISBI Challenge 2013, F. Soulez] Original image Degraded image x ∈ R N z = D ( Hx ) ∈ R M ◮ H ∈ R M × N : matrix associated with the degradation operator. ◮ D : R M → R M : noise degradation. Inverse problem: Find a good estimate of x from the observations z , using some a priori knowledge on x and on the noise characteristics .
Introduction Introduction to optimization Majoration-Minimization approaches Optimization for data processing at a large scale 3/32 Inverse problems and large scale optimization Inverse problem: Find an estimate � x close to x from the observations z = D ( Hx ) . ◮ Inverse filtering (if M = N and H is invertible) x = H − 1 z � if b ∈ R M is an additive noise = H − 1 ( Hx + b ) ← = x + H − 1 b → Closed form expression, but amplification of the noise if H is ill-conditioned ( ill-posed problem ).
Introduction Introduction to optimization Majoration-Minimization approaches Optimization for data processing at a large scale 4/32 Inverse problems and large scale optimization Inverse problem: Find an estimate � x close to x from the observations z = D ( Hx ) . ✭ ◮ ✭✭✭✭✭✭✭ Inverse filtering ◮ Variational approach � x ∈ Argmin f 1 ( x ) + f 2 ( x ) ���� ���� x ∈ R N Data fidelity term Regularization term
Introduction Introduction to optimization Majoration-Minimization approaches Optimization for data processing at a large scale 4/32 Inverse problems and large scale optimization Inverse problem: Find an estimate � x close to x from the observations z = D ( Hx ) . ✭ ◮ ✭✭✭✭✭✭✭ Inverse filtering ◮ Variational approach � x ∈ Argmin f 1 ( x ) + f 2 ( x ) ���� ���� x ∈ R N Data fidelity term Regularization term Examples of data fidelity term ◮ Gaussian noise f 1 ( x ) = 1 ( ∀ x ∈ R N ) σ 2 � Hx − z � 2 ◮ Poisson noise � � � M [ Hx ] ( m ) − z ( m ) log([ Hx ] ( m ) ) ( ∀ x ∈ R N ) f 1 ( x ) = m =1
Introduction Introduction to optimization Majoration-Minimization approaches Optimization for data processing at a large scale 5/32 Examples of regularization terms (1) ◮ Admissibility constraints M � Find x ∈ C = C m m =1 where ( ∀ m ∈ { 1 , . . . , M } ) C m ⊂ R N .
Introduction Introduction to optimization Majoration-Minimization approaches Optimization for data processing at a large scale 5/32 Examples of regularization terms (1) ◮ Admissibility constraints M � Find x ∈ C = C m m =1 where ( ∀ m ∈ { 1 , . . . , M } ) C m ⊂ R N . ◮ Variational formulation M � ( ∀ x ∈ R N ) f 2 ( x ) = ι C m ( x ) m =1 where, for all m ∈ { 1 , . . . , M } , ι C m is the indicator function of C m : � if x ∈ C m 0 ( ∀ x ∈ R N ) ι C m ( x ) = + ∞ otherwise.
Introduction Introduction to optimization Majoration-Minimization approaches Optimization for data processing at a large scale 6/32 Examples of regularization terms (2) ◮ ℓ 1 norm (analysis approach) � [ Fx ] ( k ) � � K � � � ( ∀ x ∈ R N ) f 2 ( x ) = � = � Fx � 1 k =1 F ∈ R K × N : Frame decomposition operator ( K ≥ N ) F signal x frame coefficients
Introduction Introduction to optimization Majoration-Minimization approaches Optimization for data processing at a large scale 6/32 Examples of regularization terms (2) ◮ ℓ 1 norm (analysis approach) � � [ Fx ] ( k ) � K � � � ( ∀ x ∈ R N ) f 2 ( x ) = � = � Fx � 1 k =1 ◮ Total variation ( ∀ x = ( x ( i 1 , i 2 ) ) 1 ≤ i 1 ≤ N 1 , 1 ≤ i 2 ≤ N 2 ∈ R N 1 × N 2 ) N 1 N 2 � � �∇ x ( i 1 , i 2 ) � 2 f 2 ( x ) = tv( x ) = i 1 =1 i 2 =1 ∇ x ( i 1 , i 2 ) : discrete gradient at pixel ( i 1 , i 2 ).
Introduction Introduction to optimization Majoration-Minimization approaches Optimization for data processing at a large scale 7/32 Inverse problems and large scale optimization Inverse problem: Find an estimate � x close to x from the observations z = D ( Hx ) . ◮ ✭✭✭✭✭✭✭ Inverse filtering ✭ ◮ Variational approach (more general context) m � � x ∈ Argmin f i ( x ) x ∈ R N i =1 where f i may denote a data fidelity term / a (hybrid) regularization term / constraint.
Introduction Introduction to optimization Majoration-Minimization approaches Optimization for data processing at a large scale 7/32 Inverse problems and large scale optimization Inverse problem: Find an estimate � x close to x from the observations z = D ( Hx ) . ◮ ✭✭✭✭✭✭✭ Inverse filtering ✭ ◮ Variational approach (more general context) m � � x ∈ Argmin f i ( x ) x ∈ R N i =1 where f i may denote a data fidelity term / a (hybrid) regularization term / constraint. → Often no closed form expression or solution expensive to compute (especially in large scale context). ◮ Need for an efficient iterative minimization strategy !
Introduction Introduction to optimization Majoration-Minimization approaches Optimization for data processing at a large scale 8/32 Main challenges ◮ How to exploit the mathematical properties of each term involved in f ? How to handle constraints efficiently ? How to deal with non differentiable terms in f ? Which convergence result can be expected if f is non convex? ◮ How to reduce the memory requirements of an optimization algorithm? How to avoid large-size matrix inversion? ◮ What are the benefits of block alternating strategies? What are their convergence guaranties? ◮ How to accelerate the convergence speed of a first-order (gradient-like) optimization method?
Introduction Introduction to optimization Majoration-Minimization approaches Optimization for data processing at a large scale 9/32 Outline 1. Introduction to optimization ◮ Notation/definitions ◮ Existence and unicity of minimizers ◮ Differential/subdifferential ◮ Optimality conditions 2. Majoration-Minimization approaches ◮ Majorization-Minimization principle ◮ Majorization techniques ◮ MM quadratic methods ◮ Forward-backward algorithm ◮ Block-coordinate MM algorithms
Introduction Introduction to optimization Majoration-Minimization approaches Optimization for data processing at a large scale 10/32 Introduction to optimization
Introduction Introduction to optimization Majoration-Minimization approaches Optimization for data processing at a large scale 11/32 Domain of a function Let f : R N → R ∪ + ∞ . ◮ The domain of f is dom f = { x ∈ R N | f ( x ) < + ∞} . ◮ The function f is proper if dom f � = ∅ .
Introduction Introduction to optimization Majoration-Minimization approaches Optimization for data processing at a large scale 12/32 Indicator function Let C ⊂ R N . The indicator function of C is � 0 if x ∈ C ( ∀ x ∈ R N ) ι C ( x ) = + ∞ otherwise. Example:
Introduction Introduction to optimization Majoration-Minimization approaches Optimization for data processing at a large scale 13/32 Epigraph Let f : R N → R ∪ + ∞ . The epigraph of f is � � � � f ( x ) ≤ ζ epi f = ( x , ζ ) ∈ dom f × R Examples:
Introduction Introduction to optimization Majoration-Minimization approaches Optimization for data processing at a large scale 14/32 Lower semi-continuous function Let f : R N → R ∪ + ∞ . f is a lower semi-continuous function on R N if and only if epi f is closed Examples:
Introduction Introduction to optimization Majoration-Minimization approaches Optimization for data processing at a large scale 15/32 Convex set C ⊂ R N is a convex set if ( ∀ ( x , y ) ∈ C 2 )( ∀ α ∈ ]0 , 1[) α x + (1 − α ) y ∈ C
Introduction Introduction to optimization Majoration-Minimization approaches Optimization for data processing at a large scale 16/32 Coercive function Let f : R N → R ∪ + ∞ . f is coercive if lim � x �→ + ∞ f ( x ) = + ∞ .
Introduction Introduction to optimization Majoration-Minimization approaches Optimization for data processing at a large scale 17/32 Convex function f : R N → R ∪ + ∞ is a convex function if � ∀ ( x , y ) ∈ ( R N ) 2 � ( ∀ α ∈ ]0 , 1[) f ( α x + (1 − α ) y ) ≤ α f ( x ) + (1 − α ) f ( y ) ◮ f is convex ⇔ its epigraph is convex. Examples:
Introduction Introduction to optimization Majoration-Minimization approaches Optimization for data processing at a large scale 18/32 Strictly convex function f : R N → R ∪ + ∞ is strictly convex if ( ∀ x ∈ dom f )( ∀ y ∈ dom f )( ∀ α ∈ ]0 , 1[) x � = y ⇒ f ( α x + (1 − α ) y ) < α f ( x ) + (1 − α ) f ( y ) .
Recommend
More recommend