Low Complexity Regularization of Inverse Problems Joint work with: Samuel Vaiter Jalal Fadili Gabriel Peyré VISI N www.numerical-tours.com
Inverse Problems Recovering x 0 ∈ R N from noisy observations y = Φ x 0 + w ∈ R P
Inverse Problems Recovering x 0 ∈ R N from noisy observations y = Φ x 0 + w ∈ R P Examples: Inpainting, super-resolution, . . . Φ Φ x 0
Inverse Problems in Medical Imaging Φ x = ( p θ k ) 1 � k � K
Inverse Problems in Medical Imaging Φ x = ( p θ k ) 1 � k � K Φ x = ( ˆ Magnetic resonance imaging (MRI): f ( ω )) ω ∈ Ω ˆ x
Inverse Problems in Medical Imaging Φ x = ( p θ k ) 1 � k � K Φ x = ( ˆ Magnetic resonance imaging (MRI): f ( ω )) ω ∈ Ω ˆ x Other examples: MEG, EEG, . . .
Compressed Sensing [Rice Univ.] ˜ x 0
Compressed Sensing [Rice Univ.] ˜ x 0 y [ i ] = h x 0 , ϕ i � P measures � N micro-mirrors
Compressed Sensing [Rice Univ.] ˜ x 0 y [ i ] = h x 0 , ϕ i � P measures � N micro-mirrors P/N = 1 P/N = 0 . 16 P/N = 0 . 02
Inverse Problem Regularization Observations: y = Φ x 0 + w ∈ R P . observations y Estimator: x ( y ) depends only on parameter λ
Inverse Problem Regularization Observations: y = Φ x 0 + w ∈ R P . observations y Estimator: x ( y ) depends only on parameter λ Example: variational methods 1 | 2 + λ J ( x ) x ( y ) ∈ argmin 2 | | y � Φ x | x ∈ R N Regularity Data fidelity
Inverse Problem Regularization Observations: y = Φ x 0 + w ∈ R P . observations y Estimator: x ( y ) depends only on parameter λ Example: variational methods 1 | 2 + λ J ( x ) x ( y ) ∈ argmin 2 | | y � Φ x | x ∈ R N Regularity Data fidelity Noise level Regularity of x 0 Choice of λ : tradeo ff | | w | | J ( x 0 )
Inverse Problem Regularization Observations: y = Φ x 0 + w ∈ R P . observations y Estimator: x ( y ) depends only on parameter λ Example: variational methods 1 | 2 + λ J ( x ) x ( y ) ∈ argmin 2 | | y � Φ x | x ∈ R N Regularity Data fidelity Noise level Regularity of x 0 Choice of λ : tradeo ff | | w | | J ( x 0 ) No noise: λ → 0 + , minimize x ( y ) ∈ argmin J ( x ) Φ x = y
Inverse Problem Regularization Observations: y = Φ x 0 + w ∈ R P . observations y Estimator: x ( y ) depends only on parameter λ Example: variational methods 1 | 2 + λ J ( x ) x ( y ) ∈ argmin 2 | | y � Φ x | x ∈ R N Regularity Data fidelity Noise level Regularity of x 0 Choice of λ : tradeo ff | | w | | J ( x 0 ) No noise: λ → 0 + , minimize x ( y ) ∈ argmin J ( x ) Φ x = y Performance analysis: model stability → Criteria on ( x 0 , | | w | | , λ ) to ensure | | x ( y ) − x 0 | | = O ( | | w | | ) �
Overview • Low-complexity Convex Regularization • Performance Guarantees: L2 Error • Performance Guarantees: Model Consistency
Union of Models for Data Processing Union of models: M ⊂ R N subspaces or manifolds. Synthesis Ψ sparsity: M Coe ffi cients x Image Ψ x
Union of Models for Data Processing Union of models: M ⊂ R N subspaces or manifolds. Synthesis Structured Ψ sparsity: sparsity: M Coe ffi cients x Image Ψ x
Union of Models for Data Processing Union of models: M ⊂ R N subspaces or manifolds. Synthesis Structured Ψ sparsity: sparsity: M Coe ffi cients x Image Ψ x D ∗ Analysis sparsity: Image x Gradient D ∗ x
Union of Models for Data Processing Union of models: M ⊂ R N subspaces or manifolds. Synthesis Structured Ψ sparsity: sparsity: M Coe ffi cients x Image Ψ x D ∗ Analysis Low-rank: sparsity: Image x Gradient D ∗ x S 1 , · Multi-spectral imaging: S 2 , · x i, · = � r j =1 A i,j S j, · S 3 , · x
Partly Smooth Functions J : R N → R is partly smooth at x for a manifold M x [Lewis 2003] (i) J is C 2 along M x around x ; (ii) ∀ h 2 T x ( M x ) ⊥ , t 7! J ( x + th ) non-smooth at t = 0. (iii) ∂ J is continuous on M x around x . J ( x ) = max(0 , | | x | | − 1) M x T x ( M x ) x
Examples of Partly-smooth Regularizers � 1 sparsity: J ( x ) = | | x | | 1 M x = { z ; supp( z ) ⊂ supp( x ) } M x � x � x M x J ( x ) = | | x | | 1
Examples of Partly-smooth Regularizers � 1 sparsity: J ( x ) = | | x | | 1 M x = { z ; supp( z ) ⊂ supp( x ) } Structured sparsity: J ( x ) = � b | | x b | | same M x M x x M x � M x � x � x x � M x J ( x ) = | | x | | 1 J ( x )= | x 1 | + | | x 2 , 3 | |
Examples of Partly-smooth Regularizers � 1 sparsity: J ( x ) = | | x | | 1 M x = { z ; supp( z ) ⊂ supp( x ) } Structured sparsity: J ( x ) = � b | | x b | | same M x Nuclear norm: J ( x ) = | | x | | ∗ M x = { x ; rank( z ) = rank( x ) } M x M x x M x � M x � x x � x x � M x J ( x ) = | | x | | 1 J ( x )= | x 1 | + | | x 2 , 3 | | J ( x ) = | | x | | ∗
Examples of Partly-smooth Regularizers � 1 sparsity: J ( x ) = | | x | | 1 M x = { z ; supp( z ) ⊂ supp( x ) } Structured sparsity: J ( x ) = � b | | x b | | same M x Nuclear norm: J ( x ) = | | x | | ∗ M x = { x ; rank( z ) = rank( x ) } M x = { z ; z I ∝ x I } Anti-sparsity: J ( x ) = | | x | | ∞ I = { i ; | x i | = | | x | | ∞ } M x � M x M x M x x M x � M x � x x x � x � x x � M x J ( x ) = | | x | | 1 J ( x )= | x 1 | + | | x 2 , 3 | | J ( x ) = | | x | | ∗ J ( x ) = | | x | | ∞
Overview • Low-complexity Convex Regularization • Performance Guarantees: L2 Error • Performance Guarantees: Model Consistency
Dual Certificates Φ x = Φ x 0 J ( x ) min ( P 0 ) Noiseless recovery: Φ x 0 x Φ = x 0
Dual Certificates ∂ J ( x 0 ) Φ x = Φ x 0 J ( x ) min ( P 0 ) Noiseless recovery: η Φ x 0 x Proposition: Φ = x 0 x 0 solution of ( P 0 ) ∃ η 2 D ( x 0 ) ( ) D ( x 0 ) = Im( Φ ∗ ) ∩ ∂ J ( x 0 ) Dual certificates: ∂ J ( x ) = { ⌘ ; 8 y, J ( y ) � J ( x ) + h ⌘ , y � x � }
Dual Certificates ∂ J ( x 0 ) Φ x = Φ x 0 J ( x ) min ( P 0 ) Noiseless recovery: η Φ x 0 x Proposition: Φ = x 0 x 0 solution of ( P 0 ) ∃ η 2 D ( x 0 ) ( ) D ( x 0 ) = Im( Φ ∗ ) ∩ ∂ J ( x 0 ) Dual certificates: ∂ J ( x ) = { ⌘ ; 8 y, J ( y ) � J ( x ) + h ⌘ , y � x � } Example: J ( x ) = | | x | | 1 Φ x = x � ' D ( x 0 ) = { ⌘ = x � ' ; ⌘ i = sign( x 0 ,i ) , | | ⌘ | | ∞ � 1 } η η
Dual Certificates and L2 Stability Non degenerate dual certificate: ∂ J ( x 0 ) η ¯ D ( x 0 ) = Im( Φ ∗ ) ∩ ri( ∂ J ( x 0 )) Φ x � x Φ = x 0 ri( E ) = relative interior of E = interior for the topology of a ff ( E )
Dual Certificates and L2 Stability Non degenerate dual certificate: ∂ J ( x 0 ) η ¯ D ( x 0 ) = Im( Φ ∗ ) ∩ ri( ∂ J ( x 0 )) Φ x � x Φ = x 0 ri( E ) = relative interior of E = interior for the topology of a ff ( E ) [Fadili et al. 2013] Theorem: | x � � x 0 | If ∃ ⌘ 2 ¯ D ( x 0 ), for λ ⇠ | | w | | one has | | = O ( | | w | | )
Dual Certificates and L2 Stability Non degenerate dual certificate: ∂ J ( x 0 ) η ¯ D ( x 0 ) = Im( Φ ∗ ) ∩ ri( ∂ J ( x 0 )) Φ x � x Φ = x 0 ri( E ) = relative interior of E = interior for the topology of a ff ( E ) [Fadili et al. 2013] Theorem: | x � � x 0 | If ∃ ⌘ 2 ¯ D ( x 0 ), for λ ⇠ | | w | | one has | | = O ( | | w | | ) [Grassmair, Haltmeier, Scherzer 2010]: J = | | · | | 1 . [Grassmair 2012]: J ( x � − x 0 ) = O ( | | w | | ).
Dual Certificates and L2 Stability Non degenerate dual certificate: ∂ J ( x 0 ) η ¯ D ( x 0 ) = Im( Φ ∗ ) ∩ ri( ∂ J ( x 0 )) Φ x � x Φ = x 0 ri( E ) = relative interior of E = interior for the topology of a ff ( E ) [Fadili et al. 2013] Theorem: | x � � x 0 | If ∃ ⌘ 2 ¯ D ( x 0 ), for λ ⇠ | | w | | one has | | = O ( | | w | | ) [Grassmair, Haltmeier, Scherzer 2010]: J = | | · | | 1 . [Grassmair 2012]: J ( x � − x 0 ) = O ( | | w | | ). → The constants depend on N . . . �
Compressed Sensing Setting Φ ∈ R P × N , Φ i,j ∼ N (0 , 1), i.i.d. Random matrix:
Compressed Sensing Setting Φ ∈ R P × N , Φ i,j ∼ N (0 , 1), i.i.d. Random matrix: Sparse vectors: J = | | · | | 1 . [Rudelson, Vershynin 2006] Theorem: Let s = | | x 0 | | 0 . If [Chandrasekaran et al. 2011] P � 2 s log ( N/s ) Then ∃ η 2 ¯ D ( x 0 ) with high probability on Φ .
Compressed Sensing Setting Φ ∈ R P × N , Φ i,j ∼ N (0 , 1), i.i.d. Random matrix: Sparse vectors: J = | | · | | 1 . [Rudelson, Vershynin 2006] Theorem: Let s = | | x 0 | | 0 . If [Chandrasekaran et al. 2011] P � 2 s log ( N/s ) Then ∃ η 2 ¯ D ( x 0 ) with high probability on Φ . Low-rank matrices: J = | | · | | ∗ . [Chandrasekaran et al. 2011] Theorem: Let r = rank( x 0 ). If x 0 ∈ R N 1 × N 2 P � 3 r ( N 1 + N 2 − r ) Then ∃ η 2 ¯ D ( x 0 ) with high probability on Φ .
Compressed Sensing Setting Φ ∈ R P × N , Φ i,j ∼ N (0 , 1), i.i.d. Random matrix: Sparse vectors: J = | | · | | 1 . [Rudelson, Vershynin 2006] Theorem: Let s = | | x 0 | | 0 . If [Chandrasekaran et al. 2011] P � 2 s log ( N/s ) Then ∃ η 2 ¯ D ( x 0 ) with high probability on Φ . Low-rank matrices: J = | | · | | ∗ . [Chandrasekaran et al. 2011] Theorem: Let r = rank( x 0 ). If x 0 ∈ R N 1 × N 2 P � 3 r ( N 1 + N 2 − r ) Then ∃ η 2 ¯ D ( x 0 ) with high probability on Φ . → Similar results for | | · | | 1 , 2 , | | · | | ∞ . �
Recommend
More recommend