Inverse Problems Recovering x 0 R N from noisy observations y = x 0 - - PowerPoint PPT Presentation

inverse problems
SMART_READER_LITE
LIVE PREVIEW

Inverse Problems Recovering x 0 R N from noisy observations y = x 0 - - PowerPoint PPT Presentation

Low Complexity Regularization of Inverse Problems Joint work with: Samuel Vaiter Jalal Fadili Gabriel Peyr VISI N www.numerical-tours.com Inverse Problems Recovering x 0 R N from noisy observations y = x 0 + w R P Inverse


slide-1
SLIDE 1

Gabriel Peyré

www.numerical-tours.com

Low Complexity Regularization of

Inverse Problems

Samuel Vaiter Jalal Fadili Joint work with:

VISI N

slide-2
SLIDE 2

y = Φx0 + w ∈ RP

Inverse Problems

Recovering x0 ∈ RN from noisy observations

slide-3
SLIDE 3

y = Φx0 + w ∈ RP Φ Examples: Inpainting, super-resolution, . . .

Inverse Problems

Recovering x0 ∈ RN from noisy observations

x0

Φ

slide-4
SLIDE 4

Φx = (pθk)1kK

Inverse Problems in Medical Imaging

slide-5
SLIDE 5

Magnetic resonance imaging (MRI): Φx = (pθk)1kK Φx = ( ˆ f(ω))ω∈Ω

Inverse Problems in Medical Imaging

ˆ x

slide-6
SLIDE 6

Magnetic resonance imaging (MRI): Other examples: MEG, EEG, . . . Φx = (pθk)1kK Φx = ( ˆ f(ω))ω∈Ω

Inverse Problems in Medical Imaging

ˆ x

slide-7
SLIDE 7

˜ x0

Compressed Sensing

[Rice Univ.]

slide-8
SLIDE 8

P measures N micro-mirrors ˜ x0 y[i] = hx0, ϕi

Compressed Sensing

[Rice Univ.]

slide-9
SLIDE 9

P/N = 0.16 P/N = 0.02 P/N = 1 P measures N micro-mirrors ˜ x0 y[i] = hx0, ϕi

Compressed Sensing

[Rice Univ.]

slide-10
SLIDE 10

Inverse Problem Regularization

  • bservations y

parameter λ Estimator: x(y) depends only on Observations: y = Φx0 + w ∈ RP .

slide-11
SLIDE 11

Inverse Problem Regularization

  • bservations y

parameter λ Example: variational methods Estimator: x(y) depends only on x(y) ∈ argmin

x∈RN

1 2| |y Φx| |2 + λ J(x) Data fidelity Regularity Observations: y = Φx0 + w ∈ RP .

slide-12
SLIDE 12

J(x0) Regularity of x0

Inverse Problem Regularization

  • bservations y

parameter λ Example: variational methods Estimator: x(y) depends only on x(y) ∈ argmin

x∈RN

1 2| |y Φx| |2 + λ J(x) Data fidelity Regularity Observations: y = Φx0 + w ∈ RP . Choice of λ: tradeoff | |w| | Noise level

slide-13
SLIDE 13

J(x0) Regularity of x0 x(y) ∈ argmin

Φx=y

J(x)

Inverse Problem Regularization

  • bservations y

parameter λ Example: variational methods Estimator: x(y) depends only on x(y) ∈ argmin

x∈RN

1 2| |y Φx| |2 + λ J(x) Data fidelity Regularity Observations: y = Φx0 + w ∈ RP . No noise: λ → 0+, minimize Choice of λ: tradeoff | |w| | Noise level

slide-14
SLIDE 14

J(x0) Regularity of x0 x(y) ∈ argmin

Φx=y

J(x)

Inverse Problem Regularization

  • bservations y

parameter λ Example: variational methods Estimator: x(y) depends only on x(y) ∈ argmin

x∈RN

1 2| |y Φx| |2 + λ J(x) Data fidelity Regularity Observations: y = Φx0 + w ∈ RP . No noise: λ → 0+, minimize Choice of λ: tradeoff | |w| | Noise level

  • → Criteria on (x0, |

|w| |, λ) to ensure

| |x(y) − x0| | = O(| |w| |)

model stability Performance analysis:

slide-15
SLIDE 15

Overview

  • Low-complexity Convex Regularization
  • Performance Guarantees: L2 Error
  • Performance Guarantees: Model Consistency
slide-16
SLIDE 16

Ψ Coefficients x Image Ψx M

Union of Models for Data Processing

Synthesis sparsity: Union of models: M ⊂ RN subspaces or manifolds.

slide-17
SLIDE 17

Ψ Coefficients x Image Ψx M

Union of Models for Data Processing

Synthesis sparsity: Structured sparsity: Union of models: M ⊂ RN subspaces or manifolds.

slide-18
SLIDE 18

Ψ Coefficients x Image Ψx M

Union of Models for Data Processing

D∗ Image x

Gradient D∗x

Synthesis sparsity: Structured sparsity: Analysis sparsity: Union of models: M ⊂ RN subspaces or manifolds.

slide-19
SLIDE 19

Ψ Coefficients x Image Ψx Multi-spectral imaging:

xi,· = r

j=1 Ai,jSj,·

M

Union of Models for Data Processing

D∗ Image x

Gradient D∗x

Synthesis sparsity: Structured sparsity: Analysis sparsity: Low-rank: S1,· S2,· S3,·

x

Union of models: M ⊂ RN subspaces or manifolds.

slide-20
SLIDE 20

(iii) ∂J is continuous on Mx around x. [Lewis 2003] (i) J is C2 along Mx around x ; (ii) ∀ h 2 Tx(Mx)⊥, t 7! J(x + th) non-smooth at t = 0. Tx(Mx)

Partly Smooth Functions

J : RN → R is partly smooth at x for a manifold Mx

x

Mx

J(x) = max(0, | |x| | − 1)

slide-21
SLIDE 21

x

Mx Mx ={z ; supp(z) ⊂ supp(x)}

Examples of Partly-smooth Regularizers

J(x) = | |x| |1

x

Mx 1 sparsity: J(x) = | |x| |1

slide-22
SLIDE 22

x

Mx same Mx Mx ={z ; supp(z) ⊂ supp(x)}

Examples of Partly-smooth Regularizers

J(x) = | |x| |1

x

Mx

J(x)=|x1|+| |x2,3| |

x x

Mx Mx 1 sparsity: J(x) = | |x| |1 Structured sparsity: J(x) =

b |

|xb| |

slide-23
SLIDE 23

x

Mx Mx = {x ; rank(z) = rank(x)} same Mx Mx ={z ; supp(z) ⊂ supp(x)}

Examples of Partly-smooth Regularizers

J(x) = | |x| |1

x

Mx

J(x) = | |x| |∗

x

Mx

J(x)=|x1|+| |x2,3| |

x x

Mx Mx 1 sparsity: J(x) = | |x| |1 Structured sparsity: J(x) =

b |

|xb| | Nuclear norm: J(x) = | |x| |∗

slide-24
SLIDE 24

x

Mx Mx = {x ; rank(z) = rank(x)} same Mx I = {i ; |xi| = | |x| |∞} Mx ={z ; supp(z) ⊂ supp(x)} Mx = {z ; zI ∝ xI}

Examples of Partly-smooth Regularizers

J(x) = | |x| |1

x

Mx Mx

J(x) = | |x| |∞

x x

Mx

J(x) = | |x| |∗

x

Mx

J(x)=|x1|+| |x2,3| |

x x

Mx Mx Anti-sparsity: J(x) = | |x| |∞ 1 sparsity: J(x) = | |x| |1 Structured sparsity: J(x) =

b |

|xb| | Nuclear norm: J(x) = | |x| |∗

slide-25
SLIDE 25

Overview

  • Low-complexity Convex Regularization
  • Performance Guarantees: L2 Error
  • Performance Guarantees: Model Consistency
slide-26
SLIDE 26

Φ x = Φ x0

Dual Certificates

Noiseless recovery: min

Φx=Φx0 J(x)

(P0)

x0

slide-27
SLIDE 27

Dual certificates:

Φ x = Φ x0

Proposition: D(x0) = Im(Φ∗) ∩ ∂J(x0) x0 solution of (P0) ( ) ∃ η 2 D(x0) ∂J(x) = {⌘ ; 8 y, J(y) J(x) + h⌘, y x}

Dual Certificates

η ∂J(x0) Noiseless recovery:

min

Φx=Φx0 J(x)

(P0)

x0

slide-28
SLIDE 28

Dual certificates:

Φ x = Φ x0

Proposition: D(x0) = Im(Φ∗) ∩ ∂J(x0) x0 solution of (P0) ( ) ∃ η 2 D(x0) Example: J(x) = | |x| |1 Φx = x ' D(x0) = {⌘ = x ' ; ⌘i = sign(x0,i), | |⌘| |∞ 1} ∂J(x) = {⌘ ; 8 y, J(y) J(x) + h⌘, y x}

Dual Certificates

η ∂J(x0) Noiseless recovery:

min

Φx=Φx0 J(x)

(P0)

x0 η η

slide-29
SLIDE 29

= interior for the topology of aff(E) ri(E) = relative interior of E Non degenerate dual certificate:

Dual Certificates and L2 Stability

Φ x = Φ x0

η ∂J(x0) x

¯ D(x0) = Im(Φ∗) ∩ ri(∂J(x0))

slide-30
SLIDE 30

= interior for the topology of aff(E) ri(E) = relative interior of E Non degenerate dual certificate:

Dual Certificates and L2 Stability

Theorem:

[Fadili et al. 2013]

If ∃ ⌘ 2 ¯ D(x0), for λ ⇠ | |w| | one has | |x x0| | = O(| |w| |)

Φ x = Φ x0

η ∂J(x0) x

¯ D(x0) = Im(Φ∗) ∩ ri(∂J(x0))

slide-31
SLIDE 31

= interior for the topology of aff(E) ri(E) = relative interior of E Non degenerate dual certificate:

Dual Certificates and L2 Stability

[Grassmair 2012]: J(x − x0) = O(| |w| |). [Grassmair, Haltmeier, Scherzer 2010]: J = | | · | |1. Theorem:

[Fadili et al. 2013]

If ∃ ⌘ 2 ¯ D(x0), for λ ⇠ | |w| | one has | |x x0| | = O(| |w| |)

Φ x = Φ x0

η ∂J(x0) x

¯ D(x0) = Im(Φ∗) ∩ ri(∂J(x0))

slide-32
SLIDE 32

= interior for the topology of aff(E) ri(E) = relative interior of E Non degenerate dual certificate:

Dual Certificates and L2 Stability

[Grassmair 2012]: J(x − x0) = O(| |w| |). [Grassmair, Haltmeier, Scherzer 2010]: J = | | · | |1.

  • → The constants depend on N . . .

Theorem:

[Fadili et al. 2013]

If ∃ ⌘ 2 ¯ D(x0), for λ ⇠ | |w| | one has | |x x0| | = O(| |w| |)

Φ x = Φ x0

η ∂J(x0) x

¯ D(x0) = Im(Φ∗) ∩ ri(∂J(x0))

slide-33
SLIDE 33

Random matrix: Φi,j ∼ N(0, 1), i.i.d. Φ ∈ RP ×N,

Compressed Sensing Setting

slide-34
SLIDE 34

Theorem: Random matrix: Φi,j ∼ N(0, 1), i.i.d. Sparse vectors: J = | | · | |1. Let s = | |x0| |0. If Then ∃η 2 ¯ D(x0) with high probability on Φ. Φ ∈ RP ×N,

[Chandrasekaran et al. 2011]

P 2s log (N/s)

[Rudelson, Vershynin 2006]

Compressed Sensing Setting

slide-35
SLIDE 35

Theorem: Then ∃η 2 ¯ D(x0) with high probability on Φ. Theorem: Random matrix: Φi,j ∼ N(0, 1), i.i.d. Sparse vectors: J = | | · | |1. Low-rank matrices: J = | | · | |∗. Let s = | |x0| |0. If Let r = rank(x0). If Then ∃η 2 ¯ D(x0) with high probability on Φ. Φ ∈ RP ×N,

[Chandrasekaran et al. 2011]

P 2s log (N/s) P 3r(N1 + N2 − r) x0 ∈ RN1×N2

[Rudelson, Vershynin 2006]

Compressed Sensing Setting

[Chandrasekaran et al. 2011]

slide-36
SLIDE 36

Theorem: Then ∃η 2 ¯ D(x0) with high probability on Φ. Theorem: Random matrix: Φi,j ∼ N(0, 1), i.i.d. Sparse vectors: J = | | · | |1. Low-rank matrices: J = | | · | |∗. Let s = | |x0| |0. If Let r = rank(x0). If Then ∃η 2 ¯ D(x0) with high probability on Φ. Φ ∈ RP ×N,

  • → Similar results for |

| · | |1,2, | | · | |∞.

[Chandrasekaran et al. 2011]

P 2s log (N/s) P 3r(N1 + N2 − r) x0 ∈ RN1×N2

[Rudelson, Vershynin 2006]

Compressed Sensing Setting

[Chandrasekaran et al. 2011]

slide-37
SLIDE 37

From [Amelunxen et al. 2013]

Phase Transitions

P/N 1 s/N r/ √ N J = | | · | |1 J = | | · | |∗

25 50 75 100 25 50 75 100 10 20 30 300 600 900

1 1 1 P/N

slide-38
SLIDE 38

Overview

  • Low-complexity Convex Regularization
  • Performance Guarantees: L2 Error
  • Performance Guarantees: Model Consistency
slide-39
SLIDE 39

Minimal Norm Certificate

η0 = argmin

⌘=Φ∗q∈∂J(x0)

| |q| | Minimal-norm certificate:

slide-40
SLIDE 40

∂J(x0) ⊂ A(x0) = AffHull(∂J(x0))

Minimal Norm Certificate

η0 = argmin

⌘=Φ∗q∈∂J(x0)

| |q| | Minimal-norm certificate: Case J = | | · | |1 A(x0)

x0

∂J(x0) T =Tx0(Mx0)

slide-41
SLIDE 41

∂J(x0) ⊂ A(x0) = AffHull(∂J(x0))

Minimal Norm Certificate

η0 = argmin

⌘=Φ∗q∈∂J(x0)

| |q| | Minimal-norm certificate: ηF = argmin

η=Φ∗q∈A(x0)

| |q| | Linearized pre-certificate: Case J = | | · | |1 A(x0)

x0

∂J(x0) T =Tx0(Mx0)

slide-42
SLIDE 42
  • → ηF is computed by solving a linear system.
  • ! One does not always have ηF ∈ D(x0) !

∂J(x0) ⊂ A(x0) = AffHull(∂J(x0))

Minimal Norm Certificate

η0 = argmin

⌘=Φ∗q∈∂J(x0)

| |q| | Minimal-norm certificate: ηF = argmin

η=Φ∗q∈A(x0)

| |q| | Linearized pre-certificate: Case J = | | · | |1 A(x0)

x0

∂J(x0) T =Tx0(Mx0)

slide-43
SLIDE 43
  • → ηF is computed by solving a linear system.
  • ! One does not always have ηF ∈ D(x0) !

∂J(x0) ⊂ A(x0) = AffHull(∂J(x0))

Minimal Norm Certificate

η0 = argmin

⌘=Φ∗q∈∂J(x0)

| |q| | Minimal-norm certificate: Theorem: If ker(Φ) ∩ T = {0},

  • ηF ∈ ¯

D(x0) = ) ηF = η0, η0 ∈ ¯ D(x0) = ) ηF = η0. ηF = argmin

η=Φ∗q∈A(x0)

| |q| | Linearized pre-certificate: Case J = | | · | |1 A(x0)

x0

∂J(x0) T =Tx0(Mx0)

slide-44
SLIDE 44

[Vaiter et al. 2014]

Model Stability

Theorem: the unique solution x of P(y) for y = Φx0 + w satisfies If ηF ∈ ¯ D(x0), there exists C such that if max (λ, | |w| |/λ) C x ∈ Mx0 and | |x x0| | = O(| |w| |, λ)

slide-45
SLIDE 45

[Fuchs 2004]: J = | | · | |1. [Bach 2008]: J = | | · | |1,2 and J = | | · | |∗. Previous works: [Vaiter et al. 2014]

Model Stability

Theorem: the unique solution x of P(y) for y = Φx0 + w satisfies If ηF ∈ ¯ D(x0), there exists C such that if max (λ, | |w| |/λ) C [Vaiter et al. 2011]: J = | |D∗ · | |1. x ∈ Mx0 and | |x x0| | = O(| |w| |, λ)

slide-46
SLIDE 46

Compressed Sensing Setting

Theorem: Random matrix: Φi,j ∼ N(0, 1), i.i.d. Sparse vectors: J = | | · | |1. Let s = | |x0| |0. If Φ ∈ RP ×N, Then η0 ∈ ¯ D(x0) with high probability on Φ.

[Dossal et al. 2011] [Wainwright 2009]

P > 2s log(N)

slide-47
SLIDE 47

P ∼ 2s log(N) P ∼ 2s log(N/s) L2 stability Model stability Phase transitions: vs.

Compressed Sensing Setting

Theorem: Random matrix: Φi,j ∼ N(0, 1), i.i.d. Sparse vectors: J = | | · | |1. Let s = | |x0| |0. If Φ ∈ RP ×N, Then η0 ∈ ¯ D(x0) with high probability on Φ.

[Dossal et al. 2011] [Wainwright 2009]

P > 2s log(N)

slide-48
SLIDE 48
  • → Similar results for |

| · | |1,2, | | · | |⇤, | | · | |∞. P ∼ 2s log(N) P ∼ 2s log(N/s) L2 stability Model stability Phase transitions: vs.

Compressed Sensing Setting

Theorem: Random matrix: Φi,j ∼ N(0, 1), i.i.d. Sparse vectors: J = | | · | |1. Let s = | |x0| |0. If Φ ∈ RP ×N, Then η0 ∈ ¯ D(x0) with high probability on Φ.

[Dossal et al. 2011] [Wainwright 2009]

P > 2s log(N)

slide-49
SLIDE 49
  • → Similar results for |

| · | |1,2, | | · | |⇤, | | · | |∞.

  • → Not using RIP technics (non-uniform result on x0).

P ∼ 2s log(N) P ∼ 2s log(N/s) L2 stability Model stability Phase transitions: vs.

Compressed Sensing Setting

Theorem: Random matrix: Φi,j ∼ N(0, 1), i.i.d. Sparse vectors: J = | | · | |1. Let s = | |x0| |0. If Φ ∈ RP ×N, Then η0 ∈ ¯ D(x0) with high probability on Φ.

[Dossal et al. 2011] [Wainwright 2009]

P > 2s log(N)

slide-50
SLIDE 50

Φx =

  • i

xiϕ(· − i) Increasing ∆: → reduces correlation. → reduces resolution. J(x) = | |x| |1

1-D Sparse Spikes Deconvolution

Φx0

x0

γ

slide-51
SLIDE 51

Φx =

  • i

xiϕ(· − i) Increasing ∆: → reduces correlation. → reduces resolution. γ 10 2 support recovery. J(x) = | |x| |1 ( ⇒ I = {j \ x0(j) 6= 0} | |ηF,Ic| |∞ | |ηF,Ic| |∞ < 1 η0 = ηF ∈ ¯ D(x0)

1-D Sparse Spikes Deconvolution

Φx0

x0

γ

20 1 ( ⇒

slide-52
SLIDE 52

Conclusion

Partial smoothness: encodes models using singularities.

slide-53
SLIDE 53

Performance measures L2 error model different CS guarantees

Conclusion

Partial smoothness: encodes models using singularities.

slide-54
SLIDE 54

Performance measures L2 error model different CS guarantees Specific certificate: η0, ηF , . . .

Conclusion

Partial smoothness: encodes models using singularities.

slide-55
SLIDE 55

Performance measures L2 error model different CS guarantees Specific certificate: – CS performance with arbitrary gauges. – Infinite dimensional regularizations (BV, . . . ) – Convergence discrete → continuous. η0, ηF , . . .

Conclusion

Open problems:

Partial smoothness: encodes models using singularities.