A framework for non-convex recovery of low dimensional models in infinite dimension GDR MIA, October 2020 Yann Traonmilin (CNRS, IMB), Jean-Fran¸ cois Aujol (U. Bdx, IMB), Arthur Leclaire (U. Bdx, IMB) GDR MIA 2020, Yann Traonmilin 1
What do these problems have in common? Low rank matrix recovery: Off-the-grid sparse spike recovery: 1.5 1.5 1 1 1 0.9 0.8 0.5 0.7 0.5 0.6 0 0.5 0 0.4 0.3 − 0.5 0.2 − 0.5 0.1 − 1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 − 1 1.5 − 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 − 1.5 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Gaussian mixture estimation from random moments: 20 15 10 5 0 -5 -10 -15 -5 0 5 10 15 20 25 GDR MIA 2020, Yann Traonmilin 2
Non-convex inverse problems with low-dimensional models Linear inverse problem with low dimensional model Smooth parametrization of the model in R d Recovery guarantees of a non-convex functional under a given number of measurements m � 0( dlog ( d )) Non-convex optimization techniques studied theoretically and used in practice (initialization + descent) GDR MIA 2020, Yann Traonmilin 3
Objective Study the performance of non-convex techniques under the same general non-convex framework. Give new results for examples of non-convex recovery. GDR MIA 2020, Yann Traonmilin 4
Outline Introduction Non-convex ideal decoder and the RIP Basins of attraction of global minimizers Application Conclusion GDR MIA 2020, Yann Traonmilin 5
The setting Measurement of x 0 (e.g. signal, image, object of interest) y l = � x 0 , α l � + e l with α l ∈ D . We summarize y = Ax 0 + e This makes sense for D Banach space and x 0 ∈ D ∗ a locally convex topological vector space. A underdetermined ⇒ we need a low dimensional model set : x 0 ∈ Σ. GDR MIA 2020, Yann Traonmilin 6
Examples of low dimensional models LR matrices: Σ = Σ r := { ZZ T : Z ∈ R p × r } Off-the-grid sparse spikes: � k � � Σ = Σ k ,ǫ := a i δ t i : � t i − t j � 2 > ǫ, � t i � 2 < R i =1 Gaussian mixtures: � k � � Σ = Σ k ,ǫ, Γ := a i µ t i : � t i − t j � Γ > ǫ, � t i � 2 < R , i =1 where d µ t i ( t ) = e − 1 2 � t − t i � 2 Γ d t , � u � 2 Γ = u t Γ − 1 u and Γ is the fixed known covariance matrix. GDR MIA 2020, Yann Traonmilin 7
Ideal decoder and performance We consider the estimation method (ideal decoder) : x ∗ ∈ arg min x ∈ Σ � Ax − y � 2 2 Problem : how to quantify recovery when no norm is attached to D ∗ ? We want � x ∗ − x 0 � 2 H ≤ C � e � 2 2 where � x ∗ − x 0 � 2 H is a metric measuring estimation quality (Hilbert norm in our case). GDR MIA 2020, Yann Traonmilin 8
Restricted isometries (RIP) For x ∈ Σ − Σ: (1 − γ ) � x � 2 ≤ � Ax � 2 2 ≤ (1 + γ ) � x � 2 Sufficient condition on A to guarantee the success of the ideal decoder (and convex methods in classical sparse recovery methods). Necessary for uniform recovery. Verified under a condition on number of (compressive) measurements in our 3 examples. GDR MIA 2020, Yann Traonmilin 9
Wait a minute .... The ideal decoder is far from convex. It can even be NP hard ! GDR MIA 2020, Yann Traonmilin 10
Outline Introduction Non-convex ideal decoder and the RIP Basins of attraction of global minimizers Application Conclusion GDR MIA 2020, Yann Traonmilin 11
Why consider the non-convex ideal decoder? The following strategy to perform the non-convex minimization can be successful (with theoretical guarantees). Perform a clever intialization Apply a descent algorithm This strategy is found in phase recovery [Waldspurger, 2018], low rank matrix recovery [Zhao et al., 2015] and blind deconvolution [Li et al., 2018, Cambareri and Jacques, 2018], sparse spike estimation [Flinth and Weiss, 2019, Traonmilin, Aujol, Leclaire, 2019-2020]. GDR MIA 2020, Yann Traonmilin 12
How can success be guaranteed with such a strategy? Continuous parameter space Conditions on the number of measurements and dimensionality of the model set. Success if initialization fall in the basin of attraction of a global minimum. ◮ Let’s study the basins of attractions of our problem! GDR MIA 2020, Yann Traonmilin 13
Parametrization of the model When we look at the non-convex minimization in the parameter space it is in fact very smooth inside the constraints θ ∗ ∈ arg min θ ∈ Θ � A φ ( θ ) − y � 2 θ ∈ Θ g ( θ ) = arg min 2 . where φ is the parametrization functional Θ is the parameter set Θ := φ − 1 (Σ) ⊂ R d GDR MIA 2020, Yann Traonmilin 14
Gradient descent in the parameter space Consider the fixed step gradient descent : θ n +1 = θ n − τ ∇ g ( θ n ) Basin of attraction: Λ ⊂ R d is a g -basin of attraction of θ ∗ if there exists τ > 0, such that if θ 0 ∈ Λ then the sequence g ( θ n ) converges to g ( θ ∗ ). ◮ As g is smooth, make sure the Hessian is positive on a set Λ and θ n stays in Λ. ◮ In the LR case, no local convexity. We need to look at the Hessian in relevant directions GDR MIA 2020, Yann Traonmilin 15
Shape of the basin Λ β and indeterminacy Simple ℓ 2 ball is not enough to manage all cases Instead use distance to a set of equivalent parametrizations � ˜ d ( θ, θ ∗ ) := min θ − θ � 2 ˜ θ ∈ Θ φ (˜ θ )= φ ( θ ∗ ) Shape of basins of attraction Λ β := { θ ∈ Θ : d ( θ, θ ∗ ) < β } . GDR MIA 2020, Yann Traonmilin 16
General theorem : regularity hypotheses Regularity Hypotheses: Let A be a weak-* continuous linear map from the space D ∗ to C m . Suppose A has the RIP on Σ − Σ with constant γ and φ is weak-* continuous and twice weak-* Gateaux differentiable . Let θ ∗ be a global minimizer of g on Θ. Let us assume that there exists β 1 > 0 such that θ ∈ Λ 2 β 1 implies φ ( θ ) ∈ Σ (local stability of the model set) and ˜ θ = p ( θ, θ ∗ ) is unique; there is C φ,θ ∗ > 0 such that � φ ( θ ) − φ ( θ ∗ ) � H ≤ C φ,θ ∗ d ( θ, θ ∗ ); ∀ θ ∈ Λ 2 β 1 , the first-order derivatives of A φ are uniformly bounded on φ − 1 ( θ ∗ ): the second-order derivatives of A φ are uniformly bounded on Λ 2 β 1 : GDR MIA 2020, Yann Traonmilin 17
General Theorem : Basin of attraction Theorem [Traonmilin, Aujol, Leclaire, 2020] Under the previous hypotheses, let � � θ − θ φ ( z ) � 2 � ∂ ˜ (1 − γ ) 1 H C φ,θ ∗ √ 1 + γ C φ,θ ∗ √ 1 + γ � e � 2 > 0 . β 2 := inf inf − � A ∂ 2 θ − θ φ ( z ) � 2 θ ∈ Λ β 1 z ∈ [ θ, ˜ θ ] ˜ (1) Then Λ min( β 1 ,β 2 ) is a g -basin of attraction of θ ∗ . With this theorem, with the right gradient descent step and given θ 0 ∈ Λ min( β 1 ,β 2 ) , we guarantee that: � 1 � 4 � φ ( θ n ) − x 0 � 2 1 − γ � e � 2 H ≤ 2 + O (2) . n GDR MIA 2020, Yann Traonmilin 18
Outline Introduction Non-convex ideal decoder and the RIP Basins of attraction of global minimizers Application Conclusion GDR MIA 2020, Yann Traonmilin 19
Application : Low rank matrix estimation Study kind of outdated from a practical point of view (global convergence of stochastic gradient descent) Ideal decoder (Burer - Monteiro) : Z ∈ R p × r � A ( ZZ T ) − y � 2 min 2 . Basin of attraction with y = A ( Z 0 Z T 0 ): Λ β LR := { Z : H ∈O ( r ) � ZH − Z 0 � F < β LR } inf With RIP on Σ 2 r with constant γ , basin of attraction with β LR := 1 8 κ ( Z 0 ) − 1 (1 − γ ) (1 + γ ) σ min ( Z 0 ) GDR MIA 2020, Yann Traonmilin 20
Application : sparse spikes recovery Ideal Decoder: � k 2 � � � � � � min − y � A a i δ t i � � � � a i ∈ R ; t i ∈B 2 ( R ); ∀ i � = j , � t i − t j � 2 >ǫ � i =1 2 Shape of the basin Λ β spikes := { θ : � θ − θ ∗ � 2 < β spikes } [Traonmilin and Aujol, 2019] With RIP on Σ k , ǫ 2 − Σ k , ǫ 2 , y = Ax 0 , basin of attraction with � ǫ � 4 , | a 1 | β spikes := min (3) 2 , C spikes 1+ γ and C spikes ∝ | a min | where C spikes ∝ 1 − γ | a max | , all constants can be explicit. GDR MIA 2020, Yann Traonmilin 21
Application : Gaussian mixtures New result !!! Ideal decoder � k 2 � � � � � � min − y � A a i µ t i � � � � a i ∈ R ; t i ∈B 2 ( R ); ∀ i � = j , � t i − t j � Γ >ǫ i =1 � 2 Shape of the basin Λ β GMM := { θ : � θ − θ ∗ � 2 < β GMM } With RIP on Σ k , ǫ 2 , Γ − Σ k , ǫ 2 , Γ , y = Ax 0 , basin of attraction with � � � ǫ λ min (Γ) , | a 1 | β GMM = min 2 , C GMM 8 where C GMM ∝ 1 − γ 1+ γ and C GMM ∝ | a min | | a max | GDR MIA 2020, Yann Traonmilin 22
Initialization by back-projection: the real challenge? Ideal backprojection z ∈ D with m � z := y l α l . (4) l =1 with the RIP, (1 − γ ) � x 0 � 2 H ≤ � x 0 , z � ≤ (1 + γ ) � x 0 � 2 H . (5) Problem : z ∈ D . How can we go from z to θ init ? Possible for 2D/3D spikes imaging [Traonmilin, Aujol, Leclaire, 2020]: 100 90 80 70 60 50 40 30 20 10 20 40 60 80 100 GDR MIA 2020, Yann Traonmilin 23
Outline Introduction Non-convex ideal decoder and the RIP Basins of attraction of global minimizers Application Conclusion GDR MIA 2020, Yann Traonmilin 24
Recommend
More recommend