joint blind deconvolution and blind demixing via
play

Joint Blind Deconvolution and Blind Demixing via Nonconvex - PowerPoint PPT Presentation

Joint Blind Deconvolution and Blind Demixing via Nonconvex Optimization Shuyang Ling Department of Mathematics, UC Davis July 19, 2017 Shuyang Ling (UC Davis) FOCM, Barcelona, 2017 July 19, 2017 1 / 28 Acknowledgements Research in


  1. Joint Blind Deconvolution and Blind Demixing via Nonconvex Optimization Shuyang Ling Department of Mathematics, UC Davis July 19, 2017 Shuyang Ling (UC Davis) FOCM, Barcelona, 2017 July 19, 2017 1 / 28

  2. Acknowledgements Research in collaboration with: Prof.Xiaodong Li (UC Davis) Prof.Thomas Strohmer (UC Davis) Dr.Ke Wei (UC Davis) This work is sponsored by NSF-DMS and DARPA. Shuyang Ling (UC Davis) FOCM, Barcelona, 2017 July 19, 2017 2 / 28

  3. Outline (a) Blind deconvolution meets blind demixing: applications in image processing and wireless communication (b) Mathematical models and convex approach (c) A nonconvex optimization approach towards joint blind deconvolution and blind demixing Shuyang Ling (UC Davis) FOCM, Barcelona, 2017 July 19, 2017 3 / 28

  4. What is blind deconvolution? What is blind deconvolution? Suppose we observe a function y which consists of the convolution of two unknown functions, the blurring function f and the signal of interest g , plus noise w . How to reconstruct f and g from y ? y = f ∗ g + w . It is obviously a highly ill-posed bilinear inverse problem... Much more difficult than ordinary deconvolution...but have important applications in various fields. Solvability? What conditions on f and g make this problem solvable? How? What algorithms shall we use to recover f and g ? Shuyang Ling (UC Davis) FOCM, Barcelona, 2017 July 19, 2017 4 / 28

  5. Why do we care about blind deconvolution? Image deblurring Let f be the blurring kernel and g be the original image, then y = f ∗ g is the blurred image. Question: how to reconstruct f and g from y y g f w = + blurred original blurring noise image image kernel = + Shuyang Ling (UC Davis) FOCM, Barcelona, 2017 July 19, 2017 5 / 28

  6. Blind deconvolution meets blind demixing Suppose there are s users and each of them sends a message x i , which is encoded by C i , to a common receiver. Each encoded message g i = C i x i is convolved with an unknown impulse response function f i . User 1 𝑔 $ : channel 𝐹𝑡𝑢𝑗𝑛𝑏𝑢𝑓 (𝑔 $ , 𝑦 $ ) ⋮ 𝑕 $ = 𝐷 $ 𝑦 $ : signal 𝑔 $ ∗ 𝑕 $ 𝑔 3 : channel 𝑔 3 ∗ 𝑕 3 User 7 𝑧 = ∑ 𝑔 3 ∗ 𝑕 3 + 𝑥 𝐹𝑡𝑢𝑗𝑛𝑏𝑢𝑓 (𝑔 3 , 𝑦 3 ) 38$ 𝑗 𝑕 3 = 𝐷 3 𝑦 3 : signal ⋮ 𝑔 7 ∗ 𝑕 7 𝑔 7 : channel Decoder 𝐹𝑡𝑢𝑗𝑛𝑏𝑢𝑓 (𝑔 7 , 𝑦 7 ) 𝑕 7 = 𝐷 7 𝑦 7 : signal User 𝑡 Shuyang Ling (UC Davis) FOCM, Barcelona, 2017 July 19, 2017 6 / 28

  7. Blind deconvolution and blind demixing Consider the model: s � y = f i ∗ g i + w . i =1 This is even more difficult than blind deconvolution ( s = 1), since this is a “mixture” of blind deconvolution problems. It also includes phase retrieval as a special case if s = 1 and ¯ g i = f i . More assumptions Each impulse response f i has maximum delay spread K (compact support): � h i � f i ( n ) = 0 , for n > K , f i = . 0 Let g i := C i x i be the signal x i ∈ C N encoded by C i ∈ C L × N with L > N . We also require C i to be mutually incoherent by imposing randomness. Shuyang Ling (UC Davis) FOCM, Barcelona, 2017 July 19, 2017 7 / 28

  8. Mathematical model Subspace assumption on the frequency domain Denote F as the L × L DFT matrix. Let h i ∈ C K be the first K nonzero entries of f i and B be a low-frequency DFT matrix. There holds, ˆ f i = Ff i = Bh i . g i := A i x i where A i := FC i and x i ∈ C N . Let ˆ Mathematical model s � y = diag( Bh i ) A i x i + w . i =1 Goal: We want to recover ( h i , x i ) s i =1 from ( y , B , A i ) s i =1 . Remark: The degree of freedom for unknowns: s ( K + N ); number of constraints: L . Shuyang Ling (UC Davis) FOCM, Barcelona, 2017 July 19, 2017 8 / 28

  9. Mathematical model Subspace assumption on the frequency domain Denote F as the L × L DFT matrix. Let h i ∈ C K be the first K nonzero entries of f i and B be a low-frequency DFT matrix. There holds, ˆ f i = Ff i = Bh i . g i := A i x i where A i := FC i and x i ∈ C N . Let ˆ Mathematical model s � y = diag( Bh i ) A i x i + w . i =1 Goal: We want to recover ( h i , x i ) s i =1 from ( y , B , A i ) s i =1 . Remark: The degree of freedom for unknowns: s ( K + N ); number of constraints: L . Shuyang Ling (UC Davis) FOCM, Barcelona, 2017 July 19, 2017 8 / 28

  10. Naive approach Nonlinear least squares We may want to try nonlinear least squares approach: � � 2 � � s � � � min diag( Bh i ) A i x i − y . � � � � ( h i , x i ) i =1 � �� � F ( h i , x i ) The objective function is highly nonconvex and more complicated than blind deconvolution ( s = 1). Gradient descent might get stuck at local minima. No guarantees for recoverability. Shuyang Ling (UC Davis) FOCM, Barcelona, 2017 July 19, 2017 9 / 28

  11. Naive approach Nonlinear least squares We may want to try nonlinear least squares approach: � � 2 � � s � � � min diag( Bh i ) A i x i − y . � � � � ( h i , x i ) i =1 � �� � F ( h i , x i ) The objective function is highly nonconvex and more complicated than blind deconvolution ( s = 1). Gradient descent might get stuck at local minima. No guarantees for recoverability. Shuyang Ling (UC Davis) FOCM, Barcelona, 2017 July 19, 2017 9 / 28

  12. Convex relaxation and low-rank matrix recovery Lifting Let a i , l be the l -th column of A ∗ i and b l be the l -th column of B ∗ . s s � � b ∗ l h i x ∗ y l = ( Bh i ) l · ( A I x i ) l = a i , l . i ���� i =1 i =1 rank-1 i and define the linear operator A i : C K × N → C L as, Let X i := h i x ∗ � � A i ( Z ) := { b ∗ l Za i , l } L Z , b l a ∗ } L l =1 = { l =1 . i , l Then, there holds y = � s i =1 A i ( X i ) + w . See [Cand` es-Strohmer-Voroninski 13], [Ahmed-Recht-Romberg, 14]. Shuyang Ling (UC Davis) FOCM, Barcelona, 2017 July 19, 2017 10 / 28

  13. Convex relaxation and low-rank matrix recovery Rank- s matrix recovery We rewrite y = � s i =1 diag( Bh i ) A i x i as     b l a ∗ h 1 x ∗ · · · 0 0 0 · · · 0 1 , l � 1 �  h 2 x ∗   b l a ∗  · · · 0 · · · 0 0 0     2 , l 2 y l = ,  . . .    ... . . . ... . . . . . .     . . . . . . h s x ∗ b l a ∗ 0 0 · · · · · · 0 0 s s , l � �� � rank- s matrix Recover a rank- s block diagonal matrix satisfying convex constraints. Finding such a rank- s matrix is generally an NP-hard problem. Shuyang Ling (UC Davis) FOCM, Barcelona, 2017 July 19, 2017 11 / 28

  14. Low-rank matrix recovery Nuclear norm minimization The ground truth is a rank- s block-diagonal matrix. It is natural to recover the solution via solving s s � � min � Z i � ∗ subject to A i ( Z i ) = y i =1 i =1 where � s i =1 � Z i � ∗ is the nuclear norm of blkdiag( Z 1 , · · · , Z s ) . Question: Can we recover { h i 0 x ∗ i 0 } s i =1 exactly? Shuyang Ling (UC Davis) FOCM, Barcelona, 2017 July 19, 2017 12 / 28

  15. Convex approach Theorem Assume that Let B ∈ C L × K be a partial DFT matrix with B ∗ B = I K ; Each A i is a Gaussian random matrix. The SDP relaxation is able to recover { ( h i 0 , x i 0 ) } s i =1 exactly with probability at least 1 − O ( L − γ ). Here the number of measurements L satifies h N ) log 3 L ; [Ling-Strohmer 15] L ≥ C γ s 2 ( K + µ 2 h N )) log 3 L oger 17] L ≥ C γ ( s ( K + µ 2 [Jung-Krahmer-St¨ � Bh i 0 � 2 where µ 2 h = L max 1 ≤ i ≤ s � h i 0 � 2 . ∞ We can jointly estimate the channels and signals for s users with one simple convex program. SDP is able to recover { ( h i , x i ) } s i =1 but it is computationally expensive. Shuyang Ling (UC Davis) FOCM, Barcelona, 2017 July 19, 2017 13 / 28 Can we solve this problem simply with gradient descent which also

  16. Convex approach Theorem Assume that Let B ∈ C L × K be a partial DFT matrix with B ∗ B = I K ; Each A i is a Gaussian random matrix. The SDP relaxation is able to recover { ( h i 0 , x i 0 ) } s i =1 exactly with probability at least 1 − O ( L − γ ). Here the number of measurements L satifies h N ) log 3 L ; [Ling-Strohmer 15] L ≥ C γ s 2 ( K + µ 2 h N )) log 3 L oger 17] L ≥ C γ ( s ( K + µ 2 [Jung-Krahmer-St¨ � Bh i 0 � 2 where µ 2 h = L max 1 ≤ i ≤ s � h i 0 � 2 . ∞ We can jointly estimate the channels and signals for s users with one simple convex program. SDP is able to recover { ( h i , x i ) } s i =1 but it is computationally expensive. Shuyang Ling (UC Davis) FOCM, Barcelona, 2017 July 19, 2017 13 / 28 Can we solve this problem simply with gradient descent which also

  17. A nonconvex optimization approach? An increasing list of nonconvex approaches to various problems in machine learning and signal processing: Phase retrieval: Cand` es, Li, Soltanolkotabi, Chen, Wright, Sun, etc... Matrix completion: Sun, Luo, Montanari, etc... Various problems: Recht, Wainwright, Constantine, etc... Two-step philosophy for provable nonconvex optimization (a) Use spectral method to construct a starting point inside “the basin of attraction ” ; (b) Run gradient descent method. The key is to build up “the basin of attraction”. Shuyang Ling (UC Davis) FOCM, Barcelona, 2017 July 19, 2017 14 / 28

Recommend


More recommend