Joint Blind Deconvolution and Blind Demixing via Nonconvex Optimization Shuyang Ling Department of Mathematics, UC Davis July 19, 2017 Shuyang Ling (UC Davis) FOCM, Barcelona, 2017 July 19, 2017 1 / 28
Acknowledgements Research in collaboration with: Prof.Xiaodong Li (UC Davis) Prof.Thomas Strohmer (UC Davis) Dr.Ke Wei (UC Davis) This work is sponsored by NSF-DMS and DARPA. Shuyang Ling (UC Davis) FOCM, Barcelona, 2017 July 19, 2017 2 / 28
Outline (a) Blind deconvolution meets blind demixing: applications in image processing and wireless communication (b) Mathematical models and convex approach (c) A nonconvex optimization approach towards joint blind deconvolution and blind demixing Shuyang Ling (UC Davis) FOCM, Barcelona, 2017 July 19, 2017 3 / 28
What is blind deconvolution? What is blind deconvolution? Suppose we observe a function y which consists of the convolution of two unknown functions, the blurring function f and the signal of interest g , plus noise w . How to reconstruct f and g from y ? y = f ∗ g + w . It is obviously a highly ill-posed bilinear inverse problem... Much more difficult than ordinary deconvolution...but have important applications in various fields. Solvability? What conditions on f and g make this problem solvable? How? What algorithms shall we use to recover f and g ? Shuyang Ling (UC Davis) FOCM, Barcelona, 2017 July 19, 2017 4 / 28
Why do we care about blind deconvolution? Image deblurring Let f be the blurring kernel and g be the original image, then y = f ∗ g is the blurred image. Question: how to reconstruct f and g from y y g f w = + blurred original blurring noise image image kernel = + Shuyang Ling (UC Davis) FOCM, Barcelona, 2017 July 19, 2017 5 / 28
Blind deconvolution meets blind demixing Suppose there are s users and each of them sends a message x i , which is encoded by C i , to a common receiver. Each encoded message g i = C i x i is convolved with an unknown impulse response function f i . User 1 𝑔 $ : channel 𝐹𝑡𝑢𝑗𝑛𝑏𝑢𝑓 (𝑔 $ , 𝑦 $ ) ⋮ $ = 𝐷 $ 𝑦 $ : signal 𝑔 $ ∗ $ 𝑔 3 : channel 𝑔 3 ∗ 3 User 7 𝑧 = ∑ 𝑔 3 ∗ 3 + 𝑥 𝐹𝑡𝑢𝑗𝑛𝑏𝑢𝑓 (𝑔 3 , 𝑦 3 ) 38$ 𝑗 3 = 𝐷 3 𝑦 3 : signal ⋮ 𝑔 7 ∗ 7 𝑔 7 : channel Decoder 𝐹𝑡𝑢𝑗𝑛𝑏𝑢𝑓 (𝑔 7 , 𝑦 7 ) 7 = 𝐷 7 𝑦 7 : signal User 𝑡 Shuyang Ling (UC Davis) FOCM, Barcelona, 2017 July 19, 2017 6 / 28
Blind deconvolution and blind demixing Consider the model: s � y = f i ∗ g i + w . i =1 This is even more difficult than blind deconvolution ( s = 1), since this is a “mixture” of blind deconvolution problems. It also includes phase retrieval as a special case if s = 1 and ¯ g i = f i . More assumptions Each impulse response f i has maximum delay spread K (compact support): � h i � f i ( n ) = 0 , for n > K , f i = . 0 Let g i := C i x i be the signal x i ∈ C N encoded by C i ∈ C L × N with L > N . We also require C i to be mutually incoherent by imposing randomness. Shuyang Ling (UC Davis) FOCM, Barcelona, 2017 July 19, 2017 7 / 28
Mathematical model Subspace assumption on the frequency domain Denote F as the L × L DFT matrix. Let h i ∈ C K be the first K nonzero entries of f i and B be a low-frequency DFT matrix. There holds, ˆ f i = Ff i = Bh i . g i := A i x i where A i := FC i and x i ∈ C N . Let ˆ Mathematical model s � y = diag( Bh i ) A i x i + w . i =1 Goal: We want to recover ( h i , x i ) s i =1 from ( y , B , A i ) s i =1 . Remark: The degree of freedom for unknowns: s ( K + N ); number of constraints: L . Shuyang Ling (UC Davis) FOCM, Barcelona, 2017 July 19, 2017 8 / 28
Mathematical model Subspace assumption on the frequency domain Denote F as the L × L DFT matrix. Let h i ∈ C K be the first K nonzero entries of f i and B be a low-frequency DFT matrix. There holds, ˆ f i = Ff i = Bh i . g i := A i x i where A i := FC i and x i ∈ C N . Let ˆ Mathematical model s � y = diag( Bh i ) A i x i + w . i =1 Goal: We want to recover ( h i , x i ) s i =1 from ( y , B , A i ) s i =1 . Remark: The degree of freedom for unknowns: s ( K + N ); number of constraints: L . Shuyang Ling (UC Davis) FOCM, Barcelona, 2017 July 19, 2017 8 / 28
Naive approach Nonlinear least squares We may want to try nonlinear least squares approach: � � 2 � � s � � � min diag( Bh i ) A i x i − y . � � � � ( h i , x i ) i =1 � �� � F ( h i , x i ) The objective function is highly nonconvex and more complicated than blind deconvolution ( s = 1). Gradient descent might get stuck at local minima. No guarantees for recoverability. Shuyang Ling (UC Davis) FOCM, Barcelona, 2017 July 19, 2017 9 / 28
Naive approach Nonlinear least squares We may want to try nonlinear least squares approach: � � 2 � � s � � � min diag( Bh i ) A i x i − y . � � � � ( h i , x i ) i =1 � �� � F ( h i , x i ) The objective function is highly nonconvex and more complicated than blind deconvolution ( s = 1). Gradient descent might get stuck at local minima. No guarantees for recoverability. Shuyang Ling (UC Davis) FOCM, Barcelona, 2017 July 19, 2017 9 / 28
Convex relaxation and low-rank matrix recovery Lifting Let a i , l be the l -th column of A ∗ i and b l be the l -th column of B ∗ . s s � � b ∗ l h i x ∗ y l = ( Bh i ) l · ( A I x i ) l = a i , l . i ���� i =1 i =1 rank-1 i and define the linear operator A i : C K × N → C L as, Let X i := h i x ∗ � � A i ( Z ) := { b ∗ l Za i , l } L Z , b l a ∗ } L l =1 = { l =1 . i , l Then, there holds y = � s i =1 A i ( X i ) + w . See [Cand` es-Strohmer-Voroninski 13], [Ahmed-Recht-Romberg, 14]. Shuyang Ling (UC Davis) FOCM, Barcelona, 2017 July 19, 2017 10 / 28
Convex relaxation and low-rank matrix recovery Rank- s matrix recovery We rewrite y = � s i =1 diag( Bh i ) A i x i as b l a ∗ h 1 x ∗ · · · 0 0 0 · · · 0 1 , l � 1 � h 2 x ∗ b l a ∗ · · · 0 · · · 0 0 0 2 , l 2 y l = , . . . ... . . . ... . . . . . . . . . . . . h s x ∗ b l a ∗ 0 0 · · · · · · 0 0 s s , l � �� � rank- s matrix Recover a rank- s block diagonal matrix satisfying convex constraints. Finding such a rank- s matrix is generally an NP-hard problem. Shuyang Ling (UC Davis) FOCM, Barcelona, 2017 July 19, 2017 11 / 28
Low-rank matrix recovery Nuclear norm minimization The ground truth is a rank- s block-diagonal matrix. It is natural to recover the solution via solving s s � � min � Z i � ∗ subject to A i ( Z i ) = y i =1 i =1 where � s i =1 � Z i � ∗ is the nuclear norm of blkdiag( Z 1 , · · · , Z s ) . Question: Can we recover { h i 0 x ∗ i 0 } s i =1 exactly? Shuyang Ling (UC Davis) FOCM, Barcelona, 2017 July 19, 2017 12 / 28
Convex approach Theorem Assume that Let B ∈ C L × K be a partial DFT matrix with B ∗ B = I K ; Each A i is a Gaussian random matrix. The SDP relaxation is able to recover { ( h i 0 , x i 0 ) } s i =1 exactly with probability at least 1 − O ( L − γ ). Here the number of measurements L satifies h N ) log 3 L ; [Ling-Strohmer 15] L ≥ C γ s 2 ( K + µ 2 h N )) log 3 L oger 17] L ≥ C γ ( s ( K + µ 2 [Jung-Krahmer-St¨ � Bh i 0 � 2 where µ 2 h = L max 1 ≤ i ≤ s � h i 0 � 2 . ∞ We can jointly estimate the channels and signals for s users with one simple convex program. SDP is able to recover { ( h i , x i ) } s i =1 but it is computationally expensive. Shuyang Ling (UC Davis) FOCM, Barcelona, 2017 July 19, 2017 13 / 28 Can we solve this problem simply with gradient descent which also
Convex approach Theorem Assume that Let B ∈ C L × K be a partial DFT matrix with B ∗ B = I K ; Each A i is a Gaussian random matrix. The SDP relaxation is able to recover { ( h i 0 , x i 0 ) } s i =1 exactly with probability at least 1 − O ( L − γ ). Here the number of measurements L satifies h N ) log 3 L ; [Ling-Strohmer 15] L ≥ C γ s 2 ( K + µ 2 h N )) log 3 L oger 17] L ≥ C γ ( s ( K + µ 2 [Jung-Krahmer-St¨ � Bh i 0 � 2 where µ 2 h = L max 1 ≤ i ≤ s � h i 0 � 2 . ∞ We can jointly estimate the channels and signals for s users with one simple convex program. SDP is able to recover { ( h i , x i ) } s i =1 but it is computationally expensive. Shuyang Ling (UC Davis) FOCM, Barcelona, 2017 July 19, 2017 13 / 28 Can we solve this problem simply with gradient descent which also
A nonconvex optimization approach? An increasing list of nonconvex approaches to various problems in machine learning and signal processing: Phase retrieval: Cand` es, Li, Soltanolkotabi, Chen, Wright, Sun, etc... Matrix completion: Sun, Luo, Montanari, etc... Various problems: Recht, Wainwright, Constantine, etc... Two-step philosophy for provable nonconvex optimization (a) Use spectral method to construct a starting point inside “the basin of attraction ” ; (b) Run gradient descent method. The key is to build up “the basin of attraction”. Shuyang Ling (UC Davis) FOCM, Barcelona, 2017 July 19, 2017 14 / 28
Recommend
More recommend