solving underdetermined linear equations and
play

Solving Underdetermined Linear Equations and Overdetermined - PowerPoint PPT Presentation

Solving Underdetermined Linear Equations and Overdetermined Quadratic Equations (using Convex Programming) Justin Romberg Georgia Tech, ECE Caltech ROM-GR Workshop June 7, 2013 Pasadena, California Linear systems in data acquisition Linear


  1. Solving Underdetermined Linear Equations and Overdetermined Quadratic Equations (using Convex Programming) Justin Romberg Georgia Tech, ECE Caltech ROM-GR Workshop June 7, 2013 Pasadena, California

  2. Linear systems in data acquisition

  3. Linear systems of equations are ubiquitous Model:                     A x       y =                           y : data coming off of sensor A : mathematical (linear) model for sensor x : signal/image to reconstruct

  4. Classical: When can we stably “invert” a matrix? Suppose we have an M × N observation matrix A with M ≥ N (MORE observations than unknowns), through which we observe y = Ax 0 + noise

  5. Classical: When can we stably “invert” a matrix? Suppose we have an M × N observation matrix A with M ≥ N (MORE observations than unknowns), through which we observe y = Ax 0 + noise Standard way to recover x 0 , use the pseudo-inverse x = ( A T A ) − 1 A T y x � y − Ax � 2 solve min ⇔ ˆ 2

  6. Classical: When can we stably “invert” a matrix? Suppose we have an M × N observation matrix A with M ≥ N (MORE observations than unknowns), through which we observe y = Ax 0 + noise Standard way to recover x 0 , use the pseudo-inverse x = ( A T A ) − 1 A T y x � y − Ax � 2 solve min ⇔ ˆ 2 Q: When is this recovery stable? That is, when is x − x 0 � 2 2 ∼ � noise � 2 � ˆ ? 2

  7. Classical: When can we stably “invert” a matrix? Suppose we have an M × N observation matrix A with M ≥ N (MORE observations than unknowns), through which we observe y = Ax 0 + noise Standard way to recover x 0 , use the pseudo-inverse x = ( A T A ) − 1 A T y x � y − Ax � 2 solve min ⇔ ˆ 2 Q: When is this recovery stable? That is, when is x − x 0 � 2 2 ∼ � noise � 2 � ˆ ? 2 A: When the matrix A preserves distances ... for all x 1 , x 2 ∈ R N � A ( x 1 − x 2 ) � 2 2 ≈ � x 1 − x 2 � 2 2

  8. Sparsity Decompose signal/image x ( t ) in orthobasis { ψ i ( t ) } i � x ( t ) = α i ψ i ( t ) i wavelet transform zoom x 0 { α i } i

  9. Wavelet approximation Take 1% of largest coefficients, set the rest to zero (adaptive) original approximated rel. error = 0.031

  10. When can we stably recover an S -sparse vector? y = x 0 � Now we have an underdetermined M × N system Φ (FEWER measurements than unknowns), and observe y = Φ x 0 + noise

  11. Sampling a superposition of sinusoids We take M samples of a superposition of S sinusoids: Time domain x 0 ( t ) Frequency domain ˆ x 0 ( ω ) Measure M samples S nonzero components (red circles = samples)

  12. Sampling a superposition of sinusoids Reconstruct by solving min � ˆ x � ℓ 1 subject to x ( t m ) = x 0 ( t m ) , m = 1 , . . . , M x original ˆ x 0 , S = 15 perfect recovery from 30 samples

  13. Numerical recovery curves Resolutions N = 256 , 512 , 1024 (black, blue, red) Signal composed of S randomly selected sinusoids Sample at M randomly selected locations 100 90 80 70 % success 60 50 40 30 20 10 0 0 0.2 0.4 0.6 0.8 1 S/M In practice, perfect recovery occurs when M ≈ 2 S for N ≈ 1000

  14. A nonlinear sampling theorem Exact Recovery Theorem (Cand` es, R, Tao, 2004): Unknown ˆ x 0 is supported on set of size S Select M sample locations { t m } “at random” with M ≥ Const · S log N Take time-domain samples (measurements) y m = x 0 ( t m ) Solve min x � ˆ x � ℓ 1 subject to x ( t m ) = y m , m = 1 , . . . , M Solution is exactly f with extremely high probability

  15. When can we stably recover an S -sparse vector? y x 0 = � Now we have an underdetermined M × N system Φ (FEWER measurements than unknowns), and observe y = Φ x 0 + noise

  16. When can we stably recover an S -sparse vector? Now we have an underdetermined M × N system Φ (FEWER measurements than unknowns), and observe y = Φ x 0 + noise We can recover x 0 when Φ keeps sparse signals separated � Φ( x 1 − x 2 ) � 2 2 ≈ � x 1 − x 2 � 2 2 for all S -sparse x 1 , x 2

  17. When can we stably recover an S -sparse vector? Now we have an underdetermined M × N system Φ (FEWER measurements than unknowns), and observe y = Φ x 0 + noise We can recover x 0 when Φ keeps sparse signals separated � Φ( x 1 − x 2 ) � 2 2 ≈ � x 1 − x 2 � 2 2 for all S -sparse x 1 , x 2 To recover x 0 , we solve min � x � 0 subject to Φ x = y x � x � 0 = number of nonzero terms in x This program is computationally intractable

  18. When can we stably recover an S -sparse vector? Now we have an underdetermined M × N system Φ (FEWER measurements than unknowns), and observe y = Φ x 0 + noise We can recover x 0 when Φ keeps sparse signals separated � Φ( x 1 − x 2 ) � 2 2 ≈ � x 1 − x 2 � 2 2 for all S -sparse x 1 , x 2 A relaxed (convex) program min � x � 1 subject to Φ x = y x � x � 1 = � k | x k | This program is very tractable (linear program) The convex program can recover nearly all “identifiable” sparse vectors, and it is robust .

  19. Intuition for ℓ 1 min x � x � 2 s.t. Φ x = y min x � x � 1 s.t. Φ x = y

  20. Sparse recovery algorithms ℓ 1 can recover sparse vectors “almost anytime” it is possible perfect recovery with no noise stable recovery in the presence of noise robust recovery when the signal is not exactly sparse

  21. Sparse recovery algorithms Other recovery techniques have similar theoretical properties (their practical effectiveness varies with applications) greedy algorithms iterative thresholding belief propagation specialized decoding algorithms

  22. What kind of matrices keep sparse signals separated? Φ M 0"'()-%,,%.& S (%!,1-%(%*+,2& !"#$%&& "'()'*%*+,& -!*.'(& ± 1 %*+-/%,& N +'+!3&-%,'31#'*45!*.6/.+7&8& Random matrices are provably efficient We can recover S -sparse x from M � S · log( N/S ) measurements

  23. Rice single pixel camera single photon detector image reconstruction or DMD DMD processing random pattern on DMD array (Duarte, Davenport, Takhar, Laska, Sun, Kelly, Baraniuk ’08)

  24. Hyperspectral imaging 256 frequency bands, 10s of megapixels, 30 frames per second ...

  25. DARPA’s Analog-to-Information Multichannel ADC/receiver for identifying radar pulses Covers ∼ 3 GHz with ∼ 400 MHz sampling rate

  26. Compressive sensing with structured randomness Subsampled rows of “incoherent” orthogonal matrix Can recover S -sparse x 0 with M � S log a N measurements Candes, R, Tao, Rudelson, Vershynin, Tropp, . . .

  27. Accelerated MRI ARC SPIR-iT (Lustig et al. ’08)

  28. Matrices for sparse recovery with structured randomness Random convolution + subsampling Universal ; Can recover S -sparse x 0 with M � S log a N Applications include: radar imaging sonar imaging seismic exploration channel estimation for communications super-resolved imaging R, Bajwa, Haupt, Tropp, Rauhut, . . .

  29. Integrating compression and sensing

  30. Recovering a matrix from limited observations Suppose we are interested in recovering the values of a matrix X   X 1 , 1 X 1 , 2 X 1 , 3 X 1 , 4 X 1 , 5 X 2 , 1 X 2 , 2 X 2 , 3 X 2 , 4 X 2 , 5     X = X 3 , 1 X 3 , 2 X 3 , 3 X 3 , 4 X 3 , 5     X 4 , 1 X 4 , 2 X 4 , 3 X 4 , 4 X 4 , 5   X 5 , 1 X 5 , 2 X 5 , 3 X 5 , 4 X 5 , 5 We are given a series of different linear combinations of the entries y = A ( X )

  31. Example: matrix completion Suppose we do not see all the entries in a matrix ...  X 1 , 1 − X 1 , 3 − X 1 , 5  − X 2 , 2 − X 2 , 4 −     X = − X 3 , 2 X 3 , 3 − −     X 4 , 1 − − X 4 , 4 X 4 , 5   − − − X 5 , 4 X 5 , 5 ... can we “fill in the blanks”?

  32. Applications of matrix completion Recommender Euclidean Systems Embedding Data Gram Rank of: Matrix Matrix (slide courtesy of Benjamin Recht)

  33. Low rank structure 2 3 2 3 2  R T 6 7 6 7   6 7 6 7 6 7 6 7 = 6 7 6 7 X L R × N 6 7 6 7 6 7 6 7 6 7 6 7 6 7 6 7 4 5 4 5 K × N K × R

  34. When can we stably recover a rank- R matrix? We have an underdetermined linear operator A A : R K × N → L, L ≪ KN and observe y = A ( X 0 ) + noise where X 0 has rank R We can recover X 0 when A keeps low-rank matrices separated �A ( X 1 − X 2 ) � 2 2 ≈ � X 1 − X 2 � 2 F for all rank- R X 1 , X 2

  35. When can we stably recover a rank- R matrix? We have an underdetermined linear operator A A : R K × N → L, L ≪ KN and observe y = A ( X 0 ) + noise where X 0 has rank R To recover X 0 , we would like to solve min rank( X ) subject to A ( X ) = y X but this is intractable

  36. When can we stably recover a rank- R matrix? We have an underdetermined linear operator A A : R K × N → L, L ≪ KN and observe y = A ( X 0 ) + noise where X 0 has rank R A relaxed (convex) program min � X � ∗ subject to A ( X ) = y X where � X � ∗ = sum of the singular values of X

  37. Matrix Recovery Take vectorize X , stack up vectorized A m as rows of a matrix A X Independent Gaussian entires in the A m embeds rank- R matrices when M � R ( K + N ) (Recht, Fazel, Parillo, Candes, Plan, ...)

Recommend


More recommend