Sparse Approximation Phase Transitions Matrix completion Empirical Testing of Sparse Approximation and Matrix Completion Algorithms Jared Tanner Workshop on Sparsity, Compressed Sensing and Applications ———– University of Oxford Joint with Blanchard, Donoho, and Wei Jared Tanner Empirical Testing of Sparse Approximation and Matrix Completion Algorithms
Sparse Approximation Phase Transitions Universality using cluster: embarrassingly Matrix completion Empirical testing of iterative algorithms using GPUs Three sparse approximation questions to test Sparse approximation: A ∈ R m × n min x � x � 0 s.t. � Ax − b � 2 ≤ τ with 1. Are there algorithms that have same behaviour for different A ? 2. Which algorithm is fastest and with a high recovery probability? Jared Tanner Empirical Testing of Sparse Approximation and Matrix Completion Algorithms
Sparse Approximation Phase Transitions Universality using cluster: embarrassingly Matrix completion Empirical testing of iterative algorithms using GPUs Three sparse approximation questions to test Sparse approximation: A ∈ R m × n min x � x � 0 s.t. � Ax − b � 2 ≤ τ with 1. Are there algorithms that have same behaviour for different A ? 2. Which algorithm is fastest and with a high recovery probability? Matrix completion: A maps R m × n to R p min X rank ( X ) s.t. �A ( X ) − b � 2 ≤ τ with 3. What is largest rank that is recovered with efficient algorithm? Jared Tanner Empirical Testing of Sparse Approximation and Matrix Completion Algorithms
Sparse Approximation Phase Transitions Universality using cluster: embarrassingly Matrix completion Empirical testing of iterative algorithms using GPUs Three sparse approximation questions to test Sparse approximation: A ∈ R m × n min x � x � 0 s.t. � Ax − b � 2 ≤ τ with 1. Are there algorithms that have same behaviour for different A ? 2. Which algorithm is fastest and with a high recovery probability? Matrix completion: A maps R m × n to R p min X rank ( X ) s.t. �A ( X ) − b � 2 ≤ τ with 3. What is largest rank that is recovered with efficient algorithm? Information about each question can be gleaned from large scale empirical testing. Lets use some HPC resources. Jared Tanner Empirical Testing of Sparse Approximation and Matrix Completion Algorithms
Sparse Approximation Phase Transitions Universality using cluster: embarrassingly Matrix completion Empirical testing of iterative algorithms using GPUs Sparse approximation phase transition ◮ Problem characterized by three numbers: k ≤ m ≤ n • n , Signal Length, “Nyquist” sampling rate • m , number of inner product measurements, • k , signal complexity, sparsity, k := min x � x � 0 ◮ Mixed under/over-sampling rates compared to naive/optimal Undersampling: δ m := m Oversampling: ρ m := k n , m ◮ Testing model: For matrix ensemble and algorithm draw A and k -sparse x 0 , let Π( k , m , n ) be the probability of recovery ◮ For fixed ( δ m , ρ m ), Π( k , m , n ) converges to 1 or 0 with increasing m : separated by phase transition curve ρ ( δ ) ◮ Algorithm with ρ ( δ ) large, Π( k , m , n ) insensitive to matrix? Jared Tanner Empirical Testing of Sparse Approximation and Matrix Completion Algorithms
Sparse Approximation Phase Transitions Universality using cluster: embarrassingly Matrix completion Empirical testing of iterative algorithms using GPUs Phase Transition: ℓ 1 ball, C n ◮ With overwhelming probability on measurements A m , n : for any ǫ > 0, as ( k , m , n ) → ∞ • All k -sparse signals if k / m ≤ ρ S ( m / n , C )(1 − ǫ ) • Most k -sparse signals if k / m ≤ ρ W ( m / n , C )(1 − ǫ ) • Failure typical if k / m ≥ ρ W ( m / n , C )(1 + ǫ ) 1 0.9 0.8 0.7 k 0.6 m 0.5 0.4 0.3 � W Recovery: most signals 0.2 � S 0.1 Recovery: all signals 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 δ = m / n ◮ Asymptotic behaviour δ → 0: ρ ( m / n ) ∼ [2( e ) log( n / m )] − 1 Jared Tanner Empirical Testing of Sparse Approximation and Matrix Completion Algorithms
Sparse Approximation Phase Transitions Universality using cluster: embarrassingly Matrix completion Empirical testing of iterative algorithms using GPUs Phase Transition: Simplex, T n − 1 , x ≥ 0 ◮ With overwhelming probability on measurements A m , n : for any ǫ > 0, x ≥ 0, as ( k , m , n ) → ∞ • All k -sparse signals if k / m ≤ ρ S ( m / n , T )(1 − ǫ ) • Most k -sparse signals if k / m ≤ ρ W ( m / n , T )(1 − ǫ ) • Failure typical if k / m ≥ ρ W ( m / n , T )(1 + ǫ ) 1 0.9 0.8 0.7 0.6 k 0.5 m 0.4 ! W 0.3 Recovery: most signals 0.2 ! S 0.1 Recovery: all signals 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 δ = m / n ◮ Asymptotic behaviour δ → 0: ρ ( m / n ) ∼ [2( e ) log( n / m )] − 1 Jared Tanner Empirical Testing of Sparse Approximation and Matrix Completion Algorithms
Sparse Approximation Phase Transitions Universality using cluster: embarrassingly Matrix completion Empirical testing of iterative algorithms using GPUs ℓ 1 -Weak Phase Transitions: Visual agreement ◮ Testing beyond the proven theory, 6.4 CPU years later... Jared Tanner Empirical Testing of Sparse Approximation and Matrix Completion Algorithms
Sparse Approximation Phase Transitions Universality using cluster: embarrassingly Matrix completion Empirical testing of iterative algorithms using GPUs ℓ 1 -Weak Phase Transitions: Visual agreement ◮ Testing beyond the proven theory, 6.4 CPU years later... ◮ Black: Weak phase transition: x ≥ 0 (top), x signed (bot.) ◮ Overlaid empirical evidence of 50% success rate: 1 Gaussian Bernoulli 0.9 Fourier Ternary p=2/3 Ternary p=2/5 0.8 Ternary p=1/10 Hadamard Expander p=1/5 0.7 Rademacher ρ ( δ ,Q) 0.6 ρ =k/n 0.5 0.4 0.3 0.2 0.1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 δ =n/N ◮ Gaussian, Bernoulli, Fourier, Hadamard, Rademacher ◮ Ternary ( p ): P (0) = 1 − p and P ( ± 1) = p / 2 ◮ Expander ( p ): ⌈ p · n ⌉ ones per column, otherwise zeros ◮ Rigorous statistical comparison shows n − 1 / 2 convergence Jared Tanner Empirical Testing of Sparse Approximation and Matrix Completion Algorithms
Sparse Approximation Phase Transitions Universality using cluster: embarrassingly Matrix completion Empirical testing of iterative algorithms using GPUs Bulk Z -scores 5 6 4 4 3 2 2 1 Z−score Z−score 0 0 −1 −2 −2 −4 −3 −4 −6 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 δ =n/N δ =n/N (a) Bernoulli (b) Fourier 4 3 3 2 2 1 1 0 Z−score Z−score 0 −1 −1 −2 −2 −3 −3 −4 −4 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 δ =n/N δ =n/N (c) Ternary (1/3) (d) Rademacher ◮ n = 200, n = 400 and n = 1600 ◮ Linear trend with δ = m / n , decays at rate n − 1 / 2 ◮ Proven for matrices with subgaussian tail, Montanari 2012 Jared Tanner Empirical Testing of Sparse Approximation and Matrix Completion Algorithms
Sparse Approximation Phase Transitions Universality using cluster: embarrassingly Matrix completion Empirical testing of iterative algorithms using GPUs Which algorithm is fastest and with high phase transition? State of the art algorithms for sparse approximation ◮ Hard Thresholding, H k ( A T b ), followed by subspace restricted linear solver: Conjugate Gradient ◮ Normalized IHT: H k ( x t + κ A T ( b − Ax t )), (Steepest Descent) ◮ Hard Thresholding Pursuit: NIHT with pseudo-inverse ◮ CSAMPSP (hybrid of CoSaMP and Subspace Pursuit) v t +1 = x t +1 = H α k ( x t + κ A T ( b − Ax t )) I t = supp ( v t ) ∪ supp ( x t ) Join supp. sets w I t = ( A T I t A I t ) − 1 A T I t b Least squares fit x t +1 = H β k ( w t ) Second threshold ◮ SpaRSA [Lee and Wright ’08] Jared Tanner Empirical Testing of Sparse Approximation and Matrix Completion Algorithms
Sparse Approximation Phase Transitions Universality using cluster: embarrassingly Matrix completion Empirical testing of iterative algorithms using GPUs Which algorithm is fastest and with high phase transition? State of the art algorithms for sparse approximation ◮ Hard Thresholding, H k ( A T b ), followed by subspace restricted linear solver: Conjugate Gradient ◮ Normalized IHT: H k ( x t + κ A T ( b − Ax t )), (Steepest Descent) ◮ Hard Thresholding Pursuit: NIHT with pseudo-inverse ◮ CSAMPSP (hybrid of CoSaMP and Subspace Pursuit) v t +1 = x t +1 = H α k ( x t + κ A T ( b − Ax t )) I t = supp ( v t ) ∪ supp ( x t ) Join supp. sets w I t = ( A T I t A I t ) − 1 A T I t b Least squares fit x t +1 = H β k ( w t ) Second threshold ◮ SpaRSA [Lee and Wright ’08] ◮ Testing environment with random problem generation, or passing matrix and measurements. Jared Tanner Empirical Testing of Sparse Approximation and Matrix Completion Algorithms
Recommend
More recommend