parallel 0 a fully parallel algorithm for combinatorial
play

Parallel- 0 : A fully parallel algorithm for combinatorial - PowerPoint PPT Presentation

Parallel- 0 : A fully parallel algorithm for combinatorial compressed sensing Jared Tanner & Rodrigo Mendoza-Smith 2. Int. Matheon Conf. on CS and its applications 2015 7 th Dec. 2015 University of Oxford 1 Joint with


  1. Parallel- ℓ 0 : A fully parallel algorithm for combinatorial compressed sensing Jared Tanner & Rodrigo Mendoza-Smith 2. Int. Matheon Conf. on CS and its applications 2015 7 th Dec. 2015 ———– University of Oxford 1 Joint with Rodrigo Mendoza-Smith 1 Supported by: EPSRC, NVIDIA, & SELEX-Galileo Jared Tanner & Rodrigo Mendoza-Smith Parallel- ℓ 0 : A fully parallel algorithm for combinatorial compressed sensing

  2. Combinatorial Compressed Sensing (CCS) k := { x ∈ R n : � x � 0 ≤ k } , . ◮ Let A ∈ R m × n , x ∈ χ n ◮ Compressed sensing looks for the solution, with k < m < n , of x ∈ χ n y = Ax s.t. k . ◮ Most CS theory developed for A Gaussian or Partial Fourier Jared Tanner & Rodrigo Mendoza-Smith Parallel- ℓ 0 : A fully parallel algorithm for combinatorial compressed sensing

  3. Combinatorial Compressed Sensing (CCS) k := { x ∈ R n : � x � 0 ≤ k } , . ◮ Let A ∈ R m × n , x ∈ χ n ◮ Compressed sensing looks for the solution, with k < m < n , of x ∈ χ n y = Ax s.t. k . ◮ Most CS theory developed for A Gaussian or Partial Fourier A T y Ensemble Storage Generation m Gaussian O ( mn ) O ( mn ) O ( mn ) O ( k log( n / k )) O ( k log 5 ( n )) Partial Fourier O ( m ) O ( n ) O ( n log( n )) Expander O ( dn ) O ( dn ) O ( dn ) O ( k log( n / k )) ◮ In CCS A is an expander matrix , i . e . a sparse binary matrix with d << m ones per column ( A ∈ E k ,ε, d ). Jared Tanner & Rodrigo Mendoza-Smith Parallel- ℓ 0 : A fully parallel algorithm for combinatorial compressed sensing

  4. Expander matrices: A ∈ E k ,ε, d , some notation Edges of expander graph: [ n ] = { 1 , 2 , . . . , n } , or [ m ] Neighbours of vertices X , N ( X ), are vertices connected by an edge Jared Tanner & Rodrigo Mendoza-Smith Parallel- ℓ 0 : A fully parallel algorithm for combinatorial compressed sensing

  5. Expander matrices: A ∈ E k ,ε, d , some notation Edges of expander graph: [ n ] = { 1 , 2 , . . . , n } , or [ m ] Neighbours of vertices X , N ( X ), are vertices connected by an edge A ij = 1 { i and j are connected } . Jared Tanner & Rodrigo Mendoza-Smith Parallel- ℓ 0 : A fully parallel algorithm for combinatorial compressed sensing

  6. Expander matrices: A ∈ E k ,ε, d , some notation Edges of expander graph: [ n ] = { 1 , 2 , . . . , n } , or [ m ] Neighbours of vertices X , N ( X ), are vertices connected by an edge A ij = 1 { i and j are connected } . ∃ ε ∈ (0 , 1) s.t. | Γ( X ) | = |N ( X ) | > (1 − ε ) d | X | ∀ X ⊂ [ n ] with | X | ≤ k . Jared Tanner & Rodrigo Mendoza-Smith Parallel- ℓ 0 : A fully parallel algorithm for combinatorial compressed sensing

  7. Expander matrices: A ∈ E k ,ε, d , some notation Edges of expander graph: [ n ] = { 1 , 2 , . . . , n } , or [ m ] Neighbours of vertices X , N ( X ), are vertices connected by an edge A ij = 1 { i and j are connected } . ∃ ε ∈ (0 , 1) s.t. | Γ( X ) | = |N ( X ) | > (1 − ε ) d | X | ∀ X ⊂ [ n ] with | X | ≤ k . d ≡ |N ( j ) | ∀ j ∈ [ n ]. A ∈ R m × n is a sparse binary matrix with d << m ones per column Jared Tanner & Rodrigo Mendoza-Smith Parallel- ℓ 0 : A fully parallel algorithm for combinatorial compressed sensing

  8. Structure of CCS Greedy Algorithms Initialization: A ∈ E k ,ε, d ; y ∈ R m , ˆ x = 0, r = y while not converged Compute a score s j and an update ω j ∀ j ∈ [ n ] Select T ⊂ [ n ] based on a rule on s j ˆ x j ← ˆ x j + ω j for j ∈ T r ← y − A ˆ x ◮ CCS algorithms differ by their score metric s j and how many elements T is allowed to contain Jared Tanner & Rodrigo Mendoza-Smith Parallel- ℓ 0 : A fully parallel algorithm for combinatorial compressed sensing

  9. Overview of CCS Greedy Algorithms Algorithm Score Concurrency Complexity SMP (EIHT) [1] ℓ 1 / median parallel O (( nd + n log n ) log � x � 1 ) O (( d 3 n SSMP [2] ℓ 1 / median serial m + n ) k + ( n log n ) log � x � 1 ) O (( d 3 n LDDSR [3] / ER [4] ℓ 0 / mode serial m + n ) k ) Jared Tanner & Rodrigo Mendoza-Smith Parallel- ℓ 0 : A fully parallel algorithm for combinatorial compressed sensing

  10. Overview of CCS Greedy Algorithms Algorithm Score Concurrency Complexity SMP (EIHT) [1] ℓ 1 / median parallel O (( nd + n log n ) log � x � 1 ) O (( d 3 n SSMP [2] ℓ 1 / median serial m + n ) k + ( n log n ) log � x � 1 ) O (( d 3 n LDDSR [3] / ER [4] ℓ 0 / mode serial m + n ) k ) Serial- ℓ 0 [5] ℓ 0 / ℓ 0 serial O ( dn log k ) Parallel- ℓ 0 [5] ℓ 0 / ℓ 0 parallel O ( dn log k ) Jared Tanner & Rodrigo Mendoza-Smith Parallel- ℓ 0 : A fully parallel algorithm for combinatorial compressed sensing

  11. Overview of CCS Greedy Algorithms Algorithm Score Concurrency Complexity SMP (EIHT) [1] ℓ 1 / median parallel O (( nd + n log n ) log � x � 1 ) O (( d 3 n SSMP [2] ℓ 1 / median serial m + n ) k + ( n log n ) log � x � 1 ) O (( d 3 n LDDSR [3] / ER [4] ℓ 0 / mode serial m + n ) k ) Serial- ℓ 0 [5] ℓ 0 / ℓ 0 serial O ( dn log k ) Parallel- ℓ 0 [5] ℓ 0 / ℓ 0 parallel O ( dn log k ) ◮ Only SMP was observed to take less computational time than non-combinatorial CS algorithms such as NIHT ◮ Unfortunately SMP only able to recovery x ∈ χ n k for k / m ≪ 1 ◮ Parallel- ℓ 0 computationally fast and recovery for k / m ≈ 0 . 3 ◮ Sudocodes is an alternative method, preprocessing to reduce n by determining locations in x that must be zero Jared Tanner & Rodrigo Mendoza-Smith Parallel- ℓ 0 : A fully parallel algorithm for combinatorial compressed sensing

  12. Decoding by decreasing � r � ℓ 0 Parallel- ℓ 0 Initialization: A ∈ E k ,ε, d ; y ∈ R m , α ∈ [ d − 1], ˆ x = 0, r = y while not converged T ← { ( j , ω j ) ∈ [ n ] × R : � r � 0 − � r − ω j a j � 0 > α } for ( j , ω j ) ∈ T x j ← ˆ x j + ω j for j ∈ T ˆ r ← y − A ˆ x Jared Tanner & Rodrigo Mendoza-Smith Parallel- ℓ 0 : A fully parallel algorithm for combinatorial compressed sensing

  13. Decoding by decreasing � r � ℓ 0 Parallel- ℓ 0 Initialization: A ∈ E k ,ε, d ; y ∈ R m , α ∈ [ d − 1], ˆ x = 0, r = y while not converged T ← { ( j , ω j ) ∈ [ n ] × R : � r � 0 − � r − ω j a j � 0 > α } for ( j , ω j ) ∈ T x j ← ˆ x j + ω j for j ∈ T ˆ r ← y − A ˆ x Serial- ℓ 0 Initialization: A ∈ E k ,ε, d ; y ∈ R m , α ∈ [ d − 1], ˆ x = 0, r = y while not converged for j ∈ [ n ] T ← { ω j ∈ R : � r � 0 − � r − ω j a j � 0 > α } x j ← ˆ x j + ω j for j ∈ T ˆ r ← y − A ˆ x ◮ Parallel- ℓ 0 : computing T and updating ˆ x suitable for GPU Jared Tanner & Rodrigo Mendoza-Smith Parallel- ℓ 0 : A fully parallel algorithm for combinatorial compressed sensing

  14. Theorem (Convergence of Expander ℓ 0 -Decoders) Let A ∈ E k ,ε, d and ε < 1 / 4 . and x ∈ χ n k be a dissociated signal. Then, Serial- ℓ 0 and Parallel- ℓ 0 with α = (1 − 2 ε ) d can recover x from y = Ax ∈ R m in O ( dn log k ) operations. Dissociated: P j ∈ T 1 x j � = P ∀ T 1 , T 2 ⊂ supp( x ) with T 1 � = T 2 j ∈ T 2 x j Jared Tanner & Rodrigo Mendoza-Smith Parallel- ℓ 0 : A fully parallel algorithm for combinatorial compressed sensing

  15. Theorem (Convergence of Expander ℓ 0 -Decoders) Let A ∈ E k ,ε, d and ε < 1 / 4 . and x ∈ χ n k be a dissociated signal. Then, Serial- ℓ 0 and Parallel- ℓ 0 with α = (1 − 2 ε ) d can recover x from y = Ax ∈ R m in O ( dn log k ) operations. Dissociated: P j ∈ T 1 x j � = P ∀ T 1 , T 2 ⊂ supp( x ) with T 1 � = T 2 j ∈ T 2 x j ◮ Dissociation, the same signal model as consider by sudocodes. ◮ Parallel- ℓ 0 requires log k iterations of complexity O ( dn ) complexity, each of which is trivially decomposed into n independent tasks of complexity O ( d ). ◮ Serial- ℓ 0 requires n log k iterations of complexity O ( d ). ◮ Serial- ℓ 0 is faster than Parallel- ℓ 0 if both run on a single core, but Parallel- ℓ 0 substantially faster when run on high performance computing GPUs with thousands of cores. ◮ Serial- ℓ 0 and Parallel- ℓ 0 have nearly identical recovery regions. Jared Tanner & Rodrigo Mendoza-Smith Parallel- ℓ 0 : A fully parallel algorithm for combinatorial compressed sensing

  16. Improved phase transition 50% phase transition curve for d = 7 with n = 2 18 1 0.9 0.8 0.7 ← l1−regularization 0.6 ρ =k/m 0.5 0.4 ← parallel_l0 ← serial_l0 0.3 0.2 ← ssmp ← er ← parallel_lddsr 0.1 ← smp 0 0 0.2 0.4 0.6 0.8 1 δ =m/n ◮ Greater recovery region than other CCS algorithms ◮ No apparent decrease in phase transition for m ≪ n Jared Tanner & Rodrigo Mendoza-Smith Parallel- ℓ 0 : A fully parallel algorithm for combinatorial compressed sensing

  17. Fastest CS algorithm for A ∈ E k ,ε, d Algorithm selection map for d = 7 with n = 2 18 1 0.9 parallel−l0:plus parallel−lddsr:asterisk 0.8 CGIHT:diamond CGIHTprojected:hexagram 0.7 CGIHTrestarted:up−triangle CSMPSP:down−triangle 0.6 FIHT:right−triangle HTP:left−triangle ρ =k/m 0.5 0.4 0.3 0.2 0.1 0 0 0.2 0.4 0.6 0.8 1 δ =m/n ◮ Parallel- ℓ 0 and Parallel-LDDSR fastest when convergent ◮ First examples of CCS algorithms being state-of-the-art Jared Tanner & Rodrigo Mendoza-Smith Parallel- ℓ 0 : A fully parallel algorithm for combinatorial compressed sensing

Recommend


More recommend