Parallel- 0 : A fully parallel algorithm for combinatorial - PowerPoint PPT Presentation

Parallel- ℓ 0 : A fully parallel algorithm for combinatorial compressed sensing Jared Tanner & Rodrigo Mendoza-Smith 2. Int. Matheon Conf. on CS and its applications 2015 7 th Dec. 2015 ———– University of Oxford 1 Joint with Rodrigo Mendoza-Smith 1 Supported by: EPSRC, NVIDIA, & SELEX-Galileo Jared Tanner & Rodrigo Mendoza-Smith Parallel- ℓ 0 : A fully parallel algorithm for combinatorial compressed sensing

Combinatorial Compressed Sensing (CCS) k := { x ∈ R n : � x � 0 ≤ k } , . ◮ Let A ∈ R m × n , x ∈ χ n ◮ Compressed sensing looks for the solution, with k < m < n , of x ∈ χ n y = Ax s.t. k . ◮ Most CS theory developed for A Gaussian or Partial Fourier Jared Tanner & Rodrigo Mendoza-Smith Parallel- ℓ 0 : A fully parallel algorithm for combinatorial compressed sensing

Combinatorial Compressed Sensing (CCS) k := { x ∈ R n : � x � 0 ≤ k } , . ◮ Let A ∈ R m × n , x ∈ χ n ◮ Compressed sensing looks for the solution, with k < m < n , of x ∈ χ n y = Ax s.t. k . ◮ Most CS theory developed for A Gaussian or Partial Fourier A T y Ensemble Storage Generation m Gaussian O ( mn ) O ( mn ) O ( mn ) O ( k log( n / k )) O ( k log 5 ( n )) Partial Fourier O ( m ) O ( n ) O ( n log( n )) Expander O ( dn ) O ( dn ) O ( dn ) O ( k log( n / k )) ◮ In CCS A is an expander matrix , i . e . a sparse binary matrix with d << m ones per column ( A ∈ E k ,ε, d ). Jared Tanner & Rodrigo Mendoza-Smith Parallel- ℓ 0 : A fully parallel algorithm for combinatorial compressed sensing

Expander matrices: A ∈ E k ,ε, d , some notation Edges of expander graph: [ n ] = { 1 , 2 , . . . , n } , or [ m ] Neighbours of vertices X , N ( X ), are vertices connected by an edge Jared Tanner & Rodrigo Mendoza-Smith Parallel- ℓ 0 : A fully parallel algorithm for combinatorial compressed sensing

Expander matrices: A ∈ E k ,ε, d , some notation Edges of expander graph: [ n ] = { 1 , 2 , . . . , n } , or [ m ] Neighbours of vertices X , N ( X ), are vertices connected by an edge A ij = 1 { i and j are connected } . Jared Tanner & Rodrigo Mendoza-Smith Parallel- ℓ 0 : A fully parallel algorithm for combinatorial compressed sensing

Expander matrices: A ∈ E k ,ε, d , some notation Edges of expander graph: [ n ] = { 1 , 2 , . . . , n } , or [ m ] Neighbours of vertices X , N ( X ), are vertices connected by an edge A ij = 1 { i and j are connected } . ∃ ε ∈ (0 , 1) s.t. | Γ( X ) | = |N ( X ) | > (1 − ε ) d | X | ∀ X ⊂ [ n ] with | X | ≤ k . Jared Tanner & Rodrigo Mendoza-Smith Parallel- ℓ 0 : A fully parallel algorithm for combinatorial compressed sensing

Expander matrices: A ∈ E k ,ε, d , some notation Edges of expander graph: [ n ] = { 1 , 2 , . . . , n } , or [ m ] Neighbours of vertices X , N ( X ), are vertices connected by an edge A ij = 1 { i and j are connected } . ∃ ε ∈ (0 , 1) s.t. | Γ( X ) | = |N ( X ) | > (1 − ε ) d | X | ∀ X ⊂ [ n ] with | X | ≤ k . d ≡ |N ( j ) | ∀ j ∈ [ n ]. A ∈ R m × n is a sparse binary matrix with d << m ones per column Jared Tanner & Rodrigo Mendoza-Smith Parallel- ℓ 0 : A fully parallel algorithm for combinatorial compressed sensing

Structure of CCS Greedy Algorithms Initialization: A ∈ E k ,ε, d ; y ∈ R m , ˆ x = 0, r = y while not converged Compute a score s j and an update ω j ∀ j ∈ [ n ] Select T ⊂ [ n ] based on a rule on s j ˆ x j ← ˆ x j + ω j for j ∈ T r ← y − A ˆ x ◮ CCS algorithms differ by their score metric s j and how many elements T is allowed to contain Jared Tanner & Rodrigo Mendoza-Smith Parallel- ℓ 0 : A fully parallel algorithm for combinatorial compressed sensing

Overview of CCS Greedy Algorithms Algorithm Score Concurrency Complexity SMP (EIHT) [1] ℓ 1 / median parallel O (( nd + n log n ) log � x � 1 ) O (( d 3 n SSMP [2] ℓ 1 / median serial m + n ) k + ( n log n ) log � x � 1 ) O (( d 3 n LDDSR [3] / ER [4] ℓ 0 / mode serial m + n ) k ) Jared Tanner & Rodrigo Mendoza-Smith Parallel- ℓ 0 : A fully parallel algorithm for combinatorial compressed sensing

Overview of CCS Greedy Algorithms Algorithm Score Concurrency Complexity SMP (EIHT) [1] ℓ 1 / median parallel O (( nd + n log n ) log � x � 1 ) O (( d 3 n SSMP [2] ℓ 1 / median serial m + n ) k + ( n log n ) log � x � 1 ) O (( d 3 n LDDSR [3] / ER [4] ℓ 0 / mode serial m + n ) k ) Serial- ℓ 0 [5] ℓ 0 / ℓ 0 serial O ( dn log k ) Parallel- ℓ 0 [5] ℓ 0 / ℓ 0 parallel O ( dn log k ) Jared Tanner & Rodrigo Mendoza-Smith Parallel- ℓ 0 : A fully parallel algorithm for combinatorial compressed sensing

Overview of CCS Greedy Algorithms Algorithm Score Concurrency Complexity SMP (EIHT) [1] ℓ 1 / median parallel O (( nd + n log n ) log � x � 1 ) O (( d 3 n SSMP [2] ℓ 1 / median serial m + n ) k + ( n log n ) log � x � 1 ) O (( d 3 n LDDSR [3] / ER [4] ℓ 0 / mode serial m + n ) k ) Serial- ℓ 0 [5] ℓ 0 / ℓ 0 serial O ( dn log k ) Parallel- ℓ 0 [5] ℓ 0 / ℓ 0 parallel O ( dn log k ) ◮ Only SMP was observed to take less computational time than non-combinatorial CS algorithms such as NIHT ◮ Unfortunately SMP only able to recovery x ∈ χ n k for k / m ≪ 1 ◮ Parallel- ℓ 0 computationally fast and recovery for k / m ≈ 0 . 3 ◮ Sudocodes is an alternative method, preprocessing to reduce n by determining locations in x that must be zero Jared Tanner & Rodrigo Mendoza-Smith Parallel- ℓ 0 : A fully parallel algorithm for combinatorial compressed sensing

Decoding by decreasing � r � ℓ 0 Parallel- ℓ 0 Initialization: A ∈ E k ,ε, d ; y ∈ R m , α ∈ [ d − 1], ˆ x = 0, r = y while not converged T ← { ( j , ω j ) ∈ [ n ] × R : � r � 0 − � r − ω j a j � 0 > α } for ( j , ω j ) ∈ T x j ← ˆ x j + ω j for j ∈ T ˆ r ← y − A ˆ x Jared Tanner & Rodrigo Mendoza-Smith Parallel- ℓ 0 : A fully parallel algorithm for combinatorial compressed sensing

Decoding by decreasing � r � ℓ 0 Parallel- ℓ 0 Initialization: A ∈ E k ,ε, d ; y ∈ R m , α ∈ [ d − 1], ˆ x = 0, r = y while not converged T ← { ( j , ω j ) ∈ [ n ] × R : � r � 0 − � r − ω j a j � 0 > α } for ( j , ω j ) ∈ T x j ← ˆ x j + ω j for j ∈ T ˆ r ← y − A ˆ x Serial- ℓ 0 Initialization: A ∈ E k ,ε, d ; y ∈ R m , α ∈ [ d − 1], ˆ x = 0, r = y while not converged for j ∈ [ n ] T ← { ω j ∈ R : � r � 0 − � r − ω j a j � 0 > α } x j ← ˆ x j + ω j for j ∈ T ˆ r ← y − A ˆ x ◮ Parallel- ℓ 0 : computing T and updating ˆ x suitable for GPU Jared Tanner & Rodrigo Mendoza-Smith Parallel- ℓ 0 : A fully parallel algorithm for combinatorial compressed sensing

Theorem (Convergence of Expander ℓ 0 -Decoders) Let A ∈ E k ,ε, d and ε < 1 / 4 . and x ∈ χ n k be a dissociated signal. Then, Serial- ℓ 0 and Parallel- ℓ 0 with α = (1 − 2 ε ) d can recover x from y = Ax ∈ R m in O ( dn log k ) operations. Dissociated: P j ∈ T 1 x j � = P ∀ T 1 , T 2 ⊂ supp( x ) with T 1 � = T 2 j ∈ T 2 x j Jared Tanner & Rodrigo Mendoza-Smith Parallel- ℓ 0 : A fully parallel algorithm for combinatorial compressed sensing

Theorem (Convergence of Expander ℓ 0 -Decoders) Let A ∈ E k ,ε, d and ε < 1 / 4 . and x ∈ χ n k be a dissociated signal. Then, Serial- ℓ 0 and Parallel- ℓ 0 with α = (1 − 2 ε ) d can recover x from y = Ax ∈ R m in O ( dn log k ) operations. Dissociated: P j ∈ T 1 x j � = P ∀ T 1 , T 2 ⊂ supp( x ) with T 1 � = T 2 j ∈ T 2 x j ◮ Dissociation, the same signal model as consider by sudocodes. ◮ Parallel- ℓ 0 requires log k iterations of complexity O ( dn ) complexity, each of which is trivially decomposed into n independent tasks of complexity O ( d ). ◮ Serial- ℓ 0 requires n log k iterations of complexity O ( d ). ◮ Serial- ℓ 0 is faster than Parallel- ℓ 0 if both run on a single core, but Parallel- ℓ 0 substantially faster when run on high performance computing GPUs with thousands of cores. ◮ Serial- ℓ 0 and Parallel- ℓ 0 have nearly identical recovery regions. Jared Tanner & Rodrigo Mendoza-Smith Parallel- ℓ 0 : A fully parallel algorithm for combinatorial compressed sensing

Improved phase transition 50% phase transition curve for d = 7 with n = 2 18 1 0.9 0.8 0.7 ← l1−regularization 0.6 ρ =k/m 0.5 0.4 ← parallel_l0 ← serial_l0 0.3 0.2 ← ssmp ← er ← parallel_lddsr 0.1 ← smp 0 0 0.2 0.4 0.6 0.8 1 δ =m/n ◮ Greater recovery region than other CCS algorithms ◮ No apparent decrease in phase transition for m ≪ n Jared Tanner & Rodrigo Mendoza-Smith Parallel- ℓ 0 : A fully parallel algorithm for combinatorial compressed sensing

Fastest CS algorithm for A ∈ E k ,ε, d Algorithm selection map for d = 7 with n = 2 18 1 0.9 parallel−l0:plus parallel−lddsr:asterisk 0.8 CGIHT:diamond CGIHTprojected:hexagram 0.7 CGIHTrestarted:up−triangle CSMPSP:down−triangle 0.6 FIHT:right−triangle HTP:left−triangle ρ =k/m 0.5 0.4 0.3 0.2 0.1 0 0 0.2 0.4 0.6 0.8 1 δ =m/n ◮ Parallel- ℓ 0 and Parallel-LDDSR fastest when convergent ◮ First examples of CCS algorithms being state-of-the-art Jared Tanner & Rodrigo Mendoza-Smith Parallel- ℓ 0 : A fully parallel algorithm for combinatorial compressed sensing

Parallel- 0 : A fully parallel algorithm for combinatorial - PowerPoint PPT Presentation

Parallel- 0 : A fully parallel algorithm for combinatorial compressed sensing Jared Tanner & Rodrigo Mendoza-Smith 2. Int. Matheon Conf. on CS and its applications 2015 7 th Dec. 2015 University of Oxford 1 Joint with

Introduction to Combinatorial Algorithms Lucia Moura Fall 2015 Introduction to Combinatorial

Introduction to Combinatorial Algorithms Lucia Moura Winter 2018 Introduction to Combinatorial

P2P Combinatorial Optimization Amir H. Payberah (amir@sics.se) P2P Combinatorial Optimization, 13

Introduction: Combinatorial Problems Combinatorial Problem Solving (CPS) Enric Rodr

Introduction to Parallel Computing George Karypis Principles of Parallel Algorithm Design

Odds Algorithm An Online Algorithm Group Fibonado 20. Dec 2016 Group Fibonado Odds Algorithm

CHAPTER IV IV CHAPTER Combinatorial Optimization Combinatorial Optimization by Neural Networks

Combinatorial Markov chains R. Gr ubel Leibniz Universit at Hannover AofA, Menorca 2013

20.1 Combinatorial Optimization next chapters: combinatorial optimization similar scenario,

+ Design of Parallel Algorithms Parallel Algorithm Analysis Tools + Topic Overview n Sources of

+ Design of Parallel Algorithms Parallel Algorithm Analysis Tools + Topic Overview n Sources

Combinatorial groupoids and their applications Rade T. Zivaljevi c Mathematical Institute

Fully Persistent Arrays Anders Kaseorg andersk@mit.edu 6.851 Project Presentation Fully

Parallel Numerical Algorithms Chapter 2 Parallel Thinking Section 2.1 Parallel Algorithm

Parallel Algorithms Parallel Prefix Sums Algorithm Theory WS 2012/13 Fabian Kuhn PRAM Parallel

Combinatorial Security Testing: Combinatorial Testing Meets Information Security Dimitris E.

Parallel Computing: Opportunities and Challenges Victor Lee Parallel Computing Lab (PCL), Intel

Overview on Parallel Programming Paradigms Ivan Giro3o

PARALLEL AND DISTRIBUTED ALGORITHMS BY DEBDEEP MUKHOPADHYAY AND ABHISHEK SOMANI

Parallel Programs 1 Why Bother with Programs? Theyre what runs on the machines we design

P A R A L L E L A L G O R I T H M S F O R M I N I N G L A R G E - S C A L E T I M E - V A R Y

A Tile-based Parallel Viterbi Algorithm for Biological Sequence Alignment on GPU with CUDA Zhihui

Principle of Parallel Algorithm Design Alexandre David B2-206 Today Preliminaries (3.1).

Presburger Arithmetic in Memory Access Optimization for Data-Parallel Languages Marek Ko sta