Compressed sensing off-the-grid: The Fisher metric, support - PowerPoint PPT Presentation

Compressed sensing off-the-grid: The Fisher metric, support stability and optimal sampling bounds Clarice Poon University of Bath Joint work with: Nicolas Keriven and Gabriel Peyr´ e ´ Ecole Normale Sup´ erieure February 6, 2019 1 / 36

Outline Compressed sensing off-the-grid 1 The Fisher metric and the minimum separation condition 2 Support stability for the subsampled problem 3 Ideas behind the proofs – Dual certificates 4 Removal of random signs assumption 5 2 / 36

Compressed sensing [Cand` es, Romberg & Tao ’06; Donoho ’06] Task: Recover a ∈ C N from y = Φ a where Φ ∈ C m × N with m ≪ N and a is s -sparse. Typical compressed sensing statement: For certain random matrices Φ ∈ C m × N , with high probability, a can be uniquely recovered from m = O ( s log ( N )) measurements by solving z ∈ C N � z � 1 subject to Φ z = y min or in the noisy case of y = Φ a + w , the minimizer ˆ a of z ∈ C N λ � z � 1 + 1 2 � Φ z − y � 2 min 2 with λ ∼ δ/ √ s and � w � � δ satisfies � a − ˆ a � 1 � σ s ( x ) 1 + √ sδ. 3 / 36

Compressed sensing [Cand` es, Romberg & Tao ’06; Donoho ’06] Task: Recover a ∈ C N from y = Φ a where Φ ∈ C m × N with m ≪ N and a is s -sparse. Typical compressed sensing statement: For certain random matrices Φ ∈ C m × N , with high probability, a can be uniquely recovered from m = O ( s log ( N )) measurements by solving z ∈ C N � z � 1 subject to Φ z = y min or in the noisy case of y = Φ a + w , the minimizer ˆ a of z ∈ C N λ � z � 1 + 1 2 � Φ z − y � 2 min 2 with λ ∼ δ/ √ s and � w � � δ satisfies � a − ˆ a � 1 � σ s ( x ) 1 + √ sδ. In the case where U is unitary, the above statement holds with Φ = P Ω U where Ω are m = O ( N · µ ( U ) 2 · s · log( N )) uniformly drawn indices, µ ( U ) = max i,j | U ij | is the so called coherence . In the case of U being the DFT, we have µ ( U ) 2 = 1 /N . 3 / 36

Compressed sensing off the grid Aim: Recover µ 0 ∈ M ( X ), X ⊆ R d , from m observations, y = Φ µ 0 + w Let (Ω , Λ) be a probability space. For ω ∈ Ω, we have random features ϕ ω ∈ C ( X ) . iid For k = 1 , . . . , m , let ω k ∼ Λ. The measurement operator is � m 1 �� def. Φ : M ( X ) → C m , Φ µ = √ m ϕ ω k ( x )d µ ( x ) k =1 Typically, the measure of interest is µ 0 = � s j =1 a j δ x j where aδ x denotes the Dirac at x ∈ X with amplitude a ∈ C (also called a “spike”). 4 / 36

Imaging Sampling the Fourier transform (e.g. astronomy) Recover µ ∈ M ( T d ) from ( F µ ( ω k )) m k =1 where F is the Fourier transform and ω k are drawn ] d , Unif). iid from ([ [ − f c , f c ] � − i2 πx ⊤ ω � Here, ϕ ω ( x ) = exp and m   s 1 � � � − i2 πx ⊤ Φ µ 0 = √ m a j exp j ω k   j =1 k =1 Sampling the Laplace transform (e.g. fluorescence microscopy) + ) from ( L µ ( ω k )) m Recover µ ∈ M ( R d k =1 where L is the Laplace transform and ω k are drawn iid from ( R d − 2 α ⊤ ω + , Λ α ) where Λ α ( ω ) ∝ exp � � . � − x ⊤ ω � Here, ϕ ω ( x ) = exp and m   s 1 � � � − x ⊤ Φ µ 0 = √ m a j exp j ω k   j =1 k =1 5 / 36

Two layer neural network [Bach, 2015] Let Ω ⊆ R d , and ω 1 , . . . , ω m are the training samples drawn from (Ω , Λ), with corresponding values y 1 , . . . , y m ∈ R . Find a function of the form s � f ( ω ) = a j max ( � x j , ω � , 0) j =1 where a j ∈ R and x j ∈ R d such that f ( ω j ) ≈ y j for j = 1 , . . . , m . We can then use the function f to predict y given ω ∈ Ω. 6 / 36

Two layer neural network [Bach, 2015] Let Ω ⊆ R d , and ω 1 , . . . , ω m are the training samples drawn from (Ω , Λ), with corresponding values y 1 , . . . , y m ∈ R . Find a function of the form s � f ( ω ) = a j max ( � x j , ω � , 0) j =1 where a j ∈ R and x j ∈ R d such that f ( ω j ) ≈ y j for j = 1 , . . . , m . We can then use the function f to predict y given ω ∈ Ω. This is precisely our sparse spikes problem where we let ϕ ω ( x ) = max ( � x, ω � , 0) and m   s � Φ µ 0 = a j max ( � x j , ω k � , 0)   j =1 k =1 where µ 0 = � s j =1 a j δ x j . 6 / 36

Density estimation i =1 ∈ X s of a mixture Task: Given data on T , estimate parameters ( a i ) ∈ R N + and ( x i ) s s � � ξ ( t ) = a j ξ x j ( t ) = ξ x ( t )d µ 0 ( x ) X j =1 where µ 0 = � j a j δ x j where ( ξ x ) x ∈X is a family of template distributions. E.g. x = ( m, σ ) ∈ X = R × R + and ξ x = N ( m, σ 2 ). 7 / 36

Density estimation i =1 ∈ X s of a mixture Task: Given data on T , estimate parameters ( a i ) ∈ R N + and ( x i ) s s � � ξ ( t ) = a j ξ x j ( t ) = ξ x ( t )d µ 0 ( x ) X j =1 where µ 0 = � j a j δ x j where ( ξ x ) x ∈X is a family of template distributions. E.g. x = ( m, σ ) ∈ X = R × R + and ξ x = N ( m, σ 2 ). Sketching [Gribonval, Blanchard, Keriven & Traonmilin, 2017] No direct access to ξ but n iid samples ( t 1 , . . . , t n ) ∈ T n drawn from ξ . You do not record this (possibly huge) set of data, but compute online a small set y ∈ C m of m sketches against sketching functions θ ω ( t ): n = 1 � � � def. � y k θ ω k ( t j ) ≈ θ ω k ( t ) ξ ( t )d t = θ ω k ( t ) ξ x ( t )d t d µ 0 ( x ) . n T X T j =1 def. T θ ω k ( t ) ξ x ( t )d t . E.g. θ ω ( t ) = e i � ω, t � and ϕ · ( x ) is the characterisatic � So, ϕ ω ( x ) = function of ξ x . 7 / 36

The Beurling LASSO The BLASSO was initially proposed by [De Castro & Gamboa, 2012] and [Bredies & Pikkarainnen, 2013]. Solve 1 2 � Φ µ − y � 2 + λ | µ | ( X ) ( ˆ P λ ( y )) min µ ∈M ( X ) def. � � where | µ | ( X ) = sup Re ( � f, µ � ) ; f ∈ C ( X ) , � f � ∞ � 1 . Noiseless problem: for y 0 = Φ µ 0 , ( ˆ µ ∈M ( X ) | µ | ( X ) subject to Φ µ = y 0 min P 0 ( y 0 )) NB: If µ = � j a j δ x j , then | µ | ( X ) = � a � 1 . 8 / 36

The Beurling LASSO The BLASSO was initially proposed by [De Castro & Gamboa, 2012] and [Bredies & Pikkarainnen, 2013]. Solve 1 2 � Φ µ − y � 2 + λ | µ | ( X ) ( ˆ P λ ( y )) min µ ∈M ( X ) def. � � where | µ | ( X ) = sup Re ( � f, µ � ) ; f ∈ C ( X ) , � f � ∞ � 1 . Noiseless problem: for y 0 = Φ µ 0 , ( ˆ µ ∈M ( X ) | µ | ( X ) subject to Φ µ = y 0 min P 0 ( y 0 )) NB: If µ = � j a j δ x j , then | µ | ( X ) = � a � 1 . Goal: A CS-type theory . Under what conditions can we recover µ 0 = � s j =1 a j δ x j exactly (stably) from m = O ( s × log factors) (noisy) randomised linear measurements? 8 / 36

Remarks Other approaches include Prony-type methods (1795): MUSIC [Schmidt, 1986], ESPRIT [Roy, 1987], Finite Rate of Innovation [Vetterli, 2002] ... ◮ Nonvariational approaches which encodes the spikes positions as the zeros of some polynomial, whose coefficients are derived from the measurements. ◮ Generally restricted to Fourier type measurements. ◮ Extension to multivariate setting is nontrivial. There are efficient algorithms for solving this infinite dimensional problem, e.g. SDP approaches [Cand` es & Fernandez-Granda, 2012; De Castro, Gamboa, Henrion & Lasserre 2015] and Frank-Wolfe approaches [Bredies & Pikkarainnen 2013; Boyd, Schiebinger & Recht ’15; Denoyelle, Duval & Peyr´ e ’18] . 9 / 36

Background on the BLASSO Recovery of spikes of arbitrary signs require a minimum separation condition: � F µ 0 ( k ) ; k ∈ Z d , � k � ∞ � f c � [Cand` es & Fernandez-Granda ’12]: Given , µ 0 can be recovered uniquely if ∆ = min i � = j � x i − x j � ∞ � C d f c . Many extensions to other measurement operators, minimum separation is fundamental (for BLASSO) and often imposed via ad hoc metrics [Bendory et al ’15, Tang ’15]. 10 / 36

Background on the BLASSO Recovery of spikes of arbitrary signs require a minimum separation condition: � F µ 0 ( k ) ; k ∈ Z d , � k � ∞ � f c � [Cand` es & Fernandez-Granda ’12]: Given , µ 0 can be recovered uniquely if ∆ = min i � = j � x i − x j � ∞ � C d f c . Many extensions to other measurement operators, minimum separation is fundamental (for BLASSO) and often imposed via ad hoc metrics [Bendory et al ’15, Tang ’15]. Stability for the recovered measure ˆ µ : Integral type stability estimates [Cand` es & Fernandez-Granda ’13]: � K hi ⋆ (ˆ µ − µ 0 ) � L 1 . Support concentration [Fernandez-Granda ’13; Asa¨ ıs, De Castro & Gamboa ’12]: � � µ ( X near µ | ( X far ). Bounds on � ˆ ) − a j � and | ˆ � � j Support stability [Duval and Peyr´ e ’15]: in the small noise regime where � w � and λ are sufficiently small, ˆ µ consists of exactly s spikes, and the recovered amplitudes and positions vary continuously with respect to λ and w . 10 / 36

Compressed sensing off-the-grid: The Fisher metric, support - PowerPoint PPT Presentation

Compressed sensing off-the-grid: The Fisher metric, support stability and optimal sampling bounds Clarice Poon University of Bath Joint work with: Nicolas Keriven and Gabriel Peyr e Ecole Normale Sup erieure February 6, 2019 1 / 36

Decoding in Compressed Sensing Ronald DeVore USC, 2008 p. 1/33 Discrete Compressed Sensing R

Compressed Membership for NFA (DFA) with Compressed Labels is in NP (P) Artur Je University of

Foundations of Compressed Sensing Mike Davies Edinburgh Compressed Sensing research group (E-CoS)

Compressed Sensing: Challenges and Emerging Topics Mike Davies Edinburgh Compressed Sensing

Deep Compressed Sensing Yan Wu, Mihaela Rosca, Tim Lillicrap Compressed Sensing A Brief Review

Infinite Dimensional Compressed Sensing Anders C. Hansen, University of Cambridge Chemnitz,

ON-GRID VS OFF-GRID SOLAR On-Grid Solar is solar generation that is connected to the utility grid

Welcome back... Metric spaces. Approximate metric using a tree. Tree metric: 16 16 A metric

Introduction to Compressed Sensing Gitta Kutyniok (Institut f ur Mathematik, Technische

Sun and Grid John Barr Grid Business Development 07808 328351 john.barr@sun.com Sun and Grid

Pitfalls in Measuring SLOs Danyel Fisher @fisherdanyel An Outage Danyel Fisher @fisherdanyel

MERRY FISHER 1095 New 2018 PROVISIONAL DOCUMENT MERRY FISHER 1095 : THE JOY OF CRUISING 2 In

Compressed Sensing and Bayesian Experimental Design or Optimal Sensing and Reconstruction of N -

Metric Spaces Definition If d is a metric on X , then the metric topology on X induced by d is

Fast Data Driven Compressed Sensing and application to compressed quantitative MRI Mike Davies

Sparsity-optimized Harmonic Wavelets for Compressed Sensing MRI Ruediger Willenberg (ECE)

ARTIFICIAL INTELLIGENCE Markov decision processes Lecturer: Silja Renooij These slides are part

Reminders 12 days until the American election. I voted. Did you? If you havent returned your

Social and Information Networks Resources Many of the things that we cover are from papers. But

Today's World-wide Today's World-wide Computing Grid for the Computing Grid for the Computing

Robust Spectral Compressed Sensing via Structured Matrix Completion Yuxin Chen Electrical

<Off-Grid-Traces> Discussions Reimagining digital communication after ecological disaster

Vehicle-Grid Integration Analysis Presentation to VGI Working Group May 7, 2020 Christa Heavey,

10703 Deep Reinforcement Learning Solving known MDPs Tom Mitchell September 10, 2018 Many

Compressed sensing off-the-grid: The Fisher metric, support - PowerPoint PPT Presentation

Compressed sensing off-the-grid: The Fisher metric, support stability and optimal sampling bounds Clarice Poon University of Bath Joint work with: Nicolas Keriven and Gabriel Peyr e Ecole Normale Sup erieure February 6, 2019 1 / 36

Decoding in Compressed Sensing Ronald DeVore USC, 2008 p. 1/33 Discrete Compressed Sensing R

Compressed Membership for NFA (DFA) with Compressed Labels is in NP (P) Artur Je University of

Foundations of Compressed Sensing Mike Davies Edinburgh Compressed Sensing research group (E-CoS)

Compressed Sensing: Challenges and Emerging Topics Mike Davies Edinburgh Compressed Sensing

Deep Compressed Sensing Yan Wu, Mihaela Rosca, Tim Lillicrap Compressed Sensing A Brief Review

Infinite Dimensional Compressed Sensing Anders C. Hansen, University of Cambridge Chemnitz,

ON-GRID VS OFF-GRID SOLAR On-Grid Solar is solar generation that is connected to the utility grid

Welcome back... Metric spaces. Approximate metric using a tree. Tree metric: 16 16 A metric

Introduction to Compressed Sensing Gitta Kutyniok (Institut f ur Mathematik, Technische

Sun and Grid John Barr Grid Business Development 07808 328351 john.barr@sun.com Sun and Grid

Pitfalls in Measuring SLOs Danyel Fisher @fisherdanyel An Outage Danyel Fisher @fisherdanyel

MERRY FISHER 1095 New 2018 PROVISIONAL DOCUMENT MERRY FISHER 1095 : THE JOY OF CRUISING 2 In

Compressed Sensing and Bayesian Experimental Design or Optimal Sensing and Reconstruction of N -

Metric Spaces Definition If d is a metric on X , then the metric topology on X induced by d is

Fast Data Driven Compressed Sensing and application to compressed quantitative MRI Mike Davies

Sparsity-optimized Harmonic Wavelets for Compressed Sensing MRI Ruediger Willenberg (ECE)

ARTIFICIAL INTELLIGENCE Markov decision processes Lecturer: Silja Renooij These slides are part

Reminders 12 days until the American election. I voted. Did you? If you havent returned your

Social and Information Networks Resources Many of the things that we cover are from papers. But

Today's World-wide Today's World-wide Computing Grid for the Computing Grid for the Computing

Robust Spectral Compressed Sensing via Structured Matrix Completion Yuxin Chen Electrical

&lt;Off-Grid-Traces&gt; Discussions Reimagining digital communication after ecological disaster

Vehicle-Grid Integration Analysis Presentation to VGI Working Group May 7, 2020 Christa Heavey,

10703 Deep Reinforcement Learning Solving known MDPs Tom Mitchell September 10, 2018 Many

<Off-Grid-Traces> Discussions Reimagining digital communication after ecological disaster