Single Letter Formulas for Quantized Compressed Sensing with - - PowerPoint PPT Presentation

single letter formulas for quantized compressed sensing
SMART_READER_LITE
LIVE PREVIEW

Single Letter Formulas for Quantized Compressed Sensing with - - PowerPoint PPT Presentation

Single Letter Formulas for Quantized Compressed Sensing with Gaussian Codebooks Alon Kipnis (Stanford) Galen Reeves (Duke) Yonina Eldar (Technion) ISIT, June 2018 Table of Contents Introduction Motivation Problem Formulation Background


slide-1
SLIDE 1

Single Letter Formulas for Quantized Compressed Sensing with Gaussian Codebooks

Alon Kipnis (Stanford) Galen Reeves (Duke) Yonina Eldar (Technion) ISIT, June 2018

slide-2
SLIDE 2

Table of Contents

Introduction Motivation Problem Formulation Background Main Results: CE w.r.t. Gaussian Codebooks Compress-and-Estimate Linear Transformation Compress-and-Estimate Summary

2 / 19

slide-3
SLIDE 3

Quantization in Linear Models / Compressed-Sensing

Y = H X + N

3 / 19

slide-4
SLIDE 4

Quantization in Linear Models / Compressed-Sensing

Y = H X + N

signal Gaussian noise samples sampling matrix

3 / 19

slide-5
SLIDE 5

Quantization in Linear Models / Compressed-Sensing

Y = H X + N

signal Gaussian noise samples sampling matrix

Applications:

◮ Signal processing ◮ Communication ◮ Statistics / Machine learning

3 / 19

slide-6
SLIDE 6

Quantization in Linear Models / Compressed-Sensing

Y = H X + N

signal Gaussian noise samples sampling matrix

Applications:

◮ Signal processing ◮ Communication ◮ Statistics / Machine learning

This talk: Limited bitrate to represent samples Y 1011 · · · 01

  • X

encoder decoder

3 / 19

slide-7
SLIDE 7

Quantization in Linear Models / Compressed-Sensing

Y = H X + N

signal Gaussian noise samples sampling matrix

Applications:

◮ Signal processing ◮ Communication ◮ Statistics / Machine learning

This talk: Limited bitrate to represent samples Y 1011 · · · 01

  • X

encoder decoder Scenarios:

◮ Limited memory (A/D) ◮ Limited bandwidth

3 / 19

slide-8
SLIDE 8

Related Works

(on quantized compressed)

◮ Gaussian signals [K., E., Goldsmith, Weissman ’16] ◮ Scalar quantization [Goyal, Fletcher, Rangan ’08], [Jacques, Hammond, Fadili ’11] ◮ 1-bit quantization [Boufounos, Baraniuk 08], [Plan, Vershynin ’13], [Xu, Kabashima, Zdebrova ’14] ◮ AMP reconstruction [Kamilov, Goyal, Rangan ’11] ◮ Separable setting [Leinonen, Codreanu, Juntti, Kramer ’16]

4 / 19

slide-9
SLIDE 9

Related Works

(on quantized compressed)

◮ Gaussian signals [K., E., Goldsmith, Weissman ’16] ◮ Scalar quantization [Goyal, Fletcher, Rangan ’08], [Jacques, Hammond, Fadili ’11] ◮ 1-bit quantization [Boufounos, Baraniuk 08], [Plan, Vershynin ’13], [Xu, Kabashima, Zdebrova ’14] ◮ AMP reconstruction [Kamilov, Goyal, Rangan ’11] ◮ Separable setting [Leinonen, Codreanu, Juntti, Kramer ’16]

This talk: Fundamental limit = minimal distortion over all finite-bit representations of the measurements

4 / 19

slide-10
SLIDE 10

Problem Formulation

X

Linear Transform

AWGN

H

Enc Dec

  • X

Y

  • 1, . . ., 2nR

5 / 19

slide-11
SLIDE 11

Problem Formulation

X

Linear Transform

AWGN

H

Enc Dec

  • X

Y

  • 1, . . ., 2nR

◮ Signal distribution:

Xi

i.i.d.

∼ PX, i = 1, . . . , n

◮ Coding rate:

R bits per signal dimension

◮ Sampling matrix:

◮ Right-rotationally invariant: H

dist

= HO

◮ Empirical spectral distribution of HT H converges to a

compactly supported measure µ

5 / 19

slide-12
SLIDE 12

Problem Formulation

X

Linear Transform

AWGN

H

Enc Dec

  • X

Y

  • 1, . . ., 2nR

◮ Signal distribution:

Xi

i.i.d.

∼ PX, i = 1, . . . , n

◮ Coding rate:

R bits per signal dimension

◮ Sampling matrix:

◮ Right-rotationally invariant: H

dist

= HO

◮ Empirical spectral distribution of HT H converges to a

compactly supported measure µ

Definition:

D(PX, µ, R) infimum over all D for which there exists a rate-R coding scheme such that lim sup

n→∞

1 nE

  • X − ˆ

X

  • 2

≤ D

5 / 19

slide-13
SLIDE 13

Spectral Distribution of Sampling Matrix

6 / 19

slide-14
SLIDE 14

Spectral Distribution of Sampling Matrix

  • Exm. I

H is orthogonal (HT H = γI) λ ⇒ µ is a point mass distribution δγ γ

6 / 19

slide-15
SLIDE 15

Spectral Distribution of Sampling Matrix

  • Exm. I

H is orthogonal (HT H = γI) λ ⇒ µ is a point mass distribution δγ γ

  • Exm. II

rows of H are randomly sampled form an orthogonal matrix ⇒ µ = (1 − ρ)δ0 + ρδγ ρ 1 − µ({0}) is the sampling rate λ

1 − ρ

γ

ρ

6 / 19

slide-16
SLIDE 16

Spectral Distribution of Sampling Matrix

  • Exm. I

H is orthogonal (HT H = γI) λ ⇒ µ is a point mass distribution δγ γ

  • Exm. II

rows of H are randomly sampled form an orthogonal matrix ⇒ µ = (1 − ρ)δ0 + ρδγ ρ 1 − µ({0}) is the sampling rate λ

1 − ρ

γ

ρ

  • Exm. III

H is i.i.d. Gaussian ⇒ µ is the Marchenco-Pasture law λ

1 − ρ

ρ 1 − µ({0}) is the sampling rate

6 / 19

slide-17
SLIDE 17

Special Cases / Lower Bounds

0.1 0.9 0.2 0.4 0.6 0.8 1 D(PX, µ, R) =?

DShannon(PX, R) MMSE(PX, µ)

sampling rate (1 − µ({0})) MSE

7 / 19

slide-18
SLIDE 18

Special Case: MMSE

No Quantization / Infinite Bitrate

lim

R→∞ D(PX, µ, R) = lim sup n→∞

1 n E

  • X − E[X|Y ]2
  • MMSE

M(PX, µ)

8 / 19

slide-19
SLIDE 19

Special Case: MMSE

No Quantization / Infinite Bitrate

lim

R→∞ D(PX, µ, R) = lim sup n→∞

1 n E

  • X − E[X|Y ]2
  • MMSE

M(PX, µ)

◮ Under some conditions:

M(PX, µ) = M(PX, δs)

[Guo & Verdu ’05], [Takeda et. al. ’06], [Wu & Verdu ’12], [Tulino et. al. ’13], [Reeves & Pfister ’16], [Barbier et. al. ’16,’17], [Rangan, Schinter, Fletcher ’16], [Maillard, Barbier, Macris, Krzakala, Wed 10:20]

8 / 19

slide-20
SLIDE 20

Special Case: MMSE

No Quantization / Infinite Bitrate

lim

R→∞ D(PX, µ, R) = lim sup n→∞

1 n E

  • X − E[X|Y ]2
  • MMSE

M(PX, µ)

◮ Under some conditions:

M(PX, µ) = M(PX, δs)

[Guo & Verdu ’05], [Takeda et. al. ’06], [Wu & Verdu ’12], [Tulino et. al. ’13], [Reeves & Pfister ’16], [Barbier et. al. ’16,’17], [Rangan, Schinter, Fletcher ’16], [Maillard, Barbier, Macris, Krzakala, Wed 10:20]

Main result of this talk: D(PX, µ, R) ∼ M(PX, TRµ) TR is a spectrum scaling operator

8 / 19

slide-21
SLIDE 21

Previous Results

0.1 0.9 0.2 0.4 0.6 0.8 1 DShannon MMSE sampling rate MSE

9 / 19

slide-22
SLIDE 22

Previous Results

0.1 0.9 0.2 0.4 0.6 0.8 1 DShannon MMSE DEC [KREG ’17] sampling rate MSE

9 / 19

slide-23
SLIDE 23

Estimate-and-Compress vs Compress-and-Estimate

Estimate-and-Compress [Kipnis, Reeves, Eldar, Goldsmith ’17] X

Linear Transform

AWGN

H

Est Enc’ Dec

  • X

Y

nR bits

10 / 19

slide-24
SLIDE 24

Estimate-and-Compress vs Compress-and-Estimate

Estimate-and-Compress [Kipnis, Reeves, Eldar, Goldsmith ’17] X

Linear Transform

AWGN

H

Est Enc’ Dec

  • X

Y

nR bits

  • Encoding is hard
  • Decoding is easy

10 / 19

slide-25
SLIDE 25

Estimate-and-Compress vs Compress-and-Estimate

Estimate-and-Compress [Kipnis, Reeves, Eldar, Goldsmith ’17] X

Linear Transform

AWGN

H

Est Enc’ Dec

  • X

Y

nR bits

  • Encoding is hard
  • Decoding is easy

Compress-and-Estimate (this talk) X

Linear Transform

AWGN

H

Enc Dec’ Est

  • X

Y

nR bits

  • Y

10 / 19

slide-26
SLIDE 26

Estimate-and-Compress vs Compress-and-Estimate

Estimate-and-Compress [Kipnis, Reeves, Eldar, Goldsmith ’17] X

Linear Transform

AWGN

H

Est Enc’ Dec

  • X

Y

nR bits

  • Encoding is hard
  • Decoding is easy

Compress-and-Estimate (this talk) X

Linear Transform

AWGN

H

Enc Dec’ Est

  • X

Y

nR bits

  • Y
  • Encoding is easy
  • Decoding is hard

10 / 19

slide-27
SLIDE 27

Table of Contents

Introduction Motivation Problem Formulation Background Main Results: CE w.r.t. Gaussian Codebooks Compress-and-Estimate Linear Transformation Compress-and-Estimate Summary

11 / 19

slide-28
SLIDE 28

Result I

Theorem (CE achievability)

D(PX, µ, R) ≤ M(PX, Tµ) where T is an SNR scaling operator applied to the spectral distribution of the sampling matrix, T(λ) = 1 − 2−2R/ρ 1 + γ

ρσ2 X2−2R/ρ λ,

12 / 19

slide-29
SLIDE 29

Result I

Theorem (CE achievability)

D(PX, µ, R) ≤ M(PX, Tµ) where T is an SNR scaling operator applied to the spectral distribution of the sampling matrix, T(λ) = 1 − 2−2R/ρ 1 + γ

ρσ2 X2−2R/ρ λ,

  • riginal spectrum µ

λ

1 − ρ

scaled spectrum Tµ

λ

1 − ρ

12 / 19

slide-30
SLIDE 30

Result I

Theorem (CE achievability)

D(PX, µ, R) ≤ M(PX, Tµ) where T is an SNR scaling operator applied to the spectral distribution of the sampling matrix, T(λ) = 1 − 2−2R/ρ 1 + γ

ρσ2 X2−2R/ρ λ,

  • riginal spectrum µ

λ

1 − ρ

scaled spectrum Tµ

λ

1 − ρ

Quantization is equivalent to spectrum attenuation

12 / 19

slide-31
SLIDE 31

Example I: Distortion vs Sampling rate

PX – Bernoulli-Gauss µ – Marchenco-Pasture law

0.1 0.5 0.9 0.2 0.4 0.6 0.8 1

DEC (previous) D

C E

(this talk)

DShannon MMSE sampling rate MSE

13 / 19

slide-32
SLIDE 32

Example II: distortion vs Sparsity

PX – Bernoulli-Gauss µ – Marchenco-Pasture law

sparse dense

0.2 0.4 0.6

D

E C

(previous) DCE ( t h i s t a l k )

M M S E DShannon MSE ← sparsity →

14 / 19

slide-33
SLIDE 33

Can we do better than DCE ?

sparse dense

0.2 0.4 0.6

D

E C

(previous) DCE ( t h i s t a l k )

M M S E DShannon MSE ← sparsity →

15 / 19

slide-34
SLIDE 34

Can we do better than DCE ?

sparse dense

0.2 0.4 0.6

D

E C

(previous) DCE ( t h i s t a l k )

M M S E DShannon MSE ← sparsity →

15 / 19

slide-35
SLIDE 35

Can we do better than DCE ?

sparse dense

0.2 0.4 0.6

D

E C

(previous) DCE ( t h i s t a l k )

M M S E DShannon MSE ← sparsity → X

Linear Transform

AWGN L Enc’ Dec’ Est

  • X

Y ˜ Y

nR bits

  • ˜

Y

15 / 19

slide-36
SLIDE 36

Result II

Theorem (achievability using LCE)

D(PX, µ, R) ≤ M(PX, Tθµ) where Tθ(λ) = λ

  • λ

1+λ − θ

+

λ 1+λ + θλ

Rθ = 1 2 ∞ log+

  • λ

(1 + λ)θ

  • µ(dλ)

16 / 19

slide-37
SLIDE 37

Result II

Theorem (achievability using LCE)

D(PX, µ, R) ≤ M(PX, Tθµ) where Tθ(λ) = λ

  • λ

1+λ − θ

+

λ 1+λ + θλ

Rθ = 1 2 ∞ log+

  • λ

(1 + λ)θ

  • µ(dλ)
  • riginal spectrum µ

λ

non-linear transformation Tθµ

λ

16 / 19

slide-38
SLIDE 38

Result II

Theorem (achievability using LCE)

D(PX, µ, R) ≤ M(PX, Tθµ) where Tθ(λ) = λ

  • λ

1+λ − θ

+

λ 1+λ + θλ

Rθ = 1 2 ∞ log+

  • λ

(1 + λ)θ

  • µ(dλ)
  • riginal spectrum µ

λ

non-linear transformation Tθµ

λ Converse when signal is Gaussian PX = N(0, σ2

X) !

16 / 19

slide-39
SLIDE 39

Distortion vs Sparsity

PX – Bernoulli-Gauss µ – Marchenco-Pasture law

sparse dense

0.1 0.2 0.3 0.4 0.5 0.6 0.7

DLCE (this talk)

  • ptimal

D

E C

(previous) DCE ( t h i s t a l k )

M M S E DShannon

MSE ← sparsity →

17 / 19

slide-40
SLIDE 40

Distortion vs Sparsity

PX – Bernoulli-Gauss µ – Marchenco-Pasture law

sparse dense

0.1 0.2 0.3 0.4 0.5 0.6 0.7

DLCE (this talk)

  • ptimal

D

E C

(previous) DCE ( t h i s t a l k )

M M S E DShannon

MSE ← sparsity →

17 / 19

slide-41
SLIDE 41

Table of Contents

Introduction Motivation Problem Formulation Background Main Results: CE w.r.t. Gaussian Codebooks Compress-and-Estimate Linear Transformation Compress-and-Estimate Summary

18 / 19

slide-42
SLIDE 42

Summary

◮ Quantized CS / Optimal quantization in linear measurement

model – optimal tradeoff between MSE, PX, empirical spectral distribution and bitrate

19 / 19

slide-43
SLIDE 43

Summary

◮ Quantized CS / Optimal quantization in linear measurement

model – optimal tradeoff between MSE, PX, empirical spectral distribution and bitrate

◮ Compression using Gaussian codebooks and right-orthogonally

invariant sampling matrix Quantization ↔ MMSE with transformed spectral distribution

19 / 19

slide-44
SLIDE 44

Summary

◮ Quantized CS / Optimal quantization in linear measurement

model – optimal tradeoff between MSE, PX, empirical spectral distribution and bitrate

◮ Compression using Gaussian codebooks and right-orthogonally

invariant sampling matrix Quantization ↔ MMSE with transformed spectral distribution

◮ Compress-and-estimate (CE) – Scaling by a constant factor ◮ Linear transformation CE (LCE) – Non-linear transformation

according to the water-filling principle

◮ LCE is optimal when signal is Gaussian

19 / 19

slide-45
SLIDE 45

Summary

◮ Quantized CS / Optimal quantization in linear measurement

model – optimal tradeoff between MSE, PX, empirical spectral distribution and bitrate

◮ Compression using Gaussian codebooks and right-orthogonally

invariant sampling matrix Quantization ↔ MMSE with transformed spectral distribution

◮ Compress-and-estimate (CE) – Scaling by a constant factor ◮ Linear transformation CE (LCE) – Non-linear transformation

according to the water-filling principle

◮ LCE is optimal when signal is Gaussian

The End!

  • riginal µ

λ

after quantization Tθµ

λ

19 / 19