Sparse Regression Codes Andrew Barron Ramji Venkataramanan Yale - PowerPoint PPT Presentation

Sparse Regression Codes Andrew Barron Ramji Venkataramanan Yale University University of Cambridge Joint work with Antony Joseph, Sanghee Cho, Cynthia Rush, Adam Greig, Tuhin Sarkar, Sekhar Tatikonda ISIT 2016 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 / 25

Part III of the tutorial: • SPARCs for Lossy Compression • SPARCs for Multi-terminal Source and Channel Coding • Open questions (Joint work with Sekhar Tatikonda, Tuhin Sarkar, Adam Greig) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 / 25

Lossy Compression R nats/sample Codebook e nR S = S 1 , . . . , S n S = ˆ ˆ S 1 , . . . , ˆ S n S ∥ 2 = 1 ∑ n ∥ S − ˆ k ( S k − ˆ • Distortion criterion: 1 S k ) 2 n • For i.i.d N (0 , ν 2 ) source, min distortion = ν 2 e − 2 R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 / 25

Lossy Compression R nats/sample Codebook e nR S = S 1 , . . . , S n S = ˆ ˆ S 1 , . . . , ˆ S n S ∥ 2 = 1 ∑ n ∥ S − ˆ k ( S k − ˆ • Distortion criterion: 1 S k ) 2 n • For i.i.d N (0 , ν 2 ) source, min distortion = ν 2 e − 2 R • Can we achieve this with low-complexity codes? – Storage & Computation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 / 25

SPARC Construction Section 1 Section 2 Section L M columns M columns M columns A : n rows T 0 , c 1 , β : 0 , 0 , c 2 , 0 , c L , 0 , , 0 n rows, ML columns, A ij ∼ N (0 , 1 / n ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 / 25

SPARC Construction Section 1 Section 2 Section L M columns M columns M columns A : n rows T 0 , c 1 , β : 0 , 0 , c 2 , 0 , c L , 0 , , 0 n rows, ML columns, A ij ∼ N (0 , 1 / n ) Choosing M and L : • For rate R codebook, need M L = e nR • Choose M polynomial of n ⇒ L ∼ n / log n • Storage Complexity ↔ Size of A : polynomial in n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 / 25

Optimal Encoding Section 1 Section 2 Section L M columns M columns M columns A : n rows T β : 0 , 0 , c 1 , 0 , c 2 , 0 , c L , 0 , , 0 Minimum Distance Encoding : ˆ β = arg min β ∈ SPARC ∥ S − A β 2 ∥ Theorem [Venkataramanan, Tatikonda ’12, ’14]: ∼ N (0 , ν 2 ), the sequence of rate R For source S i.i.d. SPARCs with n , L , M = L b with b > b ∗ ( R ): ( 1 ) β ∥ 2 > D n ∥ S − A ˆ < e − n ( E ∗ ( R , D )+ o (1)) . P Achieves the optimal rate-distortion function with the optimal . . . . . . . . . . . . . . . . . . . . error exponent E ∗ ( R , D ) . . . . . . . . . . . . . . . . . . . . 5 / 25

Successive Cancellation Encoding M columns A : Section 1 n rows T β : 0 , 0 , c 1 , Step 1: Choose column in section 1 that minimizes ∥ S − c 1 A j ∥ 2 - Max among M inner products ⟨ S , A j ⟩ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 / 25

Successive Cancellation Encoding M columns A : Section 1 n rows T β : 0 , 0 , c 1 , Step 1: Choose column in section 1 that minimizes ∥ S − c 1 A j ∥ 2 - Max among M inner products ⟨ S , A j ⟩ √ 2 ν 2 log M - c 1 = - residual R 1 = S − c 1 ˆ A 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 / 25

Successive Cancellation Encoding M columns A : Section 2 n rows T β : 0 , c 2 , 0 , Step 2: Choose column in section 2 that minimizes ∥ R 1 − c 2 A j ∥ 2 - Max among inner products ⟨ R 1 , A j ⟩ √ 2(log M ) ν 2 ( ) 1 − 2 R - c 2 = L - residual R 2 = R 1 − c 2 ˆ A 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 / 25

Successive Cancellation Encoding M columns A : Section L n rows T β : c L , 0 , , 0 Step L : Choose column in section L that minimizes ∥ R L − 1 − c L A j ∥ 2 √ 2(log M ) ν 2 ( ) L 1 − 2 R - c L = L - Final residual R L = R L − 1 − c L ˆ Final Distortion = 1 n ∥ R L ∥ 2 A L . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 / 25

Performance Theorem [Venkataramanan, Sarkar,Tatikonda ’13]: For an ergodic source S with mean 0 and variance ν 2 , the encoding algorithm produces a codeword A ˆ β that satisfies the following for sufficiently large M , L : ( 1 ) − κ n (∆ − c log log M β ∥ 2 > ν 2 e − 2 R + ∆ ) n ∥ S − A ˆ log M P < e Deviation between actual distortion and the optimal value is O ( log log n log n ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 / 25

Performance Theorem [Venkataramanan, Sarkar,Tatikonda ’13]: For an ergodic source S with mean 0 and variance ν 2 , the encoding algorithm produces a codeword A ˆ β that satisfies the following for sufficiently large M , L : ( 1 ) − κ n (∆ − c log log M β ∥ 2 > ν 2 e − 2 R + ∆ ) n ∥ S − A ˆ P < e log M Deviation between actual distortion and the optimal value is O ( log log n log n ) Encoding Complexity : ML inner products and comparisons ⇒ polynomial in n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 / 25

Numerical Experiment Gaussian source: Mean 0, Variance 1 1 Parameters: M=L 3 , L ∈ [30,100] 0.9 0.8 0.7 0.6 Distortion 0.5 SPARC 0.4 0.3 0.2 Shannon limit 0.1 0 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 Rate (bits/sample) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 / 25

Why does the algorithm work? M columns A : Section 1 n rows T β : 0 , 0 , c 1 , n Each section is a code of rate R / L ( L ∼ log n ) R 1 = S − c 1 ˆ • Step 1: S − → A 1 ( ) 1 − 2 R √ | R 1 | 2 ≈ ν 2 e − 2 R / L ≈ ν 2 2 ν 2 log M for c 1 = L . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 / 25

Why does the algorithm work? M columns A : Section 2 n rows T β : 0 , c 2 , 0 , n Each section is a code of rate R / L ( L ∼ log n ) R 1 = S − c 1 ˆ • Step 1: S − → A 1 ( ) 1 − 2 R √ | R 1 | 2 ≈ ν 2 e − 2 R / L ≈ ν 2 2 ν 2 log M for c 1 = L . . . . . . . . . . . . . . . . . . . . R 2 = R 1 − c 2 ˆ • Step 2: ‘Source’ R 1 − → . . . A 2 . . . . . . . . . . . . . . . . . 9 / 25

Why does the algorithm work? M columns A : Section i n rows T 0 , c i , 0 β : n Each section is a code of rate R / L ( L ∼ log n ) R i = R i − 1 − c i ˆ • Step i : ‘Source’ R i − 1 − → A 2 ( ) i − 1 , i = 2 R ν 2 With c 2 1 − 2 R L L ( ) ( ) i 1 − 2 R 1 − 2 R | R i | 2 ≈ | R i − 1 | 2 ≈ ν 2 . . . . . . . . . . . . . . . . . . . . L L . . . . . . . . . . . . . . . . . . . . 9 / 25

Why does the algorithm work? M columns A : Section L n rows T β : c L , 0 , , 0 n Each section is a code of rate R / L ( L ∼ log n ) ( ) L 1 − 2 R | R L | 2 ≈ ν 2 ≤ ν 2 e − 2 R Final Distortion: L L -stage successive refinement L ∼ n / log n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 / 25

Successive Refinement Interpretation M columns A : Section i n rows T β : 0 , c i , 0 n • The encoder successively refines the source over ∼ log n stages • The deviations in each stage can be significant! ( ) i 1 − 2 R | R i | 2 = ν 2 (1 + ∆ i ) 2 , i = 0 , . . . , L L � �� ‘Typical Value’ • KEY to result: Controlling the final deviation ∆ L • Recall: successive cancellation does not work for SPARC . . . . . . . . . . . . . . . . . . . . AWGN decoding . . . . . . . . . . . . . . . . . . . . 10 / 25

Open Questions in SPARC Compression • Better encoders with smaller gap to D ∗ ( R )? Iterative soft-decision encoding, AMP? • AWGN decoding AMP doesn’t work when directly used for compression: – may need decimation a la LDGM codes for compression • But recall: With min-distance encoding, SPARCs attain the rate-distortion function with the optimal error-exponent • Compression performance with ± 1 dictionaries • Compression of finite alphabet sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 / 25

Sparse Regression Codes for multi-terminal networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 / 25

Codes for multi-terminal problems Key ingredients: • Superposition (Multiple-access channel, Broadcast channel) • Random binning (e.g., Distributed compression, Channel Coding with Side-Information, . . . ) SPARC is based on superposition coding! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 / 25

Sparse Regression Codes Andrew Barron Ramji Venkataramanan Yale - PowerPoint PPT Presentation

Sparse Regression Codes Andrew Barron Ramji Venkataramanan Yale University University of Cambridge Joint work with Antony Joseph, Sanghee Cho, Cynthia Rush, Adam Greig, Tuhin Sarkar, Sekhar Tatikonda ISIT 2016 . . . . . . . . . . .

Sparse Matrices Example Of Sparse Matrices diagonal tridiagonal sparse many elements are

Building Codes Building Codes Building Codes Building Codes 1 1 Builder Responsibilities

ECEN 5682 Theory and Practice of Error Control Codes Cyclic Codes Peter Mathys University of

Formal Modeling in Cognitive Science Source Codes Lecture 30: Codes; Kraft Inequality; Source

Regression 3: Logistic Regression Marco Baroni Practical Statistics in R Outline Logistic

Regression Methods 1. Linear Regression and Logistic Regression: definitions, and a common

Sparse regression DS-GA 1013 / MATH-GA 2824 Mathematical Tools for Data Science

Sparse Matrices sparse many elements are zero dense few elements are zero Example Of

CODES FOR ALL SEASONS Emina Soljanin, Bell Labs IN THE CLOUD? CODES Emina @ Bell Labs Codes at

G ENERALIZED R EED -S OLOMON CODES (GRS CODES ) A CHARACTERIZATION OF MDS CODES THAT HAVE AN ERROR

Lattices from Codes or Codes from Lattices Amin Sakzad Dept of Electrical and Computer Systems

Error-Correcting codes: Application of convolutional codes to Video Streaming Diego Napp

Information Theory Lecture 8 BCH codes BCH codes: R8.45 (R5.6) Decoding BCH (and

MLSS 06 - Canberra Elements Hierarchical Basis Sparse Grids Sparse Grids Combination

Latent Structure Beyond Sparse Codes Benjamin Recht Department of EECS and Statistics

Logistic Regression James H. Steiger Department of Psychology and Human Development Vanderbilt

Index compression CE-324: Modern Information Retrieval Sharif University of Technology M.

Reducing Checkpoint Size in PlasComCM with Lossy Compression 14th Annual Workshop on Charm++ and

Adaptive Coding for Two-Way Lossy Source-Channel Communication Jian-Jia Weng, Fady Alajaji, and

Adaptive Filters Linear Prediction Gerhard Schmidt Christian-Albrechts-Universitt zu Kiel

Information Retrieval Tutorial 3: Index Compression Professor: Michel Schellekens TA: Ang Gao

Lecture 3 Source Coding I-Hsiang Wang Department of Electrical Engineering National Taiwan

A Concrete Treatment of Fiat-Shamir Signatures in the Quantum Random-Oracle Model EUROCRYPT 2018

tt ts