sparse regression codes
play

Sparse Regression Codes Andrew Barron Ramji Venkataramanan Yale - PowerPoint PPT Presentation

Sparse Regression Codes Andrew Barron Ramji Venkataramanan Yale University University of Cambridge Joint work with Antony Joseph, Sanghee Cho, Cynthia Rush, Adam Greig, Tuhin Sarkar, Sekhar Tatikonda ISIT 2016 . . . . . . . . . . .


  1. Sparse Regression Codes Andrew Barron Ramji Venkataramanan Yale University University of Cambridge Joint work with Antony Joseph, Sanghee Cho, Cynthia Rush, Adam Greig, Tuhin Sarkar, Sekhar Tatikonda ISIT 2016 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 / 25

  2. Part III of the tutorial: • SPARCs for Lossy Compression • SPARCs for Multi-terminal Source and Channel Coding • Open questions (Joint work with Sekhar Tatikonda, Tuhin Sarkar, Adam Greig) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 / 25

  3. Lossy Compression R nats/sample Codebook e nR S = S 1 , . . . , S n S = ˆ ˆ S 1 , . . . , ˆ S n S ∥ 2 = 1 ∑ n ∥ S − ˆ k ( S k − ˆ • Distortion criterion: 1 S k ) 2 n • For i.i.d N (0 , ν 2 ) source, min distortion = ν 2 e − 2 R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 / 25

  4. Lossy Compression R nats/sample Codebook e nR S = S 1 , . . . , S n S = ˆ ˆ S 1 , . . . , ˆ S n S ∥ 2 = 1 ∑ n ∥ S − ˆ k ( S k − ˆ • Distortion criterion: 1 S k ) 2 n • For i.i.d N (0 , ν 2 ) source, min distortion = ν 2 e − 2 R • Can we achieve this with low-complexity codes? – Storage & Computation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 / 25

  5. SPARC Construction Section 1 Section 2 Section L M columns M columns M columns A : n rows T 0 , c 1 , β : 0 , 0 , c 2 , 0 , c L , 0 , , 0 n rows, ML columns, A ij ∼ N (0 , 1 / n ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 / 25

  6. SPARC Construction Section 1 Section 2 Section L M columns M columns M columns A : n rows T 0 , c 1 , β : 0 , 0 , c 2 , 0 , c L , 0 , , 0 n rows, ML columns, A ij ∼ N (0 , 1 / n ) Choosing M and L : • For rate R codebook, need M L = e nR • Choose M polynomial of n ⇒ L ∼ n / log n • Storage Complexity ↔ Size of A : polynomial in n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 / 25

  7. Optimal Encoding Section 1 Section 2 Section L M columns M columns M columns A : n rows T β : 0 , 0 , c 1 , 0 , c 2 , 0 , c L , 0 , , 0 Minimum Distance Encoding : ˆ β = arg min β ∈ SPARC ∥ S − A β 2 ∥ Theorem [Venkataramanan, Tatikonda ’12, ’14]: ∼ N (0 , ν 2 ), the sequence of rate R For source S i.i.d. SPARCs with n , L , M = L b with b > b ∗ ( R ): ( 1 ) β ∥ 2 > D n ∥ S − A ˆ < e − n ( E ∗ ( R , D )+ o (1)) . P Achieves the optimal rate-distortion function with the optimal . . . . . . . . . . . . . . . . . . . . error exponent E ∗ ( R , D ) . . . . . . . . . . . . . . . . . . . . 5 / 25

  8. Successive Cancellation Encoding M columns A : Section 1 n rows T β : 0 , 0 , c 1 , Step 1: Choose column in section 1 that minimizes ∥ S − c 1 A j ∥ 2 - Max among M inner products ⟨ S , A j ⟩ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 / 25

  9. Successive Cancellation Encoding M columns A : Section 1 n rows T β : 0 , 0 , c 1 , Step 1: Choose column in section 1 that minimizes ∥ S − c 1 A j ∥ 2 - Max among M inner products ⟨ S , A j ⟩ √ 2 ν 2 log M - c 1 = - residual R 1 = S − c 1 ˆ A 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 / 25

  10. Successive Cancellation Encoding M columns A : Section 2 n rows T β : 0 , c 2 , 0 , Step 2: Choose column in section 2 that minimizes ∥ R 1 − c 2 A j ∥ 2 - Max among inner products ⟨ R 1 , A j ⟩ √ 2(log M ) ν 2 ( ) 1 − 2 R - c 2 = L - residual R 2 = R 1 − c 2 ˆ A 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 / 25

  11. Successive Cancellation Encoding M columns A : Section L n rows T β : c L , 0 , , 0 Step L : Choose column in section L that minimizes ∥ R L − 1 − c L A j ∥ 2 √ 2(log M ) ν 2 ( ) L 1 − 2 R - c L = L - Final residual R L = R L − 1 − c L ˆ Final Distortion = 1 n ∥ R L ∥ 2 A L . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 / 25

  12. Performance Theorem [Venkataramanan, Sarkar,Tatikonda ’13]: For an ergodic source S with mean 0 and variance ν 2 , the encoding algorithm produces a codeword A ˆ β that satisfies the following for sufficiently large M , L : ( 1 ) − κ n (∆ − c log log M β ∥ 2 > ν 2 e − 2 R + ∆ ) n ∥ S − A ˆ log M P < e Deviation between actual distortion and the optimal value is O ( log log n log n ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 / 25

  13. Performance Theorem [Venkataramanan, Sarkar,Tatikonda ’13]: For an ergodic source S with mean 0 and variance ν 2 , the encoding algorithm produces a codeword A ˆ β that satisfies the following for sufficiently large M , L : ( 1 ) − κ n (∆ − c log log M β ∥ 2 > ν 2 e − 2 R + ∆ ) n ∥ S − A ˆ P < e log M Deviation between actual distortion and the optimal value is O ( log log n log n ) Encoding Complexity : ML inner products and comparisons ⇒ polynomial in n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 / 25

  14. Numerical Experiment Gaussian source: Mean 0, Variance 1 1 Parameters: M=L 3 , L ∈ [30,100] 0.9 0.8 0.7 0.6 Distortion 0.5 SPARC 0.4 0.3 0.2 Shannon limit 0.1 0 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 Rate (bits/sample) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 / 25

  15. Why does the algorithm work? M columns A : Section 1 n rows T β : 0 , 0 , c 1 , n Each section is a code of rate R / L ( L ∼ log n ) R 1 = S − c 1 ˆ • Step 1: S − → A 1 ( ) 1 − 2 R √ | R 1 | 2 ≈ ν 2 e − 2 R / L ≈ ν 2 2 ν 2 log M for c 1 = L . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 / 25

  16. Why does the algorithm work? M columns A : Section 2 n rows T β : 0 , c 2 , 0 , n Each section is a code of rate R / L ( L ∼ log n ) R 1 = S − c 1 ˆ • Step 1: S − → A 1 ( ) 1 − 2 R √ | R 1 | 2 ≈ ν 2 e − 2 R / L ≈ ν 2 2 ν 2 log M for c 1 = L . . . . . . . . . . . . . . . . . . . . R 2 = R 1 − c 2 ˆ • Step 2: ‘Source’ R 1 − → . . . A 2 . . . . . . . . . . . . . . . . . 9 / 25

  17. Why does the algorithm work? M columns A : Section i n rows T 0 , c i , 0 β : n Each section is a code of rate R / L ( L ∼ log n ) R i = R i − 1 − c i ˆ • Step i : ‘Source’ R i − 1 − → A 2 ( ) i − 1 , i = 2 R ν 2 With c 2 1 − 2 R L L ( ) ( ) i 1 − 2 R 1 − 2 R | R i | 2 ≈ | R i − 1 | 2 ≈ ν 2 . . . . . . . . . . . . . . . . . . . . L L . . . . . . . . . . . . . . . . . . . . 9 / 25

  18. Why does the algorithm work? M columns A : Section L n rows T β : c L , 0 , , 0 n Each section is a code of rate R / L ( L ∼ log n ) ( ) L 1 − 2 R | R L | 2 ≈ ν 2 ≤ ν 2 e − 2 R Final Distortion: L L -stage successive refinement L ∼ n / log n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 / 25

  19. Successive Refinement Interpretation M columns A : Section i n rows T β : 0 , c i , 0 n • The encoder successively refines the source over ∼ log n stages • The deviations in each stage can be significant! ( ) i 1 − 2 R | R i | 2 = ν 2 (1 + ∆ i ) 2 , i = 0 , . . . , L L � �� � ‘Typical Value’ • KEY to result: Controlling the final deviation ∆ L • Recall: successive cancellation does not work for SPARC . . . . . . . . . . . . . . . . . . . . AWGN decoding . . . . . . . . . . . . . . . . . . . . 10 / 25

  20. Open Questions in SPARC Compression • Better encoders with smaller gap to D ∗ ( R )? Iterative soft-decision encoding, AMP? • AWGN decoding AMP doesn’t work when directly used for compression: – may need decimation a la LDGM codes for compression • But recall: With min-distance encoding, SPARCs attain the rate-distortion function with the optimal error-exponent • Compression performance with ± 1 dictionaries • Compression of finite alphabet sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 / 25

  21. Sparse Regression Codes for multi-terminal networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 / 25

  22. Codes for multi-terminal problems Key ingredients: • Superposition (Multiple-access channel, Broadcast channel) • Random binning (e.g., Distributed compression, Channel Coding with Side-Information, . . . ) SPARC is based on superposition coding! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 / 25

Recommend


More recommend