Sparse Regression Codes Andrew Barron Ramji Venkataramanan Yale University University of Cambridge Joint work with Antony Joseph, Sanghee Cho, Cynthia Rush, Adam Greig, Tuhin Sarkar, Sekhar Tatikonda ISIT 2016 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 / 25
Part III of the tutorial: • SPARCs for Lossy Compression • SPARCs for Multi-terminal Source and Channel Coding • Open questions (Joint work with Sekhar Tatikonda, Tuhin Sarkar, Adam Greig) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 / 25
Lossy Compression R nats/sample Codebook e nR S = S 1 , . . . , S n S = ˆ ˆ S 1 , . . . , ˆ S n S ∥ 2 = 1 ∑ n ∥ S − ˆ k ( S k − ˆ • Distortion criterion: 1 S k ) 2 n • For i.i.d N (0 , ν 2 ) source, min distortion = ν 2 e − 2 R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 / 25
Lossy Compression R nats/sample Codebook e nR S = S 1 , . . . , S n S = ˆ ˆ S 1 , . . . , ˆ S n S ∥ 2 = 1 ∑ n ∥ S − ˆ k ( S k − ˆ • Distortion criterion: 1 S k ) 2 n • For i.i.d N (0 , ν 2 ) source, min distortion = ν 2 e − 2 R • Can we achieve this with low-complexity codes? – Storage & Computation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 / 25
SPARC Construction Section 1 Section 2 Section L M columns M columns M columns A : n rows T 0 , c 1 , β : 0 , 0 , c 2 , 0 , c L , 0 , , 0 n rows, ML columns, A ij ∼ N (0 , 1 / n ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 / 25
SPARC Construction Section 1 Section 2 Section L M columns M columns M columns A : n rows T 0 , c 1 , β : 0 , 0 , c 2 , 0 , c L , 0 , , 0 n rows, ML columns, A ij ∼ N (0 , 1 / n ) Choosing M and L : • For rate R codebook, need M L = e nR • Choose M polynomial of n ⇒ L ∼ n / log n • Storage Complexity ↔ Size of A : polynomial in n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 / 25
Optimal Encoding Section 1 Section 2 Section L M columns M columns M columns A : n rows T β : 0 , 0 , c 1 , 0 , c 2 , 0 , c L , 0 , , 0 Minimum Distance Encoding : ˆ β = arg min β ∈ SPARC ∥ S − A β 2 ∥ Theorem [Venkataramanan, Tatikonda ’12, ’14]: ∼ N (0 , ν 2 ), the sequence of rate R For source S i.i.d. SPARCs with n , L , M = L b with b > b ∗ ( R ): ( 1 ) β ∥ 2 > D n ∥ S − A ˆ < e − n ( E ∗ ( R , D )+ o (1)) . P Achieves the optimal rate-distortion function with the optimal . . . . . . . . . . . . . . . . . . . . error exponent E ∗ ( R , D ) . . . . . . . . . . . . . . . . . . . . 5 / 25
Successive Cancellation Encoding M columns A : Section 1 n rows T β : 0 , 0 , c 1 , Step 1: Choose column in section 1 that minimizes ∥ S − c 1 A j ∥ 2 - Max among M inner products ⟨ S , A j ⟩ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 / 25
Successive Cancellation Encoding M columns A : Section 1 n rows T β : 0 , 0 , c 1 , Step 1: Choose column in section 1 that minimizes ∥ S − c 1 A j ∥ 2 - Max among M inner products ⟨ S , A j ⟩ √ 2 ν 2 log M - c 1 = - residual R 1 = S − c 1 ˆ A 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 / 25
Successive Cancellation Encoding M columns A : Section 2 n rows T β : 0 , c 2 , 0 , Step 2: Choose column in section 2 that minimizes ∥ R 1 − c 2 A j ∥ 2 - Max among inner products ⟨ R 1 , A j ⟩ √ 2(log M ) ν 2 ( ) 1 − 2 R - c 2 = L - residual R 2 = R 1 − c 2 ˆ A 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 / 25
Successive Cancellation Encoding M columns A : Section L n rows T β : c L , 0 , , 0 Step L : Choose column in section L that minimizes ∥ R L − 1 − c L A j ∥ 2 √ 2(log M ) ν 2 ( ) L 1 − 2 R - c L = L - Final residual R L = R L − 1 − c L ˆ Final Distortion = 1 n ∥ R L ∥ 2 A L . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 / 25
Performance Theorem [Venkataramanan, Sarkar,Tatikonda ’13]: For an ergodic source S with mean 0 and variance ν 2 , the encoding algorithm produces a codeword A ˆ β that satisfies the following for sufficiently large M , L : ( 1 ) − κ n (∆ − c log log M β ∥ 2 > ν 2 e − 2 R + ∆ ) n ∥ S − A ˆ log M P < e Deviation between actual distortion and the optimal value is O ( log log n log n ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 / 25
Performance Theorem [Venkataramanan, Sarkar,Tatikonda ’13]: For an ergodic source S with mean 0 and variance ν 2 , the encoding algorithm produces a codeword A ˆ β that satisfies the following for sufficiently large M , L : ( 1 ) − κ n (∆ − c log log M β ∥ 2 > ν 2 e − 2 R + ∆ ) n ∥ S − A ˆ P < e log M Deviation between actual distortion and the optimal value is O ( log log n log n ) Encoding Complexity : ML inner products and comparisons ⇒ polynomial in n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 / 25
Numerical Experiment Gaussian source: Mean 0, Variance 1 1 Parameters: M=L 3 , L ∈ [30,100] 0.9 0.8 0.7 0.6 Distortion 0.5 SPARC 0.4 0.3 0.2 Shannon limit 0.1 0 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 Rate (bits/sample) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 / 25
Why does the algorithm work? M columns A : Section 1 n rows T β : 0 , 0 , c 1 , n Each section is a code of rate R / L ( L ∼ log n ) R 1 = S − c 1 ˆ • Step 1: S − → A 1 ( ) 1 − 2 R √ | R 1 | 2 ≈ ν 2 e − 2 R / L ≈ ν 2 2 ν 2 log M for c 1 = L . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 / 25
Why does the algorithm work? M columns A : Section 2 n rows T β : 0 , c 2 , 0 , n Each section is a code of rate R / L ( L ∼ log n ) R 1 = S − c 1 ˆ • Step 1: S − → A 1 ( ) 1 − 2 R √ | R 1 | 2 ≈ ν 2 e − 2 R / L ≈ ν 2 2 ν 2 log M for c 1 = L . . . . . . . . . . . . . . . . . . . . R 2 = R 1 − c 2 ˆ • Step 2: ‘Source’ R 1 − → . . . A 2 . . . . . . . . . . . . . . . . . 9 / 25
Why does the algorithm work? M columns A : Section i n rows T 0 , c i , 0 β : n Each section is a code of rate R / L ( L ∼ log n ) R i = R i − 1 − c i ˆ • Step i : ‘Source’ R i − 1 − → A 2 ( ) i − 1 , i = 2 R ν 2 With c 2 1 − 2 R L L ( ) ( ) i 1 − 2 R 1 − 2 R | R i | 2 ≈ | R i − 1 | 2 ≈ ν 2 . . . . . . . . . . . . . . . . . . . . L L . . . . . . . . . . . . . . . . . . . . 9 / 25
Why does the algorithm work? M columns A : Section L n rows T β : c L , 0 , , 0 n Each section is a code of rate R / L ( L ∼ log n ) ( ) L 1 − 2 R | R L | 2 ≈ ν 2 ≤ ν 2 e − 2 R Final Distortion: L L -stage successive refinement L ∼ n / log n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 / 25
Successive Refinement Interpretation M columns A : Section i n rows T β : 0 , c i , 0 n • The encoder successively refines the source over ∼ log n stages • The deviations in each stage can be significant! ( ) i 1 − 2 R | R i | 2 = ν 2 (1 + ∆ i ) 2 , i = 0 , . . . , L L � �� � ‘Typical Value’ • KEY to result: Controlling the final deviation ∆ L • Recall: successive cancellation does not work for SPARC . . . . . . . . . . . . . . . . . . . . AWGN decoding . . . . . . . . . . . . . . . . . . . . 10 / 25
Open Questions in SPARC Compression • Better encoders with smaller gap to D ∗ ( R )? Iterative soft-decision encoding, AMP? • AWGN decoding AMP doesn’t work when directly used for compression: – may need decimation a la LDGM codes for compression • But recall: With min-distance encoding, SPARCs attain the rate-distortion function with the optimal error-exponent • Compression performance with ± 1 dictionaries • Compression of finite alphabet sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 / 25
Sparse Regression Codes for multi-terminal networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 / 25
Codes for multi-terminal problems Key ingredients: • Superposition (Multiple-access channel, Broadcast channel) • Random binning (e.g., Distributed compression, Channel Coding with Side-Information, . . . ) SPARC is based on superposition coding! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 / 25
Recommend
More recommend