sparse regression codes
play

Sparse Regression Codes Sekhar Tatikonda (Yale University) in - PowerPoint PPT Presentation

Sparse Regression Codes Sekhar Tatikonda (Yale University) in collaboration with Ramji Venkataramanan (University of Cambridge) Antony Joseph (UC-Berkeley) Tuhin Sarkar (IIT-Bombay) Information and Control in Networks October 18, 2012


  1. Sparse Regression Codes Sekhar Tatikonda (Yale University) in collaboration with Ramji Venkataramanan (University of Cambridge) Antony Joseph (UC-Berkeley) Tuhin Sarkar (IIT-Bombay) Information and Control in Networks October 18, 2012

  2. Summary • Lossy coding fundamental component of networked control • Efficient codes for lossy Gaussian source coding • Based on sparse regression Outline • Background • Sparse Regression Codes • Optimal Encoding • Practical Encoding • Multi-terminal Extensions • Conclusions

  3. Gaussian Data Compression R bits/sample Codebook size 2 nR S = ˆ S 1 , . . . , ˆ ˆ S = S 1 , . . . , S n S n S i.i.d Gaussian source N (0 , σ 2 ) S � 2 ≤ D MSE distortion: 1 n � S − ˆ 2 log σ 2 Possible iff R > R ∗ ( D ) = 1 D 2 / 18

  4. Achieving R ∗ ( D ) Shannon random coding S (2 nR ) } each ∼ i.i.d N (0 , σ 2 − D ) - { ˆ S (1) , . . . , ˆ Exponential storage & encoding complexity Lattice codes - compact representation - Conway-Sloane, Eyboglu-Forney, Zamir-Shamai-Erez, . . . GOAL: Compact representiation + Fast encoding & decoding 3 / 18

  5. Related Work Sparse regression codes for source coding - [Kontoyiannis, Rad, Gitzenis ITW ’10] Comp. feasible constructions for finite alphabet sources: - Gupta, Verdu, Weissman [ISIT ’08] - Jalali, Weissman [ISIT ’10] - Kontoyiannias, Gioran [ITW’10] - LDGM codes: [Wainwright, Maneva, Martinian ’10] - Polar codes: [Korada, Urbanke ’10] 4 / 18

  6. In this talk . . . Ensemble of codes based on sparse linear regression - For point-to-point & multi-terminal problems Provably achieve rates close to info-theoretic limits - with fast encoding + decoding Based on construction of Barron & Joseph for AWGN channel - Achieve capacity with fast decoding [ISIT ’10, Arxiv ’12] 6 / 20

  7. Outline • Background • Sparse Regression Codes • Optimal Encoding • Practical Encoding • Multi-terminal Extensions • Conclusions

  8. Sparse Regression Codes (SPARC) A : n × ML design matrix or ‘dictionary’ with i.i.d N (0 , 1) entries Section 1 Section 2 Section L M columns M columns M columns A : n rows T β : 0 , 0 , c, 0 , c, 0 , c, 0 , , 0 Codewords of the form A β c 2 = codeword variance - β : sparse ML × 1 binary vector, L 5 / 18

  9. SPARC Construction Section 1 Section 2 Section L M columns M columns M columns A : n rows T β : 0 , 0 , c, 0 , c, 0 , c, 0 , , 0 Choosing M and L For rate R codebook, need M L = 2 nR Shannon codebook: L = 1, M = 2 nR We choose M = L b ⇒ L ∼ Θ ( n / log n ) log n ) b +1 : polynomial in n n Size of A ∼ n × ( 6 / 18

  10. Minimum Distance Encoding Section 1 Section 2 Section L M columns M columns M columns A : n rows T 0 , c, β : 0 , 0 , c, 0 , c, 0 , , 0 Encoder : Find ˆ β = argmin � S − A β � β S = A ˆ Decoder : Reconstruct ˆ β � 1 � S � 2 > D n � S − ˆ P n = P n log P n ⇒ P n < 1 ∼ e − nT Error Exponent : T = − lim sup n 7 / 18

  11. Outline • Background • Sparse Regression Codes • Optimal Encoding • Practical Encoding • Multi-terminal Extensions • Conclusions

  12. Correlated Codewords Each codeword sum of L columns Codewords ˆ S ( i ) , ˆ S ( j ) dependent if they have common columns L b columns L b columns L b columns A : n rows Section 2 Section L Section 1 S ( i ) = M L − 1 − ( M − 1) L # codewords dependent with ˆ 11 / 18

  13. Error Analysis for SPARC P ( E ) ≤ P ( | S | 2 ≥ a 2 ) + P ( E | | S | 2 < a 2 ) . � �� � � �� � KL divergence ? � S ( i ) − S | 2 < D if | ˆ 1 Define U i ( S ) = 0 otherwise   2 nR � P ( E ( S ) | | S | 2 < a 2 ) = P U i ( S ) = 0 | | S | 2 < a 2   i =1 { U i ( S ) } are dependent 12 / 18

  14. Dependency Graph A B For random variables { U i } i ∈I , any graph with vertex set I s.t: If A and B are two disjoint subsets of I such that there are no edges with one vertex in A and the other in B, then the families { U i } i ∈ A and { U i } i ∈ B are independent. 13 / 18

  15. For our problem . . . � S ( i ) − S | 2 < D if | ˆ 1 i = 1 , . . . , 2 nR U i ( S ) = , 0 otherwise For the family { U i ( S ) } , { i ∼ j : i � = j and ˆ S ( i ) , ˆ S ( j ) share at least one common term } is a dependency graph. 14 / 18

  16. Suen’s correlation inequality Let { U i } i ∈I , be Bernoulli rvs with dependency graph Γ. Then �� � � � λ �� 2 , λ 2 8∆ , λ U i = 0 ≤ exp − min P 6 δ i ∈I where � λ = E U i , i ∈I ∆ = 1 � � E ( U i U j ) , 2 i ∈I j ∼ i � δ = max E U k . i ∈I k ∼ i 15 / 18

  17. Optimal Error Exponent for Gaussian Source σ 2 = | 2 S | 2 log a 2 R = 1 | S | 2 = a 2 D [Ihara, Kubo ’00] 2 nR codewords i.i.d N (0 , a 2 − D ) P ( | S | 2 ≥ a 2 ) + P ( | S | 2 < a 2 ) · P ( error | | S | 2 < a 2 ) P n < � �� � � �� � ∼ exp( − n D ( a 2 � σ 2 )) ↓ double-exponentially 8 / 18

  18. Main Result L b columns L b columns L b columns A : Section 1 Section L n rows Theorem SPARCs with minimum distance encoding achieve the rate-distortion function with the optimal error exponent when 3 . 5 R b > R − (1 − 2 − 2 R ) . This is possible whenever D σ 2 < 0 . 203 log n ) b +1 elements n Codebook representation polynomial in n : n × ( 9 / 18

  19. Performance: Min-distance Encoding 3.5 3 0.5 log σ 2 /D 2.5 Rate (bits) 2 1.5 1−D/ σ 2 1 0.5 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 D/ σ 2 10 / 18

  20. Outline • Background • Sparse Regression Codes • Optimal Encoding • Practical Encoding • Multi-terminal Extensions • Conclusions

  21. SPARC Construction Section 1 Section 2 Section L M columns M columns M columns A : n rows T 0 , c 1 , β : 0 , 0 , c 2 , 0 , c L , 0 , , 0 n rows, ML columns Choosing M and L : For rate R codebook, need M L = 2 nR Choose M polynomial of n ⇒ L ∼ n / log n Storage Complexity ↔ Size of A : polynomial in n 10 / 20

  22. A Simple Encoding Algorithm M columns A : Section 1 n rows T 0 , c 1 , β : 0 , Step 1: Choose column in Sec.1 that minimizes � X − c 1 A j � 2 - Max among inner products � X , A j � - ‘Residue’ R 1 = X − c 1 ˆ A 1 11 / 20

  23. A Simple Encoding Algorithm M columns A : Section 2 n rows T β : 0 , c 2 , 0 , Step 2: Choose column in Sec.2 that minimizes � R 1 − c 2 A j � 2 - Max among inner products � R 1 , A j � - Residue R 2 = R 1 − c 2 ˆ A 2 11 / 20

  24. A Simple Encoding Algorithm M columns A : Section L n rows T β : c L , 0 , , 0 Step L : Choose column in Sec. L that minimizes � R L − 1 − c L A j � 2 - Max among inner products � R L − 1 , A j � - Final residue R L = R L − 1 − c L ˆ A L 11 / 20

  25. Performance Theorem (RV, Sarkar, Tatikonda ’12) The proposed encoding algorithm approaches the rate-distortion function with exponentially small probability of error. In particular, � Distortion > σ 2 e − 2 R + ∆ � ≤ e − L ∆ P for 1 ∆ ≥ log M . Computation Complexity ML inner products and comparisons ⇒ polynomial in n Storage Complexity Design matrix A : n × ML ⇒ polynomial in n 12 / 20

  26. Outline • Background • Sparse Regression Codes • Optimal Encoding • Practical Encoding • Multi-terminal Extensions • Conclusions

  27. Point-to-point Communication Noise X Z ˆ M + M � X � 2 Z = X + Noise , ≤ P , Noise ∼ Normal (0 , N ) n SPARCs Provably good with low-complexity decoding - [Barron-Joseph, ISIT ’10,’11, Arxiv ’12] 13 / 20

  28. SPARC Construction Section 1 Section 2 Section L M columns M columns M columns A : n rows T β : 0 , 0 , c 1 , 0 , c 2 , 0 , c L , 0 , , 0 n rows, ML columns β ↔ message, Codeword A β For rate R codebook, need M L = 2 nR - choose M polynomial of n ⇒ L ∼ n / log n Adaptive successive decoding achieves R < Capacity 14 / 20

  29. Wyner-Ziv coding R ˆ X Encoder Decoder X Y Side-info Y = X + Z X ∼ N (0 , σ 2 ) , Z ∼ N (0 , N ) 16 / 20

  30. Wyner-Ziv coding U R ˆ X Encoder Decoder X Y 2 nR 1 cwds Side-info Y = X + Z X ∼ N (0 , σ 2 ) , Z ∼ N (0 , N ) Encoder U = X + V , V ∼ N (0 , Q ) Quantize X to U σ 2 - Find U that minimizes � X − a U � 2 , a = σ 2 + Q 16 / 20

  31. Wyner-Ziv coding U R ˆ X Encoder Decoder X 2 nR bins Y 2 nR 1 cwds Side-info Y = X + Z X ∼ N (0 , σ 2 ) , Z ∼ N (0 , N ) Encoder U = X + V , V ∼ N (0 , Q ) Quantize X to U σ 2 - Find U that minimizes � X − a U � 2 , a = σ 2 + Q 16 / 20

  32. Wyner-Ziv coding U R ˆ X Encoder Decoder X 2 nR bins Y 2 nR 1 cwds Side-info Y = X + Z X ∼ N (0 , σ 2 ) , Z ∼ N (0 , N ) Decoder Y = aU + Z ′ Y = X + Z ← → Find U within bin that minimizes � Y − a U � 2 - Reconstruct ˆ X = E [ X | UY ] 16 / 20

  33. Binning with SPARCs A : Section 1 Section 2 Section L M columns M columns M columns T c L , 0 , β : 0 , 0 , c 1 , , c 2 , 0 , , 0 Quantize X to a U using n × ML SPARC (rate R 1 ) 17 / 20

  34. Binning with SPARCs M ′ A : Section 1 Section L M columns M columns M columns T c L , 0 , β : , c 2 , 0 , , 0 0 , 0 , c 1 , Quantize X to a U using n × ML SPARC (rate R 1 ) ( M / M ′ ) L = 2 nR 17 / 20

Recommend


More recommend