Sparse Regression Codes Andrew Barron Ramji Venkataramanan Yale - PowerPoint PPT Presentation

Sparse Regression Codes Andrew Barron Ramji Venkataramanan Yale University University of Cambridge Joint work with Antony Joseph, Sanghee Cho, Cynthia Rush, Adam Greig, Tuhin Sarkar, Sekhar Tatikonda ISIT 2016 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 / 57

Outline of Tutorial Sparse Superposition Codes or Sparse Regression Codes (SPARCs) for: 1. Provably practical and reliable communication over the AWGN channel at rates approaching capacity 2. Efficient lossy compression at rates approaching Shannon limit 3. Multi-terminal communication and compression models 4. Open Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 / 57

Part I: Communication over the AWGN Channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 / 57

Quest for Provably Practical and Reliable High Rate Communication • The Channel Communication Problem • Gaussian Channel • History of Methods • Sparse Superposition Coding • Three efficient decoders: 1. Adaptive successive threshold decoder 2. Adaptive successive soft-decision decoder 3. Approximate Message Passing (AMP) decoder • Rate, Reliability, and Computational Complexity • Distributional Analysis • Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 / 57

The Additive White Gaussian Noise Channel U 1 , . . . , ˆ ˆ U 1 , . . . , U K U K x 1 , . . . , x n y 1 , . . . , y n Transmitter Receiver ε 1 , . . . , ε n For i = 1 , . . . n , y i = x i + ε i with: ∑ • Average power constraint: 1 i x 2 i ≤ P n • Additive Gaussian noise: ε i iid ∼ N (0 , σ 2 ) • Rate: R = K n • Capacity: C = 1 2 log (1 + snr ) • Reliability: Want small Prob { ˆ U ̸ = U } or reliably small fraction of errors for R approaching C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 / 57

Capacity-achieving codes For many binary/discrete alphabet channels: • Turbo and sparse-graph (LDPC) codes achieve rates close to capacity with efficient message-passing decoding • Theoretical results for spatially-coupled LDPC codes [Kudekar, Richardson, Urbanke ’12, ’13], . . . • Polar codes achieve capacity with efficient decoding [Arikan ’09], [Arikan, Telatar], . . . But we want to achieve C for the AWGN channel. Let’s look at some existing approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 / 57

Existing Approaches: Coded Modulation U = ( ˆ ˆ U 1 . . . ˆ U = ( U 1 . . . U K ) U K ) Channel Encoder Channel Decoder c 1 , . . . , c m y 1 . . . y n x 1 . . . x n Modulator Demodulator ε 1 . . . ε n 1. Fix a modulation scheme, e.g, 16-QAM, 64-QAM 2. Use a powerful binary code (e.g., LDPC, turbo code) to protect against errors 3. Channel decoder uses soft outputs from demodulator Surveys:[Ungerboeck, Forney’98], [Guillen i Fabregas, Martinez, Caire’08] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 / 57

Existing Approaches: Coded Modulation U = ( ˆ ˆ U 1 . . . ˆ U = ( U 1 . . . U K ) U K ) Channel Encoder Channel Decoder c 1 , . . . , c m y 1 . . . y n x 1 . . . x n Modulator Demodulator ε 1 . . . ε n Coded modulation works well in practice, but cannot provably achieve capacity with fixed constellation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 / 57

Existing Approaches: Lattice Coding Analog of linear codes in Euclidean space; provide coding and shaping gain • Achieving 1 2 log(1 + snr ) on the AWGN channel with lattice encoding and decoding , [Erez, Zamir ’08] • Low-Density Lattice Codes , [Sommer-Feder-Shalvi ’08] • Polar Lattices , [Yan, Liu, Ling, Wu ’14] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 / 57

Sparse Regression Codes (SPARC) In this part of the tutorial, we discuss the basic Sparse Regression Code construction with power allocation + two feasible decoders References for this part: – A. Joseph and A. R. Barron, Least-squares superposition codes of moderate dictionary are reliable at rates up to capacity, IEEE Trans. Inf. Theory, May 2012 – A. Joseph and A. R. Barron, Fast sparse superposition codes have near exponential error probability for R < C , IEEE Trans. Inf. Theory, Feb. 2014 – A. R. Barron and S. Cho, High-rate sparse superposition codes with iteratively optimal estimates , ISIT 2012 – A. R. Barron and S. Cho, Approximate Iterative Bayes Optimal Estimates for High-Rate Sparse Superposition Codes , WITMSE 2013 – S. Cho, High-dimensional regression with random design, including sparse superposition codes , PhD thesis, Yale University, 2014 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 / 57

Extensions and Generalizations of SPARCs Spatially-coupled dictionaries for SPARC: J. Barbier, C. Sch¨ ulke, F. Krzakala, Approximate message-passing with spatially coupled structured operators, with applications to compressed sensing and sparse superposition codes , J. Stat. Mech, 2015 http://arxiv.org/abs/1503.08040 Bernoulli ± 1 dictionaries: Y. Takeishi, M. Kawakita, and J. Takeuchi. Least squares superposition codes with bernoulli dictionary are still reliable at rates up to capacity , IEEE Trans. Inf. Theory, May 2014 Tuesday afternoon session on Sparse Superposition Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 / 57

Sparse Regression Code A : n rows T 0 , c 1 , β : 0 , 0 , c 2 , 0 , c L , 0 , , 0 • A : n × N design matrix with iid N (0 , 1 / n ) entries • Codewords A β : sparse linear combinations of columns of A with L out of N entries non-zero • Message bits U = ( U 1 , . . . , U K ) determine the locations of the L non-zeros; values of non-zeros c 1 , . . . , c L are fixed a priori ( N ) • Blocklength of code = n ; Rate = K / n = log / n L • Receiver gets Y = A β + ε ; has to decode ˆ β, ˆ U from Y , A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 / 57

Partitioned Sparse Regression Code Section 1 Section 2 Section L M columns M columns M columns A : n rows T 0 , c 1 , β : 0 , 0 , c 2 , 0 , c L , 0 , , 0 Number of columns N = ML • Matrix A split into L sections with M columns in each section • β has exactly one non-zero in each section = L log M • Total of M L codewords ⇒ Rate R = log M L n n • Input bits U = ( U 1 , . . . , U K ) split into L segments of log 2 M bits, with segment ℓ indexing location of non-zero in section ℓ • Receiver gets Y = A β + ε ; has to decode ˆ β, ˆ . . . . . . . . . . . . . . . . . . . . U from Y , A . . . . . . . . . . . . . . . . . . . . 12 / 57

Choosing M , L Section 1 Section 2 Section L M columns M columns M columns A : n rows T β : 0 , 0 , c 1 , 0 , c 2 , 0 , c L , 0 , , 0 Block length n ; Rate R = ( L log M ) / n Ultra-sparse case: Impractical M = 2 nR / L with L constant • Reliable at all rates R < C [Cover 1972,1980] • But size of A exponential in block length n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 / 57

Choosing M , L Section 1 Section 2 Section L M columns M columns M columns A : n rows T β : 0 , 0 , c 1 , 0 , c 2 , 0 , c L , 0 , , 0 Block length n ; Rate R = ( L log M ) / n Moderately-sparse: Practical M = n κ with L = nR /κ log n • size of A polynomial in block length n ; • Reliability: want small Pr { Fraction section mistakes ≥ ϵ } , for small ϵ • Outer RS code: rate 1 − ϵ , corrects remaining mistakes • Overall rate: R total = (1 − ϵ ) R ; can achieve R total up to C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 / 57

Power Allocation Section 1 Section 2 Section L M columns M columns M columns A : n rows T β : 0 , √ nP 1 , 0 , √ nP 2 , 0 , , 0 0 , √ nP L , 0 , • Indices of nonzeros: sent = ( j 1 , j 2 , . . . , j L ) • Coeff. values: β j 1 = √ nP 1 , β j 2 = √ nP 2 . . . β j L = √ nP L • Power Control: ∑ ℓ P ℓ = P ⇒ codewords A β have average power P • Examples: 1) Flat: P ℓ = P L , ℓ = 1 , . . . , L 2) Exponentially decaying: P ℓ ∝ e − 2 C ℓ/ L √ nP ℓ = Θ( √ log M ) . . . . . . . . . . . . . . . . . . . . For all power allocations, P ℓ = Θ( 1 L ) , ∀ ℓ . . . . . . . . . . . . . . . . . . . . 14 / 57

Variable Power Allocation • Power control: ∑ L ∥ β ∥ 2 = P ℓ =1 P ℓ = P • Variable power: P ℓ proportional to e − 2 C ℓ/ L for ℓ = 1 , . . . , L Example: P =7, L = 100 0.020 0.015 power allocation 0.010 0.005 0.000 0 20 40 60 80 100 section index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 / 57

Sparse Regression Codes Andrew Barron Ramji Venkataramanan Yale - PowerPoint PPT Presentation

Sparse Regression Codes Andrew Barron Ramji Venkataramanan Yale University University of Cambridge Joint work with Antony Joseph, Sanghee Cho, Cynthia Rush, Adam Greig, Tuhin Sarkar, Sekhar Tatikonda ISIT 2016 . . . . . . . . . . .

Sparse Matrices Example Of Sparse Matrices diagonal tridiagonal sparse many elements are

Building Codes Building Codes Building Codes Building Codes 1 1 Builder Responsibilities

ECEN 5682 Theory and Practice of Error Control Codes Cyclic Codes Peter Mathys University of

Formal Modeling in Cognitive Science Source Codes Lecture 30: Codes; Kraft Inequality; Source

Regression 3: Logistic Regression Marco Baroni Practical Statistics in R Outline Logistic

Regression Methods 1. Linear Regression and Logistic Regression: definitions, and a common

Sparse regression DS-GA 1013 / MATH-GA 2824 Mathematical Tools for Data Science

Sparse Matrices sparse many elements are zero dense few elements are zero Example Of

CODES FOR ALL SEASONS Emina Soljanin, Bell Labs IN THE CLOUD? CODES Emina @ Bell Labs Codes at

G ENERALIZED R EED -S OLOMON CODES (GRS CODES ) A CHARACTERIZATION OF MDS CODES THAT HAVE AN ERROR

Lattices from Codes or Codes from Lattices Amin Sakzad Dept of Electrical and Computer Systems

Error-Correcting codes: Application of convolutional codes to Video Streaming Diego Napp

Information Theory Lecture 8 BCH codes BCH codes: R8.45 (R5.6) Decoding BCH (and

MLSS 06 - Canberra Elements Hierarchical Basis Sparse Grids Sparse Grids Combination

Latent Structure Beyond Sparse Codes Benjamin Recht Department of EECS and Statistics

Logistic Regression James H. Steiger Department of Psychology and Human Development Vanderbilt

LLVM for a Managed Language What we've learned Sanjoy Das, Philip Reames

Multi-Core in JAVA/JVM Tommi Zetterman Concurrency Prior to Java 5: Synchronization and Threads

Virtual Machines 1 questions/logistics office hours posted on website VM setup due Friday 2

Review : What is An Operating System? ! Software ( kernel ) that runs at all times

Making code thread-safe Kyle J. Knoepfel 25 June 2019 LArSoft Workshop 2019 So youre going

Lecture 4: MIPS Instruction Set No class on Tuesday Todays topic: MIPS

List Decoding of Concatenated Codes: Improved Performance Estimates Alexander Barg Andrew

Two attacks on rank metric code-based Jean-Pierre Tillich schemes: RankSign and an IBE scheme

Sambuz

Useful Links

Newsletter

Mail Us