Fast Polarization for Processes with Memory Joint work with Eren S - PowerPoint PPT Presentation

Fast Polarization for Processes with Memory Joint work with Eren S ¸as ¸o˘ glu and Boaz Shuval 1/32

Polar codes in one slide X N Y N 1 1 W Polar coding ◮ Information vector : ˜ U k 1 1 = f (˜ 1 ) ◮ Padding : U N U k 1 = U N 1 · G − 1 ◮ Encoding : X N N ◮ Decoding : Successively, deduce U i from U i − 1 and Y N 1 1 2/32

Polar codes in two slides: [Arıkan:09], [ArıkanTelatar:09] ◮ Setting : binary-input, symmetric, memoryless channel 1 = X N 1 · G N ◮ Polar transform : U N ⇐ ⇒ X N U N uniform uniform 1 1 ◮ Low entropy indices : Fix β < 1 / 2 � 1 ) < 2 − N β � i : P error ( U i | U i − 1 Λ N = , Y N 1 ◮ Polarization : Let X N 1 be uniform 1 N | Λ N | = I ( X 1 ; Y 1 ) lim N →∞ ◮ Coding scheme : ◮ For i ∈ Λ N , set U i equal to information bits (uniform) ◮ Set remaining U i to uniform values, reveal to decoder 1 = U N 1 · G − 1 ◮ Transmit X N as codeword N 3/32

In this talk Setting : binary-input, symmetric, memoryless channel 4/32

In this talk Setting : binary-input, symmetric, ✭✭✭✭✭✭ ❤❤❤❤❤❤ ✭ memoryless channel ❤ 4/32

Polar codes: [S ¸as ¸o˘ glu+:09], [KoradaUrbanke:10], [HondaYamamoto:13] ◮ Setting : Memoryless i.i.d. process ( X i , Y i ) N i = 1 ◮ For simplicity : Assume X i binary 1 = X N 1 · G N ◮ Polar transform : U N ◮ Index sets : � 1 ) < 2 − N β � Λ N = i : P error ( U i | U i − 1 , Y N Low entropy: 1 � 2 − 2 − N β � 1 ) > 1 Ω N = i : P error ( U i | U i − 1 , Y N High entropy: 1 ◮ Polarization : 1 N | Λ N | = 1 − H ( X 1 | Y 1 ) lim N →∞ 1 N | Ω N | = H ( X 1 | Y 1 ) lim N →∞ 5/32

Polar codes: [S ¸as ¸o˘ glu+:09], [KoradaUrbanke:10], [HondaYamamoto:13] Optimal rate for: ◮ Coding for non-symmetric memoryless channels ◮ Coding for memoryless channels with non-binary inputs ◮ (Lossy) compression of memoryless sources Question ◮ How to handle memory? 6/32

Roadmap Index sets � � i : P error ( U i | U i − 1 Λ N ( ǫ ) = , Y N 1 ) < ǫ Low entropy: 1 � � 1 ) > 1 i : P error ( U i | U i − 1 Ω N ( ǫ ) = , Y N 2 − ǫ High entropy: 1 Plan ◮ Define framework for handling memory ◮ Establish: ◮ Slow polarization: for ǫ > 0 fixed , 1 N | Λ N ( ǫ ) | = 1 − H ⋆ ( X | Y ) lim N →∞ 1 N | Ω N ( ǫ ) | = H ⋆ ( X | Y ) lim N →∞ N H ( X N 1 | Y N 1 ) lim N →∞ 1 ◮ Fast polarization: also holds for ǫ = 2 − N β ◮ What is β ? 7/32

A framework for memory ◮ Process : ( X i , Y i , S i ) N i = 1 ◮ Finite number of states : S i ∈ S , where |S| < ∞ ◮ Hidden state : S i is unknown to encoder and decoder ◮ Probability distribution : P ( x i , y i , s i | s i − 1 ) ◮ Stationary : same for all i ◮ Markov : P ( x i , y i , s i | s i − 1 ) = P ( x i , y i , s i |{ x j , y j , s j } j < i ) ◮ State sequence : aperiodic and irreducible Markov chain 8/32

Example 1 ◮ Model : Finite state channel P s ( y | x ) , s ∈ S ◮ Input distribution : X i i.i.d. and independent of state ◮ State transition : π ( s i | s i − 1 ) ◮ Distribution : P ( x i , y i , s i | s i − 1 ) = P ( x i ) π ( s i | s i − 1 ) P s i ( y i | x i ) 9/32

Example 2 ◮ Model : ISI + noise Y i = h 0 X i + h 1 X i − 1 + · · · + h m X i − m + noise ◮ Input : X i has memory P ( x i | x i − 1 , x i − 2 , . . . , x i − m , x i − m − 1 ) ◮ State : � � S i = · · · X i X i − 1 X i − m ◮ Distribution : For x i , s i , s i − 1 compatible, P ( x i , y i , s i | s i − 1 ) = P noise ( y i | h T s i ) · P ( x i | s i − 1 ) 10/32

Example 3 ◮ Model : ( d , k ) -RLL constrained system with noise ( 1 , ∞ ) -RLL Constraint 1 0 0 11/32

Example 3 ◮ Model : ( d , k ) -RLL constrained system with noise ( 1 , ∞ ) -RLL Constraint 1 X N Y N 1 1 BSC ( p ) 0 0 11/32

Example 3 ◮ Model : ( d , k ) -RLL constrained system with noise ( 1 , ∞ ) -RLL Constraint 1 X N Y N 1 − α 1 1 BSC ( p ) α 0 1 0 state Markov chain 0 / 0 1 / 0 α ( 1 − p ) ( 1 − α )( p ) 1 / 1 ( 1 − α ) ( 1 − ) p P ( x i , y i , s i | s i − 1 ) ( 1 1 − ) p 0 / 0 1 ( p ) α ( p ) 1 / 0 0 / 1 11/32

Example 4 ◮ Model : Lossy compression of a source with memory Lossy compression Y N X N 1 1 ◮ Source distribution : P s ( y ) , s ∈ S ◮ State transition : π ( s i | s i − 1 ) ◮ Distortion : test channel P ( x | y ) ◮ Distribution : P ( x i , y i , s i | s i − 1 ) = π ( s i | s i − 1 ) P s i ( y i ) P ( x i | y i ) 12/32

Polar codes: [S ¸as ¸o˘ glu:11], [S ¸as ¸o˘ gluTal:16], [ShuvalTal:17] ◮ Setting : Process ( X i , Y i , S i ) N i = 1 with memory, as above ◮ Hidden state : State unknown to encoder and decoder 1 = X N 1 · G N ◮ Polar transform : U N U N 1 are neither independent, nor identically distributed ◮ Index sets : Fix β < 1 / 2 � 1 ) < 2 − N β � i : P error ( U i | U i − 1 Λ N = , Y N Low entropy: 1 � 2 − 2 − N β � 1 ) > 1 Ω N = i : P error ( U i | U i − 1 , Y N High entropy: 1 ◮ Polarization : 1 N | Λ N | = 1 − H ⋆ ( X | Y ) lim N →∞ 1 N | Ω N | = H ⋆ ( X | Y ) lim N →∞ N H ( X N 1 | Y N 1 ) lim N →∞ 1 13/32

Achievable rate ◮ Achievable rate : In all examples, R approaches 1 I ⋆ ( X ; Y ) = lim NI ( X N 1 ; Y N 1 ) N →∞ ◮ Successive cancellation : [Wang+:15] 14/32

Mixing Consider the process ( X i , Y i ) — hidden state · · · · · · · · · X L + 1 X M + 1 X M + 2 X 1 X 2 X L X M X N · · · · · · · · · Y 1 Y 2 Y L Y L + 1 Y M Y M + 1 Y M + 2 Y N Then, there exist ψ ( k ) , k ≥ 0, such that M + 1 ≤ ψ ( M − L ) · P X L 1 · P X N P X L 1 , Y L 1 , X N M + 1 , Y N 1 , Y L M + 1 , Y N M + 1 where: ◮ ψ ( 0 ) < ∞ ◮ ψ ( k ) → 1 15/32

Three parameters ◮ Joint distribution P ( x , y ) ◮ For simplicity: X ∈ { 0 , 1 } ◮ Parameters : H ( X | Y ) = − � x , y P ( x , y ) log P ( x | y ) Entropy � Z ( X | Y ) = 2 � P ( 0 , y ) P ( 1 , y ) Bhattacharyya y K ( X | Y ) = � y | P ( 0 , y ) − P ( 1 , y ) | T.V. distance ◮ Connections : H ≈ 0 ⇐ ⇒ Z ≈ 0 ⇐ ⇒ K ≈ 1 H ≈ 1 ⇐ ⇒ Z ≈ 1 ⇐ ⇒ K ≈ 0 16/32

Three processes For n = 1 , 2 , . . . { X i , Y i , S i } ◮ N = 2 n 1 = X N ◮ U N 1 G N { X i , Y i } ◮ Pick B n ∈ { 0 , 1 } uniform, i.i.d. ◮ Random index from { 1 , 2 , . . . , N } i = 1 + � B 1 B 2 · · · B n � 2 ◮ Processes : H n = H ( U i | U i − 1 , Y N 1 ) Entropy 1 Z n = Z ( U i | U i − 1 , Y N 1 ) Bhattacharyya 1 K n = K ( U i | U i − 1 , Y N 1 ) T.V. distance 1 17/32

Proof — memoryless case Slow polarization Fast polarization � B n + 1 = 0 2 Z n H n ∈ ( ǫ, 1 − ǫ ) Z n + 1 ≤ B n + 1 = 1 Z 2 n | H n + 1 − H n | > 0 1 N | Λ N | − n →∞ 1 − H ( X 1 | Y 1 ) − − → Low entropy set New � B n + 1 = 0 K 2 K n + 1 ≤ n B n + 1 = 1 2 K n 1 N | Ω N | − n →∞ H ( X 1 | Y 1 ) − − → High entropy set 18/32

❍❍ ✟ Proof — memory ✟✟ less case ❍ { X i , Y i , S i } Slow polarization Fast polarization { X i , Y i } � 2 ψ Z n B n + 1 = 0 H n ∈ ( ǫ, 1 − ǫ ) Z n + 1 ≤ ψ Z 2 B n + 1 = 1 n | H n + 1 − H n | > 0 1 N | Λ N | − n →∞ 1 − H ⋆ ( X | Y ) − − → Low entropy set � ψ ˆ B n + 1 = 0 K 2 1 ˆ ψ = ψ ( 0 ) = max K n + 1 ≤ n π ( s ) 2 ˆ B n + 1 = 1 s K n π : stationary state distribution 1 N | Ω N | − n →∞ H ⋆ ( X | Y ) − − → High entropy set 19/32

Notation ◮ Two consecutive blocks : ( X N 1 , Y N 1 ) and ( X 2 N N + 1 , Y 2 N N + 1 ) . ◮ Polar transform : 1 = X N 1 · G N U N 1 = X 2 N N + 1 · G N V N ◮ Random index : i = 1 + � B 1 B 2 · · · B n � 2 ◮ Notation : Q i = ( U i − 1 , Y N 1 ) 1 R i = ( V i − 1 , Y 2 N N + 1 ) 1 20/32

Slow polarization 1 = X N U N 1 · G N V N 1 = X 2 N N + 1 · G N ◮ H n is a supermartingale Q i = ( U i − 1 1 ) , Y N 1 R i = ( V i − 1 H n = H ( U i | Q i ) = H ( V i | R i ) N + 1 ) , Y 2 N 1 � H ( U i + V i | Q i , R i ) B n + 1 = 0 H n + 1 = H ( V i | U i + V i , Q i , R i ) B n + 1 = 1 By the chain rule: � � E [ H n + 1 | H n , . . . ] = 1 H ( U i + V i | Q i , R i ) + H ( V i | U i + V i , Q i , R i ) 2 = 1 2 H ( U i + V i , V i | Q i , R i ) = 1 2 H ( U i , V i | Q i , R i ) ≤ 1 2 H ( U i | Q i ) + 1 2 H ( V i | R i ) = H n 21/32

Slow polarization 1 = X N U N 1 · G N V N 1 = X 2 N N + 1 · G N Convergence Q i = ( U i − 1 1 ) , Y N 1 ◮ H n is a supermartingale R i = ( V i − 1 N + 1 ) , Y 2 N 1 ◮ 0 ≤ H n ≤ 1 H n converges a.s. and in L 1 to H ∞ Polarization ◮ H ∞ ∈ [ 0 , 1 ] ◮ We need: H ∞ ∈ { 0 , 1 } ◮ Easy if ( U N 1 , Q N 1 ) and ( V N 1 , R N 1 ) were independent ◮ They are not: Y N ∈ Q N 1 and Y N + 1 ∈ R N 1 ◮ But: for almost all i , we have I ( U i ; V i | Q i , R i ) < ǫ ◮ Enough? No. Need to show that Q i and R i can’t cooperate to stop polarization 22/32

Fast Polarization for Processes with Memory Joint work with Eren S - PowerPoint PPT Presentation

Fast Polarization for Processes with Memory Joint work with Eren S as o glu and Boaz Shuval 1/32 Polar codes in one slide X N Y N 1 1 W Polar coding Information vector : U k 1 1 = f ( 1 ) Padding : U N U k 1 = U N 1

Vorticity and spin polarization Vorticity and spin polarization Vorticity and spin polarization

Propagation of EM Waves Polarization and Propagation Linear polarization (frozen time) Linear

Memory II. Memory improvement III. Problems with memory 3 systems/stages of Memory: memory

Why memory hierarchy (3 rd Ed: p.468-487, 4 th Ed: p. 452-470) users want unlimited fast

Fast Polarization for Processes with Memory Boaz Shuval and Ido Tal Andrew and Erna Viterbi

Memory Management Ideally programmers want memory that is large fast non

Memory Management Memory Manager Requirements Minimize primary memory access time

1 Memory SoC Persistent Memory-Driven Memory Memory Processor-Centric Memory SoC SoC

Networks Computer-Computer Comm CPU CPU CPU CPU Memory Device Device Memory Memory

Being a METS Startup Fast Failure; Fast Reward November 2016 Fast Failure; Fast Reward

Memory Questions? ! What is main memory? CSCI [4|6]730 ! How does multiple processes share memory

Longitudinal polarization of / hyperons in lepton-nucleon deep inelastic scattering

Instrumental Challenges in Polarization Observations using Large Interferometry Chau Ching

CMB Polarization Power Spectra from Two Years of BICEP Data H. Cynthia Chiang Princeton

States of Polarization Linear Polarization = 8 = =

Simulations for FCC-ee beam self-polarization E. Gianfelice (Fermilab) Contents: -

tt ts

A Concrete Treatment of Fiat-Shamir Signatures in the Quantum Random-Oracle Model EUROCRYPT 2018

Lecture 3 Source Coding I-Hsiang Wang Department of Electrical Engineering National Taiwan

Information Retrieval Tutorial 3: Index Compression Professor: Michel Schellekens TA: Ang Gao

COMS 4721: Machine Learning for Data Science Lecture 14, 3/21/2017 Prof. John Paisley Department

Information Retrieval Lecture 3 Recap: lecture 2 Stemming, tokenization etc. Faster

Old News New News Review: Volume Graphics CPSC 314 Computer Graphics extra TA office hours

Information Theory and Synthetic Steganography CSM25 Secure Information Hiding Dr Hans Georg