Scalable Source Coding With Causal Side Information and a Causal - PowerPoint PPT Presentation

Scalable Source Coding With Causal Side Information and a Causal Helper Shraga Bross Faculty of Engineering, Bar-Ilan University, Israel brosss@biu.ac.il ISIT, June 2020

Transmission model Z k ❄ X 1 , 1 , . . . , ˆ ˆ φ 1 ( X n ) X 1 ,n ✲ ✲ ✲ X n Encoder Decoder 1 g k ( φ 1 ( X n ) , Z k ) k = 1 , . . . , n ❄ X 2 , 1 , . . . , ˆ ˆ ✲ X 2 ,n ✲ ✲ Decoder 2 φ 2 ( X n ) ✻ Y k Figure: Scalable source coding with causal side-information and causal helper. 2 / 16

Definition of a code An ( n, M 1 , M 2 , � n k =1 L k , D 1 , D 2 ) scalable code for the source X with causal SI ( Y, Z ) and causal helper, consists of: A first-stage encoder map φ 1 : X n → { 1 , . . . , M 1 } , and a sequence 1 ψ 1 , 1 , . . . , ψ 1 ,n of reconstruction mappings ψ 1 ,k : { 1 , . . . , M 1 } × Z k → ˆ X , k = 1 , . . . , n such that, with E denoting the expectation operator, E d [ X n , ( ψ 1 , 1 ( φ 1 ( X n ) , Z 1 ) , . . . , ψ 1 ,k ( φ 1 ( X n ) , Z k ) , . . . , ψ 1 ,n ( φ 1 ( X n ) , Z n ))] ≤ D 1 . (1) A unidirectional conference between Decoder 1 and Decoder 2 consisting of 2 a sequence of mappings g k : { 1 , . . . , M 1 } × Z k → { 1 , . . . , L k } , k = 1 , . . . , n A second-stage encoder map φ 2 : X n → { 1 , . . . , M 2 } , and a sequence 3 ψ 2 , 1 , . . . , ψ 2 ,n of reconstruction mappings ψ 2 ,k : { 1 , 2 , . . . , M 1 } × { 1 , 2 , . . . , M 2 } × { 1 , . . . , L 1 } × . . . × { 1 , . . . , L k } × Y k → ˆ X 3 / 16

Definition of achievable rate-distortions tuple such that E d [ X n , ( ψ 2 , 1 ( φ 1 ( X n ) , φ 2 ( X n ) , g 1 ( φ 1 ( X n ) , Z 1 ) , Y 1 ) , . . . , � k ψ 2 ,k ( φ 1 ( X n ) , φ 2 ( X n ) , g j ( φ 1 ( X n ) , Z j ) j =1 , Y k ) , . . . , � � n ψ 2 ,n ( φ 1 ( X n ) , φ 2 ( X n ) , � g j ( φ 1 ( X n ) , Z j ) j =1 , Y n ))] ≤ D 2 . (2) The rate tuple ( R 1 , R 2 , R h ) of the scalable code is n R 1 = 1 R 2 = R 1 + 1 R h = 1 � n log M 1 , n log M 2 , log L k . n k =1 A D - achievable tuple ( R 1 , R 2 , R h ) is defined in the regular way. The collection of all D -achievable rate tuples is the achievable scalable source-coding region R ( D ) . 4 / 16

Related work With non-causal SI at the decoders: “On successive refinement for the Wyner-Ziv problem” [Steinberg-Merhav (2004)] — characterization of R ( D ) when X − ◦ Y − ◦ Z . “Side-information scalable source coding” [Tian-Diggavi (2008)] — inner and outer bounds on R ( D ) when X − ◦ Z − ◦ Y . “On successive refinement for the Wyner-Ziv problem with partially cooperating decoders” [Bross-Weissman (2008)] — conclusive characterization of encoder rates but gap in the helper’s rate. With causal SI at the decoders: “On successive refinement with causal side information at the decoders” [Maor-Merhav (2008)] — characterization of R ( D ) no matter what is the relative SI quality at the decoders. This work: extends the last in the sense that Decoder 1 can communicate with Decoder 2, via a conference channel, at a rate not exceeding R h 5 / 16

Definition of R ∗ ( D ) Fix a pair D = ( D 1 , D 2 ) . Define R ∗ ( D ) to be the set of all ( R 1 , R 2 , R h ) for which there exist random variables ( U, V, W ) taking values in finite alphabets U , V , W , respectively, such that ( U, V ) − ◦ X − ◦ ( Y, Z ) forms a Markov chain. 1 There exist deterministic maps 2 f 1 : U × Z → ˆ X g : U × Z → W f 2 : U × V × W × Y → ˆ X such that, with W � g ( U, Z ) , E d ( X, f 1 ( U, Z )) ≤ D 1 E d ( X, f 2 ( U, V, W, Y )) ≤ D 2 . The rates R 1 , R 2 and R h satisfy 3 R 1 ≥ I ( X ; U ) (3a) R 2 ≥ I ( X ; UV ) (3b) R h ≥ H ( W | U ) . (3c) 6 / 16

Main result Theorem R ( D ) = R ∗ ( D ) . Remarks: ◦ ( Y, Z ) , in the definition of R ∗ ( D ) , The converse shows that ( U, V ) − ◦ X − can be replaced by the Markov chain U − ◦ V − ◦ X − ◦ ( Y, Z ) . The converse holds as well when the causal helper and the reconstructors are allowed to depend also on X i − 1 . That is, g k = g k ( φ 1 ( X n ) , Z i , X i − 1 ) , ˆ X 1 ,k = ψ 1 ,k ( φ 1 ( X n ) , Z i , X i − 1 ) , and X 2 ,k = ψ 2 ,k ( φ 1 ( X n ) , φ 2 ( X n ) , { g j ( · ) } k ˆ j =1 , Y i , X i − 1 ) . 7 / 16

Example Let X be a BSS and let d ( · , · ) be the Hamming loss. The SI pair ( Z, Y ) is conditionally independent given X , where Z is the output of a BSC ( δ 1 ) channel with input X , and Y is the output of a BSC ( δ 2 ) channel with input X , where 0 < δ 1 < δ 2 < 0 . 5 . Define � 1 − H b (∆) 0 ≤ ∆ ≤ d c ( δ ) R X,δ (∆) = H ′ b ( d c ( δ ))( δ − ∆) d c ( δ ) < ∆ ≤ δ where d c ( δ ) the solution to the equation (1 − H b ( d c )) / ( d c − δ ) = − H ′ b ( d c ) , where 0 < d c < δ < 0 . 5 . R X,δ ( D ) is the RDF for SC with causal SI Y ( Y is the output of a BSC ( δ ) channel with input X ) [Weissman-El Gamal (2006)]. Fix a rate-constraint R h and a distortion pair ( D 1 , D 2 ) , where D 2 < D 1 < δ 1 . 8 / 16

Example cont. Let ˜ V = X ⊕ S , where S ∼ Ber (min { ˜ D 2 , d c ( δ 2 ) } ) � β 2 is independent of ( X, Z, Y ) and ˜ D 2 > D 2 will be defined in the sequel. � δ 2 − max { ˜ D 2 , d c ( δ 2 ) } � Let B 1 ∼ Ber independently of ( X, Y, Z, S ) . Let δ 2 − d c ( δ 2 ) T ∼ Ber (Pr { T = 1 } ) , independently of ( X, Y, Z, S, B 1 ) such that Pr { T = 1 } ∗ β 2 = min { D 1 , d c ( δ 1 ) } � β 1 . With the assumption that γ � δ 1 − max { D 1 , d c ( δ 1 ) } δ 2 − d c ( δ 2 ) · ≤ 1 δ 2 − max { ˜ δ 1 − d c ( δ 1 ) D 2 , d c ( δ 2 ) } let B 2 ∼ Ber ( γ ) independently of ( X, Y, Z, S, B 1 , T ) . U − ◦ V − ◦ X − ◦ ( Y, Z ) ˜ V V X ✲ ⊕ ✲ ⊗ ✲ ✲ ⊗ ✲ ⊕ U ✻ ✻ ✻ ✻ B 1 B 2 S T Figure: Scalable source coding scheme for a BSS. 9 / 16

Example cont. Let Θ h be the time fraction during which Decoder 1 (the helper) describes W , and define f 1 ( U, Z ) = B 1 · B 2 · U + (1 − B 1 · B 2 ) · Z f 2 ( U, V, W, Y ) = B 1 · V + Θ h · W + (1 − B 1 − Θ h ) · Y With the choice of W = g ( U, Z ) = Z the distortion constraint at Decoder 2 is fulfilled as long as B 1 β 2 + Θ h δ 1 + (1 − B 1 − Θ h ) δ 2 ≤ D 2 , (4) and the distortion constraint at Decoder 1 is met as long as B 1 · B 2 β 1 + (1 − B 1 · B 2 ) δ 1 ≤ D 1 . Consequently, for the first stage I ( X ; U ) = I ( X ; T ⊕ ˜ V | B 1 · B 2 = 1) = R X,δ 1 ( D 1 ) . 10 / 16

Example cont. Assuming that the helper describes Z using a binary quantizer followed by an entropy encoder, given R h , we obtain R h ≥ Θ h H ( Z ) = Θ h H b ( δ 1 ) . Thus, choosing Θ h = min { R h /H b ( δ 1 ) , 1 − B 1 } (5) this quantity is an upper bound on the time fraction at which Decoder 2 can observe the lossless description of Z in forming its reconstruction. Thus, the constraint (4) becomes B 1 β 2 + (1 − B 1 ) δ 2 ≤ D 2 + min { R h /H b ( δ 1 ) , 1 − B 1 } ( δ 2 − δ 1 ) � ˜ D 2 , (6) and consequently for the second stage I ( X ; UV ) = I ( X ; V ) = I ( X ; ˜ V | B 1 = 1) = R X,δ 2 ( ˜ D 2 ) . Comparing the first and second stage rate expressions with the [Maor-Merhav] expression, inequality (6) reflects the helper’s assistance in terms of the relaxation of R 2 − R 1 . 11 / 16

Converse Assume that ( R 1 , R 2 , R h ) is ( D 1 , D 2 ) achievable. Let T 1 � φ 1 ( X n ) , T 2 � φ 2 ( X n ) . Then, with U j � ( T 1 , X j − 1 , Y j − 1 , Z j − 1 ) nR 1 ≥ log M 1 ≥ H ( T 1 ) n = I ( X n ; T 1 ) = � I ( X k ; T 1 | X k − 1 ) k =1 n ( a ) � I ( X k ; T 1 X k − 1 ) = k =1 n n ( b ) � � I ( X k ; T 1 X k − 1 Y k − 1 Z k − 1 ) = = I ( X k ; U k ) . k =1 k =1 ( a ) follows since X n is memoryless, ◦ ( T 1 , X k − 1 ) − ◦ ( Y k − 1 , Z k − 1 ) forms a Markov chain. ( b ) follows since X k − 12 / 16

Converse cont. Next, with V j � ( T 2 , U j ) n ( R 2 − R 1 ) ≥ log M 2 ≥ H ( T 2 | T 1 ) ≥ I ( X n ; T 2 | T 1 ) n � I ( X k ; T 2 | T 1 X k − 1 ) = k =1 n n ( c ) � I ( X k ; T 2 | T 1 X k − 1 Y k − 1 Z k − 1 ) = � = I ( X k ; V k | U k ) . k =1 k =1 ◦ ( T 1 , X k − 1 ) − ◦ ( Y k − 1 , Z k − 1 ) and ( c ) follows since X k − ◦ ( T 1 , T 2 , X k − 1 ) − ◦ ( Y k − 1 , Z k − 1 ) are Markov chains. Furthermore, X k − U k − ◦ V k − ◦ X k − ◦ ( Y k , Z k ) . Consequently, n � nR 2 ≥ nR 1 + I ( X k ; V k | U k ) k =1 n n � � ≥ [ I ( X k ; U k ) + I ( X k ; V k | U k )] = I ( X k ; U k V k ) . k =1 k =1 13 / 16

Converse cont. With W j � g j ( T 1 , Z j ) we have n � nR h ≥ log L k ≥ H ( W 1 , W 2 , . . . , W n ) k =1 n � H ( W k | T 1 W k − 1 ) ≥ H ( W 1 , W 2 , . . . , W n | T 1 ) = k =1 n � H ( W k | T 1 W k − 1 Z k − 1 X k − 1 Y k − 1 ) ≥ k =1 n n ( d ) � � H ( W k | T 1 Z k − 1 X k − 1 Y k − 1 ) = = H ( W k | U k ) , k =1 k =1 where ( d ) follows since W j is a functions of ( T 1 , Z j ) . Defining f 1 ( U i , Z i ) = ψ 1 ,i ( T 1 , Z i ) we may write n n n � E [ d ( X i , ˆ � � E [ d ( X i , ψ 1 ,i ( T 1 , Z i ))] = nD 1 ≥ X 1 ,i )] = E [ d ( X i , f 1 ( U i , Z i ))] . i =1 i =1 i =1 14 / 16

Scalable Source Coding With Causal Side Information and a Causal - PowerPoint PPT Presentation

Scalable Source Coding With Causal Side Information and a Causal Helper Shraga Bross Faculty of Engineering, Bar-Ilan University, Israel brosss@biu.ac.il ISIT, June 2020 Transmission model Z k X 1 , 1 , . . . , 1 ( X n ) X 1 ,n

Causal Effect Evaluation and Causal Network Learning Zhi Geng Peking University, China June

Formal Modeling in Cognitive Science 1 Coding Theorems Lecture 28: Kraft Inequality; Source Coding

Political Science 209 - Fall 2018 Causal Inference Florian Hollenbach 7th September 2018 Causal

Foundations of Causal Discovery Frederick Eberhardt KDD Causality Workshop 2016 Causal Discovery

Coding and Applications in Sensor Networks Why coding? Information compression

Applications of Random Coding and Algebraic Coding Theories to Universal Lossless Source Coding

Image and Video Coding: Video Coding Extensions Screen Content Coding Screen Content Coding

ADVANCED MULTIMEDIA ADVANCED MULTIMEDIA CODING CODING Fernando Pereira Instituto Superior

Dynamical systems Expanding maps on the circle. Coding Jana Rodriguez Hertz ICTP 2018 coding

Causal Inference By: Miguel A. Hern an and James M. Robins Part I: Causal inference without

Causal Programming Causal Programming Joshua Brul Joshua Brul

Few-shot Domain Adaptation 1/12 by Causal Mechanism Transfer Domain adaptation Causal mechanism

Causal Discovery from Observational Data Brady Neal causalcourse.com What if we dont have

INFORMATION-THEORETIC SECURITY INFORMATION-THEORETIC SECURITY Lecture 4 - Elements of Information

Coding and Applications in Sensor Networks Coding and Applications in Sensor Networks Why coding?

SCHISM numerical formulation Joseph Zhang Horizontal grid: hybrid 2 3 Side 2 4 3 Side 1

Findings from NAMIs Coverage for Care Survey 2015 Network Adequacy NAMIs 2015 Coverage for

ME: You, Me, and ADHD Your interest in students Chief Academic Advisor for School with ADHD

52% re-traumatising and stigmatising children. Ensure that staff CEs have the right skills and

Noisy Channel Models CMSC 723 / LING 723 / INST 725 M ARINE C ARPUAT marine@cs.umd.edu T oday

Safety & Liveness Properties DM519 Concurrent Programming 1 Chapter 6 Repetition:

Finding Shortest Paths Shortest Path Problem Shortest Path Problem Given a graph G = ( V , E )

Safety and Liveness; Exceptions Christine Rizkallah CSE, UNSW Term 3; 2020 1 Program

Darren Hunter Presented by realestate.com.au Who is Darren Hunter? Based in Adelaide SA

Scalable Source Coding With Causal Side Information and a Causal - PowerPoint PPT Presentation

Scalable Source Coding With Causal Side Information and a Causal Helper Shraga Bross Faculty of Engineering, Bar-Ilan University, Israel brosss@biu.ac.il ISIT, June 2020 Transmission model Z k X 1 , 1 , . . . , 1 ( X n ) X 1 ,n

Causal Effect Evaluation and Causal Network Learning Zhi Geng Peking University, China June

Formal Modeling in Cognitive Science 1 Coding Theorems Lecture 28: Kraft Inequality; Source Coding

Political Science 209 - Fall 2018 Causal Inference Florian Hollenbach 7th September 2018 Causal

Foundations of Causal Discovery Frederick Eberhardt KDD Causality Workshop 2016 Causal Discovery

Coding and Applications in Sensor Networks Why coding? Information compression

Applications of Random Coding and Algebraic Coding Theories to Universal Lossless Source Coding

Image and Video Coding: Video Coding Extensions Screen Content Coding Screen Content Coding

ADVANCED MULTIMEDIA ADVANCED MULTIMEDIA CODING CODING Fernando Pereira Instituto Superior

Dynamical systems Expanding maps on the circle. Coding Jana Rodriguez Hertz ICTP 2018 coding

Causal Inference By: Miguel A. Hern an and James M. Robins Part I: Causal inference without

Causal Programming Causal Programming Joshua Brul Joshua Brul

Few-shot Domain Adaptation 1/12 by Causal Mechanism Transfer Domain adaptation Causal mechanism

Causal Discovery from Observational Data Brady Neal causalcourse.com What if we dont have

INFORMATION-THEORETIC SECURITY INFORMATION-THEORETIC SECURITY Lecture 4 - Elements of Information

Coding and Applications in Sensor Networks Coding and Applications in Sensor Networks Why coding?

SCHISM numerical formulation Joseph Zhang Horizontal grid: hybrid 2 3 Side 2 4 3 Side 1

Findings from NAMIs Coverage for Care Survey 2015 Network Adequacy NAMIs 2015 Coverage for

ME: You, Me, and ADHD Your interest in students Chief Academic Advisor for School with ADHD

52% re-traumatising and stigmatising children. Ensure that staff CEs have the right skills and

Noisy Channel Models CMSC 723 / LING 723 / INST 725 M ARINE C ARPUAT marine@cs.umd.edu T oday

Safety &amp; Liveness Properties DM519 Concurrent Programming 1 Chapter 6 Repetition:

Finding Shortest Paths Shortest Path Problem Shortest Path Problem Given a graph G = ( V , E )

Safety and Liveness; Exceptions Christine Rizkallah CSE, UNSW Term 3; 2020 1 Program

Darren Hunter Presented by realestate.com.au Who is Darren Hunter? Based in Adelaide SA

Safety & Liveness Properties DM519 Concurrent Programming 1 Chapter 6 Repetition: