Scalable Source Coding With Causal Side Information and a Causal Helper Shraga Bross Faculty of Engineering, Bar-Ilan University, Israel brosss@biu.ac.il ISIT, June 2020
Transmission model Z k ❄ X 1 , 1 , . . . , ˆ ˆ φ 1 ( X n ) X 1 ,n ✲ ✲ ✲ X n Encoder Decoder 1 g k ( φ 1 ( X n ) , Z k ) k = 1 , . . . , n ❄ X 2 , 1 , . . . , ˆ ˆ ✲ X 2 ,n ✲ ✲ Decoder 2 φ 2 ( X n ) ✻ Y k Figure: Scalable source coding with causal side-information and causal helper. 2 / 16
Definition of a code An ( n, M 1 , M 2 , � n k =1 L k , D 1 , D 2 ) scalable code for the source X with causal SI ( Y, Z ) and causal helper, consists of: A first-stage encoder map φ 1 : X n → { 1 , . . . , M 1 } , and a sequence 1 ψ 1 , 1 , . . . , ψ 1 ,n of reconstruction mappings ψ 1 ,k : { 1 , . . . , M 1 } × Z k → ˆ X , k = 1 , . . . , n such that, with E denoting the expectation operator, E d [ X n , ( ψ 1 , 1 ( φ 1 ( X n ) , Z 1 ) , . . . , ψ 1 ,k ( φ 1 ( X n ) , Z k ) , . . . , ψ 1 ,n ( φ 1 ( X n ) , Z n ))] ≤ D 1 . (1) A unidirectional conference between Decoder 1 and Decoder 2 consisting of 2 a sequence of mappings g k : { 1 , . . . , M 1 } × Z k → { 1 , . . . , L k } , k = 1 , . . . , n A second-stage encoder map φ 2 : X n → { 1 , . . . , M 2 } , and a sequence 3 ψ 2 , 1 , . . . , ψ 2 ,n of reconstruction mappings ψ 2 ,k : { 1 , 2 , . . . , M 1 } × { 1 , 2 , . . . , M 2 } × { 1 , . . . , L 1 } × . . . × { 1 , . . . , L k } × Y k → ˆ X 3 / 16
Definition of achievable rate-distortions tuple such that E d [ X n , ( ψ 2 , 1 ( φ 1 ( X n ) , φ 2 ( X n ) , g 1 ( φ 1 ( X n ) , Z 1 ) , Y 1 ) , . . . , � k ψ 2 ,k ( φ 1 ( X n ) , φ 2 ( X n ) , g j ( φ 1 ( X n ) , Z j ) j =1 , Y k ) , . . . , � � n ψ 2 ,n ( φ 1 ( X n ) , φ 2 ( X n ) , � g j ( φ 1 ( X n ) , Z j ) j =1 , Y n ))] ≤ D 2 . (2) The rate tuple ( R 1 , R 2 , R h ) of the scalable code is n R 1 = 1 R 2 = R 1 + 1 R h = 1 � n log M 1 , n log M 2 , log L k . n k =1 A D - achievable tuple ( R 1 , R 2 , R h ) is defined in the regular way. The collection of all D -achievable rate tuples is the achievable scalable source-coding region R ( D ) . 4 / 16
Related work With non-causal SI at the decoders: “On successive refinement for the Wyner-Ziv problem” [Steinberg-Merhav (2004)] — characterization of R ( D ) when X − ◦ Y − ◦ Z . “Side-information scalable source coding” [Tian-Diggavi (2008)] — inner and outer bounds on R ( D ) when X − ◦ Z − ◦ Y . “On successive refinement for the Wyner-Ziv problem with partially cooperating decoders” [Bross-Weissman (2008)] — conclusive characterization of encoder rates but gap in the helper’s rate. With causal SI at the decoders: “On successive refinement with causal side information at the decoders” [Maor-Merhav (2008)] — characterization of R ( D ) no matter what is the relative SI quality at the decoders. This work: extends the last in the sense that Decoder 1 can communicate with Decoder 2, via a conference channel, at a rate not exceeding R h 5 / 16
Definition of R ∗ ( D ) Fix a pair D = ( D 1 , D 2 ) . Define R ∗ ( D ) to be the set of all ( R 1 , R 2 , R h ) for which there exist random variables ( U, V, W ) taking values in finite alphabets U , V , W , respectively, such that ( U, V ) − ◦ X − ◦ ( Y, Z ) forms a Markov chain. 1 There exist deterministic maps 2 f 1 : U × Z → ˆ X g : U × Z → W f 2 : U × V × W × Y → ˆ X such that, with W � g ( U, Z ) , E d ( X, f 1 ( U, Z )) ≤ D 1 E d ( X, f 2 ( U, V, W, Y )) ≤ D 2 . The rates R 1 , R 2 and R h satisfy 3 R 1 ≥ I ( X ; U ) (3a) R 2 ≥ I ( X ; UV ) (3b) R h ≥ H ( W | U ) . (3c) 6 / 16
Main result Theorem R ( D ) = R ∗ ( D ) . Remarks: ◦ ( Y, Z ) , in the definition of R ∗ ( D ) , The converse shows that ( U, V ) − ◦ X − can be replaced by the Markov chain U − ◦ V − ◦ X − ◦ ( Y, Z ) . The converse holds as well when the causal helper and the reconstructors are allowed to depend also on X i − 1 . That is, g k = g k ( φ 1 ( X n ) , Z i , X i − 1 ) , ˆ X 1 ,k = ψ 1 ,k ( φ 1 ( X n ) , Z i , X i − 1 ) , and X 2 ,k = ψ 2 ,k ( φ 1 ( X n ) , φ 2 ( X n ) , { g j ( · ) } k ˆ j =1 , Y i , X i − 1 ) . 7 / 16
Example Let X be a BSS and let d ( · , · ) be the Hamming loss. The SI pair ( Z, Y ) is conditionally independent given X , where Z is the output of a BSC ( δ 1 ) channel with input X , and Y is the output of a BSC ( δ 2 ) channel with input X , where 0 < δ 1 < δ 2 < 0 . 5 . Define � 1 − H b (∆) 0 ≤ ∆ ≤ d c ( δ ) R X,δ (∆) = H ′ b ( d c ( δ ))( δ − ∆) d c ( δ ) < ∆ ≤ δ where d c ( δ ) the solution to the equation (1 − H b ( d c )) / ( d c − δ ) = − H ′ b ( d c ) , where 0 < d c < δ < 0 . 5 . R X,δ ( D ) is the RDF for SC with causal SI Y ( Y is the output of a BSC ( δ ) channel with input X ) [Weissman-El Gamal (2006)]. Fix a rate-constraint R h and a distortion pair ( D 1 , D 2 ) , where D 2 < D 1 < δ 1 . 8 / 16
Example cont. Let ˜ V = X ⊕ S , where S ∼ Ber (min { ˜ D 2 , d c ( δ 2 ) } ) � β 2 is independent of ( X, Z, Y ) and ˜ D 2 > D 2 will be defined in the sequel. � δ 2 − max { ˜ D 2 , d c ( δ 2 ) } � Let B 1 ∼ Ber independently of ( X, Y, Z, S ) . Let δ 2 − d c ( δ 2 ) T ∼ Ber (Pr { T = 1 } ) , independently of ( X, Y, Z, S, B 1 ) such that Pr { T = 1 } ∗ β 2 = min { D 1 , d c ( δ 1 ) } � β 1 . With the assumption that γ � δ 1 − max { D 1 , d c ( δ 1 ) } δ 2 − d c ( δ 2 ) · ≤ 1 δ 2 − max { ˜ δ 1 − d c ( δ 1 ) D 2 , d c ( δ 2 ) } let B 2 ∼ Ber ( γ ) independently of ( X, Y, Z, S, B 1 , T ) . U − ◦ V − ◦ X − ◦ ( Y, Z ) ˜ V V X ✲ ⊕ ✲ ⊗ ✲ ✲ ⊗ ✲ ⊕ U ✻ ✻ ✻ ✻ B 1 B 2 S T Figure: Scalable source coding scheme for a BSS. 9 / 16
Example cont. Let Θ h be the time fraction during which Decoder 1 (the helper) describes W , and define f 1 ( U, Z ) = B 1 · B 2 · U + (1 − B 1 · B 2 ) · Z f 2 ( U, V, W, Y ) = B 1 · V + Θ h · W + (1 − B 1 − Θ h ) · Y With the choice of W = g ( U, Z ) = Z the distortion constraint at Decoder 2 is fulfilled as long as B 1 β 2 + Θ h δ 1 + (1 − B 1 − Θ h ) δ 2 ≤ D 2 , (4) and the distortion constraint at Decoder 1 is met as long as B 1 · B 2 β 1 + (1 − B 1 · B 2 ) δ 1 ≤ D 1 . Consequently, for the first stage I ( X ; U ) = I ( X ; T ⊕ ˜ V | B 1 · B 2 = 1) = R X,δ 1 ( D 1 ) . 10 / 16
Example cont. Assuming that the helper describes Z using a binary quantizer followed by an entropy encoder, given R h , we obtain R h ≥ Θ h H ( Z ) = Θ h H b ( δ 1 ) . Thus, choosing Θ h = min { R h /H b ( δ 1 ) , 1 − B 1 } (5) this quantity is an upper bound on the time fraction at which Decoder 2 can observe the lossless description of Z in forming its reconstruction. Thus, the constraint (4) becomes B 1 β 2 + (1 − B 1 ) δ 2 ≤ D 2 + min { R h /H b ( δ 1 ) , 1 − B 1 } ( δ 2 − δ 1 ) � ˜ D 2 , (6) and consequently for the second stage I ( X ; UV ) = I ( X ; V ) = I ( X ; ˜ V | B 1 = 1) = R X,δ 2 ( ˜ D 2 ) . Comparing the first and second stage rate expressions with the [Maor-Merhav] expression, inequality (6) reflects the helper’s assistance in terms of the relaxation of R 2 − R 1 . 11 / 16
Converse Assume that ( R 1 , R 2 , R h ) is ( D 1 , D 2 ) achievable. Let T 1 � φ 1 ( X n ) , T 2 � φ 2 ( X n ) . Then, with U j � ( T 1 , X j − 1 , Y j − 1 , Z j − 1 ) nR 1 ≥ log M 1 ≥ H ( T 1 ) n = I ( X n ; T 1 ) = � I ( X k ; T 1 | X k − 1 ) k =1 n ( a ) � I ( X k ; T 1 X k − 1 ) = k =1 n n ( b ) � � I ( X k ; T 1 X k − 1 Y k − 1 Z k − 1 ) = = I ( X k ; U k ) . k =1 k =1 ( a ) follows since X n is memoryless, ◦ ( T 1 , X k − 1 ) − ◦ ( Y k − 1 , Z k − 1 ) forms a Markov chain. ( b ) follows since X k − 12 / 16
Converse cont. Next, with V j � ( T 2 , U j ) n ( R 2 − R 1 ) ≥ log M 2 ≥ H ( T 2 | T 1 ) ≥ I ( X n ; T 2 | T 1 ) n � I ( X k ; T 2 | T 1 X k − 1 ) = k =1 n n ( c ) � I ( X k ; T 2 | T 1 X k − 1 Y k − 1 Z k − 1 ) = � = I ( X k ; V k | U k ) . k =1 k =1 ◦ ( T 1 , X k − 1 ) − ◦ ( Y k − 1 , Z k − 1 ) and ( c ) follows since X k − ◦ ( T 1 , T 2 , X k − 1 ) − ◦ ( Y k − 1 , Z k − 1 ) are Markov chains. Furthermore, X k − U k − ◦ V k − ◦ X k − ◦ ( Y k , Z k ) . Consequently, n � nR 2 ≥ nR 1 + I ( X k ; V k | U k ) k =1 n n � � ≥ [ I ( X k ; U k ) + I ( X k ; V k | U k )] = I ( X k ; U k V k ) . k =1 k =1 13 / 16
Converse cont. With W j � g j ( T 1 , Z j ) we have n � nR h ≥ log L k ≥ H ( W 1 , W 2 , . . . , W n ) k =1 n � H ( W k | T 1 W k − 1 ) ≥ H ( W 1 , W 2 , . . . , W n | T 1 ) = k =1 n � H ( W k | T 1 W k − 1 Z k − 1 X k − 1 Y k − 1 ) ≥ k =1 n n ( d ) � � H ( W k | T 1 Z k − 1 X k − 1 Y k − 1 ) = = H ( W k | U k ) , k =1 k =1 where ( d ) follows since W j is a functions of ( T 1 , Z j ) . Defining f 1 ( U i , Z i ) = ψ 1 ,i ( T 1 , Z i ) we may write n n n � E [ d ( X i , ˆ � � E [ d ( X i , ψ 1 ,i ( T 1 , Z i ))] = nD 1 ≥ X 1 ,i )] = E [ d ( X i , f 1 ( U i , Z i ))] . i =1 i =1 i =1 14 / 16
Recommend
More recommend