Coding Theorems for Reversible Embedding Frans Willems and Ton - PowerPoint PPT Presentation

Coding Theorems for Reversible Embedding Frans Willems and Ton Kalker T.U. Eindhoven, Philips Research DIMACS, March 16-19, 2003

Outline 1. Gelfand-Pinsker coding theorem 2. Noise-free embedding 3. Reversible embedding 4. Robust and reversible embedding 5. Partially reversible embedding 6. Remarks 1

I. The Gelfand-Pinsker Coding Theorem ˆ Y N Z N W W Y N = e ( W, X N ) ✲ ✲ ✲ ˆ ✲ W = d ( Z N ) P c ( z | y, x ) ✻ ✻ X N P s ( x ) Messages: Pr { W = w } = 1 /M for w ∈ { 1 , 2 , · · · , M } . Side information: Pr { X N = x N } = Π N n =1 P s ( x n ) for x N ∈ X N . Channel: discrete memoryless {Y × X , P c ( z | y, x ) , Z} . Error probability: P E = Pr { ˆ W � = W } . Rate: R = 1 N log 2 ( M ). 2

Capacity The side-information capacity C si is the largest ρ such that for all ǫ > 0 there exist for all large enough N encoders and decoders with R ≥ ρ − ǫ and P E ≤ ǫ . THEOREM (Gelfand-Pinsker [1980]): C si = P t ( u,y | x ) I ( U ; Z ) − I ( U ; X ) . max (1) Achievability proof: Fix a test-channel P t ( u, y | x ). Consider sets A ǫ ( · ) of strongly typical sequences , etc. (a) For each message index w ∈ { 1 , · · · , 2 NR } , generate 2 NR u sequences u N at random according to P ( u ) = � x,y P s ( x ) P t ( u, y | x ). Give these sequences the label w . (b) When message index w has to be transmitted choose a sequence u N having label w such that ( u N , x N ) ∈ A ǫ ( U, X ). Such a sequence exists almost always if R u > I ( U ; X ) (roughly). 3

(c) The input sequence y N results from applying the ”channel” P ( y | u, x ) = y P t ( u, y | x ) to u N and x N . Then y N is transmitted. P t ( y, u | x ) / � (d) The decoder upon receiving z N , looks for the unique sequence u N such that ( u N , z N ) ∈ A ǫ ( U, Z ). If R + R u < I ( U ; Z ) (roughly) such a unique sequence exists. The message index is the label of u N . Conclusion is that R < I ( U ; Z ) − I ( U ; X ) is achievable. Observations A: As an intermediate result the decoder recovers the sequence u N . B: The transmitted u N is jointly typical with the side-info sequence x N , i.e. ( u N , x N ) ∈ A ǫ ( U, X ) thus their joint composition is OK . Note that P ( u, x ) = � y P s ( x ) P t ( u, y | x ). 4

II. Noise-free Embedding X N Y N Y N = e ( W, X N ) ✲ ✲ P s ( x ) ✻ ✲ ✲ ˆ W = d ( Y N ) ˆ W W Messages: Pr { W = w } = 1 M for w ∈ { 1 , 2 , · · · , M } . Source (host): Pr { X N = x N } = Π n =1 ,N P s ( x n ) for x N ∈ X N . Error probability: P E = Pr { ˆ W � = W } . Rate: R = 1 N log 2 ( M ). Embedding distortion: D xy = E [ 1 n =1 ,N D xy ( X n , e n ( W, X N ))] for some � N distortion matrix { D xy ( x, y ) , x ∈ X , y ∈ Y} . 5

Achievable region noise-free embedding A rate-distortion pair ( ρ, ∆ xy ) is said to be achievable if for all ǫ > 0 there exists for all large enough N encoders and decoders such that R ≥ ρ − ǫ, ∆ xy + ǫ, D xy ≤ P E ≤ ǫ. THEOREM (Chen [2000], Barron [2000]): The set of achievable rate-distortion pairs is equal to G nfe which is defined as G nfe = { ( ρ, ∆ xy ) : 0 ≤ ρ ≤ H ( Y | X ) , � ∆ xy ≥ P ( x, y ) D xy ( x, y ) , x,y for P ( x, y ) = P s ( x ) P t ( y | x ) } . (2) Again {X , P t ( y | x ) , Y} is called test-channel . 6

Proof: Achievability: In the Gelfand-Pinsker achievability proof, note that Z = Y (noiseless channel) and take the auxiliary random variable U = Y . Then ( x N , y N ) ∈ A ǫ ( X, Y ) hence D xy is OK. For the embedding rate we obtain R = I ( U ; Z ) − I ( U ; X ) = I ( Y ; Y ) − I ( Y ; X ) = H ( Y | X ) . Converse: Rate part: H ( W ) − H ( W | ˆ log 2 ( M ) W ) + Fano term ≤ H ( W | X N ) − H ( W | X N , Y N ) + Fano term ≤ I ( W ; Y N | X N ) + Fano term = H ( Y N | X N ) + Fano term ≤ � ≤ H ( Y n | X n ) + Fano term n =1 ,N ≤ NH ( Y | X ) + Fano term , 7

where X and Y are random variables with Pr { ( X, Y ) = ( x, y ) } = 1 � Pr { ( X n , Y n ) = ( x, y ) } , N n =1 ,N for x ∈ X and y ∈ Y . Note that for x ∈ X Pr { X = x } = P s ( x ) . Distortion part: Pr { ( X N , Y N ) = ( x N , y N ) } 1 � � D xy = D xy ( x n , y n ) N n x N ,y N � = Pr { ( X, Y ) = ( x, y ) } D xy ( x, y ) . x,y Let P E ↓ 0, etc. 8

III. Reversible Embedding X N Y N Y N = e ( W, X N ) ✲ ✲ P s ( x ) ˆ W ✻ ✲ ✲ ( ˆ W, ˆ X N 1 ) = d ( Y N ) W ✲ ˆ X N 1 Messages: Pr { W = w } = 1 M for w ∈ { 1 , 2 , · · · , M } . Source (host): Pr { X N = x N } = Π n =1 ,N P s ( x n ) for x N ∈ X N . Error probability: P E = Pr { ˆ W � = W ∨ ˆ X N 1 � = X N } . Rate: R = 1 N log 2 ( M ). Embedding distortion: D xy = E [ 1 n =1 ,N D xy ( X n , e n ( W, X N ))] for some � N distortion matrix { D xy ( x, y ) , x ∈ X , y ∈ Y} . Inspired by Fridrich, Goljan, and Du, ”Lossless data embedding for all image formats,” Proc. SPIE, Security and Watermarking of Multimedia Contents , San Jose, CA, 2002. 9

Achievable region for reversible embedding A rate-distortion pair ( ρ, ∆ xy ) is said to be achievable if for all ǫ > 0 there exists for all large enough N encoders and decoders such that R ≥ ρ − ǫ, ∆ xy + ǫ, D xy ≤ P E ≤ ǫ. RESULT (Kalker-Willems [2002]): The set of achievable rate-distortion pairs is equal to G re which is defined as G re = { ( ρ, ∆ xy ) : 0 ≤ ρ ≤ H ( Y ) − H ( X ) , � ∆ xy ≥ P ( x, y ) D xy ( x, y ) , x,y for P ( x, y ) = P s ( x ) P t ( y | x ) } . (3) Note that {X , P t ( y | x ) , Y} is the test channel. 10

Proof: Achievability: In the Gelfand-Pinsker achievability proof, note that Z = Y (noiseless channel) and take the auxiliary random variable U = [ X, Y ]. Then x N can be reconstructed by the decoder and ( x N , y N ) ∈ A ǫ ( X, Y ) hence D xy is OK. For the embedding rate we obtain R = I ( U ; Z ) − I ( U ; X ) = I ([ X, Y ]; Y ) − I ([ X, Y ]; X ) = H ( Y ) − H ( X ) . Converse: Rate part: H ( W ) − H ( W, X N | ˆ X N W, ˆ log 2 ( M ) 1 ) + Fano term ≤ H ( W, X N ) − H ( W, X N | ˆ X N 1 ) − H ( X N ) + Fano term W, ˆ = H ( W, X N ) − H ( W, X N | Y N , ˆ X N 1 ) − H ( X N ) + Fano term W, ˆ ≤ I ( W, X N ; Y N ) − H ( X N ) + Fano term = H ( Y N ) − H ( X N ) + Fano term = � ≤ [ H ( Y n ) − H ( X n )] + Fano term n =1 ,N ≤ N [ H ( Y ) − H ( X )] + Fano term , 11

where X and Y are random variables with Pr { ( X, Y ) = ( x, y ) } = 1 � Pr { ( X n , Y n ) = ( x, y ) } , N n =1 ,N for x ∈ X and y ∈ Y . Note that for x ∈ X Pr { X = x } = P s ( x ) . Distortion part: Pr { ( X N , Y N ) = ( x N , y N ) } 1 � � D xy = D xy ( x n , y n ) N n x N ,y N � = Pr { ( X, Y ) = ( x, y ) } D xy ( x, y ) . x,y Let P E ↓ 0, etc. 12

Example: Binary source, Hamming distortion p x p y ❍ ❍ ✟ ◗ ✟ ✑ ◗ ✑ ◗ ✑ 1 d 1 1 ◗ ✑ ◗ ✑ ◗ ✑ ◗ ✑ ◗ X Y ✑ ◗ ✑ ◗ ✑ ◗ ✑ ◗ ✑ d 0 ◗ 0 0 ✑ ◗ ✑ ◗ ✑ ✑ ◗ Since ∆ xy ≥ p x d 1 + (1 − p x ) d 0 p y = p x (1 − d 1 ) + (1 − p x ) d 0 we can write p y ≤ ∆ xy + p x (1 − 2 d 1 ) . Assume w.l.o.g. that p x ≤ 1 / 2. First let ∆ xy be such that ∆ xy + p x ≤ 1 / 2 or ∆ xy ≤ 1 / 2 − p x . Then we have p y ≤ ∆ xy + p x ≤ 1 / 2 , 13

and hence ρ ≤ h ( p y ) − h ( p x ) ≤ h ( p x + ∆ xy ) − h ( p x ) . However ρ = h ( p x + ∆ xy ) − h ( p x ) is achievable with ∆ xy by taking ∆ xy d 1 = 0 and d 0 = . 1 − p x Note that the test channel is not symmetric and that ∆ xy ≤ 1 / 2 − p x d 0 = ≤ 1 / 2 . 1 − p x 1 − p x For ∆ xy + p x ≥ 1 / 2 the rate is bounded as ρ ≤ 1 − h ( p x ) but also achievable. 14

Plot of rate-distortion region G re 0.35 0.3 0.25 0.2 RATE in BITS 0.15 0.1 0.05 0 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 DISTORTION Horizontal axis ∆ xy , vertical axis ρ , for p x = 0 . 2. Maximum embedding rate 1 − h (0 . 2) ≈ 0 . 278. 15

Another perspective y N ( k ) y N ( k + 1) ✻ ✻ w ( k ) w ( k + 1) x N ( k ) x N ( k + 1) Consider a blocked system with blocks of length N . In block k message bits can be (noise-free) embedded with rate H ( Y | X ) and corresponding distortion. Then in block k + 1 message bits are embedded that allow for recon- struction of x N ( k ) given y N ( k ). This requires NH ( X | Y ) bits. Therefore the resulting embedding rate is = H ( Y | X ) − H ( X | Y ) R = H ( Y, X ) − H ( X ) − H ( X | Y ) = H ( Y ) − H ( X ) . 16

IV. Robust and Reversible Embedding X N Y N Y N = e ( W, X N ) ✲ ✲ P s ( x ) ˆ W Z N ✻ ✲ ✲ ✲ ( ˆ W, ˆ X N 1 ) = d ( Z N ) P c ( z | y ) ✲ W ˆ X N 1 Messages: Pr { W = w } = 1 M for w ∈ { 1 , 2 , · · · , M } . Source (host): Pr { X N = x N } = Π n =1 ,N P s ( x n ) for x N ∈ X N . Channel: discrete memoryless {Y , P c ( z | y ) , Z} . X N 1 � = X N } . Error probability: P E = Pr { ˆ W � = W ∨ ˆ Rate: R = 1 N log 2 ( M ). Embedding distortion: D xy = E [ 1 n =1 ,N D xy ( X n , e n ( W, X N ))] for some � N distortion matrix { D xy ( x, y ) , x ∈ X , y ∈ Y} . 17

Coding Theorems for Reversible Embedding Frans Willems and Ton - PowerPoint PPT Presentation

Coding Theorems for Reversible Embedding Frans Willems and Ton Kalker T.U. Eindhoven, Philips Research DIMACS, March 16-19, 2003 Outline 1. Gelfand-Pinsker coding theorem 2. Noise-free embedding 3. Reversible embedding 4. Robust and

Formal Modeling in Cognitive Science 1 Coding Theorems Lecture 28: Kraft Inequality; Source Coding

Greedy embedding of a graph Greedy embedding of a graph 99 Greedy embedding Greedy embedding

Universality Issues in Reversible Computing Systems and Cellular Automata Kenichi Morita

Image and Video Coding: Video Coding Extensions Screen Content Coding Screen Content Coding

ADVANCED MULTIMEDIA ADVANCED MULTIMEDIA CODING CODING Fernando Pereira Instituto Superior

Dynamical systems Expanding maps on the circle. Coding Jana Rodriguez Hertz ICTP 2018 coding

Graph Drawing Embedding Embedding For a given graph G = ( V , E ) , an embedding (into R 2 )

Planarity Embedding Embedding For a given graph G = ( V , E ) , an embedding (into R 2 ) assigns

Formal Modeling in Cognitive Science Source Codes Lecture 30: Codes; Kraft Inequality; Source

Cool theorems proved by undergraduates Ken Ono Emory University Cool theorems proved by

Electric network for non-reversible Markov chains joint with Aron Folly M arton Bal

Entropy Change in Entropy Reversible Isobaric Process Ideal Gas in a Reversible Process Free

Reversible calculus Ioana Cristescu 22 June 2012 Ioana Cristescu Reversible calculus

Reversible Computation and Reversible Programming Languages Tetsuo Yokoyama Graduate School of

Toward a Curry-Howard Correspondence for Linear, Reversible Computation Reversible Computation

Risk-Based Coding and Reimbursement What is Risk-Based Coding? Risk-Based Coding Overview A

Lecture 14 Jan-Willem van de Meent TOPIC MODELS Borrowing from : David Blei (Columbia)

DeltaShaper Enabling Unobservable Censorship- resistant TCP Tunneling over Videoconferencing

Naive Bayes Classication Naive Bayes Classication In [1]: % matplotlib inline from

Higher order rectifiability via Reifenberg theorems for sets and measures Silvia Ghinassi Stony

Higher Order Functions 1 Shell CSCE 314 TAMU Higher-order Functions A function is called

Model Checking for Symbolic-Heap Separation Logic with Inductive Predicates James Brotherston 1

Computation-as-deduction in Abella: work in progress Kaustuv Chaudhuri, Ulysse G erard and Dale

Installation Installation Procedures Procedures for Clusters for Clusters PART 2 NETBOOT