Gaussian Approximation of Quantization Error for Inference from Compressed Data Alon Kipnis (Stanford) Galen Reeves (Duke) ISIT, July 2019
Table of Contents Introduction Motivation Contribution Main results Non Asymptotic Asymptotic Examples / Applications Standard Source Coding Quantized Compressed Sensing 2 / 16
Motivation Inference from Compressed Data data inference P X | θ ∼ ˆ X θ 3 / 16
Motivation Inference from Compressed Data data inference P X | θ ∼ ˆ X Y θ compression (bit limitation) 3 / 16
Motivation Inference from Compressed Data data inference P X | θ ∼ ˆ X Y θ compression (bit limitation) ◮ Indirect rate-distortion [Dobrushin & Tsybakov ’62], [Berger ’71] ◮ Quantized compressed sensing [Kamilov et. al. ’12], [Xu et. al. ’14], [Kipnis et. al. ’17, ’18] ◮ Estimation under communication constraints [Han ’87], [Zhang & Berger ’88], [Duchi et. al. ’17], [Duchi & Kipnis ’17], [Barnes et. al. ’18], [Han et. al. ’18] ◮ Task-oriented quantization [Kassam ’77], [Picimbono & Duvaut ’88], [Gersho ’96], [Misra et. al. ’08], [Shlezinger et. al. ’18] 3 / 16
Motivation Inference from Compressed Data data inference P X | θ ∼ ˆ X Y θ compression (bit limitation) ◮ Indirect rate-distortion [Dobrushin & Tsybakov ’62], [Berger ’71] ◮ Quantized compressed sensing [Kamilov et. al. ’12], [Xu et. al. ’14], [Kipnis et. al. ’17, ’18] ◮ Estimation under communication constraints [Han ’87], [Zhang & Berger ’88], [Duchi et. al. ’17], [Duchi & Kipnis ’17], [Barnes et. al. ’18], [Han et. al. ’18] ◮ Task-oriented quantization [Kassam ’77], [Picimbono & Duvaut ’88], [Gersho ’96], [Misra et. al. ’08], [Shlezinger et. al. ’18] Challenge: combining estimation theory and quantization 3 / 16
Lossy Compression vs AWGN Channel { 1 , . . . , 2 nR } P X | θ θ X Enc Dec Y N (0 , 1 snr I ) + Z 4 / 16
Lossy Compression vs AWGN Channel { 1 , . . . , 2 nR } P X | θ θ X Enc Dec Y N (0 , 1 snr I ) + Z ◮ Plenty of pitfalls/non-rigorous work [Gray ] ◮ Some rigorous “high bit resolution” results [Lee & Neuhoff ’96], [Viswanathan & Zamir ’01], [Marco & Neuhoff ’05] 4 / 16
Lossy Compression vs AWGN Channel { 1 , . . . , 2 nR } P X | θ θ X Enc Dec Y N (0 , 1 snr I ) + Z ◮ Plenty of pitfalls/non-rigorous work [Gray ] ◮ Some rigorous “high bit resolution” results [Lee & Neuhoff ’96], [Viswanathan & Zamir ’01], [Marco & Neuhoff ’05] This Talk: If X is encoded using a random spherical code, then snr = 2 2 R − 1 Wass 2 ( Y , Z | X ) ≈ const , 4 / 16
Geometric Interpretation of Gaussian Source Coding [Sakrison ’68], [Wyner ’68] 5 / 16
Geometric Interpretation of Gaussian Source Coding [Sakrison ’68], [Wyner ’68] input sphere X √ n 5 / 16
Geometric Interpretation of Gaussian Source Coding [Sakrison ’68], [Wyner ’68] representation sphere input sphere ¯ X X Y α √ n √ r n sin α → 2 − R 5 / 16
Geometric Interpretation of Gaussian Source Coding [Sakrison ’68], [Wyner ’68] representation error sphere sphere input sphere ¯ X X Y α √ n √ r n sin α → 2 − R 5 / 16
Geometric Interpretation of Gaussian Source Coding [Sakrison ’68], [Wyner ’68] This talk: representation error sphere sphere input sphere X ¯ Y X X Y α √ n α √ n √ r √ n ρ n sin α → 2 − R 5 / 16
Overview of Contributions N (0 , 1 snr I ) X Y √ n + X Z 2 2 R − 1 = snr 6 / 16
Overview of Contributions N (0 , 1 snr I ) X Y √ n + X Z 2 2 R − 1 = snr ◮ Strong equivalence between quantization error using rate R random spherical coding and AWGN with SNR 2 2 R − 1 6 / 16
Overview of Contributions N (0 , 1 snr I ) X Y √ n + X Z 2 2 R − 1 = snr ◮ Strong equivalence between quantization error using rate R random spherical coding and AWGN with SNR 2 2 R − 1 ◮ Applications to inference from compressed data 6 / 16
Table of Contents Introduction Motivation Contribution Main results Non Asymptotic Asymptotic Examples / Applications Standard Source Coding Quantized Compressed Sensing 7 / 16
Main Result (Non Asymptotic) Gaussian Approximation of Quantization Error rate R P X | θ θ X Y spherical code W ∼ N (0 , σ 2 I ) + Z Theorem For P X with finite second moments, E [ � X � ] E [ � X � ] = ρ 2 − R , ρ = , σ = n (2 2 R − 1) � � n (1 − 2 − 2 R ) we have 2 ( Y , Z | X ) ≤ var( � X � ) + 2 σ 2 + C R E [ � X � ] 2 log 2 n Wass 2 n 2 8 / 16
Wasserstein Distance and Lipschitz Continuity Definition (quadratic Wasserstein Distance:) Wass 2 ( Y , Z ) � inf P Y ,Z E [ � Y − Z � 2 ] , ( P Y , P Z are fixed) (a.k.a. Kantorovitch, Kantorovich-Rubinstein, “transportation”, ρ -bar, “earth movers”, Gini, Frechet, Vallender, Mallows...) 9 / 16
Wasserstein Distance and Lipschitz Continuity Definition (quadratic Wasserstein Distance:) Wass 2 ( Y , Z ) � inf P Y ,Z E [ � Y − Z � 2 ] , ( P Y , P Z are fixed) (a.k.a. Kantorovitch, Kantorovich-Rubinstein, “transportation”, ρ -bar, “earth movers”, Gini, Frechet, Vallender, Mallows...) Fact For any L -Lipschitz f : � �� � � � � � � � � θ − f ( Y ) � 2 � θ − f ( Z ) � 2 − � ≤ L Wass 2 ( Y , Z | X ) E E � � 2 2 � � � 9 / 16
Main Result (Asymptotic) Asymptotic Squared Error rate R P X | θ θ X Y spherical code θ ∈ R d n W ∼ N (0 , σ 2 I ) + Z Corollary If 1 � θ n ( Z ) � 2 � � θ − ˆ = M ( snr ) + o (1) , E d n then 1 �� 2 � � = M (2 2 R − 1) + o (1) , � θ − ˆ E θ n ( Y ) � � d n � provided: ◮ var( � X � ) = O (1) θ n ) = o ( √ d n ) ◮ Lip (ˆ 10 / 16
Table of Contents Introduction Motivation Contribution Main results Non Asymptotic Asymptotic Examples / Applications Standard Source Coding Quantized Compressed Sensing 11 / 16
Examples / Applications { 1 , . . . , 2 nR } P X | θ ˆ θ X Enc Dec θ ◮ Standard source coding: X = θ ◮ Quantized compressed sensing: X = A θ + W Not in this talk... ◮ Parametric estimation under bit constraints ◮ Optimization with gradient compression ◮ Data compression in latent space using a generative model 12 / 16
Example I: Standard Source Coding � θ 2 � X = θ , E = 1 1 R bits/symbol iid P θ ∼ ˆ θ Enc Dec θ 13 / 16
Example I: Standard Source Coding � θ 2 � X = θ , E = 1 1 R bits/symbol iid P θ ∼ ˆ θ Enc Dec θ W/ √ snr + Z 13 / 16
Example I: Standard Source Coding � θ 2 � X = θ , E = 1 1 R bits/symbol iid P θ ∼ ˆ θ Enc Dec θ W/ √ snr + Z ˆ 1) Estimator : θ ( z ) = E [ θ 1 | Z 1 = z ] M ( snr ) = mmse ( θ 1 | Z 1 ) 2) MSE function : 13 / 16
Example I: Standard Source Coding � θ 2 � X = θ , E = 1 1 R bits/symbol iid P θ ∼ ˆ θ Enc Dec θ W/ √ snr + Z ˆ 1) Estimator : θ ( z ) = E [ θ 1 | Z 1 = z ] M ( snr ) = mmse ( θ 1 | Z 1 ) 2) MSE function : Corollary � � 2 2 R − 1 � θ 1 | θ 1 + W/ D sp ( R ) = mmse is achievable with random spherical coding 13 / 16
Example I: Standard Source Coding � θ 2 � X = θ , E = 1 1 R bits/symbol iid P θ ∼ ˆ θ Enc Dec θ W/ √ snr + Z ˆ 1) Estimator : θ ( z ) = E [ θ 1 | Z 1 = z ] M ( snr ) = mmse ( θ 1 | Z 1 ) 2) MSE function : Corollary � � 2 2 R − 1 � θ 1 | θ 1 + W/ D sp ( R ) = mmse is achievable with random spherical coding D sp ( R ) ≤ D Gauss ( R ) = 2 − 2 R Note: ◮ Compare to [Sakrison ’68], [Lapidoth ’97] 13 / 16
Standard Source Coding (cont’d) Illustration: Equiprobable Binary P θ = Unif ( {− 1 , 1 } ) , i = 1 , . . . , n 1 D sp ( R ) D Gauss ( R ) D Shannon ( R ) MSE 0 0 1 2 R 14 / 16
Example II: Quantized Compressed Sensing X = A θ + ǫW → { 1 , . . . , 2 nR } → ˆ R n × d n A ∈ θ, n/d n → δ> 0 15 / 16
Example II: Quantized Compressed Sensing X = A θ + ǫW → { 1 , . . . , 2 nR } → ˆ R n × d n A ∈ θ, n/d n → δ> 0 � � X � 2 � σ 2 = 1 E � ǫ 2 + σ 2 W ′ , Z = A θ + snr n 15 / 16
Example II: Quantized Compressed Sensing X = A θ + ǫW → { 1 , . . . , 2 nR } → ˆ R n × d n A ∈ θ, n/d n → δ> 0 � � X � 2 � σ 2 = 1 E � ǫ 2 + σ 2 W ′ , Z = A θ + snr n 1) θ T AMP ( Z ) = T iterations of A pproximate M essage P assing [Donoho et. al. ’09] 2) M T AMP ( snr ) = T iterations of state evolution recursion [Bayati & Montanari ’11] 15 / 16
Example II: Quantized Compressed Sensing X = A θ + ǫW → { 1 , . . . , 2 nR } → ˆ R n × d n A ∈ θ, n/d n → δ> 0 � � X � 2 � σ 2 = 1 E � ǫ 2 + σ 2 W ′ , Z = A θ + snr n 1) θ T AMP ( Z ) = T iterations of A pproximate M essage P assing [Donoho et. al. ’09] 2) M T AMP ( snr ) = T iterations of state evolution recursion [Bayati & Montanari ’11] Corollary 1 �� � 2 � AMP (2 2 R − 1) � θ − θ T → M T � AMP ( Y ) E d n 15 / 16
Recommend
More recommend