Amplification by Shuffling: From Local to Central Differential Privacy via Anonymity Vitaly Feldman Ulfar Erlingsson Ilya Mironov Ananth Raghunathan Kunal Talwar Abhradeep Thakurta
Local Differential Privacy (LDP) π΅ 1 π¦ 1 For all π , π΅ π is a local π -DP randomizer: π΅ 2 π¦ 2 for all π€, π€β² β π π¦ 3 π΅ 3 Server π΅ π (π¦ π = π€) π΅ π (π¦ π = π€β²) [Warner β65; EGS β03; KLNRS β08] π΅ π π¦ π Compute (approximately) π(π¦ 1 , π¦ 2 , β¦ , π¦ π )
Outline Benefits of anonymity: Online monitoring with LDP privacy amplification by shuffling 3
Online monitoring time π¦ π,π β {0,1} π¦ 1,1 π¦ 1,2 π¦ 1,3 π¦ 1,π Status of user π on day π π¦ 2,1 π¦ 2,2 π¦ 2,3 π¦ 2,π Assume that each userβs status π¦ 3,1 π¦ 3,2 π¦ 3,3 π¦ 3,π changes at most π times β’ only for utility π¦ π,1 π¦ π,2 π¦ π,3 π¦ π,π π π π 1 π 2 π 3 Estimate the daily counts π π¦ π,π for all π β [π] π π = Ο π=1 4
Monitoring with LDP There exists an π -LDP algorithm that constructs estimates π 1 , α α π 2 , β¦ , α π π such that with high prob. for all π β [π] , ππ (log π) 2 π β α π π π = π π β’ Report the status changes (only first π ) β’ Maintains a tree of counters each over an interval of time β’ Based on [DNPR β10; CSS β11] 5
Encode-Shuffle-Analyze (ESA) [Bittau et al. β17] π΅ 1 π¦ 1 π΅ 2 π¦ 2 π΅ 3 π¦ 3 Server π΅ π π¦ π Shuffle and anonymize 6
Privacy amplification by shuffling For any π = π(1) and any sequence of π -LDP algorithms (π΅ 1 , β¦ , π΅ π ) , let π΅ shuffle π¦ 1 , β¦ , π¦ π = π΅ 1 π¦ π 1 , π΅ 2 π¦ π 2 , β¦ , π΅ π π¦ π π for a random and uniform permutation π: π β π Then π΅ shuffle is πβ², π -DP in the central model for π β² = π π log 1/π π Holds for adaptive case: π΅ π may depend on outputs of π΅ 1 , β¦ , π΅ πβ1 7
Comparison with subsampling Running π -DP algorithm on random π -fraction of elements is β ππ -DP ( π β€ 1) [KLNRS β08] Shuffling includes all elements so π = 1 Output π΅ 1 π¦ π 1 , π΅ 2 π¦ π 2 , β¦ , π΅ π π¦ π π where π 1 , π 2 , β¦ , π π βΌ [π] (independently) is πβ², π -DP for π β² = π π log 1/π π e.g. [BST β14 ] Advantages of shuffling: does not affect the statistics of the dataset β’ does not increase LDP cost β’ 8
Implications for ESA π΅ 1 π¦ 1 π΅ 2 π¦ 2 Set π β [π] with the same randomizer π¦ 3 π΅ 3 Server π΅ π π¦ π Shuffle and anonymize π log 1/π For every π β π , the output is π , π -DP for element at position π π 9
Special case: binary randomized response RR : For π¦ β 0,1 , return π¦ flipped with probability 1/3 . Satisfies (log 2) -LDP Output distribution is determined by π = # 1 (RR(π¦ 1 ), β¦ , RR(π¦ π )) π βΌ Bin π, 2 3 + Bin π β π, 1 3 , where π = # 1 (π¦ 1 , β¦ , π¦ π ) For a neighboring dataset: π β² = π Β± 1 Bin π, 2 3 + Bin π β π, 1 Bin π + 1, 2 3 + Bin π β π β 1, 1 β log 1/π 3 3 ,π π [DKMMN β06] Also given in [Cheu,Smith,Ullman,Zeber,Zhilyaev β18] (independently) 10
Conclusions β’ Monitoring with LDP and log dependence on time β’ General privacy amplification technique o Match state of the art in the central model o Can be used to derive lower bounds for LDP β’ Provable benefits of anonymity for ESA-like architectures β’ To appear in SODA 2019 β’ arxiv.org/abs/1811.12469 11
Recommend
More recommend