amplification by shuffling
play

Amplification by Shuffling: From Local to Central Differential - PowerPoint PPT Presentation

Amplification by Shuffling: From Local to Central Differential Privacy via Anonymity Vitaly Feldman Ulfar Erlingsson Ilya Mironov Ananth Raghunathan Kunal Talwar Abhradeep Thakurta Local Differential Privacy (LDP) 1 1 For all ,


  1. Amplification by Shuffling: From Local to Central Differential Privacy via Anonymity Vitaly Feldman Ulfar Erlingsson Ilya Mironov Ananth Raghunathan Kunal Talwar Abhradeep Thakurta

  2. Local Differential Privacy (LDP) 𝐡 1 𝑦 1 For all 𝑗 , 𝐡 𝑗 is a local πœ— -DP randomizer: 𝐡 2 𝑦 2 for all 𝑀, 𝑀′ ∈ π‘Œ 𝑦 3 𝐡 3 Server 𝐡 𝑗 (𝑦 𝑗 = 𝑀) 𝐡 𝑗 (𝑦 𝑗 = 𝑀′) [Warner β€˜65; EGS β€˜03; KLNRS β€˜08] 𝐡 π‘œ 𝑦 π‘œ Compute (approximately) 𝑔(𝑦 1 , 𝑦 2 , … , 𝑦 π‘œ )

  3. Outline Benefits of anonymity: Online monitoring with LDP privacy amplification by shuffling 3

  4. Online monitoring time 𝑦 𝑗,π‘˜ ∈ {0,1} 𝑦 1,1 𝑦 1,2 𝑦 1,3 𝑦 1,𝑒 Status of user 𝑗 on day π‘˜ 𝑦 2,1 𝑦 2,2 𝑦 2,3 𝑦 2,𝑒 Assume that each user’s status 𝑦 3,1 𝑦 3,2 𝑦 3,3 𝑦 3,𝑒 changes at most 𝑙 times β€’ only for utility 𝑦 π‘œ,1 𝑦 π‘œ,2 𝑦 π‘œ,3 𝑦 π‘œ,𝑒 𝑇 π‘œ 𝑇 1 𝑇 2 𝑇 3 Estimate the daily counts 𝑇 𝑦 𝑗,π‘˜ for all π‘˜ ∈ [𝑒] π‘œ π‘˜ = Οƒ 𝑗=1 4

  5. Monitoring with LDP There exists an πœ— -LDP algorithm that constructs estimates 𝑇 1 , መ መ 𝑇 2 , … , መ 𝑇 𝑒 such that with high prob. for all π‘˜ ∈ [𝑒] , π‘œπ‘™ (log 𝑒) 2 π‘˜ βˆ’ መ 𝑇 𝑇 π‘˜ = 𝑃 πœ— β€’ Report the status changes (only first 𝑙 ) β€’ Maintains a tree of counters each over an interval of time β€’ Based on [DNPR β€˜10; CSS β€˜11] 5

  6. Encode-Shuffle-Analyze (ESA) [Bittau et al. β€˜17] 𝐡 1 𝑦 1 𝐡 2 𝑦 2 𝐡 3 𝑦 3 Server 𝐡 π‘œ 𝑦 π‘œ Shuffle and anonymize 6

  7. Privacy amplification by shuffling For any πœ— = 𝑃(1) and any sequence of πœ— -LDP algorithms (𝐡 1 , … , 𝐡 π‘œ ) , let 𝐡 shuffle 𝑦 1 , … , 𝑦 π‘œ = 𝐡 1 𝑦 𝜌 1 , 𝐡 2 𝑦 𝜌 2 , … , 𝐡 π‘œ 𝑦 𝜌 π‘œ for a random and uniform permutation 𝜌: π‘œ β†’ π‘œ Then 𝐡 shuffle is πœ—β€², πœ€ -DP in the central model for πœ— β€² = 𝑃 πœ— log 1/πœ€ π‘œ Holds for adaptive case: 𝐡 𝑗 may depend on outputs of 𝐡 1 , … , 𝐡 π‘—βˆ’1 7

  8. Comparison with subsampling Running πœ— -DP algorithm on random π‘Ÿ -fraction of elements is β‰ˆ π‘Ÿπœ— -DP ( πœ— ≀ 1) [KLNRS β€˜08] Shuffling includes all elements so π‘Ÿ = 1 Output 𝐡 1 𝑦 𝑗 1 , 𝐡 2 𝑦 𝑗 2 , … , 𝐡 π‘œ 𝑦 𝑗 π‘œ where 𝑗 1 , 𝑗 2 , … , 𝑗 π‘œ ∼ [π‘œ] (independently) is πœ—β€², πœ€ -DP for πœ— β€² = 𝑃 πœ— log 1/πœ€ π‘œ e.g. [BST β€˜14 ] Advantages of shuffling: does not affect the statistics of the dataset β€’ does not increase LDP cost β€’ 8

  9. Implications for ESA 𝐡 1 𝑦 1 𝐡 2 𝑦 2 Set 𝑇 βŠ† [π‘œ] with the same randomizer 𝑦 3 𝐡 3 Server 𝐡 π‘œ 𝑦 π‘œ Shuffle and anonymize πœ— log 1/πœ€ For every 𝑗 ∈ 𝑇 , the output is 𝑃 , πœ€ -DP for element at position 𝑗 𝑇 9

  10. Special case: binary randomized response RR : For 𝑦 ∈ 0,1 , return 𝑦 flipped with probability 1/3 . Satisfies (log 2) -LDP Output distribution is determined by 𝑛 = # 1 (RR(𝑦 1 ), … , RR(𝑦 π‘œ )) 𝑛 ∼ Bin 𝑙, 2 3 + Bin π‘œ βˆ’ 𝑙, 1 3 , where 𝑙 = # 1 (𝑦 1 , … , 𝑦 π‘œ ) For a neighboring dataset: 𝑙 β€² = 𝑙 Β± 1 Bin 𝑙, 2 3 + Bin π‘œ βˆ’ 𝑙, 1 Bin 𝑙 + 1, 2 3 + Bin π‘œ βˆ’ 𝑙 βˆ’ 1, 1 β‰ˆ log 1/πœ€ 3 3 ,πœ€ π‘œ [DKMMN β€˜06] Also given in [Cheu,Smith,Ullman,Zeber,Zhilyaev β€˜18] (independently) 10

  11. Conclusions β€’ Monitoring with LDP and log dependence on time β€’ General privacy amplification technique o Match state of the art in the central model o Can be used to derive lower bounds for LDP β€’ Provable benefits of anonymity for ESA-like architectures β€’ To appear in SODA 2019 β€’ arxiv.org/abs/1811.12469 11

Recommend


More recommend