privately detecting changes in unknown distributions
play

Privately Detecting Changes in Unknown Distributions Wanrong Zhang, - PowerPoint PPT Presentation

Privately Detecting Changes in Unknown Distributions Wanrong Zhang, Georgia Tech joint work with Rachel Cummings, Sara Krehbiel, Yuliia Lut 1 Motivation I: Smart-home IoT devices 2 Motivation II: Disease outbreaks 3 Change-point problem:


  1. Privately Detecting Changes in Unknown Distributions Wanrong Zhang, Georgia Tech joint work with Rachel Cummings, Sara Krehbiel, Yuliia Lut 1

  2. Motivation I: Smart-home IoT devices 2

  3. Motivation II: Disease outbreaks 3

  4. Change-point problem: Identify distributional changes in stream of highly sensitive data Model: Data points 𝑦 " , … , 𝑦 % βˆ— ∼ 𝑄 ) (pre-change) Need formal privacy 𝑦 % βˆ— , … , 𝑦 * ∼ 𝑄 " (post-change) guarantees for change-point detection algorithms Question: Estimate the unknown change time 𝑙 βˆ— Previous work: parametric model [CKM+18] ( 𝑄 ) and 𝑄 " known) Our work: nonparametric model (𝑄 ) and 𝑄 " unknown) 4

  5. Differential privacy [DMNS β€˜06] Bound the maximum amount that one person’s data can change the distribution of an algorithm’s output An algorithm 𝑁: π‘ˆ * β†’ 𝑆 is 𝝑 -differentially private if βˆ€ neighboring 𝑦, 𝑦′ ∈ π‘ˆ * and βˆ€ 𝑇 βŠ† 𝑆 , 𝑄 𝑁 𝑦 ∈ 𝑇 ≀ 𝑓 ; 𝑄 𝑁 𝑦 < ∈ 𝑇 β€’ 𝑇 as set of β€œbad outcomes” β€’ Worst-case guarantee 5

  6. Privately Detecting Changes in Unknown Distributions 1. Offline setting: dataset known in advance 2. Online setting: data points arrive one at a time 3. Drift change detection (in paper) 4. Empirical results (in paper) 6

  7. Privately Detecting Changes in Unknown Distributions 1. Offline setting: dataset known in advance 2. Online setting: data points arrive one at a time 3. Drift change detection (in paper) 4. Empirical results (in paper) 7

  8. Mann-Whitney test [MW β€˜47] Datasets: 𝑦 " , 𝑦 = , … 𝑦 % ~𝑄 ) and 𝑦 %?" , 𝑦 %?= , … 𝑦 * ~𝑄 " 𝐼 ) : 𝑄 ) = 𝑄 " , 𝐼 " : 𝑄 ) β‰  𝑄 " " * % %(*D%) βˆ‘ βˆ‘ Test statistic: π‘Š 𝑙 = 𝐽(𝑦 H > 𝑦 J ) JK%?" HK" Number of such pairs (𝑦 H , 𝑦 J ) such " Under 𝐼 " , require 𝑏: = 𝑄𝑠 N~O P ,Q~O R 𝑦 > 𝑧 β‰  that 𝑦 H > 𝑦 J = 8

  9. Non-private nonparametric change-point detection [Darkhovsky β€˜79] 1. F or every 𝑙 ∈ πœΉπ’ , … 𝟐 βˆ’ 𝜹 𝒐 2. Compute π‘Š 𝑙 Can we compute V(𝑙) or ] = 𝑏𝑠𝑕𝑛𝑏𝑦 % π‘Š(𝑙) 3. Output 𝑙 arg max π‘Š(𝑙) privately? 𝑾(𝒍) πœΉπ’ 𝒍 βˆ— 𝟐 (𝟐 βˆ’ 𝜹)𝒐 𝒐 9

  10. Adding differential privacy Differentially private algorithms add noise that scale with the sensitivity of a query. Query sensitivity: The sensitivity of real-valued query 𝑔 is: i,i j *kHlmnopq 𝑔 π‘Œ βˆ’ 𝑔 π‘Œ < Δ𝑔 = max . Laplace Mechanism: The mechanism 𝑁 𝑔, π‘Œ, πœ— = 𝑔 π‘Œ + Lap( xy z ) is πœ— -differentially private. 10

  11. Offline PNCPD = Mann-Whitney + ReportNoisyMax Private Nonparametric Change-Point Detector: 𝑄𝑂𝐷𝑄𝐸(π‘Œ, πœ—, 𝛿) 1. Input: database, privacy parameter πœ— , constraint parameter 𝛿 2. for k ∈ π›Ώπ‘œ , … 1 βˆ’ 𝛿 π‘œ 3. Compute statistic π‘Š(𝑙) = 4. Sample π‘Ž % ~π‘€π‘π‘ž ;…* † = 𝑏𝑠𝑕𝑛𝑏𝑦 % π‘Š 𝑙 + π‘Ž % 5. Output 𝑙 11

  12. Main results: OfflinePNCPD Theorem: Offline𝑄𝑂𝐷𝑄𝐸 π‘Œ, πœ—, 𝛿 is πœ— -differentially private and with † with probability 1 βˆ’ 𝛾 , it outputs private change-point estimator 𝑙 error at most ".)" 1 ’ log 1 † βˆ’ 𝑙 βˆ— < 𝑃 𝑙 𝝑𝛿 β€’ 𝑏 βˆ’ 1/2 = 𝛾 Previous non-private analysis [Darkhovsky β€˜76] Β§ ] βˆ’ 𝑙 βˆ— < 𝑃(π‘œ =/β€œ ) 𝑙 Our improved non-private analysis: Β§ 𝛿 β€’ 𝑏 βˆ’ 1/2 = log 1 1 ] βˆ’ 𝑙 βˆ— < 𝑃 𝑙 Ξ² = 𝑃 1 12

  13. Privately Detecting Changes in Unknown Distributions 1. Offline setting: dataset known in advance 2. Online setting: data points arrive one at a time 3. Drift change detection (in paper) 4. Empirical results (in paper) 13

  14. Online setting More challenging: must detect change quickly without much post- change data High Level Approach: 1. Privately detect online when V 𝑙 > π‘ˆ in the center of a sliding window of last π‘œ data points. 2. Run OfflinePNCPD on the identified window. Have DP algorithm (AboveNoisyThreshold) for this 14

  15. Online setting More challenging: must detect change quickly without much post- change data Our Approach: 1. Run AboveNoisyThreshold on Mann-Whitney queries in the center of a sliding window of last π‘œ data points. 2. Run OfflinePNCPD on the identified window. π‘Š 𝑙 + π‘Ž % < π‘ˆ 15

  16. Online setting More challenging: must detect change quickly without much post- change data Our Approach: 1. Run AboveNoisyThreshold on Mann-Whitney queries in the center of a sliding window of last π‘œ data points. 2. Run OfflinePNCPD on the identified window. π‘Š 𝑙 + π‘Ž % < π‘ˆ 16

  17. Online setting More challenging: must detect change quickly without much post- change data Our Approach: 1. Run AboveNoisyThreshold on Mann-Whitney queries in the center of a sliding window of last π‘œ data points. 2. Run OfflinePNCPD on the identified window. π‘Š 𝑙 + π‘Ž % β‰₯ π‘ˆ 17

  18. OnlinePNCPD 1. Input: database π‘Œ = {𝑦 " , … } , privacy parameter πœ— , threshold π‘ˆ ] = π‘ˆ + Lap ˜ 2. Let π‘ˆ zβ„’ 3. For each new data point 𝑦 % : 4. Compute Mann-Whitney statistic π‘Š(𝑙) in center of last π‘œ data points Sample π‘Ž % ∼ Lap RΕ‘ 5. zβ„’ ] , then 6. If π‘Š 𝑙 + π‘Ž J > π‘ˆ 7. Run OfflinePNCPD on last π‘œ data points with πœ—/2 8. Else, output βŠ₯ 18

  19. Main result: OnlinePNCPD Theorem: Online𝑄𝑂𝐷𝑄𝐸 π‘Œ, T, πœ—, 𝛿 is πœ— -differentially private. For appropriate threshold T, with probability 1 βˆ’ 𝛾 , it outputs private † with error at most change-point estimator 𝑙 † βˆ’ 𝑙 βˆ— < 𝑃 1 πœ— log π‘œ 𝑙 𝛾 where π‘œ is the window size. Choice of T β€’ Can’t raise alarm too early (False positive: π‘ˆ > π‘ˆ β€’ ) β€’ Can’t fail to raise alarm at true change (False negative: π‘ˆ < π‘ˆ ΕΎ ) 19

  20. Privately Detecting Changes in Unknown Distributions 1. Offline setting: dataset known in advance 2. Online setting: data points arrive one at a time 3. Drift change detection (in paper) 4. Empirical results (in paper) 20

  21. References β€’ Cummings, R., Krehbiel, S., Mei, Y., Tuo, R., & Zhang, W. Differentially private change-point detection. In Advances in Neural Information Processing Systems, NeurIPS’18 pp. 10848-10857,2018 β€’ Dwork, C., McSherry, F., Nissim, K., & Smith, A. Calibrating noise to sensitivity in private data analysis. In Theory of cryptography conference, pp. 265-284, 2006. β€’ Dwork, C., Roth, A. The algorithmic foundations of differential privacy. Foundations and Trends in Theoretical Computer Science , 9 (3– 4), 211-407, 2014. β€’ Darkhovsky, B. A nonparametric method for the a posteriori detection of the ``disorder’’ time of a sequence of independent random variables. Theory of Probability & Its Applications , 21(1):178-183, 1976. β€’ Mann, H.B. and Whitney, D.R. On a test of whether one of two random variables is stochastically larger than the other. The annals of mathematical statistics , pp 50-60, 1947. 21

  22. Privately Detecting Changes in Unknown Distributions Wanrong Zhang, Georgia Tech joint work with Rachel Cummings, Sara Krehbiel, Yuliia Lut 22

Recommend


More recommend