Privately Detecting Changes in Unknown Distributions Wanrong Zhang, Georgia Tech joint work with Rachel Cummings, Sara Krehbiel, Yuliia Lut 1
Motivation I: Smart-home IoT devices 2
Motivation II: Disease outbreaks 3
Change-point problem: Identify distributional changes in stream of highly sensitive data Model: Data points π¦ " , β¦ , π¦ % β βΌ π ) (pre-change) Need formal privacy π¦ % β , β¦ , π¦ * βΌ π " (post-change) guarantees for change-point detection algorithms Question: Estimate the unknown change time π β Previous work: parametric model [CKM+18] ( π ) and π " known) Our work: nonparametric model (π ) and π " unknown) 4
Differential privacy [DMNS β06] Bound the maximum amount that one personβs data can change the distribution of an algorithmβs output An algorithm π: π * β π is π -differentially private if β neighboring π¦, π¦β² β π * and β π β π , π π π¦ β π β€ π ; π π π¦ < β π β’ π as set of βbad outcomesβ β’ Worst-case guarantee 5
Privately Detecting Changes in Unknown Distributions 1. Offline setting: dataset known in advance 2. Online setting: data points arrive one at a time 3. Drift change detection (in paper) 4. Empirical results (in paper) 6
Privately Detecting Changes in Unknown Distributions 1. Offline setting: dataset known in advance 2. Online setting: data points arrive one at a time 3. Drift change detection (in paper) 4. Empirical results (in paper) 7
Mann-Whitney test [MW β47] Datasets: π¦ " , π¦ = , β¦ π¦ % ~π ) and π¦ %?" , π¦ %?= , β¦ π¦ * ~π " πΌ ) : π ) = π " , πΌ " : π ) β π " " * % %(*D%) β β Test statistic: π π = π½(π¦ H > π¦ J ) JK%?" HK" Number of such pairs (π¦ H , π¦ J ) such " Under πΌ " , require π: = ππ N~O P ,Q~O R π¦ > π§ β that π¦ H > π¦ J = 8
Non-private nonparametric change-point detection [Darkhovsky β79] 1. F or every π β πΉπ , β¦ π β πΉ π 2. Compute π π Can we compute V(π) or ] = ππ ππππ¦ % π(π) 3. Output π arg max π(π) privately? πΎ(π) πΉπ π β π (π β πΉ)π π 9
Adding differential privacy Differentially private algorithms add noise that scale with the sensitivity of a query. Query sensitivity: The sensitivity of real-valued query π is: i,i j *kHlmnopq π π β π π < Ξπ = max . Laplace Mechanism: The mechanism π π, π, π = π π + Lap( xy z ) is π -differentially private. 10
Offline PNCPD = Mann-Whitney + ReportNoisyMax Private Nonparametric Change-Point Detector: πππ·ππΈ(π, π, πΏ) 1. Input: database, privacy parameter π , constraint parameter πΏ 2. for k β πΏπ , β¦ 1 β πΏ π 3. Compute statistic π(π) = 4. Sample π % ~πππ ;β¦* β = ππ ππππ¦ % π π + π % 5. Output π 11
Main results: OfflinePNCPD Theorem: Offlineπππ·ππΈ π, π, πΏ is π -differentially private and with β with probability 1 β πΎ , it outputs private change-point estimator π error at most ".)" 1 β log 1 β β π β < π π ππΏ β’ π β 1/2 = πΎ Previous non-private analysis [Darkhovsky β76] Β§ ] β π β < π(π =/β ) π Our improved non-private analysis: Β§ πΏ β’ π β 1/2 = log 1 1 ] β π β < π π Ξ² = π 1 12
Privately Detecting Changes in Unknown Distributions 1. Offline setting: dataset known in advance 2. Online setting: data points arrive one at a time 3. Drift change detection (in paper) 4. Empirical results (in paper) 13
Online setting More challenging: must detect change quickly without much post- change data High Level Approach: 1. Privately detect online when V π > π in the center of a sliding window of last π data points. 2. Run OfflinePNCPD on the identified window. Have DP algorithm (AboveNoisyThreshold) for this 14
Online setting More challenging: must detect change quickly without much post- change data Our Approach: 1. Run AboveNoisyThreshold on Mann-Whitney queries in the center of a sliding window of last π data points. 2. Run OfflinePNCPD on the identified window. π π + π % < π 15
Online setting More challenging: must detect change quickly without much post- change data Our Approach: 1. Run AboveNoisyThreshold on Mann-Whitney queries in the center of a sliding window of last π data points. 2. Run OfflinePNCPD on the identified window. π π + π % < π 16
Online setting More challenging: must detect change quickly without much post- change data Our Approach: 1. Run AboveNoisyThreshold on Mann-Whitney queries in the center of a sliding window of last π data points. 2. Run OfflinePNCPD on the identified window. π π + π % β₯ π 17
OnlinePNCPD 1. Input: database π = {π¦ " , β¦ } , privacy parameter π , threshold π ] = π + Lap Λ 2. Let π zβ’ 3. For each new data point π¦ % : 4. Compute Mann-Whitney statistic π(π) in center of last π data points Sample π % βΌ Lap RΕ‘ 5. zβ’ ] , then 6. If π π + π J > π 7. Run OfflinePNCPD on last π data points with π/2 8. Else, output β₯ 18
Main result: OnlinePNCPD Theorem: Onlineπππ·ππΈ π, T, π, πΏ is π -differentially private. For appropriate threshold T, with probability 1 β πΎ , it outputs private β with error at most change-point estimator π β β π β < π 1 π log π π πΎ where π is the window size. Choice of T β’ Canβt raise alarm too early (False positive: π > π β’ ) β’ Canβt fail to raise alarm at true change (False negative: π < π ΕΎ ) 19
Privately Detecting Changes in Unknown Distributions 1. Offline setting: dataset known in advance 2. Online setting: data points arrive one at a time 3. Drift change detection (in paper) 4. Empirical results (in paper) 20
References β’ Cummings, R., Krehbiel, S., Mei, Y., Tuo, R., & Zhang, W. Differentially private change-point detection. In Advances in Neural Information Processing Systems, NeurIPSβ18 pp. 10848-10857,2018 β’ Dwork, C., McSherry, F., Nissim, K., & Smith, A. Calibrating noise to sensitivity in private data analysis. In Theory of cryptography conference, pp. 265-284, 2006. β’ Dwork, C., Roth, A. The algorithmic foundations of differential privacy. Foundations and Trends in Theoretical Computer Science , 9 (3β 4), 211-407, 2014. β’ Darkhovsky, B. A nonparametric method for the a posteriori detection of the ``disorderββ time of a sequence of independent random variables. Theory of Probability & Its Applications , 21(1):178-183, 1976. β’ Mann, H.B. and Whitney, D.R. On a test of whether one of two random variables is stochastically larger than the other. The annals of mathematical statistics , pp 50-60, 1947. 21
Privately Detecting Changes in Unknown Distributions Wanrong Zhang, Georgia Tech joint work with Rachel Cummings, Sara Krehbiel, Yuliia Lut 22
Recommend
More recommend