Differentially Private Testing of Identity and Closeness of Discrete - PowerPoint PPT Presentation

Differentially Private Testing of Identity and Closeness of Discrete Distributions NeurIPS 2018, Montreal, Canada Jayadev Acharya, Cornell University Ziteng Sun, Cornell University Huanyu Zhang, Cornell University

Hypothesis Testing • Given data from an unknown statistical source (distribution) 1

Hypothesis Testing • Given data from an unknown statistical source (distribution) • Does the distribution satisfy a postulated hypothesis? 1

Modern Challenges Large domain, small samples • Distributions over large domains/high dimensions 2

Modern Challenges Large domain, small samples • Distributions over large domains/high dimensions • Expensive data 2

Modern Challenges Large domain, small samples • Distributions over large domains/high dimensions • Expensive data • Sample complexity 2

Modern Challenges Large domain, small samples • Distributions over large domains/high dimensions • Expensive data • Sample complexity Privacy • Samples contain sensitive information 2

Modern Challenges Large domain, small samples • Distributions over large domains/high dimensions • Expensive data • Sample complexity Privacy • Samples contain sensitive information • Perform hypothesis testing while preserving privacy 2

Identity Testing (IT), Goodness of Fit • [ k ] := { 0 , 1 , 2 , ..., k − 1 } , a discrete set of size k . 3

Identity Testing (IT), Goodness of Fit • [ k ] := { 0 , 1 , 2 , ..., k − 1 } , a discrete set of size k . • q : a known distribution over [ k ]. 3

Identity Testing (IT), Goodness of Fit • [ k ] := { 0 , 1 , 2 , ..., k − 1 } , a discrete set of size k . • q : a known distribution over [ k ]. • Given X n := X 1 . . . X n independent samples from unknown p . 3

Identity Testing (IT), Goodness of Fit • [ k ] := { 0 , 1 , 2 , ..., k − 1 } , a discrete set of size k . • q : a known distribution over [ k ]. • Given X n := X 1 . . . X n independent samples from unknown p . • Is p = q ? 3

Identity Testing (IT), Goodness of Fit • [ k ] := { 0 , 1 , 2 , ..., k − 1 } , a discrete set of size k . • q : a known distribution over [ k ]. • Given X n := X 1 . . . X n independent samples from unknown p . • Is p = q ? • Tester: A : [ k ] n → { 0 , 1 } , which satisfies the following: With probability at least 2 / 3,  1 , if p = q  A ( X n ) = 0 , if | p − q | TV > α  3

Identity Testing (IT), Goodness of Fit • [ k ] := { 0 , 1 , 2 , ..., k − 1 } , a discrete set of size k . • q : a known distribution over [ k ]. • Given X n := X 1 . . . X n independent samples from unknown p . • Is p = q ? • Tester: A : [ k ] n → { 0 , 1 } , which satisfies the following: With probability at least 2 / 3,  1 , if p = q  A ( X n ) = 0 , if | p − q | TV > α  Sample complexity: Smallest n where such a tester exists. 3

Identity Testing (IT), Goodness of Fit • [ k ] := { 0 , 1 , 2 , ..., k − 1 } , a discrete set of size k . • q : a known distribution over [ k ]. • Given X n := X 1 . . . X n independent samples from unknown p . • Is p = q ? • Tester: A : [ k ] n → { 0 , 1 } , which satisfies the following: With probability at least 2 / 3,  1 , if p = q  A ( X n ) = 0 , if | p − q | TV > α  � √ k /α 2 � S ( IT ) = Θ . 3

Differential Privacy (DP) [Dwork et al., 2006] A randomized algorithm A : X n → S is ε -differentially private if ∀ S ⊂ S and ∀ X n , Y n with d H ( X n , Y n ) ≤ 1, we have Pr ( A ( X n ) ∈ S ) ≤ e ε · Pr ( A ( Y n ) ∈ S ) . 4

Previous Results Identity Testing: � √ � k Non-private : S ( IT ) = Θ [Paninski, 2008] α 2 � √ � √ k log k k ε -DP algorithms: S ( IT , ε ) = O α 2 + [Cai et al., 2017] α 3 / 2 ε 5

Previous Results Identity Testing: � √ � k Non-private : S ( IT ) = Θ [Paninski, 2008] α 2 � √ � √ k log k k ε -DP algorithms: S ( IT , ε ) = O α 2 + [Cai et al., 2017] α 3 / 2 ε What is the sample complexity of identity testing? 5

Our Results Theorem � √ � �� k 1 / 2 k 1 / 3 α 4 / 3 ε 2 / 3 , 1 k S ( IT , ε ) = Θ α 2 + max αε 1 / 2 , αε 6

Our Results Theorem � √ � �� k 1 / 2 k 1 / 3 α 4 / 3 ε 2 / 3 , 1 k S ( IT , ε ) = Θ α 2 + max αε 1 / 2 , αε  � √ � α 2 + k 1 / 2 k Θ , if n ≤ k  αε 1 / 2    � √ � k 1 / 3 k k S ( IT , ε ) = Θ α 2 + , if k < n ≤ α 4 / 3 ε 2 / 3 α 2  � √ �  α 2 + 1 k k Θ if n ≥ α 2 .   αε 6

Our Results Theorem � √ � �� k 1 / 2 k 1 / 3 α 4 / 3 ε 2 / 3 , 1 k S ( IT , ε ) = Θ α 2 + max αε 1 / 2 , αε  � √ � α 2 + k 1 / 2 k Θ , if n ≤ k  αε 1 / 2    � √ � k 1 / 3 k k S ( IT , ε ) = Θ α 2 + , if k < n ≤ α 4 / 3 ε 2 / 3 α 2  � √ �  α 2 + 1 k k Θ if n ≥ α 2 .   αε New algorithms for achieving upper bounds New methodology to prove lower bounds for hypothesis testing 6

Upper Bound Privatizing the statistic used by [Diakonikolas et al., 2017], which is sample optimal in the non-private case. Independent work of [Aliakbarpour et al., 2017] gives a different upper bound. 7

Lower Bound - Coupling Lemma Lemma Suppose there is a coupling between p and q over X n , such that E [ d H ( X n , Y n )] ≤ D Then, any ε -differentially private hypothesis testing algorithm must satisfy � 1 � ε = Ω D 8

Lower Bound - Coupling Lemma Lemma Suppose there is a coupling between p and q over X n , such that E [ d H ( X n , Y n )] ≤ D Then, any ε -differentially private hypothesis testing algorithm must satisfy � 1 � ε = Ω D Use LeCam’s two-point method. Construct two hypotheses and a coupling between them with small expected Hamming distance. 8

The End Paper available on arxiv: https://arxiv.org/abs/1707.05128 . See you at the poster session! Tue Dec 4th 05:00 – 07:00 PM @ Room 210 and 230 AB #151. 9

Aliakbarpour, M., Diakonikolas, I., and Rubinfeld, R. (2017). Differentially private identity and closeness testing of discrete distributions. arXiv preprint arXiv:1707.05497 . Cai, B., Daskalakis, C., and Kamath, G. (2017). Priv’it: Private and sample efficient identity testing. In ICML . Diakonikolas, I., Gouleakis, T., Peebles, J., and Price, E. (2017). Sample-optimal identity testing with high probability. arXiv preprint arXiv:1708.02728 . Dwork, C., Mcsherry, F., Nissim, K., and Smith, A. (2006). Calibrating noise to sensitivity in private data analysis. In In Proceedings of the 3rd Theory of Cryptography Conference . 9

Paninski, L. (2008). A coincidence-based test for uniformity given very sparsely sampled discrete data. IEEE Transactions on Information Theory , 54(10):4750–4755. 9

Differentially Private Testing of Identity and Closeness of Discrete - PowerPoint PPT Presentation

Differentially Private Testing of Identity and Closeness of Discrete Distributions NeurIPS 2018, Montreal, Canada Jayadev Acharya, Cornell University Ziteng Sun, Cornell University Huanyu Zhang, Cornell University Hypothesis Testing Given

Differentially-Private Federated Linear Bandits Introduction Federated Learning Contextual

Verifying Differentially Private Bayesian Inference Marco Gaboardi University of Dundee Joint

Differentially Private Recommender Systems David Madras University of Toronto April 4, 2017

Identity Theft Identity Theft Identity theft occurs when your personal information is stolen

Estimating the Variance of Complex Differentially Private Algorithms Robert Ashmead JSM 2019,

Order Statistics and Pitman Closeness Katherine F. Davies Department of Statistics University of

A multimodal logic for closeness Alfredo Burrieza Emilio Mu noz-Velasco Manuel Ojeda-Aciego

Identity and Access Management Using Identity Management and Identity Governance to increase

Adopting the global Marketing Lead Domains.coop co-operative identity www.identity.coop 24

Absorption Line Profiles for Absorption Line Profiles for Differentially Rotating 2 M

Key Escrow free Identity-based Identity-based Cryptosystem Cryptosystem Identity-based

Levels of Testing Chapter 12 Beyond unit testing Developer Testing stages Unit testing

Testing Terminology System testing Types of errors Function testing Structure

Differentially Private Oblivious RAM Sameer Wagh , Paul Cuff , Prateek Mittal July 24,

Building Blocks of Privacy: Differentially Private Mechanisms Graham Cormode graham@cormode.org

Smart Glasses Fashion Glasses Smart Glasses identity Ephesians 2 Ephesians 2 Ephesians 2

12/1/2019 Department of Veterinary and Animal Sciences Department of Veterinary and Animal

Discrete Mathematics & Mathematical Reasoning Chapter 7: Discrete Probability Kousha

Some Discrete Distribution Families Many families of discrete distributions have been studied; we

Explainable Neural Computation via Stack Neural Module Networks (July, 2018) Ronghang Hu, Jacob

Learning Arbitrary Statistical Mixtures of Discrete Distributions Jian Li (Tsinghua), Yuval

Coalgebraic Tools for Randomness-Conserving Protocols Matvey Soloviev (Cornell University) RAMiCS

Discrete Random Variables A random variable is a numerical value associated with the outcome of an

Probability and Statistics for Computer Science Its straigh+orward to link a number to the

Differentially Private Testing of Identity and Closeness of Discrete - PowerPoint PPT Presentation

Differentially Private Testing of Identity and Closeness of Discrete Distributions NeurIPS 2018, Montreal, Canada Jayadev Acharya, Cornell University Ziteng Sun, Cornell University Huanyu Zhang, Cornell University Hypothesis Testing Given

Differentially-Private Federated Linear Bandits Introduction Federated Learning Contextual

Verifying Differentially Private Bayesian Inference Marco Gaboardi University of Dundee Joint

Differentially Private Recommender Systems David Madras University of Toronto April 4, 2017

Identity Theft Identity Theft Identity theft occurs when your personal information is stolen

Estimating the Variance of Complex Differentially Private Algorithms Robert Ashmead JSM 2019,

Order Statistics and Pitman Closeness Katherine F. Davies Department of Statistics University of

A multimodal logic for closeness Alfredo Burrieza Emilio Mu noz-Velasco Manuel Ojeda-Aciego

Identity and Access Management Using Identity Management and Identity Governance to increase

Adopting the global Marketing Lead Domains.coop co-operative identity www.identity.coop 24

Absorption Line Profiles for Absorption Line Profiles for Differentially Rotating 2 M

Key Escrow free Identity-based Identity-based Cryptosystem Cryptosystem Identity-based

Levels of Testing Chapter 12 Beyond unit testing Developer Testing stages Unit testing

Testing Terminology System testing Types of errors Function testing Structure

Differentially Private Oblivious RAM Sameer Wagh , Paul Cuff , Prateek Mittal July 24,

Building Blocks of Privacy: Differentially Private Mechanisms Graham Cormode graham@cormode.org

Smart Glasses Fashion Glasses Smart Glasses identity Ephesians 2 Ephesians 2 Ephesians 2

12/1/2019 Department of Veterinary and Animal Sciences Department of Veterinary and Animal

Discrete Mathematics &amp; Mathematical Reasoning Chapter 7: Discrete Probability Kousha

Some Discrete Distribution Families Many families of discrete distributions have been studied; we

Explainable Neural Computation via Stack Neural Module Networks (July, 2018) Ronghang Hu, Jacob

Learning Arbitrary Statistical Mixtures of Discrete Distributions Jian Li (Tsinghua), Yuval

Coalgebraic Tools for Randomness-Conserving Protocols Matvey Soloviev (Cornell University) RAMiCS

Discrete Random Variables A random variable is a numerical value associated with the outcome of an

Probability and Statistics for Computer Science Its straigh+orward to link a number to the

Discrete Mathematics & Mathematical Reasoning Chapter 7: Discrete Probability Kousha