No Free Lunch in Data Privacy CompSci 590.03 Instructor: Ashwin - PowerPoint PPT Presentation

No Free Lunch in Data Privacy CompSci 590.03 Instructor: Ashwin Machanavajjhala Lecture 15: 590.03 Fall 12 1

Outline • Background: Domain-independent privacy definitions • No Free Lunch in Data Privacy [Kifer- M SIGMOD ‘11] • Correlations: A case for domain specific privacy definitions [Kifer- M SIGMOD ‘11] • Pufferfish Privacy Framework [Kifer- M PODS’12] • Defining Privacy for Correlated Data [Kifer- M PODS’12 & Ding - M ‘13] – Next class Lecture 15: 590.03 Fall 12 2

Data Privacy Problem Utility: Privacy: No breach about any individual Server D B Individual 1 Individual 2 Individual 3 Individual N r 1 r 2 r 3 r N Lecture 15: 590.03 Fall 12 3

Data Privacy in the real world Application Data Collector Third Party Private Function (utility) (adversary) Information Medical Hospital Epidemiologist Disease Correlation between disease and geography Genome Hospital Statistician/ Genome Correlation between analysis Researcher genome and disease Advertising Google/FB/Y! Advertiser Clicks/Brows Number of clicks on an ad ing by age/region/gender … Social Facebook Another user Friend links Recommend other users Recommen- / profile or ads to users based on dations social network iDASH Privacy Workshop 9/29/2012 4

Semantic Privacy ... nothing about an individual should be learnable from the database that cannot be learned without access to the database. T. Dalenius, 1977 Lecture 15: 590.03 Fall 12 5

Can we achieve semantic privacy? • … or is there one (“precious…”) privacy definition to rule them all? Lecture 15: 590.03 Fall 12 6

Defining Privacy • In order to allow utility, a non-negligible amount of information about an individual must be disclosed to the adversary. • Measuring information disclosed to an adversary involves carefully modeling the background knowledge already available to the adversary. • … but we do not know what information is available to the adversary. Lecture 15: 590.03 Fall 12 7

Many definitions & several attacks • Linkage attack K-Anonymity • Background knowledge attack Sweeney et al. L-diversity IJUFKS ‘02 • Minimality /Reconstruction Machanavajjhala et. al attack TKDD ‘07 T-closeness • de Finetti attack E-Privacy • Composition attack Li et. al ICDE ‘07 Machanavajjhala et. al Diff ifferenti tial VLDB ‘09 Privacy Dw Dwork et. al ICALP ‘06 Lecture 15: 590.03 Fall 12 8

Composability [Dwork et al, TCC 06] Theorem (Composability) : If algorithms A 1 , A 2 , …, A k use independent randomness and each A i satisfies ε i -differential privacy, resp. Then, outputting all the answers together satisfies differential privacy with ε = ε 1 + ε 2 + … + ε k Lecture 15: 590.03 Fall 12 10

Differential Privacy • Domain independent privacy definition that is independent of the attacker. • Tolerates many attacks that other definitions are susceptible to. – Avoids composition attacks – Claimed to be tolerant against adversaries with arbitrary background knowledge. • Allows simple, efficient and useful privacy mechanisms – Used in a live US Census Product [ M et al ICDE ‘08] Lecture 15: 590.03 Fall 12 11

Outline • Background: Domain independent privacy definitions. • No Free Lunch in Data Privacy [Kifer- M SIGMOD ‘11] • Correlations: A case for domain specific privacy definitions [Kifer- M SIGMOD ‘11] • Pufferfish Privacy Framework [Kifer- M PODS’12] • Defining Privacy for Correlated Data [Kifer- M PODS’12 & Ding - M ‘13] – Current research Lecture 15: 590.03 Fall 12 12

No Free Lunch Theorem It is not possible to guarantee any utility in addition to privacy, without making assumptions about [Kifer- Machanavajjhala SIGMOD ‘11] • the data generating distribution • the background knowledge available [Dwork-Naor JPC ‘10] to an adversary Lecture 15: 590.03 Fall 12 13

Discriminant: Sliver of Utility • Does an algorithm A provide any utility? w(k, A) > c if there are k inputs { D 1 , …, D k } such that A(D i ) give different outputs with probability > c . • Example: If A can distinguish between tables of size <100 and size >1000000000, then w(2,A) = 1 . 14

Discriminant: Sliver of Utility Theorem: The discriminant of Laplace mechanism is 1. Proof: • Let Di = a database with n records and n∙i /k cancer patients • Let Si = the range [ n∙i /k – n/3k, n∙i /k + n/3k]. All Si are disjoint • Let M be the laplace mechanism on the query “how many cancer patients are there”. • Pr(M(Di) ε Si) = Pr(Noise < n/3k) > 1 – e -n/3k ε = 1 – δ • Hence, discriminant w(k,M) > 1- δ • As n tends to infinity, discriminant tends to 1. 15

Discriminant: Sliver of Utility • Does an algorithm A provide any utility? w(k, A) > c if there are k inputs { D 1 , …, D k } such that A(D i ) give different outputs with probability > c . • If w(k, A) is close to 1 - we may get some utility after using A . • If w(k, A) is close to 0 - we cannot distinguish any k inputs – no utility. 16

Non-privacy • D is randomly drawn from P data . • q is a sensitive query with k answers, s.t., knows P data but cannot guess value of q • A is not private if: can guess q correctly based on P data and A 17

No Free Lunch Theorem • Let A be a privacy mechanism with w(k,A) > 1- ε • Let q be a sensitive query with k possible outcomes. • There exists a data generating distribution P data , s.t. – q(D) is uniformly distributed, but – wins with probability greater than 1- ε 18

Outline • Background: Domain independent privacy definitions • No Free Lunch in Data Privacy [Kifer- M SIGMOD ‘11] • Correlations: A case for domain specific privacy definitions [Kifer- M SIGMOD ‘11] • Pufferfish Privacy Framework [Kifer- M PODS’12] • Defining Privacy for Correlated Data [Kifer- M PODS’12 & Ding - M ‘13] – Current research Lecture 15: 590.03 Fall 12 19

Correlations & Differential Privacy • When an adversary knows that individuals in a table are correlated, then (s)he can learn sensitive information about individuals even from the output of a differentially private mechanism. • Example 1: Contingency tables with pre-released exact counts • Example 2: Social Networks Lecture 15: 590.03 Fall 12 20

Contingency tables Each tuple takes k=4 different values 2 2 2 8 D Count( , ) Lecture 15: 590.03 Fall 12 21

Contingency tables Want to release counts privately ? ? ? ? D Count( , ) Lecture 15: 590.03 Fall 12 22

Laplace Mechanism 2 + Lap(1/ ε ) 2 + Lap(1/ ε ) 2 + Lap(1/ ε ) 8 + Lap(1/ ε ) Mean : 8 D Variance : 2/ ε 2 Guarantees differential privacy. Lecture 15: 590.03 Fall 12 23

Marginal counts 2 + Lap(1/ ε ) 2 + Lap(1/ ε ) 4 4 2 + Lap(1/ ε ) 8 + Lap(1/ ε ) 10 10 4 4 10 10 Auxiliary marginals published for following reasons: 1. Legal : 2002 Supreme Court case Utah v. Evans 2. Contractual : Advertisers must know exact D demographics at coarse granularities Does Laplace mechanism still guarantee privacy? Lecture 15: 590.03 Fall 12 24

Marginal counts 2 + Lap(1/ ε ) 2 + Lap(1/ ε ) 4 2 + Lap(1/ ε ) 2 + Lap(1/ ε ) 2 + Lap(1/ ε ) 8 + Lap(1/ ε ) 10 2 + Lap(1/ ε ) 4 10 Count ( , ) = 8 + Lap(1/ ε ) Count ( , ) = 8 - Lap(1/ ε ) D Count ( , ) = 8 - Lap(1/ ε ) Count ( , ) = 8 + Lap(1/ ε ) Lecture 15: 590.03 Fall 12 25

Marginal counts 2 + Lap(1/ ε ) 2 + Lap(1/ ε ) 4 2 + Lap(1/ ε ) 2 + Lap(1/ ε ) 2 + Lap(1/ ε ) 8 + Lap(1/ ε ) 10 2 + Lap(1/ ε ) 4 10 Mean : 8 D Variance : 2/ke 2 can reconstruct the table with high precision for large k Lecture 15: 590.03 Fall 12 26

Reason for Privacy Breach • Pairs of tables that differ in one tuple • cannot distinguish them Tables that do not satisfy background knowledge Space of all possible tables Lecture 15: 590.03 Fall 12 27

Reason for Privacy Breach can distinguish between every pair of these tables based on the output Space of all possible tables Lecture 15: 590.03 Fall 12 28

Correlations & Differential Privacy • When an adversary knows that individuals in a table are correlated, then (s)he can learn sensitive information about individuals even from the output of a differentially private mechanism. • Example 1: Contingency tables with pre-released exact counts • Example 2: Social Networks Lecture 15: 590.03 Fall 12 29

A count query in a social network Bob Alice • Want to release the number of edges between blue and green communities. • Should not disclose the presence/absence of Bob-Alice edge. 30

Adversary knows how social networks evolve • Depending on the social network evolution model, (d 2 -d 1 ) is linear or even super-linear in the size of the network. 31

Differential privacy fails to avoid breach Output (d 1 + δ ) δ ~ Laplace(1/ ε ) Output (d 2 + δ ) Adversary can distinguish between the two worlds if d 2 – d 1 is large. 32

Outline • Background: Domain independent privacy definitions • No Free Lunch in Data Privacy [Kifer- M SIGMOD ‘11] • Correlations: A case for domain-specific privacy definitions [Kifer- M SIGMOD ‘11] • Pufferfish Privacy Framework [Kifer- M PODS’12] • Defining Privacy for Correlated Data [Kifer- M PODS’12 & Ding - M ‘13] – Current research Lecture 15: 590.03 Fall 12 33

No Free Lunch in Data Privacy CompSci 590.03 Instructor: Ashwin - PowerPoint PPT Presentation

No Free Lunch in Data Privacy CompSci 590.03 Instructor: Ashwin Machanavajjhala Lecture 15: 590.03 Fall 12 1 Outline Background: Domain-independent privacy definitions No Free Lunch in Data Privacy [Kifer- M SIGMOD 11]

WASTE-FREE LUNCH Day 1 OBJECTIVE Describe how a waste-free lunch at the Olympics could

% Free Lunch by Zip Code 2009-10 (NCES Common Core of Data) LOCATION CHARTER TOTAL FREE LUNCH

Sustaining the Data Ecosystem There is no free lunch but you still need to eat CCDSC 2016

2013 rockchalk 1 / 81 K.U. Introduction Data Outreg Plots Free Lunch Conclusions Guessing

30. Parallel Programming I Moores Law and the Free Lunch, Hardware Architectures, Parallel

arXiv:2007.10928v1 [cs.LG] 21 Jul 2020 Abstract The No Free Lunch theorems prove that under a

Preserving the Privacy of Sensitive Relationships in Graph Data Motivation Valuable Data! No

EU Art.29 Data Protection Users care about privacy Working Party From: Special Eurobarometer

Data Privacy for IEEE Volunteer Data Managers 2 Overview Changes in Data Privacy IEEE

Data privacy: an introduction (part 1) Klara Stokes What is privacy? Privacy has been defined in

CS573 Data Privacy and Security Data Privacy and Security in Healthcare Data Privacy and Security

Outline Recap DifferenCal privacy ensures adversary cant

Data privacy and big Data privacy and big data data Engineering & Public Policy Lorrie

Data Privacy Law Overview Privacy Protections (D) Working Group Jennifer McAdam Senior Counsel

Microsoft Research The free lunch is over. Muticores are here. We have to program them.

Privacy and Data Protection Designing for Privacy Tobias Pulls CC-BY-4.0 Privacy is a Human

11-830 Computational Ethics for NLP Lecture 11: Privacy and Anonymity Privacy and Anonymity

CS573 Data Privacy and Security Local Differential Privacy Li Xiong Privacy at Scale: Local

#MicroFocusCyberSummit Cloud-based Data Privacy and Protection Protecting data and privacy

Microservices and OSGi running with Apache Karaf Agenda No free Lunch - microservices

1476 Students 58% Free & Reduced Lunch 12% English Learners

No Free Lunch David Cervini, Danica Porobic , Pnar Tzn, Anastasia Ailamaki Why Hardware

Government and Big Data: Privacy Risks and Solutions Ontarios Access and Privacy Laws The

CS573 Data Privacy and Security Location Privacy Location Privacy Yonghui (Yohu) Xiao htt //

No Free Lunch in Data Privacy CompSci 590.03 Instructor: Ashwin - PowerPoint PPT Presentation

No Free Lunch in Data Privacy CompSci 590.03 Instructor: Ashwin Machanavajjhala Lecture 15: 590.03 Fall 12 1 Outline Background: Domain-independent privacy definitions No Free Lunch in Data Privacy [Kifer- M SIGMOD 11]

WASTE-FREE LUNCH Day 1 OBJECTIVE Describe how a waste-free lunch at the Olympics could

% Free Lunch by Zip Code 2009-10 (NCES Common Core of Data) LOCATION CHARTER TOTAL FREE LUNCH

Sustaining the Data Ecosystem There is no free lunch but you still need to eat CCDSC 2016

2013 rockchalk 1 / 81 K.U. Introduction Data Outreg Plots Free Lunch Conclusions Guessing

30. Parallel Programming I Moores Law and the Free Lunch, Hardware Architectures, Parallel

arXiv:2007.10928v1 [cs.LG] 21 Jul 2020 Abstract The No Free Lunch theorems prove that under a

Preserving the Privacy of Sensitive Relationships in Graph Data Motivation Valuable Data! No

EU Art.29 Data Protection Users care about privacy Working Party From: Special Eurobarometer

Data Privacy for IEEE Volunteer Data Managers 2 Overview Changes in Data Privacy IEEE

Data privacy: an introduction (part 1) Klara Stokes What is privacy? Privacy has been defined in

CS573 Data Privacy and Security Data Privacy and Security in Healthcare Data Privacy and Security

Outline Recap DifferenCal privacy ensures adversary cant

Data privacy and big Data privacy and big data data Engineering &amp; Public Policy Lorrie

Data Privacy Law Overview Privacy Protections (D) Working Group Jennifer McAdam Senior Counsel

Microsoft Research The free lunch is over. Muticores are here. We have to program them.

Privacy and Data Protection Designing for Privacy Tobias Pulls CC-BY-4.0 Privacy is a Human

11-830 Computational Ethics for NLP Lecture 11: Privacy and Anonymity Privacy and Anonymity

CS573 Data Privacy and Security Local Differential Privacy Li Xiong Privacy at Scale: Local

#MicroFocusCyberSummit Cloud-based Data Privacy and Protection Protecting data and privacy

Microservices and OSGi running with Apache Karaf Agenda No free Lunch - microservices

1476 Students 58% Free &amp; Reduced Lunch 12% English Learners

No Free Lunch David Cervini, Danica Porobic , Pnar Tzn, Anastasia Ailamaki Why Hardware

Government and Big Data: Privacy Risks and Solutions Ontarios Access and Privacy Laws The

CS573 Data Privacy and Security Location Privacy Location Privacy Yonghui (Yohu) Xiao htt //

Data privacy and big Data privacy and big data data Engineering & Public Policy Lorrie

1476 Students 58% Free & Reduced Lunch 12% English Learners