Privacy in Recommender Systems CompSci 590.03 Instructor: Ashwin Machanavajjhala Lecture 21: 590.03 Fall 12 1
Outline • What is a Recommender System? • Recommender Systems & Privacy Breaches • Algorithms for privacy recommender systems – Untrusted Server: – Trusted Server: • Social Recommendations: – A theoretical trade-off between privacy and utility Lecture 21: 590.03 Fall 12 4
Recommender systems appear in everyday life … Lecture 21: 590.03 Fall 12 5
Recommendation Engine • Users activity: – Rate items (movies/products/etc) – Click items (news article/advertisement) – Browse items (products/webpages) • Task: Predict the utility of items to a particular user based on a database of history of activities from a number of users. Lecture 21: 590.03 Fall 12 6
Database of ratings Items Users Lecture 21: 590.03 Fall 12 7
Algorithms for Collaborative Filtering • Neighborhood-based – Utility of a new item to a user is proportional utility of the item to similar users. • Latent Factor Models – Users and items are described by a latent model with a small number of dimensions – User likes “science fiction action movies” • Accounting for temporal dynamics and biases – Popular items have higher utility in general – Items may have higher utility due to presentation bias – … • See Yehuda Koren’s tutorial Lecture 21: 590.03 Fall 12 8
Example: Neighbor based Algorithm Average Rating of User 1 = 2.8 I1 I2 I3 I4 I5 I6 I7 I8 I9 U1 2 1 1 4 4 3 4 3 U2 1 1 5 5 4 4 3 U3 4 5 5 2 3 3 2 U4 5 4 1 3 2 2 Average Rating of User 1 = 3.4 Lecture 21: 590.03 Fall 12 9
Example: Neighbor based Algorithm Rescale the users I1 I2 I3 I4 I5 I6 I7 I8 I9 U1 -0.8 -1.8 -1.8 1.2 1.2 0.2 1.2 0.2 U2 -2.3 -2.3 1.7 1.7 0.7 0.7 -0.3 U3 0.6 1.6 1.6 -1.4 -0.4 -0.4 -1.4 U4 2.2 1.2 -1.8 0.2 -0.8 -0.8 Lecture 21: 590.03 Fall 12 10
Example: Neighbor based Algorithm Compute Similarities I1 I2 I3 I4 I5 I6 I7 I8 I9 U1 -0.8 -1.8 -1.8 1.2 1.2 0.2 1.2 0.2 U2 -2.3 -2.3 1.7 1.7 0.7 0.7 -0.3 U3 0.6 1.6 1.6 -1.4 -0.4 -0.4 -1.4 U4 2.2 1.2 -1.8 0.2 -0.8 -0.8 Lecture 21: 590.03 Fall 12 11
Example: Neighbor based Algorithms Predict a missing Rating U1 U2 U3 U4 I3 U1 1 0.78 -0.96 -0.85 -1.8 U2 0.78 1 -0.74 -0.77 -2.3 U3 -0.96 -0.74 1 0.83 1.6 U4 -0.85 -0.77 0.83 1 ? Lecture 21: 590.03 Fall 12 12
Outline • What is a Recommender System? • Recommender Systems & Privacy Breaches • Algorithms for privacy recommender systems – Untrusted Server: – Trusted Server: • Social Recommendations: – A theoretical trade-off between privacy and utility Lecture 21: 590.03 Fall 12 13
Active Privacy Attack • Adversary knows a subset of items rated/purchased by the target user. • Adversary creates a new fake account and rates the same set of items. • Other items highly rated by the user are recommended to the fake user (adversary). I1 I2 I3 I4 I5 I6 I7 I8 I9 U1 2 1 1 4 4 3 4 3 U2 1 1 5 5 4 4 3 U3 4 5 5 2 3 3 2 U4 5 4 1 3 2 2 Lecture 21: 590.03 Fall 12 14
Outline • What is a Recommender System? • Recommender Systems & Privacy Breaches • Algorithms for privacy recommender systems – Untrusted Server: – Trusted Server: • Social Recommendations: – A theoretical trade-off between privacy and utility Lecture 21: 590.03 Fall 12 15
Untrusted Server • The users do not trust the server and do not want to disclose their true set of ratings • Distributed Recommendations [Canny SIGIR 02] – Protects information from untrusted server – Does not protect against active attack • Randomized Response [Evfimievski et al PODS 02] – Protects information from untrusted server – Protects against active attack Lecture 21: 590.03 Fall 12 16
Randomized Response Alice J.S. Bach, Server painting, nasa.gov, … Bob B. Spears, baseball, cnn.com, Chris … B. Marley, camping, linux.org, … Lecture 21: 590.03 Fall 12 17
Randomized Response Alice J.S. Bach, J.S. Bach, painting, Server painting, nasa.gov, nasa.gov, … … B. Spears, Bob baseball, cnn.com, B. Spears, … B. Marley, baseball, camping, cnn.com, Chris … linux.org, … B. Marley, camping, linux.org, … Lecture 21: 590.03 Fall 12 18
Randomized Response Alice J.S. Bach, J.S. Bach, painting, Server painting, nasa.gov, nasa.gov, … … B. Spears, Bob baseball, Data Mining Model cnn.com, B. Spears, … B. Marley, baseball, camping, cnn.com, Chris … linux.org, Usage … B. Marley, camping, linux.org, … Lecture 21: 590.03 Fall 12 19
Randomized Response Alice Metallica, J.S. Bach, painting, Server painting, nasa.gov, nasa.gov, … … Statistics Recovery B. Spears, Bob soccer, Data Mining Model bbc.co.uk, B. Spears, … B. Marley, baseball, camping, cnn.com, Chris microsoft.com … Usage … B. Marley, camping, linux.org, … Lecture 21: 590.03 Fall 12 20
One Algorithm: Select-a-size • Pick a number j at random • Select j original items • Insert new items with probability ρ Lecture 21: 590.03 Fall 12 21
Trusted Server • Differentially Private Recommendations [McSherry et al KDD 09] Lecture 21: 590.03 Fall 12 22
Outline • What is a Recommender System? • Recommender Systems & Privacy Breaches • Algorithms for privacy recommender systems – Untrusted Server: – Trusted Server: • Social Recommendations: – A theoretical trade-off between privacy and utility Lecture 21: 590.03 Fall 12 23
Personalized Social Recommendations Recommend ads based on private shopping histories of • Armani “friends” in the social network. • Gucci • Prada • Nikon • HP • Nike Alice Betty 24 Lecture 21: 590.03 Fall 12
Social Advertising … in real world Items (products/people) liked by Alice’s friends are better recommendations for Alice A product that is followed by your friends … Lecture 21: 590.03 Fall 12 25
Social Advertising … privacy problem Only the items (products/people) liked by Alice’s friends are recommendations for Alice Fact that “Betty” liked “ VistaPrint ” is leaked to “Alice” Betty Alice Lecture 21: 590.03 Fall 12 26
Social Advertising … privacy problem Recommending irrelevant items some times improves privacy, but reduces accuracy Betty Alice Lecture 21: 590.03 Fall 12 27
Social Advertising Privacy problem Alice is recommended ‘X’ Alice Betty Can we provide accurate recommendations to Alice based on the social network, while ensuring that Alice cannot deduce that Betty likes ‘X’ ? Lecture 21: 590.03 Fall 12 28
Social Recommendations • A set of agents – Yahoo/Facebook users, medical patients • A set of recommended items – Other users (friends) , advertisements, products (drugs) • A network of edges connecting the agents, items – Social network, patient-doctor and patient-drug history • Problem: – Recommend a new item i to agent a based on the network Lecture 21: 590.03 Fall 12 29
Social Recommendations(this talk) • A set of agents – Yahoo/Facebook users , medical patients • A set of recommended items – Other users (friends) , advertisements, products (drugs) • A network of edges connecting the agents, items – Social network , patient-doctor and patient-drug history • Problem: – Recommend a new friend i to target user a based on the social network Lecture 21: 590.03 Fall 12 30
Social Recommendations Target Node (a) Utility Function – u(a, i) utility of recommending candidate i u(a, i 1 ) u(a, i 2 ) u(a, i 3 ) to target a Candidate Recommendations Lecture 21: 590.03 Fall 12 31
Non-Private Recommendation Algorithm a Utility Function – u(a, i) utility of recommending candidate i to target a u(a, i 1 ) u(a, i 2 ) u(a, i 3 ) Algorithm For each target node a For each candidate i Compute p(a, i) that maximizes Σ u(a,i) p(a,i) endfor Randomly pick one of the candidates with probability p(a,i) endfor Lecture 21: 590.03 Fall 12 32
Good utility functions for link prediction [Liben-Nowell, Kleinberg 2003] 2-hop neighborhood • Common Neighbors • Adamic/Adar Holistic • Katz (weighted paths) • Personalized PageRank Lecture 21: 590.03 Fall 12 33
Example: Common Neighbors Utility a Utility Function – u(a, i) utility of recommending candidate i to target a u(a, i 1 ) u(a, i 2 ) u(a, i 3 ) Common Neighbors Utility : “Alice and Bob are likely to be friends if they have many common neighbors” u(a,i 1 ) = f(2), u(a, i 2 ) = f(3), u(a,i 3 ) = f(1) Non-Private Algorithm • Return the candidate with max u(a, i) • Randomly pick a candidate with probability proportional to u(a,i) Lecture 21: 590.03 Fall 12 34
Other utility functions • Adamic/Adar – Two nodes are more similar if they have more common neighbors that have smaller degrees • Katz – Two nodes are similar if they are connected by shorter paths Lecture 21: 590.03 Fall 12 35
Privacy Should not disclose existence of private edges in the network vs Allow recommendations based on private edges Lecture 21: 590.03 Fall 12 36
Recommend
More recommend