A Privacy-Protecting Architecture for Collaborative Filtering via Forgery and Suppression of Ratings Javier Parra-Arnau , David Rebollo-Monedero and Jordi Forné http://sites.google.com/site/javierparraarnau/ Department of Telematics Engineering Technical University of Catalonia ( UPC ) Barcelona, Spain Leuven, Belgium September 15, 2011 1
Outline 2 Introduction State of the Art An Architecture for Privacy Protection in Collaborative Filtering based Recommendation Systems Formulation of the Optimal Trade-Off between Privacy and Utility Conclusions
Introduction
Information Overload 3 The amount of information on the Web has grown exponentially since the advent of the Internet
Collaborative Filtering 4 A recommendation system is a filtering system that suggest information items that are likely to be of interest to the user Recommendation systems based on collaborative filtering (CF) algorithms Examples include Amazon, Digg, Movielens and Netflix ion overload ion overload ion overload ion overload ion overload mation overl ion overload ion overload ion overload ion overload ion overload rmation ove on overload rmation ove on overload ormation ov tion overloa formation o Information
User Profiles 5 Users need to communicate their preferences to the recommender in order to obtain a prediction for those items they have not yet considered 80 76 71 71 67 62 54 51 38 34 25 25 16 12 7 7 7 3 3 Drama Thriller Comedy Action Romance Sci-Fi War Documentary Animation Western Adventure Crime Mistery Fantasy Horror Children Musical Film-Noir IMAX
Privacy Risk 6 The privacy risks perceived by users include computers “figuring things out” about them, unsolicited marketing, court subpoenas, and government surveillance [Cranor 03] Recommendation System predictions she’s … … pregnant!
Forgery and Suppression of Ratings 7 Submitting false information and refusing to give private information are strategies accepted by users concerned with their privacy [Fox 00, Hoffman 99] Our approach relies upon the forgery and suppression of ratings SUPPRESSION … … Recommendation predictions System the user has read these books
Contribution (I) 8 Our architecture protects user privacy to a certain extent utility loss measured as forgery rate and suppression rate
Contribution (II) 9 Mathematical formulation of the optimal trade-off among privacy, forgery rate ½ and suppression rate ¾ Privacy as the Shannon entropy of the user’s apparent profile µ q + r ¡ s ¶ P ( ½; ¾ ) = max H r;s 1 + ½ ¡ ¾ r i > 0 ; P r i = ½ q i > s i > 0 ; P s i = ¾ Our proposal could be used in combination with other existing approaches
State of the Art
Privacy Protection in Recommendation Systems 10 The state-of-the-art approaches may be classified according to these main strategies perturbing the information provided by users [Pollat 03, 05, Agrawal 01, Kargupta 03, Huang 05], using cryptographic techniques [Canny 02, Ahmad 07, Zhan 10], and distributing the information collected [Miller 04, Berkovsky 07] 3.2 + 1.5, 2.9 – 0.7, 4.1, 4.4 – 2.7 5.6, 3.3 + 1.0, recommendation 1.1, system 3.4 – 0.1 [Pollat 03]
Privacy Protection in Recommendation Systems 10 The state-of-the-art approaches may be classified according to these main strategies perturbing the information provided by users [Pollat 03, 05, Agrawal 01, Kargupta 03, Huang 05], using cryptographic techniques [Canny 02, Ahmad 07, Zhan 10], and distributing the information collected [Miller 04, Berkovsky 07] q 5 q 4 q 1 q 3 Enc( q 1 )+ ::: + Enc( q 5 )= q 2 = Enc( q 1 + ::: + q 5 ) [Canny 02]
Privacy Protection in Recommendation Systems 10 The state-of-the-art approaches may be classified according to these main strategies perturbing the information provided by users [Pollat 03, 05, Agrawal 01, Kargupta 03, Huang 05], using cryptographic techniques [Canny 02, Ahmad 07, Zhan 10], and distributing the information collected [Miller 04, Berkovsky 07] ratings central server [Miller 04]
An Architecture for Privacy Protection in CF-based Recommendation Systems
Overview 11 Profiling is accomplished on the basis of user ratings Information items are classified as known or unknown Users may wish to submit ratings to unknown items (forgery) and refrain from rating known items (suppression) Recommendation System unknown items known items
User Profile Model 12 Witty Clever Buddies Fall in Love Humorous Couple 80 76 71 71 Relations Parents and Children Feel 67 62 Good Best Friends Offbeat 54 51 38 34 25 25 16 Emotiona l Slow Teenage Life Sincere 12 7 7 7 3 3 Human Spirit Human Nature Drama Thriller Comedy Action Romance Sci-Fi War Documentary Animation Western Adventure Crime Mistery Fantasy Horror Children Musical Film-Noir IMAX Parents and Children Coming of Age Touching Village Life Movielens Jinni [Toubiana 10, Fredrikson 11] suggest representing user profiles as histograms of absolute frequencies We model the profile of a user as a probability mass function (PMF)
User Profile Construction 13 Our architecture requires to estimate the actual profile of a user to help them decide which items should be rated and which should not Histogram based on the categories provided by the recommender Categorize items by exploring web pages and using the vector space model [Salton 75] …? books \ literature & fiction \ genre fiction
Adversarial Model 14 Passive attacker capable of crawling through the items rated by a user The attacker observes the apparent user profile t , a perturbed version of the actual user profile q ratings predictions Recommender … … … … … … forgery and suppression NO PROTECTION! of ratings q q t
Privacy Measure 15 We measure privacy as the Shannon entropy of the user’s apparent profile t number of categories n X H( t ) = t i log 2 t i i =1 Accordingly, privacy is compromised whenever the user’s preferences are biased towards certain categories of interest 1 2 3 4 1 2 3 4 minimum privacy maximum privacy
Architecture 16 User side Network side Known / Unknown Category Items Classifier Extractor ! Forgery Information Alarm Provider ... ! Communication Suppression Manager Alarm x2 Recommendation System uncategorized item categorized item Forgery and User Profile known item Constructor Suppression Generator unknown item rated item
Architecture 17 Block Functionality Communication User side Network side Description - Starting at the beginning, the book explores how JavaScript originated and evolved into with the what it is today. A detailed discussion of the components that make up a JavaScript implementation recommender follows, with specific focus on standards such as ECMAScript and the Document Object Model (DOM). Category - books \ computers & internet \ web development Retrieve Average Customer Review 4.5/5 information about Known / Unknown Category Items Classifier Extractor the items explored by the user Description - Stephen Hawking, one of the most brilliant theoretical physicists in history, wrote the ! Information modern classic A Brief History of Time to help nonscientists understand the questions being asked by Forgery scientists today. Alarm Provider Category - books \ science ... Average Customer Review 4/5 ! Communication Description - Written by soccer great and championship Stanford coach Bobby Clark, this book tells Suppression Manager you how, starting at point zero, an uninitiated coach can meld kids into a team and help them enjoy one of Alarm the most rewarding experiences of their youth. Category - books \ sports \ coaching \ soccer x2 Average Customer Review 4.5/5 Recommendation System Description - You’ve made it! Your baby has turned one! Now the real fun begins. From temper uncategorized item tantrums to toilet training, raising a toddler brings its own set of challenges and questions — and Toddler categorized item 411 has the answers. Forgery and User Profile known item Category - books \ parenting & families \ parenting Constructor Suppression Generator unknown item Average Customer Review 3/5 rated item
Architecture 18 Block Functionality Obtain categories User side Network side associated with the items downloaded by the Communication Known / Unknown Category Manager Items Classifier Extractor ! Information Forgery Alarm Provider ... ! Communication Suppression Manager Alarm x2 Recommendation System uncategorized item categorized item Forgery and User Profile known item Constructor Suppression Generator unknown item rated item
Architecture 19 Block Functionality The user classifies User side Network side the items as known or unknown Known / Unknown Category Items Classifier Extractor ! Information Forgery books \ books \ books \ books \ Alarm computers & internet \ sports \ parenting & families \ Provider science web development coaching \ parenting ... soccer ! Communication Suppression Manager Alarm x2 Recommendation System uncategorized item categorized item Forgery and User Profile known item Constructor Suppression Generator unknown item unknown items known items rated item
Recommend
More recommend