Recommender Systems Francesco Ricci Free University of Bozen-Bolzano fricci@unibz.it
Content p Paradox of choice and information overload p Personalization p Recommender system p Step 1: Preference elicitation p Step 2: Preference prediction - rating estimation techniques n Contextualization p Step 3: Recommendations' presentation p Issues and problems p Questions 2
Explosion of Choice p A trip to a local supermarket : n 85 different varieties and brands of crackers. n 285 varieties of cookies. n 165 varieties of “ juice drinks ” n 75 iced teas n 275 varieties of cereal n 120 different pasta sauces n 80 different pain relievers n 40 options for toothpaste n 95 varieties of snacks (chips, pretzels, etc.) n 61 varieties of sun tan oil and sunblock n 360 types of shampoo, conditioner, gel, and mousse. n 90 different cold remedies and decongestants. n 230 soups, including 29 different chicken soups n 175 different salad dressings and if none of them suited, 15 extra-virgin olive oils and 42 vinegars and make one ’ s own
New Domains for Choice p Telephone Services p Retirement Pensions p Medical Care p News p Choosing how to work p Choosing how to love p Choosing how to be
Choice and Well-Being p We have more choice , more freedom, autonomy, and self determination p It seems that increased choice improves well- being: n added options can only make us better off: those who care will benefit, and those who do not care can always ignore the added options p Various assessment of well-being have shown that increased affluence have accompanied by decreased well-being .
Neuroscience and Information Overload p Neuroscientists have discovered that unproductivity and loss of drive can result from decision overload p Our brains ( 120 bits per second ) are configured to make a certain number of decisions per day and once we reach that limit, we can’t make any more p Information processing has a cost : we can have trouble separating the trivial from the important – this inf. processing makes us tired. 6
Information Overload p Internet = information overload = having too much information to make a decision or remain informed about a topic p To make a decision or remain informed about a topic you must perform exploratory search (e.g., comparison, knowledge acquisition, product selection, etc.) n not aware of the range of available options n may not know what to search n if presented with some results may not be able to choose. 7
Personalization p “If I have 3 million customers on the Web, I should have 3 million stores on the Web” n Jeff Bezos , CEO and founder, Amazon.com n Degree in Computer Science n $34.2 billion (net worth), ranked no. 15 in the Forbes list of the America's Wealthiest 8 People
Amazon.it 9
Movie Recommendation – YouTube Recommendations account for about 60% of all video clicks from 10 the home page.
Consumer Attitudes 11
The Long Tail p Economic model in which the market for non-hits (typically large numbers of low-volume items) could be significant and sometimes even greater than the market for big hits (typically small numbers of high-volume items). 12
Goal p Recommend items that are good for you! n relevant n improve well being n rational choices n optimal 13
Step 1: Preference Elicitation 14
Last.fm – Preference Elicitation
Rating Recommendations 16
Alternative Methods 17
Remembering p D. Kahneman (nobel prize): what we remember about an experience is determined by ( peak-end rule ) n How the experience felt when it was at its peak (best or worst) n How it felt when it ended p We rely on this summary later to remind how the experience felt and decide whether to have that experience again p So how well do we know what we want? n It is doubtful that we prefer an experience to another very similar just because the first ended better. Bias of Remembered Utility 18
Step 2: Model Building 19
Movie rating data Training data Test data user movie date score user movie date score 1 21 5/7/02 1 1 62 1/6/05 ? 1 213 8/2/04 5 1 96 9/13/04 ? 2 345 3/6/01 4 2 7 8/18/05 ? 2 123 5/1/05 4 2 3 11/22/05 ? 2 768 7/15/02 3 3 47 6/13/02 ? 3 76 1/22/01 5 3 15 8/12/01 ? 4 45 8/3/00 4 4 41 9/1/00 ? 5 568 9/10/05 1 4 28 8/27/05 ? 5 342 3/5/03 2 5 93 4/4/05 ? 5 234 12/28/00 2 5 74 7/16/03 ? 6 76 8/11/02 5 6 69 2/14/04 ? 6 56 6/15/03 4 6 83 10/3/03 ? 20
Matrix of ratings Items Users 21
Item-to-Item Collaborative Filtering target neigh. neigh. p Suppose the prediction is made using two nearest- neighbors, and that the items most similar to “Titanic” are “Forrest Gump” and “Wall-E” p w titanic, forrest = 0.85 p w titanic, wall-e = 0.75 p r* eric, titanic = (0.85*5 + 0.75*4)/(0.85 + 0.75) = 4.53 22
User-Based Collaborative Filtering p A collection of n users U and a collection of m items I p A n × m matrix of ratings r ui , with r ui = ? if user u did not rate item i p Prediction for user u and item j is computed as * = r ∑ r u + K w uv ( r vj − r v ) uj v ∈ N j ( u ) A set of neighbours of u that have rated j p Where, r u is the average rating of user u , K is a normalization factor such that the absolute values of w uv sum to 1, and ∑ ( r uj − r u )( r vj − r v ) Pearson Correlation of j ∈ I uv w uv = users u and v ∑ u ) 2 ∑ v ) 2 ( r uj − r ( r vj − r j ∈ I uv j ∈ I uv 23 [Breese et al., 1998]
Latent Factor Models serious Braveheart The Color Amadeus Purple Lethal Sense and Weapon Sensibility Ocean ’ s 11 Geared Geared towards towards males females Dave The Lion King Dumb and Dumber The Princess Independence Diaries Day Gus 24 escapist
Basic Matrix Factorization Model items 1 3 5 5 4 12 items 6 users 5 4 4 2 1 3 users ~ max 72 entries 2 4 1 2 3 4 3 5 2 4 5 4 2 4 3 4 2 2 5 1 3 3 2 4 items 1.1 -.2 .3 .5 -2 -.5 .8 -.4 .3 1.4 2.4 -.9 .1 -.4 .2 users -.8 .7 .5 1.4 .3 -1 1.4 2.9 -.7 1.2 -.1 1.3 -.5 .6 .5 ~ 2.1 -.4 .6 1.7 2.4 .9 -.3 .4 .8 .7 -.6 .1 -.2 .3 .5 1.1 2.1 .3 -.7 2.1 -2 A rank-3 approximation -1 .7 .3 12 x 3 entries 6 x 3 entries 25 54 total entries
Estimate Unknown Ratings items 1 3 5 5 4 5 4 4 2 1 3 ? users ~ 2 4 1 2 3 4 3 5 2 4 5 4 2 4 3 4 2 2 5 1 3 3 2 4 items 1.1 -.2 .3 .5 -2 -.5 .8 -.4 .3 1.4 2.4 -.9 .1 -.4 .2 users -.8 .7 .5 1.4 .3 -1 1.4 2.9 -.7 1.2 -.1 1.3 -.5 .6 .5 ~ 2.1 -.4 .6 1.7 2.4 .9 -.3 .4 .8 .7 -.6 .1 -.2 .3 .5 1.1 2.1 .3 -.7 2.1 -2 -1 .7 .3 A rank-3 approximation 26
Estimate Unknown Ratings -0.5*(-2) + 0.6*0.3 + 0.5*2.4 = 2.4 items 1 3 5 5 4 5 4 4 2 1 3 2.4 users ~ 2 4 1 2 3 4 3 5 2 4 5 4 2 4 3 4 2 2 5 1 3 3 2 4 items 1.1 -.2 .3 .5 -2 -.5 .8 -.4 .3 1.4 2.4 -.9 .1 -.4 .2 users -.8 .7 .5 1.4 .3 -1 1.4 2.9 -.7 1.2 -.1 1.3 -.5 .6 .5 ~ 2.1 -.4 .6 1.7 2.4 .9 -.3 .4 .8 .7 -.6 .1 -.2 .3 .5 1.1 2.1 .3 -.7 2.1 -2 -1 .7 .3 A rank-3 approximation 27
Matrix factorization as a cost function 2 + q i ( + 2 " % 2 ( ) T q i Min p * , q * ∑ r ui − p u p u + λ $ ' * - # & ) , known r ui p regularization - user-factors of u u q - item-factors of i i r - rating by u for i ui • Optimize by either stochastic gradient-descent or alternating least squares 28
“ Core ” Recommendation Techniques U is a set of users I is a set of items/products [Burke, 2007] 29
Content-Based Recommender with Centroid Not interesting Documents Interesting Documents Centroid sports Centroid Doc2 Doc1 User Model politics 30 Doc1 is estimated more interesting than Doc2
Recommendations can be wrong p Recommenders tend to recommend items similar to those browsed or purchased in the past 31
Context-Aware Computing p Gartner Top 10 strategic technology trends for IT p Context-aware computing is a style of computing in which situational and environmental information about people, places and things is used to anticipate immediate needs and proactively offer enriched, situation-aware and usable content, functions and experiences. http://www.gartner.com/it-glossary/context-aware-computing-2 32
Google Now 33 https://www.google.com/landing/now/
Types of Context - Mobile p Physical context n time, position, and activity of the user, [Fling, 2009] weather, light, and temperature ... p Social context n the presence and role of other people around the user p Interaction media context n the device used to access the system and the type of media that are browsed and personalized (text, music, images, movies, …) p Modal context n The state of mind of the user, the user’s goals, mood, experience, and cognitive capabilities. 34
Recommend
More recommend