data mining techniques
play

Data Mining Techniques CS 6220 - Section 3 - Fall 2016 Lecture 14 - PowerPoint PPT Presentation

Data Mining Techniques CS 6220 - Section 3 - Fall 2016 Lecture 14 Jan-Willem van de Meent (credit: Andrew Ng, Alex Smola, Yehuda Koren, Stanford CS246) Recommender Systems The Long Tail (from: https://www.wired.com/2004/10/tail/) The


  1. Data Mining Techniques CS 6220 - Section 3 - Fall 2016 Lecture 14 Jan-Willem van de Meent (credit: Andrew Ng, Alex Smola, 
 Yehuda Koren, Stanford CS246)

  2. Recommender Systems

  3. The Long Tail (from: https://www.wired.com/2004/10/tail/)

  4. The Long Tail (from: https://www.wired.com/2004/10/tail/)

  5. The Long Tail (from: https://www.wired.com/2004/10/tail/)

  6. Problem Setting

  7. Problem Setting

  8. Problem Setting

  9. Problem Setting • Task : Predict user preferences for unseen items

  10. Content-based Filtering serious Braveheart The Color Amadeus Purple Lethal Weapon Sense and Sensibility Ocean’s ¡ 11 Geared Geared towards towards males females Dave The Lion King Dumb and Dumber The Princess Independence Diaries Day Gus escapist

  11. Content-based Filtering serious Braveheart The Color Amadeus Purple Lethal Weapon Sense and Sensibility Ocean’s ¡ 11 Geared Geared towards towards males females Dave The Lion King Dumb and Dumber The Princess Independence Diaries Day Gus escapist Idea : Predict rating using item features on a per-user basis

  12. Content-based Filtering serious Braveheart The Color Amadeus Purple Lethal Weapon Sense and Sensibility Ocean’s ¡ 11 Geared Geared towards towards males females Dave The Lion King Dumb and Dumber The Princess Independence Diaries Day Gus escapist Idea : Predict rating using user features on a per-item basis

  13. Collaborative Filtering #3 #2 #1 Joe #4 Idea : Predict rating based on similarity to other users

  14. Problem Setting • Task : Predict user preferences for unseen items • Content-based filtering : Model user/item features • Collaborative filtering : Implicit similarity of users items

  15. Recommender Systems • Movie recommendation (Netflix) • Related product recommendation (Amazon) • Web page ranking (Google) • Social recommendation (Facebook) • News content recommendation (Yahoo) • Priority inbox & spam filtering (Google) • Online dating (OK Cupid) • Computational Advertising (Everyone)

  16. Challenges • Scalability • Millions of objects • 100s of millions of users • Cold start • Changing user base • Changing inventory • Imbalanced dataset • User activity / item reviews 
 power law distributed • Ratings are not missing at random

  17. Running Example: Netflix Data Training data Test data score user movie date user movie date score 1 21 5/7/02 1 1 62 1/6/05 ? 1 213 8/2/04 5 1 96 9/13/04 ? 2 345 3/6/01 4 2 7 8/18/05 ? 2 123 5/1/05 4 2 3 11/22/05 ? 2 768 7/15/02 3 3 47 6/13/02 ? 3 76 1/22/01 5 3 15 8/12/01 ? 4 45 8/3/00 4 4 41 9/1/00 ? 5 568 9/10/05 1 4 28 8/27/05 ? 5 342 3/5/03 2 5 93 4/4/05 ? 5 234 12/28/00 2 5 74 7/16/03 ? 6 76 8/11/02 5 6 69 2/14/04 ? 6 56 6/15/03 4 6 83 10/3/03 ? • Released as part of $1M competition by Netflix in 2006 • Prize awarded to BellKor in 2009

  18. Running Yardstick: RMSE s X | S | − 1 r ui − r ui ) 2 rmse( S ) = (ˆ ( i,u ) ∈ S

  19. Running Yardstick: RMSE s X | S | − 1 r ui − r ui ) 2 rmse( S ) = (ˆ ( i,u ) ∈ S (doesn’t tell you how to actually do recommendation)

  20. Ratings aren’t everything Netflix then Netflix now

  21. Content-based Filtering

  22. Item-based Features

  23. Item-based Features

  24. Item-based Features

  25. Per-user Regression Learn a set of regression coefficients for each user | r u − X w | 2 w u = argmin w

  26. Bias

  27. Bias

  28. Bias Moonrise Kingdom 4 5 4 4 0.3 0.2

  29. Bias Moonrise Kingdom 4 5 4 4 0.3 0.2 Problem : Some movies are universally loved / hated

  30. Bias 3 3 Moonrise Kingdom 4 5 3 4 4 0.3 0.2 Problem : Some movies are universally loved / hated 
 some users are more picky than others

  31. Bias Moonrise Kingdom 4 5 4 4 0.3 0.2 Problem : Some movies are universally loved / hated 
 some users are more picky than others Solution: Introduce a per-movie and per-user bias

  32. Temporal Effects

  33. Changes in user behavior … Netflix changed rating labels 2004

  34. Movies get better with time? Are movies getting better with time?

  35. Temporal Effects Are movies getting better with time? Solution: Model temporal effects in bias not weights

  36. Neighborhood Methods

  37. Neighborhood Based Methods #3 #2 #1 Joe #4 Users and items form a bipartite graph (edges are ratings)

  38. Neighborhood Based Methods (user, user) similarity • predict rating based on average 
 from k-nearest users • good if item base is smaller than user base • good if item base changes rapidly (item,item) similarity • predict rating based on average 
 from k-nearest items • good if the user base is small • good if user base changes rapidly

  39. Parzen-Window Style CF P j ∈ s k ( i,u ) s ij ( r uj − b uj ) r ui = b ui + ˆ where b ui = µ + b u + b i P j ∈ s k ( i,u ) s ij • Define a similarity s ij between items • Find set s k ( i , u ) of k -nearest neighbors 
 to i that were rated by user u • Predict rating using weighted average over set • How should we define s ij ?

  40. • – Pearson Correlation Coefficient • – each item rated by a distinct set of users User ratings for item i: ? ? 1 ? ? 5 5 3 ? ? 4 2 ? ? ? 4 ? 5 4 1 ? User ratings for item j: 2 ? ? ? 4 2 5 ? ? 1 5 ? ? 2 ? 3 ? ? ? 5 4 • Cov[ r ui , r uj ] s ij = Std[ r ui ]Std[ r uj ]

  41. (item,item) similarity Empirical estimate of Pearson correlation coefficient P u ∈ U ( i,j ) ( r ui − b ui )( r uj − b uj ) ρ ij = ˆ qP u ∈ U ( i,j ) ( r ui − b ui ) 2 P u ∈ U ( i,j ) ( r uj − b uj ) 2 Regularize towards 0 for small support | U ( i, j ) | − 1 s ij = | U ( i, j ) | − 1 + λ ˆ ρ ij Regularize towards baseline for small neighborhood P j ∈ s k ( i,u ) s ij ( r uj − b uj ) r ui = b ui + ˆ λ + P j ∈ s k ( i,u ) s ij

  42. Similarity for binary labels Pearson correlation not meaningful for binary labels 
 (e.g. Views, Purchases, Clicks) Jaccard similarity Observed / Expected ratio m ij s ij = observed m ij s ij = expected ≈ α + m i m j /m α + m i + m j − m ij m i users acting on i m ij users acting on both i and j m total number of users

  43. Matrix Factorization Methods

  44. Matrix Factorization Moonrise Kingdom 4 5 4 4 0.3 0.2

  45. Matrix Factorization Moonrise Kingdom 4 5 4 4 0.3 0.2 Idea: pose as (biased) matrix factorization problem

  46. Matrix Factorization users 1 3 5 5 4 5 4 4 2 1 3 items ~ 2 4 1 2 3 4 3 5 2 4 5 4 2 4 3 4 2 2 5 1 3 3 2 4 users 1.1 -.2 .3 .5 -2 -.5 .8 -.4 .3 1.4 2.4 -.9 .1 -.4 .2 items -.8 .7 .5 1.4 .3 -1 1.4 2.9 -.7 1.2 -.1 1.3 -.5 .6 .5 ~ 2.1 -.4 .6 1.7 2.4 .9 -.3 .4 .8 .7 -.6 .1 -.2 .3 .5 1.1 2.1 .3 -.7 2.1 -2 -1 .7 .3 A rank-3 SVD approximation

  47. Prediction users 1 3 5 5 4 5 4 4 2 1 3 ? items ~ 2 4 1 2 3 4 3 5 2 4 5 4 2 4 3 4 2 2 5 1 3 3 2 4 users 1.1 -.2 .3 .5 -2 -.5 .8 -.4 .3 1.4 2.4 -.9 .1 -.4 .2 items -.8 .7 .5 1.4 .3 -1 1.4 2.9 -.7 1.2 -.1 1.3 -.5 .6 .5 ~ 2.1 -.4 .6 1.7 2.4 .9 -.3 .4 .8 .7 -.6 .1 -.2 .3 .5 1.1 2.1 .3 -.7 2.1 -2 -1 .7 .3 A rank-3 SVD approximation

  48. Prediction users 1 3 5 5 4 5 4 4 2 1 3 2.4 items ~ 2 4 1 2 3 4 3 5 2 4 5 4 2 4 3 4 2 2 5 1 3 3 2 4 users 1.1 -.2 .3 .5 -2 -.5 .8 -.4 .3 1.4 2.4 -.9 .1 -.4 .2 items -.8 .7 .5 1.4 .3 -1 1.4 2.9 -.7 1.2 -.1 1.3 -.5 .6 .5 ~ 2.1 -.4 .6 1.7 2.4 .9 -.3 .4 .8 .7 -.6 .1 -.2 .3 .5 1.1 2.1 .3 -.7 2.1 -2 -1 .7 .3 A rank-3 SVD approximation

  49. SVD with missing values .1 -.4 .2 1 3 5 5 4 -.5 .6 .5 5 4 4 2 1 3 1.1 -.2 .3 .5 -2 -.5 .8 -.4 .3 1.4 2.4 -.9 -.2 .3 .5 ~ 2 4 1 2 3 4 3 5 -.8 .7 .5 1.4 .3 -1 1.4 2.9 -.7 1.2 -.1 1.3 1.1 2.1 .3 2 4 5 4 2 2.1 -.4 .6 1.7 2.4 .9 -.3 .4 .8 .7 -.6 .1 -.7 2.1 -2 4 3 4 2 2 5 -1 .7 .3 1 3 3 2 4 Pose as regression problem • SVD ¡isn’t ¡defined ¡when ¡entries ¡are ¡unknown ¡ � • � Regularize using Frobenius norm • – –

  50. Alternating Least Squares .1 -.4 .2 1 3 5 5 4 -.5 .6 .5 5 4 4 2 1 3 1.1 -.2 .3 .5 -2 -.5 .8 -.4 .3 1.4 2.4 -.9 -.2 .3 .5 ~ 2 4 1 2 3 4 3 5 -.8 .7 .5 1.4 .3 -1 1.4 2.9 -.7 1.2 -.1 1.3 1.1 2.1 .3 2 4 5 4 2 2.1 -.4 .6 1.7 2.4 .9 -.3 .4 .8 .7 -.6 .1 -.7 2.1 -2 4 3 4 2 2 5 -1 .7 .3 1 3 3 2 4 • SVD ¡isn’t ¡defined ¡when ¡entries ¡are ¡unknown ¡ � (regress w u given X ) • � • – –

Recommend


More recommend