recommendation systems part ii
play

Recommendation Systems: Part II Prof. Srijan Kumar - PowerPoint PPT Presentation

CSE 6240: Web Search and Text Mining. Spring 2020 Recommendation Systems: Part II Prof. Srijan Kumar http://cc.gatech.edu/~srijan 1 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining Announcements Project:


  1. CSE 6240: Web Search and Text Mining. Spring 2020 Recommendation Systems: Part II Prof. Srijan Kumar http://cc.gatech.edu/~srijan 1 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

  2. Announcements • Project: – Final report rubric: released – Final presentation: details forthcoming 2 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

  3. Recommendation Systems • Content-based • Collaborative Filtering • Latent Factor Models • Case Study: Netflix Challenge • Deep Recommender Systems Slide reference: Mining Massive Dataset http://mmds.org/ 3 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

  4. Latent Factor Models • These models learn latent factors to represent users and items from the rating matrix – Latent factors are not directly observable – These are derived from the data • Recall: Network embeddings • Methods: – Singular value decomposition (SVD) – Principal Component Analysis (PCA) – Eigendecompositon 4 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

  5. Latent Factors: Example • Embedding axes are a type of latent factors • In a user-movie rating matrix: • Movie latent factors can represent axes: – Comedy vs drama – Degree of action – Appropriateness to children • User latent factors will measure a user’s affinity towards corresponding movie factors 5 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

  6. Latent Factors: Example Serious Braveheart Amadeus Lethal Weapon Sense and Sensibility Ocean’s 11 Geared Geared towards towards Factor 1 males females The Lion King The Princess Independence Diaries Day Factor 2 Dumb and Dumber Funny 6 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

  7. SVD • SVD: SVD decomposes an input matrix into multiple factor matrices – A = U S V T n n – Where, » – A : Input data matrix S V T m m A – U : Left singular vecs – V : Right singular vecs – S : Singular values U 7 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

  8. � � SVD • SVD gives minimum reconstruction error (Sum of Squared Errors): 1 $,&,' ( 𝐵 *+ − 𝑉Σ𝑊 0 *+ min *+∈4 • SSE and RMSE are monotonically related: ; – 𝑆𝑁𝑇𝐹 = 𝑇𝑇𝐹 è SVD is minimizing RMSE < • Complication: The sum in SVD error term is over all entries. But our R has missing entries. – Solution: no-rating in interpreted as zero-rating. 8 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

  9. SVD on Rating Matrix • “SVD” on rating data: R ≈ Q · P T • Each row of Q represents an item • Each column of P represents a user factors users .1 -.4 .2 1 3 5 5 4 users -.5 .6 .5 5 4 4 2 1 3 factors 1.1 -.2 .3 .5 -2 -.5 .8 -.4 .3 1.4 2.4 -.9 items -.2 .3 .5 ≈ 2 4 1 2 3 4 3 5 -.8 .7 .5 1.4 .3 -1 1.4 2.9 -.7 1.2 -.1 1.3 1.1 2.1 .3 2 4 5 4 2 2.1 -.4 .6 1.7 2.4 .9 -.3 .4 .8 .7 -.6 .1 items -.7 2.1 -2 4 3 4 2 2 5 P T -1 .7 .3 1 3 3 2 4 Q R 9 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

  10. � Ratings as Products of Factors • How to estimate the missing rating of user x for item i ? 𝒔 > 𝒚𝒋 = 𝒓 𝒋 ⋅ 𝒒 𝒚 users = ( 𝒓 𝒋𝒈 ⋅ 𝒒 𝒚𝒈 1 3 5 5 4 5 4 4 2 1 3 ? 𝒈 items ≈ 2 4 1 2 3 4 3 5 2 4 5 4 2 q i = row i of Q 4 3 4 2 2 5 p x = column x of P T 1 3 3 2 4 .1 -.4 .2 users -.5 .6 .5 1.1 -.2 .3 .5 -2 -.5 .8 -.4 .3 1.4 2.4 -.9 items factors -.2 .3 .5 -.8 .7 .5 1.4 .3 -1 1.4 2.9 -.7 1.2 -.1 1.3 1.1 2.1 .3 2.1 -.4 .6 1.7 2.4 .9 -.3 .4 .8 .7 -.6 .1 -.7 2.1 -2 P T -1 .7 .3 Q factors 10 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

  11. � Ratings as Products of Factors • How to estimate the missing rating of user x for item i ? 𝒔 > 𝒚𝒋 = 𝒓 𝒋 ⋅ 𝒒 𝒚 users = ( 𝒓 𝒋𝒈 ⋅ 𝒒 𝒚𝒈 1 3 5 5 4 5 4 4 2 1 3 ? 𝒈 items ≈ 2 4 1 2 3 4 3 5 2 4 5 4 2 q i = row i of Q 4 3 4 2 2 5 p x = column x of P T 1 3 3 2 4 .1 -.4 .2 users -.5 .6 .5 1.1 -.2 .3 .5 -2 -.5 .8 -.4 .3 1.4 2.4 -.9 items factors -.2 .3 .5 -.8 .7 .5 1.4 .3 -1 1.4 2.9 -.7 1.2 -.1 1.3 1.1 2.1 .3 2.1 -.4 .6 1.7 2.4 .9 -.3 .4 .8 .7 -.6 .1 -.7 2.1 -2 P T -1 .7 .3 Q factors 11 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

  12. � Ratings as Products of Factors • How to estimate the missing rating of user x for item i ? 𝒔 > 𝒚𝒋 = 𝒓 𝒋 ⋅ 𝒒 𝒚 users = ( 𝒓 𝒋𝒈 ⋅ 𝒒 𝒚𝒈 1 3 5 5 4 5 4 2.4 4 2 1 3 ? 𝒈 items ≈ 2 4 1 2 3 4 3 5 2 4 5 4 2 q i = row i of Q 4 3 4 2 2 5 p x = column x of P T 1 3 3 2 4 .1 -.4 .2 users -.5 .6 .5 1.1 -.2 .3 .5 -2 -.5 .8 -.4 .3 1.4 2.4 -.9 items factors -.2 .3 .5 -.8 .7 .5 1.4 .3 -1 1.4 2.9 -.7 1.2 -.1 1.3 1.1 2.1 .3 2.1 -.4 .6 1.7 2.4 .9 -.3 .4 .8 .7 -.6 .1 -.7 2.1 -2 P T -1 .7 .3 Q factors 12 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

  13. Latent Factor Models: Example Serious Movies plotted in Braveheart Amadeus two dimensions. Dimensions have meaning. Lethal Weapon Sense and Sensibility Ocean’s 11 Geared Geared Factor 1 towards towards males females The Lion King Factor 2 The Princess Independence Diaries Day Dumb and Dumber Funny 13 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

  14. Latent Factor Models Serious Users fall in the Braveheart Amadeus same space, showing their preferences. Lethal Weapon Sense and Sensibility Ocean’s 11 Geared Geared Factor 1 towards towards males females The Lion King Factor 2 The Princess Independence Diaries Day Dumb and Dumber Funny 14 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

  15. SVD: Problems • SVD minimizes SSE for training data – Want large k (# of factors) to capture all the signals – But, error on test data begins to rise for k > 2 • This is a classical example of overfitting: – With too much freedom (too many free parameters) the model starts fitting noise – Model fits too well the training data and thus not generalizing well to unseen test data 15 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

  16. Preventing Overfitting • To solve overfitting we introduce regularization: – Allow rich model where there are sufficient data – Shrink aggressively where data are scarce é ù å å å 2 2 min - + l + l 2 ( r q p ) p q ê ú xi i x 1 x 2 i ë û P , Q training x i “error” “length” l 1 , l 2 … user set regularization parameters Note: We do not care about the “raw” value of the objective function, but we care in P,Q that achieve the minimum of the objective 16 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

  17. The Effect of Regularization serious Braveheart The Color Amadeus Purple Lethal Weapon Sense and Sensibility Ocean’s 11 Geared Geared towards towards males Factor 1 females The Princess The Lion King Dumb and Diaries Dumber Factor 2 Independence Day funny 17 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

  18. Modeling Biases and Interactions user bias movie bias user-movie interaction Baseline predictor User-Movie interaction Characterizes the matching between Separates users and movies § ¡ users and movies Benefits from insights into user’s § Attracts most research in the field behavior ¡ Benefits from algorithmic and ¡ Among the main practical § mathematical innovations contributions of the competition ¡ μ = overall mean rating ¡ b x = bias of user x ¡ b i = bias of movie i 18 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

  19. Baseline Predictor • We have expectations on the rating by user x of movie i , even without estimating x ’s attitude towards movies like i – Rating scale of user x – (Recent) popularity of movie i – Values of other ratings user – Selection bias; related to gave recently (day-specific number of ratings user gave on mood, anchoring, multi-user the same day (“frequency”) accounts) 19 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

  20. Putting It All Together 𝑠 F* = 𝜈 + 𝑐 F + 𝑐 * + 𝑟 * ⋅ 𝑞 F Overall Bias for Bias for User-Movie mean rating user x movie i interaction • Example: – Mean rating: µ µ = 3.7 – You are a critical reviewer: your ratings are 1 star lower than the mean: b x = -1 – Star Wars gets a mean rating of 0.5 higher than average movie: b i = + 0.5 – Predicted rating for you on Star Wars: = 3.7 - 1 + 0.5 = 3.2 20 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

Recommend


More recommend