CSE 6240: Web Search and Text Mining. Spring 2020 Recommendation Systems: Part II Prof. Srijan Kumar http://cc.gatech.edu/~srijan 1 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining
Announcements • Project: – Final report rubric: released – Final presentation: details forthcoming 2 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining
Recommendation Systems • Content-based • Collaborative Filtering • Latent Factor Models • Case Study: Netflix Challenge • Deep Recommender Systems Slide reference: Mining Massive Dataset http://mmds.org/ 3 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining
Latent Factor Models • These models learn latent factors to represent users and items from the rating matrix – Latent factors are not directly observable – These are derived from the data • Recall: Network embeddings • Methods: – Singular value decomposition (SVD) – Principal Component Analysis (PCA) – Eigendecompositon 4 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining
Latent Factors: Example • Embedding axes are a type of latent factors • In a user-movie rating matrix: • Movie latent factors can represent axes: – Comedy vs drama – Degree of action – Appropriateness to children • User latent factors will measure a user’s affinity towards corresponding movie factors 5 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining
Latent Factors: Example Serious Braveheart Amadeus Lethal Weapon Sense and Sensibility Ocean’s 11 Geared Geared towards towards Factor 1 males females The Lion King The Princess Independence Diaries Day Factor 2 Dumb and Dumber Funny 6 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining
SVD • SVD: SVD decomposes an input matrix into multiple factor matrices – A = U S V T n n – Where, » – A : Input data matrix S V T m m A – U : Left singular vecs – V : Right singular vecs – S : Singular values U 7 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining
� � SVD • SVD gives minimum reconstruction error (Sum of Squared Errors): 1 $,&,' ( 𝐵 *+ − 𝑉Σ𝑊 0 *+ min *+∈4 • SSE and RMSE are monotonically related: ; – 𝑆𝑁𝑇𝐹 = 𝑇𝑇𝐹 è SVD is minimizing RMSE < • Complication: The sum in SVD error term is over all entries. But our R has missing entries. – Solution: no-rating in interpreted as zero-rating. 8 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining
SVD on Rating Matrix • “SVD” on rating data: R ≈ Q · P T • Each row of Q represents an item • Each column of P represents a user factors users .1 -.4 .2 1 3 5 5 4 users -.5 .6 .5 5 4 4 2 1 3 factors 1.1 -.2 .3 .5 -2 -.5 .8 -.4 .3 1.4 2.4 -.9 items -.2 .3 .5 ≈ 2 4 1 2 3 4 3 5 -.8 .7 .5 1.4 .3 -1 1.4 2.9 -.7 1.2 -.1 1.3 1.1 2.1 .3 2 4 5 4 2 2.1 -.4 .6 1.7 2.4 .9 -.3 .4 .8 .7 -.6 .1 items -.7 2.1 -2 4 3 4 2 2 5 P T -1 .7 .3 1 3 3 2 4 Q R 9 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining
� Ratings as Products of Factors • How to estimate the missing rating of user x for item i ? 𝒔 > 𝒚𝒋 = 𝒓 𝒋 ⋅ 𝒒 𝒚 users = ( 𝒓 𝒋𝒈 ⋅ 𝒒 𝒚𝒈 1 3 5 5 4 5 4 4 2 1 3 ? 𝒈 items ≈ 2 4 1 2 3 4 3 5 2 4 5 4 2 q i = row i of Q 4 3 4 2 2 5 p x = column x of P T 1 3 3 2 4 .1 -.4 .2 users -.5 .6 .5 1.1 -.2 .3 .5 -2 -.5 .8 -.4 .3 1.4 2.4 -.9 items factors -.2 .3 .5 -.8 .7 .5 1.4 .3 -1 1.4 2.9 -.7 1.2 -.1 1.3 1.1 2.1 .3 2.1 -.4 .6 1.7 2.4 .9 -.3 .4 .8 .7 -.6 .1 -.7 2.1 -2 P T -1 .7 .3 Q factors 10 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining
� Ratings as Products of Factors • How to estimate the missing rating of user x for item i ? 𝒔 > 𝒚𝒋 = 𝒓 𝒋 ⋅ 𝒒 𝒚 users = ( 𝒓 𝒋𝒈 ⋅ 𝒒 𝒚𝒈 1 3 5 5 4 5 4 4 2 1 3 ? 𝒈 items ≈ 2 4 1 2 3 4 3 5 2 4 5 4 2 q i = row i of Q 4 3 4 2 2 5 p x = column x of P T 1 3 3 2 4 .1 -.4 .2 users -.5 .6 .5 1.1 -.2 .3 .5 -2 -.5 .8 -.4 .3 1.4 2.4 -.9 items factors -.2 .3 .5 -.8 .7 .5 1.4 .3 -1 1.4 2.9 -.7 1.2 -.1 1.3 1.1 2.1 .3 2.1 -.4 .6 1.7 2.4 .9 -.3 .4 .8 .7 -.6 .1 -.7 2.1 -2 P T -1 .7 .3 Q factors 11 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining
� Ratings as Products of Factors • How to estimate the missing rating of user x for item i ? 𝒔 > 𝒚𝒋 = 𝒓 𝒋 ⋅ 𝒒 𝒚 users = ( 𝒓 𝒋𝒈 ⋅ 𝒒 𝒚𝒈 1 3 5 5 4 5 4 2.4 4 2 1 3 ? 𝒈 items ≈ 2 4 1 2 3 4 3 5 2 4 5 4 2 q i = row i of Q 4 3 4 2 2 5 p x = column x of P T 1 3 3 2 4 .1 -.4 .2 users -.5 .6 .5 1.1 -.2 .3 .5 -2 -.5 .8 -.4 .3 1.4 2.4 -.9 items factors -.2 .3 .5 -.8 .7 .5 1.4 .3 -1 1.4 2.9 -.7 1.2 -.1 1.3 1.1 2.1 .3 2.1 -.4 .6 1.7 2.4 .9 -.3 .4 .8 .7 -.6 .1 -.7 2.1 -2 P T -1 .7 .3 Q factors 12 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining
Latent Factor Models: Example Serious Movies plotted in Braveheart Amadeus two dimensions. Dimensions have meaning. Lethal Weapon Sense and Sensibility Ocean’s 11 Geared Geared Factor 1 towards towards males females The Lion King Factor 2 The Princess Independence Diaries Day Dumb and Dumber Funny 13 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining
Latent Factor Models Serious Users fall in the Braveheart Amadeus same space, showing their preferences. Lethal Weapon Sense and Sensibility Ocean’s 11 Geared Geared Factor 1 towards towards males females The Lion King Factor 2 The Princess Independence Diaries Day Dumb and Dumber Funny 14 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining
SVD: Problems • SVD minimizes SSE for training data – Want large k (# of factors) to capture all the signals – But, error on test data begins to rise for k > 2 • This is a classical example of overfitting: – With too much freedom (too many free parameters) the model starts fitting noise – Model fits too well the training data and thus not generalizing well to unseen test data 15 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining
Preventing Overfitting • To solve overfitting we introduce regularization: – Allow rich model where there are sufficient data – Shrink aggressively where data are scarce é ù å å å 2 2 min - + l + l 2 ( r q p ) p q ê ú xi i x 1 x 2 i ë û P , Q training x i “error” “length” l 1 , l 2 … user set regularization parameters Note: We do not care about the “raw” value of the objective function, but we care in P,Q that achieve the minimum of the objective 16 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining
The Effect of Regularization serious Braveheart The Color Amadeus Purple Lethal Weapon Sense and Sensibility Ocean’s 11 Geared Geared towards towards males Factor 1 females The Princess The Lion King Dumb and Diaries Dumber Factor 2 Independence Day funny 17 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining
Modeling Biases and Interactions user bias movie bias user-movie interaction Baseline predictor User-Movie interaction Characterizes the matching between Separates users and movies § ¡ users and movies Benefits from insights into user’s § Attracts most research in the field behavior ¡ Benefits from algorithmic and ¡ Among the main practical § mathematical innovations contributions of the competition ¡ μ = overall mean rating ¡ b x = bias of user x ¡ b i = bias of movie i 18 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining
Baseline Predictor • We have expectations on the rating by user x of movie i , even without estimating x ’s attitude towards movies like i – Rating scale of user x – (Recent) popularity of movie i – Values of other ratings user – Selection bias; related to gave recently (day-specific number of ratings user gave on mood, anchoring, multi-user the same day (“frequency”) accounts) 19 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining
Putting It All Together 𝑠 F* = 𝜈 + 𝑐 F + 𝑐 * + 𝑟 * ⋅ 𝑞 F Overall Bias for Bias for User-Movie mean rating user x movie i interaction • Example: – Mean rating: µ µ = 3.7 – You are a critical reviewer: your ratings are 1 star lower than the mean: b x = -1 – Star Wars gets a mean rating of 0.5 higher than average movie: b i = + 0.5 – Predicted rating for you on Star Wars: = 3.7 - 1 + 0.5 = 3.2 20 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining
Recommend
More recommend