recommendation systems
play

Recommendation Systems Stony Brook University CSE545, Spring 2019 - PowerPoint PPT Presentation

Recommendation Systems Stony Brook University CSE545, Spring 2019 Recommendation Systems What other item will this user like? (based on previously liked items) How much will user like item X? Recommendation Systems What other item


  1. Recommendation Systems Stony Brook University CSE545, Spring 2019

  2. Recommendation Systems ● What other item will this user like? (based on previously liked items) ● How much will user like item X?

  3. Recommendation Systems ● What other item will this user like? (based on previously liked items) ● How much will user like item X? ?

  4. Recommendation Systems ● What other item will this user like? (based on previously liked items) ● How much will user like item X?

  5. Recommendation Systems

  6. Recommendation Systems Past User Ratings

  7. Recommendation Systems Why Big Data? ● Data with many potential features (and sometimes observations) ● An application of techniques for finding similar items ○ locality sensitive hashing ○ dimensionality reduction

  8. Recommendation System: Example

  9. Enabled by Web Shopping ● Does Wal-Mart have everything you need?

  10. Enabled by Web Shopping ● Does Wal-Mart have everything you need? (thelongtail.com)

  11. Enabled by Web Shopping ● Does Wal-Mart have everything you need? ● A lot of products are only of interest to a small population (i.e. “long-tail products”). ● However, most people buy many products that are from the long-tail. ● Web shopping enables more choices (thelongtail.com) ○ Harder to search ○ Recommendation engines to the rescue

  12. Enabled by Web Shopping ● Does Wal-Mart have everything you need? ● A lot of products are only of interest to a small population (i.e. “long-tail products”). ● However, most people buy many products that are from the long-tail. ● Web shopping enables more choices (thelongtail.com) ○ Harder to search ○ Recommendation engines to the rescue

  13. A Model for Recommendation Systems Given: users , items, utility matrix

  14. A Model for Recommendation Systems Given: users , items, utility matrix Game of Fargo Brooklyn Silicon Walking user Thrones Nine-Nine Valley Dead A 4 5 3 3 B 5 4 2 C 5 2

  15. A Model for Recommendation Systems Given: users , items, utility matrix Game of Fargo Brooklyn Silicon Walking user Thrones Nine-Nine Valley Dead A 4 5 3 3 B 5 4 2 C 5 2 ? ? ?

  16. Recommendation Systems Problems to tackle: 1. Gathering ratings 2. Extrapolate unknown ratings a. Explicit: based on user ratings and reviews (problem: only a few users engage in such tasks) b. Implicit: Learn from actions (e.g. purchases, clicks) (problem: hard to learn low ratings) 3. Evaluation

  17. Common Approaches Recommendation Systems 1. Content-based Problems to tackle: 2. Collaborative 3. Latent Factor 1. Gathering ratings 2. Extrapolate unknown ratings a. Explicit: based on user ratings and reviews (problem: only a few users engage in such tasks) b. Implicit: Learn from actions (e.g. purchases, clicks) (problem: hard to learn low ratings) 3. Evaluation

  18. Common Approaches Recommendation Systems 1. Content-based Problems to tackle: 2. Collaborative 3. Latent Factor 1. Gathering ratings 2. Extrapolate unknown ratings a. Explicit: based on user ratings and reviews (problem: only a few users engage in such tasks) b. Implicit: Learn from actions (e.g. purchases, clicks) (problem: hard to learn low ratings) 3. Evaluation

  19. Utility Matrix: columns: p features movies f1, f2, f3, f4, … fp o1 o2 o3 … users rows: N observations oN

  20. Goal: Complete Matrix movies f1, f2, f3, f4, … fp o1 o2 o3 … users oN

  21. Problem: Given Incomplete Matrix movies f1, f2, f3, f4, … fp o1 o2 o3 … users oN

  22. Complete Matrix using Latent Factors f1, f2, f3, f4, … fp c1, c2, c3, c4, … cp’ o1 o1 o2 o2 o3 o3 … … oN oN Dimensionality reduction Try to best represent but with on p’ columns.

  23. Complete Matrix using Latent Factors Find latent factors Reconstruct matrix

  24. Dimensionality Reduction - PCA Linear approximates of data in dimensions. Found via Singular Value Decomposition: T X [nxp] = U [nxr] D [rxr] V [pxr] X: original matrix, U: “left singular vectors”, D: “singular values” (diagonal), V: “right singular vectors” Projection (dimensionality reduced space) in 3 dimensions: T ) (U [nx3] D [3x3] V [px3] To reduce features in new dataset: X new V = X new_small

  25. Dimensionality Reduction - PCA Linear approximates of data in dimensions. Found via Singular Value Decomposition: T X [nxp] = U [nxr] D [rxr] V [pxr] X: original matrix, U: “left singular vectors”, D: “singular values” (diagonal), V: “right singular vectors” p p ≈ n X n

  26. Dimensionality Reduction - PCA - Example T X [nxp] = U [nxr] D [rxr] V [pxr] Users to movies matrix

  27. Dimensionality Reduction - PCA Linear approximates of data in dimensions. Found via Singular Value Decomposition: T X [nxp] = U [nxr] D [rxr] V [pxr] X: original matrix, U: “left singular vectors”, D: “singular values” (diagonal), V: “right singular vectors” p p ≈ n X n

  28. Dimensionality Reduction - PCA Linear approximates of data in dimensions. Found via Singular Value Decomposition: T X [nxp] = U [nxr] D [rxr] V [pxr] X: original matrix, U: “left singular vectors”, D: “singular values” (diagonal), V: “right singular vectors” To check how well the original matrix can be reproduced: Z [nxp] = U D V T , How does Z compare to original X?

  29. Dimensionality Reduction - PCA - Example T X [nxp] = U [nxr] D [rxr] V [pxr]

  30. Common Approaches Recommendation Systems 1. Content-based Problems to tackle: 2. Collaborative 3. Latent Factor 1. Gathering ratings 2. Extrapolate unknown ratings a. Explicit: based on user ratings and reviews (problem: only a few users engage in such tasks) b. Implicit: Learn from actions (e.g. purchases, clicks) (problem: hard to learn low ratings) 3. Evaluation

  31. Common Approaches Recommendation Systems 1. Content-based Problems to tackle: 2. Collaborative 3. Latent Factor 1. Gathering ratings 2. Extrapolate unknown ratings a. Explicit: based on user ratings and reviews (problem: only a few users engage in such tasks) b. Implicit: Learn from actions (e.g. purchases, clicks) (problem: hard to learn low ratings) 3. Evaluation

  32. Content-based Rec Systems Based on similarity of items to past items that they have rated.

  33. Content-based Rec Systems Based on similarity of items to past items that they have rated.

  34. Content-based Rec Systems Based on similarity of items to past items that they have rated. Build profiles of items (set of features); examples: 1. shows: producer, actors, theme, review people: friends, posts pick words with tf-idf

  35. Content-based Rec Systems Based on similarity of items to past items that they have rated. Build profiles of items (set of features); examples: 1. shows: producer, actors, theme, review people: friends, posts pick words with tf-idf Construct user profile from item profiles; approach: 2. average all item profiles variation: weight by difference from their average

  36. Content-based Rec Systems Based on similarity of items to past items that they have rated. Build profiles of items (set of features); examples: 1. shows: producer, actors, theme, review people: friends, posts pick words with tf-idf Construct user profile from item profiles; approach: 2. average all item profiles of items they’ve purchased variation: weight by difference from their average ratings Predict ratings for new items; approach: 3. x i

  37. Why Content Based? ● Only need users history ● Captures unique tastes ● Can recommend new items ● Can provide explanations

  38. Why Content Based? ● Only need users history ● Need good features ● Captures unique tastes ● New users don’t have history ● Can recommend new items ● Doesn’t venture “outside the box” ● Can provide explanations (Overspecialized)

  39. Why Content Based? ● Only need users history ● Need good features ● Captures unique tastes ● New users don’t have history ● Can recommend new items ● Doesn’t venture “outside the box” ● Can provide explanations (Overspecialized) (not exploiting other users judgments)

  40. Collaborative Filtering Rec Systems ● Need good features ● New users don’t have history ● Doesn’t venture “outside the box” (Overspecialized) (not exploiting other users judgments)

  41. Common Approaches Recommendation Systems 1. Content-based Problems to tackle: 2. Collaborative 3. Latent Factor 1. Gathering ratings 2. Extrapolate unknown ratings a. Explicit: based on user ratings and reviews (problem: only a few users engage in such tasks) b. Implicit: Learn from actions (e.g. purchases, clicks) (problem: hard to learn low ratings) 3. Evaluation

  42. Collaborative Filtering Rec Systems -- neighborhood

  43. Collaborative Filtering Rec Systems Game of Fargo Brooklyn Silicon Walking user Thrones Nine-Nine Valley Dead A 4 5 2 3 B 5 4 2 C 5 2

Recommend


More recommend