effective missing data prediction for collaborative
play

Effective Missing Data Prediction for Collaborative Filtering Hao - PowerPoint PPT Presentation

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Effective Missing Data Prediction for Collaborative Filtering Hao Ma, Irwin King, and Michael R. Lyu Department of Computer Science and Engineering


  1. Outline Simple Examples of Recommender System Introduction Definitions of Some Concepts Missing Data Prediction A Simple CF Example Empirical Analysis Pearson Correlation Coefficient Conclusions and Future Work Significance Weighting Definition of Recommendation Systems Computer programs Predict items that a user may be interested in Items could be movies, music, books, news, web pages, etc. Given some information about the user’s profile Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  2. Outline Simple Examples of Recommender System Introduction Definitions of Some Concepts Missing Data Prediction A Simple CF Example Empirical Analysis Pearson Correlation Coefficient Conclusions and Future Work Significance Weighting Definition of Recommendation Systems Computer programs Predict items that a user may be interested in Items could be movies, music, books, news, web pages, etc. Given some information about the user’s profile Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  3. Outline Simple Examples of Recommender System Introduction Definitions of Some Concepts Missing Data Prediction A Simple CF Example Empirical Analysis Pearson Correlation Coefficient Conclusions and Future Work Significance Weighting Definition of Recommendation Systems Computer programs Predict items that a user may be interested in Items could be movies, music, books, news, web pages, etc. Given some information about the user’s profile Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  4. Outline Simple Examples of Recommender System Introduction Definitions of Some Concepts Missing Data Prediction A Simple CF Example Empirical Analysis Pearson Correlation Coefficient Conclusions and Future Work Significance Weighting Definition of Recommendation Systems Computer programs Predict items that a user may be interested in Items could be movies, music, books, news, web pages, etc. Given some information about the user’s profile Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  5. Outline Simple Examples of Recommender System Introduction Definitions of Some Concepts Missing Data Prediction A Simple CF Example Empirical Analysis Pearson Correlation Coefficient Conclusions and Future Work Significance Weighting Definition of Recommendation Systems Computer programs Predict items that a user may be interested in Items could be movies, music, books, news, web pages, etc. Given some information about the user’s profile Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  6. Outline Simple Examples of Recommender System Introduction Definitions of Some Concepts Missing Data Prediction A Simple CF Example Empirical Analysis Pearson Correlation Coefficient Conclusions and Future Work Significance Weighting Definition of Collaborative Filtering Making automatic predictions (filtering) about the interests of a user By collecting taste information from many other users (collaborating) Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  7. Outline Simple Examples of Recommender System Introduction Definitions of Some Concepts Missing Data Prediction A Simple CF Example Empirical Analysis Pearson Correlation Coefficient Conclusions and Future Work Significance Weighting Definition of Collaborative Filtering Making automatic predictions (filtering) about the interests of a user By collecting taste information from many other users (collaborating) Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  8. Outline Simple Examples of Recommender System Introduction Definitions of Some Concepts Missing Data Prediction A Simple CF Example Empirical Analysis Pearson Correlation Coefficient Conclusions and Future Work Significance Weighting Definition of Collaborative Filtering Making automatic predictions (filtering) about the interests of a user By collecting taste information from many other users (collaborating) Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  9. Outline Simple Examples of Recommender System Introduction Definitions of Some Concepts Missing Data Prediction A Simple CF Example Empirical Analysis Pearson Correlation Coefficient Conclusions and Future Work Significance Weighting User-based Collaborative Filtering Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  10. Outline Simple Examples of Recommender System Introduction Definitions of Some Concepts Missing Data Prediction A Simple CF Example Empirical Analysis Pearson Correlation Coefficient Conclusions and Future Work Significance Weighting User-based Collaborative Filtering Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  11. Outline Simple Examples of Recommender System Introduction Definitions of Some Concepts Missing Data Prediction A Simple CF Example Empirical Analysis Pearson Correlation Coefficient Conclusions and Future Work Significance Weighting User-based Collaborative Filtering Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  12. Outline Simple Examples of Recommender System Introduction Definitions of Some Concepts Missing Data Prediction A Simple CF Example Empirical Analysis Pearson Correlation Coefficient Conclusions and Future Work Significance Weighting User-based Collaborative Filtering Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  13. Outline Simple Examples of Recommender System Introduction Definitions of Some Concepts Missing Data Prediction A Simple CF Example Empirical Analysis Pearson Correlation Coefficient Conclusions and Future Work Significance Weighting User-based Collaborative Filtering Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  14. Outline Simple Examples of Recommender System Introduction Definitions of Some Concepts Missing Data Prediction A Simple CF Example Empirical Analysis Pearson Correlation Coefficient Conclusions and Future Work Significance Weighting User-based Collaborative Filtering Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  15. Outline Simple Examples of Recommender System Introduction Definitions of Some Concepts Missing Data Prediction A Simple CF Example Empirical Analysis Pearson Correlation Coefficient Conclusions and Future Work Significance Weighting User-based Collaborative Filtering User-based collaborative filtering predicts the ratings of active users based on the ratings of similar users found in the user-item matrix The similarity between users could be defined as: � ( r a,i − r a ) · ( r u,i − r u ) i ∈ I ( a ) ∩ I ( u ) Sim ( a, u ) = � � � � ( r a,i − r a ) 2 · ( r u,i − r u ) 2 i ∈ I ( a ) ∩ I ( u ) i ∈ I ( a ) ∩ I ( u ) Sim ( a, u ) is ranging from [ − 1 , 1] , and a larger value means users a and u are more similar Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  16. Outline Simple Examples of Recommender System Introduction Definitions of Some Concepts Missing Data Prediction A Simple CF Example Empirical Analysis Pearson Correlation Coefficient Conclusions and Future Work Significance Weighting User-based Collaborative Filtering User-based collaborative filtering predicts the ratings of active users based on the ratings of similar users found in the user-item matrix The similarity between users could be defined as: � ( r a,i − r a ) · ( r u,i − r u ) i ∈ I ( a ) ∩ I ( u ) Sim ( a, u ) = � � � � ( r a,i − r a ) 2 · ( r u,i − r u ) 2 i ∈ I ( a ) ∩ I ( u ) i ∈ I ( a ) ∩ I ( u ) Sim ( a, u ) is ranging from [ − 1 , 1] , and a larger value means users a and u are more similar Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  17. Outline Simple Examples of Recommender System Introduction Definitions of Some Concepts Missing Data Prediction A Simple CF Example Empirical Analysis Pearson Correlation Coefficient Conclusions and Future Work Significance Weighting User-based Collaborative Filtering User-based collaborative filtering predicts the ratings of active users based on the ratings of similar users found in the user-item matrix The similarity between users could be defined as: � ( r a,i − r a ) · ( r u,i − r u ) i ∈ I ( a ) ∩ I ( u ) Sim ( a, u ) = � � � � ( r a,i − r a ) 2 · ( r u,i − r u ) 2 i ∈ I ( a ) ∩ I ( u ) i ∈ I ( a ) ∩ I ( u ) Sim ( a, u ) is ranging from [ − 1 , 1] , and a larger value means users a and u are more similar Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  18. Outline Simple Examples of Recommender System Introduction Definitions of Some Concepts Missing Data Prediction A Simple CF Example Empirical Analysis Pearson Correlation Coefficient Conclusions and Future Work Significance Weighting User-based Collaborative Filtering User-based collaborative filtering predicts the ratings of active users based on the ratings of similar users found in the user-item matrix The similarity between users could be defined as: � ( r a,i − r a ) · ( r u,i − r u ) i ∈ I ( a ) ∩ I ( u ) Sim ( a, u ) = � � � � ( r a,i − r a ) 2 · ( r u,i − r u ) 2 i ∈ I ( a ) ∩ I ( u ) i ∈ I ( a ) ∩ I ( u ) Sim ( a, u ) is ranging from [ − 1 , 1] , and a larger value means users a and u are more similar Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  19. Outline Simple Examples of Recommender System Introduction Definitions of Some Concepts Missing Data Prediction A Simple CF Example Empirical Analysis Pearson Correlation Coefficient Conclusions and Future Work Significance Weighting User-based Collaborative Filtering Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  20. Outline Simple Examples of Recommender System Introduction Definitions of Some Concepts Missing Data Prediction A Simple CF Example Empirical Analysis Pearson Correlation Coefficient Conclusions and Future Work Significance Weighting User-based Collaborative Filtering Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  21. Outline Simple Examples of Recommender System Introduction Definitions of Some Concepts Missing Data Prediction A Simple CF Example Empirical Analysis Pearson Correlation Coefficient Conclusions and Future Work Significance Weighting Item-based Collaborative Filtering Item-based collaborative filtering predicts the ratings of active users based on the information of similar items computed The similarity between items could be defined as: � ( r u,i − r i ) · ( r u,j − r j ) u ∈ U ( i ) ∩ U ( j ) Sim ( i, j ) = � � � � ( r u,i − r i ) 2 · ( r u,j − r j ) 2 u ∈ U ( i ) ∩ U ( j ) u ∈ U ( i ) ∩ U ( j ) Like user similarity, item similarity Sim ( i, j ) is also ranging from [ − 1 , 1] Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  22. Outline Simple Examples of Recommender System Introduction Definitions of Some Concepts Missing Data Prediction A Simple CF Example Empirical Analysis Pearson Correlation Coefficient Conclusions and Future Work Significance Weighting Item-based Collaborative Filtering Item-based collaborative filtering predicts the ratings of active users based on the information of similar items computed The similarity between items could be defined as: � ( r u,i − r i ) · ( r u,j − r j ) u ∈ U ( i ) ∩ U ( j ) Sim ( i, j ) = � � � � ( r u,i − r i ) 2 · ( r u,j − r j ) 2 u ∈ U ( i ) ∩ U ( j ) u ∈ U ( i ) ∩ U ( j ) Like user similarity, item similarity Sim ( i, j ) is also ranging from [ − 1 , 1] Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  23. Outline Simple Examples of Recommender System Introduction Definitions of Some Concepts Missing Data Prediction A Simple CF Example Empirical Analysis Pearson Correlation Coefficient Conclusions and Future Work Significance Weighting Item-based Collaborative Filtering Item-based collaborative filtering predicts the ratings of active users based on the information of similar items computed The similarity between items could be defined as: � ( r u,i − r i ) · ( r u,j − r j ) u ∈ U ( i ) ∩ U ( j ) Sim ( i, j ) = � � � � ( r u,i − r i ) 2 · ( r u,j − r j ) 2 u ∈ U ( i ) ∩ U ( j ) u ∈ U ( i ) ∩ U ( j ) Like user similarity, item similarity Sim ( i, j ) is also ranging from [ − 1 , 1] Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  24. Outline Simple Examples of Recommender System Introduction Definitions of Some Concepts Missing Data Prediction A Simple CF Example Empirical Analysis Pearson Correlation Coefficient Conclusions and Future Work Significance Weighting Item-based Collaborative Filtering Item-based collaborative filtering predicts the ratings of active users based on the information of similar items computed The similarity between items could be defined as: � ( r u,i − r i ) · ( r u,j − r j ) u ∈ U ( i ) ∩ U ( j ) Sim ( i, j ) = � � � � ( r u,i − r i ) 2 · ( r u,j − r j ) 2 u ∈ U ( i ) ∩ U ( j ) u ∈ U ( i ) ∩ U ( j ) Like user similarity, item similarity Sim ( i, j ) is also ranging from [ − 1 , 1] Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  25. Outline Simple Examples of Recommender System Introduction Definitions of Some Concepts Missing Data Prediction A Simple CF Example Empirical Analysis Pearson Correlation Coefficient Conclusions and Future Work Significance Weighting An Example Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  26. Outline Simple Examples of Recommender System Introduction Definitions of Some Concepts Missing Data Prediction A Simple CF Example Empirical Analysis Pearson Correlation Coefficient Conclusions and Future Work Significance Weighting An Example Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  27. Outline Simple Examples of Recommender System Introduction Definitions of Some Concepts Missing Data Prediction A Simple CF Example Empirical Analysis Pearson Correlation Coefficient Conclusions and Future Work Significance Weighting An Example Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  28. Outline Simple Examples of Recommender System Introduction Definitions of Some Concepts Missing Data Prediction A Simple CF Example Empirical Analysis Pearson Correlation Coefficient Conclusions and Future Work Significance Weighting Significance Weighting We use the following equation to solve this problem: Sim ′ ( a, u ) = Min ( | I a ∩ I u | , γ ) · Sim ( a, u ) , γ where | I a ∩ I u | is the number of items which user a and user u rated in common Then the similarity between items could be defined as: Sim ′ ( i, j ) = Min ( | U i ∩ U j | , δ ) · Sim ( i, j ) , δ where | U i ∩ U j | is the number of users who rated both item i and item j Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  29. Outline Simple Examples of Recommender System Introduction Definitions of Some Concepts Missing Data Prediction A Simple CF Example Empirical Analysis Pearson Correlation Coefficient Conclusions and Future Work Significance Weighting Significance Weighting We use the following equation to solve this problem: Sim ′ ( a, u ) = Min ( | I a ∩ I u | , γ ) · Sim ( a, u ) , γ where | I a ∩ I u | is the number of items which user a and user u rated in common Then the similarity between items could be defined as: Sim ′ ( i, j ) = Min ( | U i ∩ U j | , δ ) · Sim ( i, j ) , δ where | U i ∩ U j | is the number of users who rated both item i and item j Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  30. Outline Simple Examples of Recommender System Introduction Definitions of Some Concepts Missing Data Prediction A Simple CF Example Empirical Analysis Pearson Correlation Coefficient Conclusions and Future Work Significance Weighting Significance Weighting We use the following equation to solve this problem: Sim ′ ( a, u ) = Min ( | I a ∩ I u | , γ ) · Sim ( a, u ) , γ where | I a ∩ I u | is the number of items which user a and user u rated in common Then the similarity between items could be defined as: Sim ′ ( i, j ) = Min ( | U i ∩ U j | , δ ) · Sim ( i, j ) , δ where | U i ∩ U j | is the number of users who rated both item i and item j Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  31. Outline Collaborative Filtering Challenges Introduction User-Item Matrix Missing Data Prediction Similar Neighbors Selection Empirical Analysis Missing Data Prediction Conclusions and Future Work Parameter Discussion User-Item Matrix Challenges of Collaborative Filtering Data Sparsity Prediction Accuracy Scalability Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  32. Outline Collaborative Filtering Challenges Introduction User-Item Matrix Missing Data Prediction Similar Neighbors Selection Empirical Analysis Missing Data Prediction Conclusions and Future Work Parameter Discussion User-Item Matrix Challenges of Collaborative Filtering Data Sparsity Prediction Accuracy Scalability Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  33. Outline Collaborative Filtering Challenges Introduction User-Item Matrix Missing Data Prediction Similar Neighbors Selection Empirical Analysis Missing Data Prediction Conclusions and Future Work Parameter Discussion User-Item Matrix Challenges of Collaborative Filtering Data Sparsity Prediction Accuracy Scalability Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  34. Outline Collaborative Filtering Challenges Introduction User-Item Matrix Missing Data Prediction Similar Neighbors Selection Empirical Analysis Missing Data Prediction Conclusions and Future Work Parameter Discussion User-Item Matrix Challenges of Collaborative Filtering Data Sparsity Prediction Accuracy Scalability Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  35. Outline Collaborative Filtering Challenges Introduction User-Item Matrix Missing Data Prediction Similar Neighbors Selection Empirical Analysis Missing Data Prediction Conclusions and Future Work Parameter Discussion User-Item Matrix Challenges of Collaborative Filtering Data Sparsity Prediction Accuracy Scalability Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  36. Outline Collaborative Filtering Challenges Introduction User-Item Matrix Missing Data Prediction Similar Neighbors Selection Empirical Analysis Missing Data Prediction Conclusions and Future Work Parameter Discussion Challenges of Collaborative Filtering Data Sparsity Prediction Accuracy Scalability Data Sparsity Propose an algorithm to increase the density of User-Item Matrix Only predict some of the missing data Prediction Accuracy Adopt significance weighting Linearly combine user information with item information Predict the missing data with high confidence Our algorithm increases 6 . 24% of prediction accuracy over other state-of-the-art methods in average Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  37. Outline Collaborative Filtering Challenges Introduction User-Item Matrix Missing Data Prediction Similar Neighbors Selection Empirical Analysis Missing Data Prediction Conclusions and Future Work Parameter Discussion Challenges of Collaborative Filtering Data Sparsity Prediction Accuracy Scalability Data Sparsity Propose an algorithm to increase the density of User-Item Matrix Only predict some of the missing data Prediction Accuracy Adopt significance weighting Linearly combine user information with item information Predict the missing data with high confidence Our algorithm increases 6 . 24% of prediction accuracy over other state-of-the-art methods in average Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  38. Outline Collaborative Filtering Challenges Introduction User-Item Matrix Missing Data Prediction Similar Neighbors Selection Empirical Analysis Missing Data Prediction Conclusions and Future Work Parameter Discussion Challenges of Collaborative Filtering Data Sparsity Prediction Accuracy Scalability Data Sparsity Propose an algorithm to increase the density of User-Item Matrix Only predict some of the missing data Prediction Accuracy Adopt significance weighting Linearly combine user information with item information Predict the missing data with high confidence Our algorithm increases 6 . 24% of prediction accuracy over other state-of-the-art methods in average Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  39. Outline Collaborative Filtering Challenges Introduction User-Item Matrix Missing Data Prediction Similar Neighbors Selection Empirical Analysis Missing Data Prediction Conclusions and Future Work Parameter Discussion Challenges of Collaborative Filtering Data Sparsity Prediction Accuracy Scalability Data Sparsity Propose an algorithm to increase the density of User-Item Matrix Only predict some of the missing data Prediction Accuracy Adopt significance weighting Linearly combine user information with item information Predict the missing data with high confidence Our algorithm increases 6 . 24% of prediction accuracy over other state-of-the-art methods in average Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  40. Outline Collaborative Filtering Challenges Introduction User-Item Matrix Missing Data Prediction Similar Neighbors Selection Empirical Analysis Missing Data Prediction Conclusions and Future Work Parameter Discussion Challenges of Collaborative Filtering Data Sparsity Prediction Accuracy Scalability Data Sparsity Propose an algorithm to increase the density of User-Item Matrix Only predict some of the missing data Prediction Accuracy Adopt significance weighting Linearly combine user information with item information Predict the missing data with high confidence Our algorithm increases 6 . 24% of prediction accuracy over other state-of-the-art methods in average Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  41. Outline Collaborative Filtering Challenges Introduction User-Item Matrix Missing Data Prediction Similar Neighbors Selection Empirical Analysis Missing Data Prediction Conclusions and Future Work Parameter Discussion Challenges of Collaborative Filtering Data Sparsity Prediction Accuracy Scalability Data Sparsity Propose an algorithm to increase the density of User-Item Matrix Only predict some of the missing data Prediction Accuracy Adopt significance weighting Linearly combine user information with item information Predict the missing data with high confidence Our algorithm increases 6 . 24% of prediction accuracy over other state-of-the-art methods in average Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  42. Outline Collaborative Filtering Challenges Introduction User-Item Matrix Missing Data Prediction Similar Neighbors Selection Empirical Analysis Missing Data Prediction Conclusions and Future Work Parameter Discussion Challenges of Collaborative Filtering Data Sparsity Prediction Accuracy Scalability Data Sparsity Propose an algorithm to increase the density of User-Item Matrix Only predict some of the missing data Prediction Accuracy Adopt significance weighting Linearly combine user information with item information Predict the missing data with high confidence Our algorithm increases 6 . 24% of prediction accuracy over other state-of-the-art methods in average Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  43. Outline Collaborative Filtering Challenges Introduction User-Item Matrix Missing Data Prediction Similar Neighbors Selection Empirical Analysis Missing Data Prediction Conclusions and Future Work Parameter Discussion Challenges of Collaborative Filtering Data Sparsity Prediction Accuracy Scalability Data Sparsity Propose an algorithm to increase the density of User-Item Matrix Only predict some of the missing data Prediction Accuracy Adopt significance weighting Linearly combine user information with item information Predict the missing data with high confidence Our algorithm increases 6 . 24% of prediction accuracy over other state-of-the-art methods in average Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  44. Outline Collaborative Filtering Challenges Introduction User-Item Matrix Missing Data Prediction Similar Neighbors Selection Empirical Analysis Missing Data Prediction Conclusions and Future Work Parameter Discussion Challenges of Collaborative Filtering Data Sparsity Prediction Accuracy Scalability Data Sparsity Propose an algorithm to increase the density of User-Item Matrix Only predict some of the missing data Prediction Accuracy Adopt significance weighting Linearly combine user information with item information Predict the missing data with high confidence Our algorithm increases 6 . 24% of prediction accuracy over other state-of-the-art methods in average Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  45. Outline Collaborative Filtering Challenges Introduction User-Item Matrix Missing Data Prediction Similar Neighbors Selection Empirical Analysis Missing Data Prediction Conclusions and Future Work Parameter Discussion User-Item Matrix Predicted User-Item Matrix Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  46. Outline Collaborative Filtering Challenges Introduction User-Item Matrix Missing Data Prediction Similar Neighbors Selection Empirical Analysis Missing Data Prediction Conclusions and Future Work Parameter Discussion Similar Neighbors Selection For every missing data r u,i , a set of similar users S ( u ) towards user u can be generated according to: S ( u ) = { u a | Sim ′ ( u a , u ) > η, u a � = u } where Sim ′ ( u a , u ) is computed using Significance Weighting, and η is the user similarity threshold At the same time, for every missing data r u,i , a set of similar items S ( i ) towards item i can be generated according to: S ( i ) = { i k | Sim ′ ( i k , i ) > θ, i k � = i } where θ is the item similarity threshold Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  47. Outline Collaborative Filtering Challenges Introduction User-Item Matrix Missing Data Prediction Similar Neighbors Selection Empirical Analysis Missing Data Prediction Conclusions and Future Work Parameter Discussion Similar Neighbors Selection For every missing data r u,i , a set of similar users S ( u ) towards user u can be generated according to: S ( u ) = { u a | Sim ′ ( u a , u ) > η, u a � = u } where Sim ′ ( u a , u ) is computed using Significance Weighting, and η is the user similarity threshold At the same time, for every missing data r u,i , a set of similar items S ( i ) towards item i can be generated according to: S ( i ) = { i k | Sim ′ ( i k , i ) > θ, i k � = i } where θ is the item similarity threshold Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  48. Outline Collaborative Filtering Challenges Introduction User-Item Matrix Missing Data Prediction Similar Neighbors Selection Empirical Analysis Missing Data Prediction Conclusions and Future Work Parameter Discussion Similar Neighbors Selection For every missing data r u,i , a set of similar users S ( u ) towards user u can be generated according to: S ( u ) = { u a | Sim ′ ( u a , u ) > η, u a � = u } where Sim ′ ( u a , u ) is computed using Significance Weighting, and η is the user similarity threshold At the same time, for every missing data r u,i , a set of similar items S ( i ) towards item i can be generated according to: S ( i ) = { i k | Sim ′ ( i k , i ) > θ, i k � = i } where θ is the item similarity threshold Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  49. Outline Collaborative Filtering Challenges Introduction User-Item Matrix Missing Data Prediction Similar Neighbors Selection Empirical Analysis Missing Data Prediction Conclusions and Future Work Parameter Discussion Missing Data Prediction Algorithm Given the missing data r u,i , if S ( u ) � = ∅ ∧ S ( i ) � = ∅ , the prediction of missing data P ( r u,i ) is defined as: � Sim ′ ( u a , u ) · ( r u a ,i − u a ) u a ∈ S ( u ) P ( r u,i ) = λ × ( u + � ) + Sim ′ ( u a , u ) u a ∈ S ( u ) � Sim ′ ( i k , i ) · ( r u,i k − i k ) i k ∈ S ( i ) (1 − λ ) × ( i + ) � Sim ′ ( i k , i ) i k ∈ S ( i ) Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  50. Outline Collaborative Filtering Challenges Introduction User-Item Matrix Missing Data Prediction Similar Neighbors Selection Empirical Analysis Missing Data Prediction Conclusions and Future Work Parameter Discussion Missing Data Prediction Algorithm Given the missing data r u,i , if S ( u ) � = ∅ ∧ S ( i ) � = ∅ , the prediction of missing data P ( r u,i ) is defined as: � Sim ′ ( u a , u ) · ( r u a ,i − u a ) u a ∈ S ( u ) P ( r u,i ) = λ × ( u + � ) + Sim ′ ( u a , u ) u a ∈ S ( u ) � Sim ′ ( i k , i ) · ( r u,i k − i k ) i k ∈ S ( i ) (1 − λ ) × ( i + ) � Sim ′ ( i k , i ) i k ∈ S ( i ) Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  51. Outline Collaborative Filtering Challenges Introduction User-Item Matrix Missing Data Prediction Similar Neighbors Selection Empirical Analysis Missing Data Prediction Conclusions and Future Work Parameter Discussion Missing Data Prediction Algorithm If S ( u ) � = ∅ ∧ S ( i ) = ∅ , the prediction of missing data P ( r u,i ) is defined as: � Sim ′ ( u a , u ) · ( r u a ,i − u a ) u a ∈ S ( u ) P ( r u,i ) = u + � Sim ′ ( u a , u ) u a ∈ S ( u ) If S ( u ) = ∅ ∧ S ( i ) � = ∅ , the prediction of missing data P ( r u,i ) is defined as: � Sim ′ ( i k , i ) · ( r u,i k − i k ) i k ∈ S ( i ) P ( r u,i ) = i + � Sim ′ ( i k , i ) i k ∈ S ( i ) Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  52. Outline Collaborative Filtering Challenges Introduction User-Item Matrix Missing Data Prediction Similar Neighbors Selection Empirical Analysis Missing Data Prediction Conclusions and Future Work Parameter Discussion Missing Data Prediction Algorithm If S ( u ) � = ∅ ∧ S ( i ) = ∅ , the prediction of missing data P ( r u,i ) is defined as: � Sim ′ ( u a , u ) · ( r u a ,i − u a ) u a ∈ S ( u ) P ( r u,i ) = u + � Sim ′ ( u a , u ) u a ∈ S ( u ) If S ( u ) = ∅ ∧ S ( i ) � = ∅ , the prediction of missing data P ( r u,i ) is defined as: � Sim ′ ( i k , i ) · ( r u,i k − i k ) i k ∈ S ( i ) P ( r u,i ) = i + � Sim ′ ( i k , i ) i k ∈ S ( i ) Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  53. Outline Collaborative Filtering Challenges Introduction User-Item Matrix Missing Data Prediction Similar Neighbors Selection Empirical Analysis Missing Data Prediction Conclusions and Future Work Parameter Discussion Missing Data Prediction Algorithm If S ( u ) � = ∅ ∧ S ( i ) = ∅ , the prediction of missing data P ( r u,i ) is defined as: � Sim ′ ( u a , u ) · ( r u a ,i − u a ) u a ∈ S ( u ) P ( r u,i ) = u + � Sim ′ ( u a , u ) u a ∈ S ( u ) If S ( u ) = ∅ ∧ S ( i ) � = ∅ , the prediction of missing data P ( r u,i ) is defined as: � Sim ′ ( i k , i ) · ( r u,i k − i k ) i k ∈ S ( i ) P ( r u,i ) = i + � Sim ′ ( i k , i ) i k ∈ S ( i ) Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  54. Outline Collaborative Filtering Challenges Introduction User-Item Matrix Missing Data Prediction Similar Neighbors Selection Empirical Analysis Missing Data Prediction Conclusions and Future Work Parameter Discussion Missing Data Prediction Algorithm If S ( u ) = ∅ ∧ S ( i ) = ∅ , the prediction of missing data P ( r u,i ) is defined as: P ( r u,i ) = 0 This consideration is different from all other existing prediction or smoothing methods – they always try to predict all the missing data in the user-item matrix, which will predict some missing data with bad quality Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  55. Outline Collaborative Filtering Challenges Introduction User-Item Matrix Missing Data Prediction Similar Neighbors Selection Empirical Analysis Missing Data Prediction Conclusions and Future Work Parameter Discussion Missing Data Prediction Algorithm If S ( u ) = ∅ ∧ S ( i ) = ∅ , the prediction of missing data P ( r u,i ) is defined as: P ( r u,i ) = 0 This consideration is different from all other existing prediction or smoothing methods – they always try to predict all the missing data in the user-item matrix, which will predict some missing data with bad quality Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  56. Outline Collaborative Filtering Challenges Introduction User-Item Matrix Missing Data Prediction Similar Neighbors Selection Empirical Analysis Missing Data Prediction Conclusions and Future Work Parameter Discussion Missing Data Prediction Algorithm If S ( u ) = ∅ ∧ S ( i ) = ∅ , the prediction of missing data P ( r u,i ) is defined as: P ( r u,i ) = 0 This consideration is different from all other existing prediction or smoothing methods – they always try to predict all the missing data in the user-item matrix, which will predict some missing data with bad quality Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  57. Outline Collaborative Filtering Challenges Introduction User-Item Matrix Missing Data Prediction Similar Neighbors Selection Empirical Analysis Missing Data Prediction Conclusions and Future Work Parameter Discussion Discussion on γ and δ Parameter Employed to avoid overestimating the user similarities and γ item similarities δ Too high = ⇒ users or items do not have enough neighbors η = ⇒ decrease of prediction accuracy θ Too low = ⇒ overestimate problem still exists = ⇒ decrease λ of prediction accuracy Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  58. Outline Collaborative Filtering Challenges Introduction User-Item Matrix Missing Data Prediction Similar Neighbors Selection Empirical Analysis Missing Data Prediction Conclusions and Future Work Parameter Discussion Discussion on γ and δ Parameter Employed to avoid overestimating the user similarities and γ item similarities δ Too high = ⇒ users or items do not have enough neighbors η = ⇒ decrease of prediction accuracy θ Too low = ⇒ overestimate problem still exists = ⇒ decrease λ of prediction accuracy Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  59. Outline Collaborative Filtering Challenges Introduction User-Item Matrix Missing Data Prediction Similar Neighbors Selection Empirical Analysis Missing Data Prediction Conclusions and Future Work Parameter Discussion Discussion on γ and δ Parameter Employed to avoid overestimating the user similarities and γ item similarities δ Too high = ⇒ users or items do not have enough neighbors η = ⇒ decrease of prediction accuracy θ Too low = ⇒ overestimate problem still exists = ⇒ decrease λ of prediction accuracy Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  60. Outline Collaborative Filtering Challenges Introduction User-Item Matrix Missing Data Prediction Similar Neighbors Selection Empirical Analysis Missing Data Prediction Conclusions and Future Work Parameter Discussion Discussion on γ and δ Parameter Employed to avoid overestimating the user similarities and γ item similarities δ Too high = ⇒ users or items do not have enough neighbors η = ⇒ decrease of prediction accuracy θ Too low = ⇒ overestimate problem still exists = ⇒ decrease λ of prediction accuracy Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  61. Outline Collaborative Filtering Challenges Introduction User-Item Matrix Missing Data Prediction Similar Neighbors Selection Empirical Analysis Missing Data Prediction Conclusions and Future Work Parameter Discussion Parameter Discussion on η and θ γ Thresholds to select neighbors δ Too high = ⇒ few missing data need to be predicted = ⇒ user-item matrix is very sparse η θ Too low = ⇒ almost all the missing data need to be predicted = ⇒ user-item matrix is very dense λ Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  62. Outline Collaborative Filtering Challenges Introduction User-Item Matrix Missing Data Prediction Similar Neighbors Selection Empirical Analysis Missing Data Prediction Conclusions and Future Work Parameter Discussion Parameter Discussion on η and θ γ Thresholds to select neighbors δ Too high = ⇒ few missing data need to be predicted = ⇒ user-item matrix is very sparse η θ Too low = ⇒ almost all the missing data need to be predicted = ⇒ user-item matrix is very dense λ Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  63. Outline Collaborative Filtering Challenges Introduction User-Item Matrix Missing Data Prediction Similar Neighbors Selection Empirical Analysis Missing Data Prediction Conclusions and Future Work Parameter Discussion Parameter Discussion on η and θ γ Thresholds to select neighbors δ Too high = ⇒ few missing data need to be predicted = ⇒ user-item matrix is very sparse η θ Too low = ⇒ almost all the missing data need to be predicted = ⇒ user-item matrix is very dense λ Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  64. Outline Collaborative Filtering Challenges Introduction User-Item Matrix Missing Data Prediction Similar Neighbors Selection Empirical Analysis Missing Data Prediction Conclusions and Future Work Parameter Discussion Parameter Discussion on η and θ γ Thresholds to select neighbors δ Too high = ⇒ few missing data need to be predicted = ⇒ user-item matrix is very sparse η θ Too low = ⇒ almost all the missing data need to be predicted = ⇒ user-item matrix is very dense λ Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  65. Outline Collaborative Filtering Challenges Introduction User-Item Matrix Missing Data Prediction Similar Neighbors Selection Empirical Analysis Missing Data Prediction Conclusions and Future Work Parameter Discussion Discussion on λ Parameter Determines how closely the rating prediction relies on user γ information or item information δ ⇒ prediction depends completely upon user-based λ = 1 = η information θ λ = 0 = ⇒ prediction depends completely upon item-based λ information Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  66. Outline Collaborative Filtering Challenges Introduction User-Item Matrix Missing Data Prediction Similar Neighbors Selection Empirical Analysis Missing Data Prediction Conclusions and Future Work Parameter Discussion Discussion on λ Parameter Determines how closely the rating prediction relies on user γ information or item information δ ⇒ prediction depends completely upon user-based λ = 1 = η information θ λ = 0 = ⇒ prediction depends completely upon item-based λ information Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  67. Outline Collaborative Filtering Challenges Introduction User-Item Matrix Missing Data Prediction Similar Neighbors Selection Empirical Analysis Missing Data Prediction Conclusions and Future Work Parameter Discussion Discussion on λ Parameter Determines how closely the rating prediction relies on user γ information or item information δ ⇒ prediction depends completely upon user-based λ = 1 = η information θ λ = 0 = ⇒ prediction depends completely upon item-based λ information Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  68. Outline Collaborative Filtering Challenges Introduction User-Item Matrix Missing Data Prediction Similar Neighbors Selection Empirical Analysis Missing Data Prediction Conclusions and Future Work Parameter Discussion Discussion on λ Parameter Determines how closely the rating prediction relies on user γ information or item information δ ⇒ prediction depends completely upon user-based λ = 1 = η information θ λ = 0 = ⇒ prediction depends completely upon item-based λ information Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  69. Outline Collaborative Filtering Challenges Introduction User-Item Matrix Missing Data Prediction Similar Neighbors Selection Empirical Analysis Missing Data Prediction Conclusions and Future Work Parameter Discussion Parameter Discussion Table: The relationship between parameters with other CF approaches (MDP: Mission Data Predicted) λ η θ Related CF Approaches 1 1 1 User-based CF without MDP 0 1 1 Item-based CF without MDP 1 0 0 User-based CF with full MDP 0 0 0 Item-based CF with full MDP Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  70. Outline Collaborative Filtering Challenges Introduction User-Item Matrix Missing Data Prediction Similar Neighbors Selection Empirical Analysis Missing Data Prediction Conclusions and Future Work Parameter Discussion Parameter Discussion Table: The relationship between parameters with other CF approaches (MDP: Mission Data Predicted) λ η θ Related CF Approaches 1 1 1 User-based CF without MDP 0 1 1 Item-based CF without MDP 1 0 0 User-based CF with full MDP 0 0 0 Item-based CF with full MDP Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  71. Outline Datasets Introduction Metrics Missing Data Prediction Summary of Experiments Empirical Analysis Comparisons Conclusions and Future Work Impact of Parameters Movielens It contains 100,000 ratings (1-5 scales) rated by 943 users on 1,682 movies, and each user at least rated 20 movies. The density of the user-item matrix is: 100000 943 × 1682 = 6 . 30% The statistics of dataset MovieLens is summarized in the following table: Table: Statistics of Dataset MovieLens Statistics User Item Min. Num. of Ratings 20 1 Max. Num. of Ratings 737 583 Avg. Num. of Ratings 106.04 59.45 Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  72. Outline Datasets Introduction Metrics Missing Data Prediction Summary of Experiments Empirical Analysis Comparisons Conclusions and Future Work Impact of Parameters Movielens It contains 100,000 ratings (1-5 scales) rated by 943 users on 1,682 movies, and each user at least rated 20 movies. The density of the user-item matrix is: 100000 943 × 1682 = 6 . 30% The statistics of dataset MovieLens is summarized in the following table: Table: Statistics of Dataset MovieLens Statistics User Item Min. Num. of Ratings 20 1 Max. Num. of Ratings 737 583 Avg. Num. of Ratings 106.04 59.45 Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  73. Outline Datasets Introduction Metrics Missing Data Prediction Summary of Experiments Empirical Analysis Comparisons Conclusions and Future Work Impact of Parameters Movielens It contains 100,000 ratings (1-5 scales) rated by 943 users on 1,682 movies, and each user at least rated 20 movies. The density of the user-item matrix is: 100000 943 × 1682 = 6 . 30% The statistics of dataset MovieLens is summarized in the following table: Table: Statistics of Dataset MovieLens Statistics User Item Min. Num. of Ratings 20 1 Max. Num. of Ratings 737 583 Avg. Num. of Ratings 106.04 59.45 Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  74. Outline Datasets Introduction Metrics Missing Data Prediction Summary of Experiments Empirical Analysis Comparisons Conclusions and Future Work Impact of Parameters Mean Absolute Errors We use the Mean Absolute Error (MAE) metrics to measure the prediction quality of our proposed approach with other collaborative filtering methods MAE is defined as: � u,i | r u,i − � r u,i | MAE = , N where r u,i denotes the rating that user u gave to item i , and � r u,i denotes the rating that user u gave to item i which is predicted by our approach, and N denotes the number of tested ratings Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  75. Outline Datasets Introduction Metrics Missing Data Prediction Summary of Experiments Empirical Analysis Comparisons Conclusions and Future Work Impact of Parameters Mean Absolute Errors We use the Mean Absolute Error (MAE) metrics to measure the prediction quality of our proposed approach with other collaborative filtering methods MAE is defined as: � u,i | r u,i − � r u,i | MAE = , N where r u,i denotes the rating that user u gave to item i , and � r u,i denotes the rating that user u gave to item i which is predicted by our approach, and N denotes the number of tested ratings Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  76. Outline Datasets Introduction Metrics Missing Data Prediction Summary of Experiments Empirical Analysis Comparisons Conclusions and Future Work Impact of Parameters Mean Absolute Errors We use the Mean Absolute Error (MAE) metrics to measure the prediction quality of our proposed approach with other collaborative filtering methods MAE is defined as: � u,i | r u,i − � r u,i | MAE = , N where r u,i denotes the rating that user u gave to item i , and � r u,i denotes the rating that user u gave to item i which is predicted by our approach, and N denotes the number of tested ratings Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  77. Outline Datasets Introduction Metrics Missing Data Prediction Summary of Experiments Empirical Analysis Comparisons Conclusions and Future Work Impact of Parameters Summary of Experiments Comparisons with Traditional PCC Methods Comparisons with State-of-the-Art Algorithms Impact of Missing Data Prediction Impact of γ and δ Impact of λ Impact of η and θ Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  78. Outline Datasets Introduction Metrics Missing Data Prediction Summary of Experiments Empirical Analysis Comparisons Conclusions and Future Work Impact of Parameters Summary of Experiments Comparisons with Traditional PCC Methods Comparisons with State-of-the-Art Algorithms Impact of Missing Data Prediction Impact of γ and δ Impact of λ Impact of η and θ Comparisons with Traditional PCC Methods User-based collaborative filtering using Pearson Correlation Coefficient Item-based collaborative filtering using Pearson Correlation Coefficient Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  79. Outline Datasets Introduction Metrics Missing Data Prediction Summary of Experiments Empirical Analysis Comparisons Conclusions and Future Work Impact of Parameters Summary of Experiments Comparisons with Traditional PCC Methods Comparisons with State-of-the-Art Algorithms Impact of Missing Data Prediction Impact of γ and δ Impact of λ Impact of η and θ Comparisons with State-of-the-Art Algorithms Similarity Fusion (SF) [J. Wang, et al., SIGIR 2006] Smoothing and Cluster-Based PCC (SCBPCC) [G. Xue, et al., SIGIR 2005] Aspect Model (AM) [T. Hofmann, TOIS 2004] Personality Diagnosis (PD) [D. M. Pennock, et al., UAI 2000] Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

  80. Outline Datasets Introduction Metrics Missing Data Prediction Summary of Experiments Empirical Analysis Comparisons Conclusions and Future Work Impact of Parameters Summary of Experiments Comparisons with Traditional PCC Methods Comparisons with State-of-the-Art Algorithms Impact of Missing Data Prediction Impact of γ and δ Impact of λ Impact of η and θ Impact of Missing Data Prediction Effective Missing Data Prediction (EMDP) Predict Every Missing Data (PEMD) Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

Recommend


More recommend