hybrid algorithms for recommending new items
play

Hybrid algorithms for recommending new items - PowerPoint PPT Presentation

2nd International Workshop on Information Heterogeneity and Fusion in Recommender Systems (HetRec 2011) Chicago, IL (USA) Oct 2011, 27 th Hybrid algorithms for recommending new items http://dx.doi.org/10.1145/2039320.2039325


  1. 2nd International Workshop on Information Heterogeneity and Fusion in Recommender Systems (HetRec 2011) Chicago, IL (USA) – Oct 2011, 27 th Hybrid algorithms for recommending new items http://dx.doi.org/10.1145/2039320.2039325 http://dx.doi.org/10.1145/2039320.2039325 R OBERTO T URRIN – Moviri, R&D Paolo Cremonesi – Politecnico di Milano Fabio Airoldi – Moviri, R&D MOVIRI, R&D

  2. ..in a nutshell Credits: http://dpaki.com/?p=2591 • Hybrid algorithms • Real domain requirements • scalability • modularity • many unrated items • many unrated items • New-item stressing experiments • Datasets • Private TV dataset • MovieLens

  3. Traditional recommender systems Collaborative (CF) Content-based (CBF) � Pros � Pros High quality Work on new items � � � Cons Cons � Cons Cons New items problem Low quality � � (since they do not have ratings) (since user ratings are ignored) Popularity bias Profile overfitting � � R. TURRIN, P . Cremonesi, F . Airoldi - Hybrid algorithms for recommending new items

  4. ..so CF or CBF? ..many variables quality CF CBF CBF time ? new system mature system R. TURRIN, P . Cremonesi, F . Airoldi - Hybrid algorithms for recommending new items

  5. TV domain: new items • The EPG is characterized by many unrated, new TV programs • The percentage of new-item • The percentage of new-item cannot be neglected R. TURRIN, P . Cremonesi, F . Airoldi - Hybrid algorithms for recommending new items

  6. Existing hybrid algorithms � Several hybrid algorithms mix CF and CBF (but also demographics, social) e.g.: P . Melville, R. J. Mooney, and R. Nagarajan. “ Content-boosted collaborative � filtering for improved recommendations ”, 2002 B. Mobasher, X. Jin, and Y. Zhou. “ Semantically Enhanced Collaborative � Filtering on the Web ”, 2003 � Pros � Some approaches show better quality than CF/CBF � Cons Low scalability / no real-time recommendations � Only partial focus on new-item problem � Not working with implicit, binary ratings � R. TURRIN, P . Cremonesi, F . Airoldi - Hybrid algorithms for recommending new items

  7. Our hybrid algorithms � GOALS � New-item � Quality comparable to collaborative � REQUIREMENTS: � Batch/real-time scalability /complexity � Updated recommendations � Modularity : ability to re-use existing CF and CBF algorithms. Modularity : ability to re-use existing CF and CBF algorithms. � Implicit/explicit ratings R. TURRIN, P . Cremonesi, F . Airoldi - Hybrid algorithms for recommending new items

  8. Main contributions � GOALS � New-item � Quality comparable to collaborative � REQUIREMENTS: � Batch/real-time scalability /complexity � Updated recommendations � Modularity : ability to re-use existing CF and CBF algorithms. Modularity : ability to re-use existing CF and CBF algorithms. � Implicit/explicit ratings � Two hybrid algorithms: � extension of SimComb algorithm � introduction of a new hybrid algorithm � New-item stressing evaluation R. TURRIN, P . Cremonesi, F . Airoldi - Hybrid algorithms for recommending new items

  9. STATE-OF-THE-ART RECOMMENDER ALGORITHMS MOVIRI, R&D

  10. Collaborative algorithms Rating given by user u to item i User Rating In implicit dataset is either 1 or 0 Matrix (URM) u i Implemented strategies : � Item-item neighborhood-based ( NNCos ) Item-item neighborhood-based ( NNCos ) Recommendations are based on item-item similarities computed as the � cosine metric � Latent factor models ( PureSVD ) Recommendations are based on hidden factors implicitly discovered by � means of a matrix factorization (SVD) R. TURRIN, P . Cremonesi, F . Airoldi - Hybrid algorithms for recommending new items

  11. Content-based algorithm Weight of feature f in item i . � Computed as TF-IDF Item-content matrix � Example of features: genre, actors, (ICM) f directors,… i LSA (Latent Semantic Analysis) The ICM is factorized by means of SVD in order to discover latent semantic R. TURRIN, P . Cremonesi, F . Airoldi - Hybrid algorithms for recommending new items

  12. Hybrid algorithms Interleaved (INTL) Trivial hybrid implementation where the final recommendation list is � formed by alternating items recommended by the CF algorithm with items recommended by the CBF algorithm Item A Item A Item Z Item B Item Y Item Z Item C Item X Item B CF list CBF list Item Y Item Y Hybrid list SimComb [Mobasher et al. 2004] Two item-item similarity matrices are computed and linearly combined � CF CBF HYBRID (1- α ) + α = item-item item-item item-item similarities similarities similarities R. TURRIN, P . Cremonesi, F . Airoldi - Hybrid algorithms for recommending new items

  13. • FFA (Filtered Feature Augmentation) • SIMinjKnn (Similarity Injection Knn) PROPOSED HYBRID ALGORITHMS MOVIRI, R&D

  14. Collaborative filtering as main brick We trust CF recommendations when the model has been trained with “enough” information (i.e., ratings) CF We add CBF-based data (i.e., rating) for better training the CF when no enough information is available CBF R. TURRIN, P . Cremonesi, F . Airoldi - Hybrid algorithms for recommending new items

  15. Collaborative filtering as main brick We trust CF recommendations when the model has been trained with “enough” information (i.e., ratings) CF We add CBF-based data (i.e., features) for better training the CF when no enough information is available CBF R. TURRIN, P . Cremonesi, F . Airoldi - Hybrid algorithms for recommending new items

  16. Item-item model K NN Item-item similarity matrix i j A number of recommendation (CF and CBF) algorithms allow to compute item-item similarity. R. TURRIN, P . Cremonesi, F . Airoldi - Hybrid algorithms for recommending new items

  17. Item-item model: real-time recommendations + ? ? ? - ? + User ratings KNN Item-item similarity matrix i j R. TURRIN, P . Cremonesi, F . Airoldi - Hybrid algorithms for recommending new items

  18. Item-item model: real-time recommendations + ? ? ? - ? + * User ratings Real-time requirements: • Memory : K * #items • Memory : #items • Time : f(#ratings, K ) * #items • Use of existing algorithms • Updated recommendations • Implicit/explicit ratings MODEL R. TURRIN, P . Cremonesi, F . Airoldi - Hybrid algorithms for recommending new items

  19. Filtered Feature Augmentation (FFA) Idea : add pseudo-ratings to the item profiles Motivation Pseudo-ratings model new items � Less sparse item-profiles � CBF C ONTENT Filter CF Model R ATINGS R. TURRIN, P . Cremonesi, F . Airoldi - Hybrid algorithms for recommending new items

  20. Filtered Feature Augmentation (FFA) Idea : add pseudo-ratings to the item profiles Motivation Pseudo-ratings model new items � Less sparse item-profiles � Entropy-based filtering (e.g., Gini impurity measure) predicted ratings CBF C ONTENT Filter CF R ATINGS Model R. TURRIN, P . Cremonesi, F . Airoldi - Hybrid algorithms for recommending new items

  21. Similarity Injection Knn (SIMinjKnn) Idea : mixing CF and CBF similarities Motivation Discovering relationships between new and � old items CBF CBF CBF C ONTENT C ONTENT Model Combiner Model CF CF R ATINGS Model R. TURRIN, P . Cremonesi, F . Airoldi - Hybrid algorithms for recommending new items

  22. EVALUATION MOVIRI, R&D

  23. Datasets � 1M Movielens � ~6K users, ~3.9K items, 1M ratings ML � An implicit, binary dataset collected from 15’000 IPTV users over a period of six months � ~15K users, ~800 rated items/~4K, ~26K ratings TV � Multilanguage (mainly German, French) content data available at http://home.dei.polimi.it/cremones/memo/downloads/TV2.zip R. TURRIN, P . Cremonesi, F . Airoldi - Hybrid algorithms for recommending new items

  24. Testing methodology (1) Training set (extracted from H 1 ) • H 1 : set of existing items • H 2 : set of new items Test set • (100- β )% existing items: extracted from H 1 • β % new items: extracted from H 2 Discarded ratings R. TURRIN, P . Cremonesi, F . Airoldi - Hybrid algorithms for recommending new items

  25. Testing methodology (1) Training set (extracted from H 1 ) • H 1 : set of existing items • H 2 : set of new items Test set • (100- β )% existing items: extracted from H 1 • β % new items: extracted from H 2 Discarded ratings R. TURRIN, P . Cremonesi, F . Airoldi - Hybrid algorithms for recommending new items

  26. Testing methodology (2) � For each <user, item> < u,i> in H 1+2 : � Generate rating prediction for i � Generate rating prediction for every other items � Sort the items according to predicted rating � There is a “hit” if rank( i ) < N There is a “hit” if rank( i ) < N � i.e., item i appears in the top-N. In our tests, N=20 R. TURRIN, P . Cremonesi, F . Airoldi - Hybrid algorithms for recommending new items

  27. Non-hybrid algorithms ML TV ML TV R. TURRIN, P . Cremonesi, F . Airoldi - Hybrid algorithms for recommending new items

  28. Hybrid algorithms: ML ML R. TURRIN, P . Cremonesi, F . Airoldi - Hybrid algorithms for recommending new items

  29. Hybrid algorithms: ML ML R. TURRIN, P . Cremonesi, F . Airoldi - Hybrid algorithms for recommending new items

  30. Hybrid algorithms: TV TV R. TURRIN, P . Cremonesi, F . Airoldi - Hybrid algorithms for recommending new items

Recommend


More recommend