On the Effectiveness of Linear Models for One-Class Collaborative - PowerPoint PPT Presentation

On the Effectiveness of Linear Models for One-Class Collaborative Filtering Suvash Sedhain 1,2 , Aditya Menon 2,1 , Scott Sanner 3,1 , Darius Braziunas 4 Australian National University 1 NICTA 2 Oregon State University 3 Rakuten Kobo Inc 4

Recommender Systems • Recommender Systems – Objective: Present personalized items to users • Collaborative filtering – De-facto method for multiuser recommender systems – Find people like you and leverage their preferences – One-class: only observe positive feedback

Sneak Peak: Model Proposal • Personalized user focused linear model • Convex • Embarrassingly parallel – Each user trained individually

State-of-the-art Collaborative Filtering • Neighborhood methods • Matrix Factorization • SLIM (Sparse Linear Method)

Nearest Neighbors: A Matrix View . . . . . . . . . . . . . . . . . 1 ? 1 ? ? ? 1 1 ? 1 . . . . . . × = 1 1 ? ? 1 . . . . . . . . ? 1 ? 1 ? • { Jaccard, Cosine} similarity S I used in practice • Keep only top k similarities • Simple, but learning is limited

Factorization Model (Weighted) Matrix Factorization . . . . . . . . . . . . 1 0 1 0 0 1 1 0 0 1 k × n = Item Projection 1 1 0 0 1 . . . . 0 1 1 0 0 m × k User Projection • Works well in general, but non-convex!

SLIM item item . . . . . . . . . . . . . . . . . 1 0 1 1 0 1 0 0 0 0 0 1 1 0 1 1 0 0 1 1 = user × 1 1 1 1 0 0 1 0 0 1 . item . . . . . . . . . . . 0 1 1 0 0 0 1 1 0 0 • Effectively trying to learn item-to-item similarities • Not user-focused, complicated optimization 10

Recommender Systems Desiderata • Learning based • Convex objective • User focused • Parallelizable

Comparison of recommendation methods for OC-CF

Outline • Problem statement • Background • LRec Model • Experiments • Results • Summary

LRec Recommendation for . . . • Each item is a training 1 0 1 1 1 instance 0 1 0 0 0 • Can be interpreted as 1 1 1 1 1 learning user-user × = . . . affinities 0 0 0 1 0 • Regularizer prevents . . from the trivial solution . . . . 1 1 0 1 0 W u1 Recommendation Any loss function - Squared Learning a model - Logistic per user

Properties of LRec • User focused – Recommendation as learning a model per user • Convex objective – Guarantees optimal solution for the formulation • Embarrassingly parallel – Each model is completely independent of other

Relationship with Existing Models SLIM LRec - Item focused - User focused - Elastic-net penalty + non-negativity - L2 penalty constraints - Optimization - Optimization: – L2 loss - Coordinate descent – Logistic Loss : Liblinear - Levy et.al. relaxed the non-negativity (dual iff #users >> #items) constraints; optimization via SGD Truncated Gradient

Relationship with Existing Models LRec Neighborhood models • Learns weight matrix via • Computes similarities using predefined classification/regression problem similarity metrics(eg: Cosine, Jaccard) – can be interpreted as learning user- user similarities

Relationship with Existing Models LRec Matrix Factorization • Learns weight matrix via classification/regression problem – can be interpreted as learning user- If user similarities Recommendation Where, • Non Convex objective • Convex objective • Low rank • Full rank • Parallelism via distributed • Embarrassingly parallel communication

Other Advantages of LRec • Efficient hyper-parameter tuning for ranking – Validate on small subset of users • Model can be fine-tuned per user

Other Advantages of LRec: Incorporating Side Information . . . Genre Actors . . . 1 1 0 1 1 0 0 1 0 0 1 1 1 1 × = 1 Item features . . . 0 0 0 1 0 . . . . . . 0 1 1 0 0 • Can easily incorporate abundant item-side information

Outline • Problem statement • Background • LRec Model • Experiment & Results • Summary

Dataset Description and Evaluation • Movielens 1M (ML1M) • Kobo • Last FM (LASTFM) • Million Song Dataset (MSD) • Evaluation Metrics 10 random train-test split • • precision@k 80%-20% split • • mean Average Precision@100 For MSD, we evaluate on random 500 users • Error bars => 95% confidence interval

Experiment Setup • Baselines • SLIM – Most Popular • LREC – Neighborhood – Elastic Net Lrec + Non-Negativity • User KNN (U-KNN) (Lrec + Sq + L 1 + NN) • Item KNN (I-KNN) – Squared Loss LRec (Lrec + Sq) – Logistic Loss LRec (LRec) – Matrix Factorization • PureSVD • WRMF • LogisticMF • Bayesian Personalized Ranking (BPR)

Results Did not finish

Results Precision@20 on ML1M and LastFM dataset

Results Did not finish Precision@20 on Kobo and LastFM dataset

Performance Evaluation Users segmented by the number of observation % improvement over WRMF on ML1M dataset

Case Study Recommendation from WRMF vs LRec LRec is more personalized

Summary • LRec – Personalized user focused linear recommender – Convex objective – Embarrassingly parallel • Future work – Further scale LRec • Computational • Memory footprint

Thanks

On the Effectiveness of Linear Models for One-Class Collaborative - PowerPoint PPT Presentation

On the Effectiveness of Linear Models for One-Class Collaborative Filtering Suvash Sedhain 1,2 , Aditya Menon 2,1 , Scott Sanner 3,1 , Darius Braziunas 4 Australian National University 1 NICTA 2 Oregon State University 3 Rakuten Kobo Inc 4

CS 7616 Pattern Recognition Linear, Linear, Linear Aaron Bobick School of Interactive

Introduction to Data Science: Logistic 0 1 1 according to a data fit criterion. account

Limitations of linear models Richard Erickson Instructor DataCamp Generalized Linear Models in

Workshop 2 Building from Linear Models to Generalised Linear Models Part 1: understanding LMs 2

Workshop 3 Building from Linear Models to Generalised Linear Models Part 2: GLMs 2 2 What are

Functional Linear Models 1 66 / 181 Functional Linear Models Statistical Models So far we have

Notes on the Non-linear Regression The model Non-linear regression models, like ordinary linear

CSC Effectiveness Review CSC Effectiveness Review Team October 2018 ICANN63 Need for Review of

Overview of logistic regression Richard Erickson Instructor DataCamp Generalized Linear Models

District 211 One-to-One Program One-to-One: Program Background 2012-2013 2016-2017 One-to-One

Linear Classifiers: Expressiveness Machine Learning 1 Lecture outline Linear models:

Introduction to the R Statistical Computing Environment Linear and Generalized Linear Models in R

Graphics 2014 Linear Algebra II Linear Maps & Matrices Linear Maps & Matrices CORE

Linear Models for Regression Greg Mori - CMPT 419/726 Bishop PRML Ch. 3 Regression Linear Basis

ECON 950 Winter 2020 Prof. James MacKinnon 9. Going Beyond Linear Models Linear regression,

Outline Statistical inference for linear mixed models general form of linear mixed models

Revisions to the Prevention of Significant Deterioration (PSD) and Title V Greenhouse Gas (GHG)

Java Sockets Alexander V. Konstantinou CS4119 Computer Networks Columbia University Spring

Extending Modular Redundancy to NTV: Costs and Limits of Resiliency at Reduced Supply Voltage

Lazy Maintenance of Materialized Views Jingren Zhou, Microsoft Research, USA Paul Larson,

using Large Scale Measurement Platforms Understanding the Impact of Network Infrastructure Changes

Serendipity Virtual Element Methods L. Beir ao da Veiga, F. Brezzi, L.D. Marini, A. Russo

MODULE FOUR The Law Of Perfect Positioning Synchronicity Serendipity Perfect Timing Everything

The Periodic Table of Finite Elements Douglas N. Arnold, University of Minnesota Collaborators:

On the Effectiveness of Linear Models for One-Class Collaborative - PowerPoint PPT Presentation

On the Effectiveness of Linear Models for One-Class Collaborative Filtering Suvash Sedhain 1,2 , Aditya Menon 2,1 , Scott Sanner 3,1 , Darius Braziunas 4 Australian National University 1 NICTA 2 Oregon State University 3 Rakuten Kobo Inc 4

CS 7616 Pattern Recognition Linear, Linear, Linear Aaron Bobick School of Interactive

Introduction to Data Science: Logistic 0 1 1 according to a data fit criterion. account

Limitations of linear models Richard Erickson Instructor DataCamp Generalized Linear Models in

Workshop 2 Building from Linear Models to Generalised Linear Models Part 1: understanding LMs 2

Workshop 3 Building from Linear Models to Generalised Linear Models Part 2: GLMs 2 2 What are

Functional Linear Models 1 66 / 181 Functional Linear Models Statistical Models So far we have

Notes on the Non-linear Regression The model Non-linear regression models, like ordinary linear

CSC Effectiveness Review CSC Effectiveness Review Team October 2018 ICANN63 Need for Review of

Overview of logistic regression Richard Erickson Instructor DataCamp Generalized Linear Models

District 211 One-to-One Program One-to-One: Program Background 2012-2013 2016-2017 One-to-One

Linear Classifiers: Expressiveness Machine Learning 1 Lecture outline Linear models:

Introduction to the R Statistical Computing Environment Linear and Generalized Linear Models in R

Graphics 2014 Linear Algebra II Linear Maps &amp; Matrices Linear Maps &amp; Matrices CORE

Linear Models for Regression Greg Mori - CMPT 419/726 Bishop PRML Ch. 3 Regression Linear Basis

ECON 950 Winter 2020 Prof. James MacKinnon 9. Going Beyond Linear Models Linear regression,

Outline Statistical inference for linear mixed models general form of linear mixed models

Revisions to the Prevention of Significant Deterioration (PSD) and Title V Greenhouse Gas (GHG)

Java Sockets Alexander V. Konstantinou CS4119 Computer Networks Columbia University Spring

Extending Modular Redundancy to NTV: Costs and Limits of Resiliency at Reduced Supply Voltage

Lazy Maintenance of Materialized Views Jingren Zhou, Microsoft Research, USA Paul Larson,

using Large Scale Measurement Platforms Understanding the Impact of Network Infrastructure Changes

Serendipity Virtual Element Methods L. Beir ao da Veiga, F. Brezzi, L.D. Marini, A. Russo

MODULE FOUR The Law Of Perfect Positioning Synchronicity Serendipity Perfect Timing Everything

The Periodic Table of Finite Elements Douglas N. Arnold, University of Minnesota Collaborators:

Graphics 2014 Linear Algebra II Linear Maps & Matrices Linear Maps & Matrices CORE