Recommendation on Data Missing Not at Random A Doubly Robust Joint - PowerPoint PPT Presentation

Recommendation on Data Missing Not at Random A Doubly Robust Joint Learning Approach

Rating Matrix Item 1 Item 2 Item 3 ... Item M User 1 4 ... User 2 2 ... User 3 5 ... 5 ... ... ... ... ... ... User N 2 ... 1

Rating Prediction Item 1 Item 2 Item 3 ... Item M User 1 4.5 2.3 3.5 ... 1.8 User 2 6.7 3.9 2.9 ... 3.8 User 3 2.3 4.8 1.1 ... 5.2 ... ... ... ... ... ... User N 2.6 3.5 1.8 ... 0.7

Prediction Error Item 1 Item 2 Item 3 ... Item M User 1 4.5 - 4 = 0.5 ... User 2 2.9 - 2 = 0.9 ... User 3 5 - 4.8 = 0.2 ... 5.2 - 5 = 0.2 ... ... ... ... ... ... User N 2 - 1.8 = 0.2 ... 1 - 0.7 = 0.3

Prediction Error Item 1 Item 2 Item 3 ... Item M User 1 4.5 - 4 = 0.5 2.3 3.5 ... 1.8 User 2 6.7 3.9 2.9 - 2 = 0.9 ... 3.8 User 3 2.3 5 - 4.8 = 0.2 1.1 ... 5.2 - 5 = 0.2 ... ... ... ... ... ... User N 2.6 3.5 2 - 1.8 = 0.2 ... 1 - 0.7 = 0.3

Handling Missing Ratings: Ignore Them Item 1 Item 2 Item 3 ... Item M User 1 0.5 ... User 2 0.9 ... When missing ratings are missing at random ( MAR ), the prediction error is User 3 0.2 ... 0.2 unbiased ... ... ... ... ... ... i.e., User N 0.2 ... 0.3

Missing Ratings: Missing Not at Random ○ Missing ratings: missing not at random ( MNAR ) ○ Rating for an item is missing or not: the user’s rating for that item ○ Producer: ○ Tens of thousands of items, not randomly chosen to present ○ Selection / ranking / filtering process ○ User: ○ Normally don’t choose items randomly to watch/buy/visit ○ After watching/buying/visiting, don’t choose items randomly to rate, either ■ Rate those they have an opinion Can we do better when ratings are MNAR?

Handling Missing Ratings: Error Imputation Item 1 Item 2 Item 3 ... Item M User 1 0.5 2.2 1.0 ... 2.7 User 2 2.2 0.6 0.9 ... 0.7 The imputed errors can be based on User 3 2.2 0.2 3.4 ... 0.2 heuristics. For example, in an existing work [Steck 2010]: ... ... ... ... ... ... User N 1.9 1.0 0.2 ... 0.3 If the imputed errors are accurate, the prediction error is unbiased

Handling Missing Ratings: Inverse Propensity Item 1 Item 2 Item 3 ... Item M User 1 0.5*1.3 ... User 2 0.9*2.7 ... where User 3 0.2*3.4 ... 0.2*1.4 ... ... ... ... ... ... User N 0.2*3.9 ... 0.3*1.2 If the estimated propensities are accurate, the prediction error is unbiased

Weakness ○ Error imputation based (EIB) ○ Hard to accurately estimate the imputed errors ○ it’s almost as hard as predicting the original ratings ○ Inverse propensity scoring (IPS) ○ often suffers from the large variance issue ○ When estimated propensity is very small, it creates a very large value

Handling Missing Ratings: Proposed Doubly Robust where * and is the imputed error * when imputed error is close to the true error Doubly robust : the prediction error is unbiased when ○ either the estimated propensities are accurate ○ or the imputed errors are accurate

Toy Example Prediction error = 10 / 6

Toy Example Estimated error from EIB is 8 / 6

Toy Example Estimated error from IPS is 9.2 / 6

Toy Example Estimated error from DR is 9.92 / 6

Joint Learning ○ Imputed errors are closely related to predicted ratings, e.g., ○ Accuracy of imputed errors changes when predicted ratings change ○ In turn, changed imputed errors affect rating prediction training ○ Joint Learning Rating prediction model minimizes Error imputation model minimizes error estimated by DR estimator the squared deviation

Analysis of DR Estimator Bias Tail bound Generalization bound

Bias of DR Estimator

Tail Bound of DR Estimator

Generalization Bound

Experiments ○ MAE and MSE when test on MAR ratings

Experiments ○ Estimation bias and standard deviation using synthetic data under MSE

Take Away ○ Missing ratings are not always missing at random ○ Accurate estimation of the prediction error on MNAR ratings improves generalization and performance ○ Doubly robust estimator often gives more accurate estimation ○ Joint learning of rating prediction and error imputation achieves further improvements

Poster: Today @ Pacific Ballroom #217 Thanks for your time! Questions?

Appendix

Recommendation on Data Missing Not at Random A Doubly Robust Joint - PowerPoint PPT Presentation

Recommendation on Data Missing Not at Random A Doubly Robust Joint Learning Approach Rating Matrix Item 1 Item 2 Item 3 ... Item M User 1 4 ... User 2 2 ... User 3 5 ... 5 ... ... ... ... ... ... User N 2 ... 1 Rating

Bayesian Generalized linear mixed models with data missing not at random Overview: Two simple

Multiple Imputation for Missing Data in KLoSA Juwon Song Korea University and UCLA Contents 1.

Missing Data and Imputation NINA ORWITZ OCTOBER 30 TH , 2017 Outline Types of missing data

Random Numbers RANDOM VS PSEUDO RANDOM Truly Random numbers From Wolfram: A random number

Missing Values in SAS Magnus Mengelbier Director PhUSE 2011 1 Topics Introduction

Searching for and replacing missing values Nicholas Tierney Statistician DataCamp Dealing With

Is the data missing at random? DEALIN G W ITH MIS S IN G DATA IN P YTH ON Suraj Donthi Deep

Missing data and data imputation with the Swiss Household Panel Andr Berchtold LIVES, LINES,

Whats Missing? SOCI 101 November 29, 2011 SOCI 101 () Whats Missing? November 29, 2011

Chapter 2: Random Variables In this chapter we will cover: 1. Discrete Random variables, ( 2.1

Random Numbers, Files, and Onwards Random Numbers Computers cannot produce truly random numbers.

Performing and tracking imputation Nicholas Tierney Statistician DataCamp Dealing With Missing

Seven Deadly Sins in Ocular 1: Missing the Tumor Not performing adequate ophthalmoscopy Oncology

Estimating Gaussian Mixture Models from Data with Missing Features by Daniel McMichael CSSIP

Unbiased Offline Recommender Evaluation for Missing-Not-At-Random Implicit Feedback Serge

Why learn how to build recommendation engines? Jamen Long Data Scientist DataCamp Building

(Still) Exploiting TCP Timestamps Veit N. Hailperin 1 1 scip AG Hack in Paris, June 2015 Veit N.

Randomized methods for machine learning David Lopez-Paz, FAIR May 17, 2016

To Randomize or Not To Randomize: Space Optimal Summaries for Hyperlink Analysis Tam as

under Modular Updates Shachar Lovett (UCSD) Kaave Hosseini (UCSD CMU), Grigory Yaroslavtsev

On the Exact Security of Schnorr-Type Signatures in the Random Oracle Model Yannick Seurin

Computation for Mali licious Adversaries and an Honest Majority Jun Furukawa*, Yehuda Lindell**,

Improved Bounds on the Dot Product under Random Projection and Random Sign Projection Ata Kab

to Systematically Test File-System Crash Consistency Ashlie Martinez Vijay Chidambaram