Recommendations as Treatments: Debiasing Learning and Evaluation ICML 2016, NYC Schnabel † , Adith Swaminathan † , Tob obias Sc Ashudeep Singh † , Navin Chandak § , Thorsten Joachims † † Cornell University, § Google Funded in part through NSF Awards IIS-1247637, IIS-1217686, IIS-1513692.
Recommendations as Treatments: Debiasing Learning and Evaluation Movie recommendation Horror Romance Drama 5 5 1 3 5 1 3 5 5 Horror Lovers 5 5 1 3 5 5 5 3 O 5 5 1 3 3 Observed Y/N 1 1 5 5 3 3 1 5 3 5 5 Romance Lovers 5 5 5 5 3 Y 1 5 5 3 True Rating ⇒ Data is Missing Not At Random (MNAR) Example adapted from (Steck et al., 2010) 2
Recommendations as Treatments: Debiasing Learning and Evaluation Selection Bias in Recommendation Why is there selection bias? o User-induced bias (e.g., browsing) o System-induced bias (e.g., advertising) Question: What happens if we ignore selection bias? (Marlin et al., 2007; Steck, 2011; Hernándandez-Lobato et al., 2014) 3
Recommendations as Treatments: Debiasing Learning and Evaluation Evaluating Recommendations under Selection Bias Horror Romance Drama 𝑍 5 5 5 5 1 1 3 3 Recommend 5 1 3 5 5 5 5 Horror Lovers 5 5 1 3 5 5 1 3 5 5 5 5 5 5 3 3 O 5 5 5 5 1 1 3 3 3 3 Observed Y/N 1 1 5 5 3 3 1 1 5 5 3 3 1 5 3 5 5 5 5 Romance Lovers 5 5 3 3 5 5 5 3 5 5 5 3 Y 1 1 5 5 5 5 3 3 True Rating ⇒ Observed ratings are misleading due to selection bias 4
Recommendations as Treatments: Debiasing Learning and Evaluation Evaluating Predicted Ratings under Selection Bias 𝑍 𝑍 1 2 Pred Ratings (worse) Pred Ratings (better) Horror Romance Drama Horror Romance Drama 5 5 5 5 5 5 5 5 5 5 3 3 3 3 3 5 5 5 5 5 1 1 1 1 5 5 5 5 5 1 1 1 1 1 5 5 5 5 5 1 5 5 5 5 5 5 1 3 5 1 3 5 5 5 5 5 5 5 5 5 5 3 3 3 3 3 5 5 5 5 5 1 1 1 1 5 5 5 5 5 1 1 1 1 1 5 5 5 5 5 1 5 5 5 5 5 Horror Horror Lovers Lovers 5 5 5 5 5 5 5 5 5 5 3 3 3 3 3 5 5 5 5 5 1 1 1 1 5 5 5 5 5 1 1 1 1 1 5 5 5 5 5 1 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 3 3 3 3 3 5 5 5 5 5 1 1 1 1 5 5 5 5 5 1 1 1 1 1 5 5 5 5 5 1 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 3 3 3 3 3 5 5 5 5 5 1 1 1 1 5 5 5 5 5 1 1 1 1 1 5 5 5 5 5 1 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 3 3 3 3 3 1 1 1 1 1 5 5 5 5 1 1 1 1 1 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 1 5 3 1 5 3 5 5 5 5 5 5 5 5 5 5 3 3 3 3 3 1 1 1 1 1 5 5 5 5 1 1 1 1 1 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 Romance Romance Lovers Lovers 5 5 5 5 5 5 5 5 5 5 3 3 3 3 3 1 1 1 1 1 5 5 5 5 1 1 1 1 1 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 3 3 3 3 3 1 1 1 1 1 5 5 5 5 1 1 1 1 1 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 3 3 3 3 3 1 1 1 1 1 5 5 5 5 1 1 1 1 1 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
Recommendations as Treatments: Debiasing Learning and Evaluation Evaluating Predicted Ratings under Selection Bias 𝑍 𝑍 1 2 Pred Ratings (worse) Pred Ratings (better) Horror Romance Drama Horror Romance Drama 5 5 5 5 5 1 1 1 1 1 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 3 3 3 3 3 5 3 3 3 3 3 5 5 5 5 5 1 1 1 1 5 5 5 5 5 1 1 1 1 1 5 5 5 5 5 1 5 5 5 5 5 5 1 3 5 1 3 5 5 5 5 5 1 1 1 1 1 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 3 3 3 3 3 5 3 3 3 3 3 5 5 5 5 5 1 1 1 1 5 5 5 5 5 1 1 1 1 1 5 5 5 5 5 1 5 5 5 5 5 Horror Lovers Horror Lovers 5 5 5 5 5 1 1 1 1 1 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 3 3 3 3 3 5 3 3 3 3 3 5 5 5 5 5 1 1 1 1 5 5 5 5 5 1 1 1 1 1 5 5 5 5 5 1 5 5 5 5 5 5 5 5 5 5 1 1 1 1 1 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 3 3 3 3 3 5 3 3 3 3 3 5 5 5 5 5 1 1 1 1 5 5 5 5 5 1 1 1 1 1 5 5 5 5 5 1 5 5 5 5 5 5 5 5 5 5 1 1 1 1 1 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 3 3 3 3 3 5 3 3 3 3 3 5 5 5 5 5 1 1 1 1 5 5 5 5 5 1 1 1 1 1 5 5 5 5 5 1 5 5 5 5 5 1 1 1 1 1 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 3 3 3 3 3 5 3 3 3 3 3 1 1 1 1 1 5 5 5 5 1 1 1 1 1 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 1 5 3 1 5 3 1 1 1 1 1 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 3 3 3 3 3 5 3 3 3 3 3 1 1 1 1 1 5 5 5 5 1 1 1 1 1 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 Romance Romance Lovers Lovers 1 1 1 1 1 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 3 3 3 3 3 5 3 3 3 3 3 1 1 1 1 1 5 5 5 5 1 1 1 1 1 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 1 1 1 1 1 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 3 3 3 3 3 5 3 3 3 3 3 1 1 1 1 1 5 5 5 5 1 1 1 1 1 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 1 1 1 1 1 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 3 3 3 3 3 5 3 3 3 3 3 1 1 1 1 1 5 5 5 5 1 1 1 1 1 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 6
Recommendations as Treatments: Debiasing Learning and Evaluation Evaluating Predicted Ratings under Selection Bias 𝑍 𝑍 1 2 Pred Ratings (worse) Pred Ratings (better) Horror Romance Drama Horror Romance Drama 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 3 3 3 3 3 5 3 3 3 3 3 5 5 5 5 5 1 1 1 1 5 5 5 5 5 1 1 1 1 1 5 5 5 5 5 1 5 5 5 5 5 5 1 3 5 1 3 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 3 3 3 3 3 5 3 3 3 3 3 5 5 5 5 5 1 1 1 1 5 5 5 5 5 1 1 1 1 1 5 5 5 5 5 1 5 5 5 5 5 Horror Horror Lovers Lovers 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 3 3 3 3 3 5 3 3 3 3 3 5 5 5 5 5 1 1 1 1 5 5 5 5 5 1 1 1 1 1 5 5 5 5 5 1 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 3 3 3 3 3 5 3 3 3 3 3 5 5 5 5 5 1 1 1 1 5 5 5 5 5 1 1 1 1 1 5 5 5 5 5 1 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 3 3 3 3 3 5 3 3 3 3 3 5 5 5 5 5 1 1 1 1 5 5 5 5 5 1 1 1 1 1 5 5 5 5 5 1 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 3 3 3 3 3 1 1 1 1 1 5 5 5 5 1 1 1 1 1 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 3 3 3 3 3 1 5 3 1 5 3 5 5 5 5 5 5 5 5 5 5 3 3 3 3 3 1 1 1 1 1 5 5 5 5 1 1 1 1 1 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 Romance 5 5 5 5 5 5 5 5 5 5 3 3 3 3 3 Romance Lovers Lovers 5 5 5 5 5 5 5 5 5 5 3 3 3 3 3 1 1 1 1 1 5 5 5 5 1 1 1 1 1 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 3 3 3 3 3 5 5 5 5 5 5 5 5 5 5 3 3 3 3 3 1 1 1 1 1 5 5 5 5 1 1 1 1 1 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 3 3 3 3 3 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 3 3 3 3 3 5 3 3 3 3 3 1 1 1 1 1 5 5 5 5 1 1 1 1 1 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 ⇒ Observed losses are misleading due to selection bias 7
Recommendations as Treatments: Debiasing Learning and Evaluation Recommendations as Treatments Question : How can we fix the effects of selection bias? o Connection to potential outcomes framework Counterfactual Outcomes 𝑍 Observed Outcomes ෨ 𝑍 treatments movies treatments 5 5 1 3 5 5 5 1 3 5 5 1 3 5 5 5 3 patients users 5 5 1 3 3 patients 1 1 5 5 3 3 5 5 1 5 3 5 5 5 5 3 1 5 5 3 ⇒ Understand assignment mechanism (Imbens & Ruben, 2015) 8
Recommendations as Treatments: Debiasing Learning and Evaluation Debiasing Evaluation Propensities P Horror Romance Drama Assignment mechansim for recommendation: o 𝑄 𝑣,𝑗 = 𝑄 𝑃 𝑣,𝑗 = 1 p p/10 p/2 Use Inverse-Propensity-Scoring Estimator (IPS) to obtain unbiased estimate: p/10 p p/2 1 1 2 𝑆 𝐽𝑄𝑇 𝑣,𝑗 − 𝑍|𝑄 = 𝑍 𝑍 𝑣,𝑗 𝑉 ⋅ 𝐽 𝑄 𝑣,𝑗 𝑣,𝑗 :𝑃 𝑣𝑗 =1 (Little & Rubin, 2002; Cortes et al., 2008; Bickel et al., 2009; Sugiyama & Kawanabe, 2012). 9
Recommendations as Treatments: Debiasing Learning and Evaluation Propensity estimation Two settings: o Experimental ̶- Propensities are under our control; known by design (e.g., ad placement) o Observational ̶- Users self-select; need to Observations O estimate 𝑄 𝑣,𝑗 Horror Romance Drama 1 0 1 0 0 1 0 0 0 0 0 0 1 0 0 Estimate parameter of binary random 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 variables : 0 1 0 1 0 0 0 0 1 0 1 0 0 0 1 1 0 0 0 1 1 0 0 1 0 0 1 0 0 1 𝑄 𝑣,𝑗 = 𝑄 𝑃 𝑣,𝑗 = 1 | 𝑌, ෨ 𝑍 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 1 1 0 0 0 0 Variety of models: Logistic Regression, 1 0 0 0 0 0 1 1 0 0 0 0 0 1 0 Naïve Bayes, etc. 10
Recommendations as Treatments: Debiasing Learning and Evaluation Debiasing Evaluation Robustness to selection bias: Severity of Selection Bias Severity of Selection Bias 11
Recommendations as Treatments: Debiasing Learning and Evaluation Debiasing Evaluation Robustness to inaccurate propensities: IPS-est More accurate propensities More accurate propensities 12
Recommendations as Treatments: Debiasing Learning and Evaluation Debiasing Learning Empirical Risk Minimization (ERM) successful in many settings (Cortes & Vapnik, 1995) Use ERM together with Inverse-Propensity-Scoring Estimator (IPS) 𝑍 𝐹𝑆𝑁 = argmin 𝑆 𝐽𝑄𝑇 𝑍 | 𝑄 𝑍∈ℋ For matrix factorization with MSE loss: 1 2 + 𝜇 2 + 𝑋 𝐺 𝑍 𝐹𝑆𝑁 = argmin 2 𝑍 𝑣,𝑗 − 𝑊 𝑣 𝑋 𝑊 𝐺 𝑗 𝑄 𝑣,𝑗 𝑊,𝑋 𝑃 𝑣,𝑗 =1 propensity weight 13
Recommend
More recommend