Understanding the Implications of Recommender Systems on Our Views and Behaviors Gedas Adomavicius University of Minnesota Joint work with Jesse Bockstedt, Shawn Curley, Jingjing Zhang 2 1
Recommender Systems: Feedback Loop Predicted Ratings (expressing recommendations for unknown items) Recommender System Consumer (Consumer preference (Preference, Purchasing, Accuracy estimation) Consumption) Actual Ratings (expressing preferences for consumed items) 3 Relevant Notion: Decision Heuristics Anchoring The process of seeding a thought in people’s minds and having that thought influence their later actions (Ariely 2008 ). Anchoring and Adjustment: A person begins with a first approximation (anchor) and then makes incremental adjustments based on additional information (Tversky & Kahneman 1974). 4 2
Anchoring and Adjustment Heuristic • Decision makers use implicitly suggested reference points (the anchor) as a starting point and make adjustments from it until they reach a reasonable estimate (Tversky & Kahneman 1974) • Example: numeric anchoring (Ariely et al. 2006) – Think of the last two digits of your social security number – Now bid on products… – People with higher social security numbers made bids 60 ‐ 120% higher SSN SSN Estimate Estimate 14 86 45 67 5 Related Literature: Anchoring Effects • Three waves of anchoring research (Epley and Gilovich 2010) – First: establishment of anchoring and adjustment as leading to biases in judgment • E.g., Tversky & Kahneman 1974; Chapman & Bornstein 1996; Northcraft & Neale 1987 – Second: psychological explanations for anchoring effects (Russo 2010). • Uncertainty leads to a search from the anchor to the first plausible value among the distribution of uncertain values • Anchor leads to biased retrieval of anchor ‐ consistent knowledge • Numerical priming • Providing content relevant to one’s preference (e.g., anchor is viewed as a suggestion to the correct answer; “trust” in the system) – Third: anchoring in real world contexts • E.g., Johnson, Schnytzer & Liu (2009) study anchoring in horserace betting; Ku, Galinsky & Murnighan (2006) investigate anchoring effects in auctions. • Recommender systems (Cosley et al. 2003) 6 3
Anchoring in Recommendations Recommendation Recommendation We think We think Preference Preference you’ll like it: you’ll like it: Rating: Rating: 2.5/5 3.5/5 Unbiased Preference: 3.0/5 7 Related Literature: Anchoring and Recommender Systems Cosley et al. (2003) Our Prior Studies Setting Recommender systems Recommender systems Preference Preference and Willingness ‐ to ‐ Pay Type of task (no objective standard) (no objective standard) Stimuli Multiple movies Single/multiple TV shows, jokes, songs Recommendations System ‐ based System ‐ based, plus artificially generated Manipulations Two: High vs. Low Multiple: High vs. Low; also range of manipulations Timing (process Retrospective Point of Consumption implications) (Retrieval; Uncertainty) (Integrating & Responding; No Uncertainty) Directly (timing, perceived reliability hypotheses) and Explanations None indirectly provide evidence relative to possible explanations that have been posited for anchoring 8 4
Prior Research on Biases in Recommender Systems Cosley et al. (2003) • Impact of system generated recommendations on user re ‐ ratings of movies • Recall task • High test ‐ retest rating consistency with no recommendations • Showing system’s ratings biased users’ subsequently submitted ratings in the direction of recommendation 9 Are Biases Bad? Biases could be undesirable for recommendation system (Cosley et al. 2003, Adomavicius et al. 2013) – Contaminate the recommender system’s inputs, weakening the system’s ability to provide high ‐ quality recommendations in subsequent iterations – Can lead to users having a distorted view of items’ relevance – Can lead to the recommender system having a distorted view of users’ preferences – Provide opportunities for manipulation 10 5
Our Prior Studies • Motivation: – Deepen our understanding of anchoring biases within the important context of recommender systems – Anchoring effects in preference setting (both in terms of item ratings and willingness to pay) and at the time of consumption – Provide evidence relative to the proposed explanations for anchoring effects 11 Prior Studies: General Research Question Whether and to what extent do system ratings that are displayed to users influence users’ preferences and behaviors at the time of consumption? Studies 1 ‐ 3 : Preference ratings Studies 4 ‐ 5 : Willingness ‐ to ‐ pay 12 6
Studies 1 ‐ 3: Impact on Preference Ratings • Effect of system’s recommendations on self ‐ reported preference ratings – Observed with different information good types: TV shows, jokes • Research issues: – Anchoring issue (High/Low recommendation) – Timing issue (Before/After consumption) – Perceived system reliability issue (Strong/Weak perceived system reliability) – Perturbation size issue (impact of perturbation size on anchoring effect) – Symmetry/asymmetry of effects 13 General Procedure • Rate multiple items – inputs for recommender system • See a recommendation for the viewed instance(s) • View 1 or more instances of item to be rated – Preference at time of consumption! – Minimal uncertainty and biased recall • Provide a preference rating for the viewed instance(s) 14 7
Study 1 ‐ Design • Rated 105 TV shows • Watched an episode of 1 show (all saw same episode) • Received an artificial rating of 4.5 or 1.5 • DV: Actual Rating (submitted by user after consumption) • Tested 3 hypotheses – Anchoring (i.e., anchoring direction) • High (4.5 out of 5) vs. Low (1.5 out of 5) anchor – Timing (of recommendation) • Before viewing vs. After viewing – Perceived reliability (of recommendation) • Weak vs. Strong – Control Group 15 Study 1 ‐ Results • Anchoring hypothesis – supported – Significant observed anchoring effect of the provided artificial recommendation (High vs. Low) • Timing hypothesis – not supported – No significant difference of Before vs. After • Perceived system reliability – supported – No significant impact in the Weak condition (WeakHigh vs. WeakLow) • Asymmetry of the anchoring effect – Artificial high recommendation did not raise ratings significantly (High vs. Control) – Artificial low recommendation significantly lowered ratings (Low vs. Control) 16 8
Study 2 ‐ Design • Anchors were based on an actual recommender system – Seven recommendation techniques were tested on the dataset – Item ‐ based collaborative filtering approach was the best performer • Test of Anchoring Hypothesis – High (predicted rating plus 1.5) – Accurate (predicted rating) – Low (predicted rating minus 1.5) – Control (no prediction) • Each subject watched a show (not all the same) – She/he had never seen before – Had predicted rating for this user between 2.5 and 3.5 • DV: Rating Drift = Actual Rating – Predicted Rating 17 Study 2 ‐ Results • Effects of providing recommendation – Accurate prediction had no impact (Accurate vs. Control) • Anchoring effect – High recommendation condition led to significant difference in rating drift compared to the Low condition (High vs. Low) • Symmetry – Aggregate over multiple shows: High/Low effects are symmetric – Single show (Show effect): High/Low effects are asymmetric (& different from Study 1) 18 9
Study 3: Granularity of Anchoring Effects What is the functional form of the anchoring effect? Three possibilities: 19 Study 3 ‐ Design • Anchors were based on an actual recommender system • Anchoring: Within ‐ Subjects Design – Each evaluated 50 jokes – Among the remaining 50 jokes • Perturbations of ‐ 1.5, ‐ 1, ‐ .5, 0, .5, 1, 1.5 • Control (no prediction) • Used Jokes to get multiple ratings, still at time of consumption • DV: Rating Drift = Actual Rating – Predicted Rating • Regression done for each individual subject (N = 40 per subject) 20 10
Study 3 ‐ Aggregated Analysis • Aggregated across items and subjects, for each perturbation 0.53 Mean Rating Drift 0.28 Control ‐ 0.04 0.07 ‐ 1.5 ‐ 1 ‐ 0.5 0.5 1 1.5 ‐ 0.20 ‐ 0.23 ‐ 0.41 ‐ 0.53 Perturbation of Recommendation 21 Study 3 ‐ Results • Anchoring effect occurs at the individual level • Effect is linear (Mean Slope = .35) – No significant curvilinearity found – Positive and negative slopes did not significantly differ • Symmetry – Aggregate over multiple jokes: High/Low effects are symmetric 22 11
Takeaways – Preference Ratings • Biased recommendations influence consumers’ preference ratings – Anchoring not only impacts recalled preferences (e.g., Cosley et al. 2003), but also impacts preference ratings at the point of consumption • Perceived reliability of the recommendation matters • Timing of recommendation has no significant effect • Perturbations have a proportional ( linear ) effect on user submitted ratings (both negative and positive) • Asymmetry of anchoring effects – Context ‐ specific (e.g., item ‐ specific?) – Interesting direction for future work User preference ratings are malleable and can be significantly influenced by the recommendations received. 23 General Research Question Whether and to what extent do system ratings that are displayed to users influence users’ preferences and behaviors at the time of consumption? Studies 1 ‐ 3 : Preference ratings Studies 4 ‐ 5 : Willingness ‐ to ‐ pay 24 12
Recommend
More recommend