Influence and Homophily in Networked User Behavior Eytan Bakshy Facebook mining social network dynamics workshop @ www 2012 April 16 , 2012
Motivation ▪ To what extent do social networks shape our behaviors online? ▪ Homophily and heterogeneity confound social influence effects. ▪ Online behavior resembles well-studied forms of contagion ▪ Statistical controls are not enough (Shalizi & Thomas, 2011 ) ▪ How do we measure influence? ▪ Experiments.
Outline ▪ What is a reasonable model of social contagion on the Web? ▪ The homophily confound ▪ Study 1 : Influence in information diffusion ▪ Study 2 : Influence in sharing decisions ▪ Implications
Information as biological contagion ▪ Standard models assume constant probability of infection
Information as biological contagion ▪ Standard models assume constant probability of infection ▪ Interesting things happen when reproduction rates are high R ≥ β / γ
Information as biological contagion ▪ Standard models assume constant probability of ! *+ ! * infection ! *+ ! 1 ! ! *+ ! 0 ! ▪ Interesting things happen %$&'"() ! *+ ! / ! ! *+ ! . when reproduction rates ! ! *+ ! - ! are high ! *+ ! , ! ! *+ + *+ * *+ 1 *+ 0 *+ / ▪ On the web, most !"#$ information doesn’t ! -. 4 appear to spread -. 3 ! -. 2 &'"()"*+, ! -. 1 ! ! -. 0 R ≥ β / γ ! -. / ! -. - ! ! ! . / 1 3 5 !"#$% Bakshy, Hofman, Mason, Watts 2011
Threshold models of social contagion ▪ Threshold models: become activated after k contacts are activated ▪ Not clear that local consensus factors into individual decisions in sharing content
Threshold models of social contagion ▪ Threshold models: become activated after k contacts are activated ▪ Not clear that local consensus factors into individual decisions in sharing content ▪ Positive externalities: e.g. adoption of a technology ▪ Utility of visiting to a page is often unrelated to number of visiting friends
Diffusion of innovations ▪ Focuses on the spread of ideas and technologies ▪ Entail costly decisions
Diffusion of innovations ▪ Focuses on the spread of ideas and technologies ▪ Entail costly decisions ▪ Embeddedness, authority, interpersonal trust, play important role
Diffusion of innovations ▪ Focuses on the spread of ideas and technologies ▪ Entail costly decisions ▪ Embeddedness, authority, interpersonal trust, play important role ▪ Much of online activity is cheap and informal
Some Similarities Data Theory � Probability of joining a community when k friends are already members 0.025 0.02 0.015 probability 0.01 0.005 Watts & Dodds 2007 0 0 5 10 15 20 25 30 35 40 45 50 k Wei et al 2010 Backstrom et al 2006 Watts & Dodds 2007 0.08 small assets 0.20 large assets rate of adoption Probability of Buying 0.06 0.10 0.04 0.02 0.00 0 0 5 10 15 10 20 30 40 50 60 number of neighbors (k) Incoming Recommendations Bakshy et al 2009 Leskovec et al 2007
The Homophily Confound Unknown correlation between friends’ characteristics (expected to be stronger for closer friends) X i X j Unknown characteristics U i U j Known characteristics (e.g. Web browsing behavior, interests) Y ja (t 0 ) Alter’s sharing behavior D ija Y ia (t 1 ) Ego’s sharing behavior figure stolen from Bakshy, Eckles, Yan & Rosenn, 2012
Influence (and homophily) Unknown correlation between friends’ characteristics (expected to be stronger for closer friends) X i X i X j X j Unknown characteristics U i U i U j U j Known characteristics (e.g. Web browsing behavior, interests) Y ja (t 0 ) Y ja (t 0 ) Alter’s sharing behavior Other forms of influence D ija D ija Mechanism (e.g. News Feed, social cues) Y ia (t 1 ) Y ia (t 1 ) Ego’s sharing behavior figure stolen from Bakshy, Eckles, Yan & Rosenn, 2012
Study 1 : Effect of Feed on Information Diffusion with Itamar Rosenn, Cameron Marlow, and Lada Adamic published as The Role of Social Networks in Information Diffusion. WWW 2012 .
Study 1 : Outline ▪ Field experiment tests how much sharing would occur in the absence of exposure via the Facebook feed ▪ Answers causal questions about influence & diffusion: ▪ To what extent does feed increase sharing? ▪ Are weak ties responsible for disseminating information?* ▪ How is tie strength predictive of user activity?* *to be continued on April 19 th, The Role of Social Networks in Information Diffusion
Correlated Information Sources External Influence regularly visit same visit sites that link to mass + interpersonal interpersonal site the same content media communication Web revisitation Blogs Face-to-face Telephone Adar et al, 2009 News Aggregators RSS IM Email
Influence on Feed Unknown correlation between friends’ characteristics (expected to be stronger for closer friends) X i X i X j X j Unknown characteristics U i U i U j U j Known characteristics (e.g. Web browsing behavior, interests) Y ja (t 0 ) Y ja (t 0 ) Alter’s sharing behavior Other forms of influence D ija D ija Facebook news feed Y ia (t 1 ) Y ia (t 1 ) Ego’s sharing behavior figure stolen from Bakshy, Eckles, Yan & Rosenn, 2012
Details ▪ Assignment procedure: ▪ (viewer, URL) pairs are deterministically assigned into the feed and no feed condition ▪ Directed shares (via messages, wall posts) are not subject treatment and are removed from experiment ▪ Evaluating outcomes: ▪ Compare the likelihood of sharing in the feed (treatment) with the no feed (control) condition
Data ▪ Random sample of all (user, URL) pairs eligible to be shown in the Facebook news feed between a 7 week period in 2010 ▪ 253 , 238 , 367 subjects ▪ 75 , 888 , 466 URLs ▪ 1 , 168 , 633 , 941 distinct subject-URL pairs (random trials)
Temporal Clustering shared within the first hour of exposure users shared at exact same time shared before seeing within one day story on feed within one week 1.0 1.0 0.8 0.8 cumulative density cumulative density 0.6 0.6 c 0.4 0.4 condition feed 0.2 no feed 0.2 0.0 0.0 -5 0 5 10 15 20 25 30 0 5 10 15 20 25 30 share time - exposure time (days) share time - alter's share time (days) Absolute time Relative to first exposure
What is the overall effect of feed on sharing? ▪ Two methods for comparing probabilities: ▪ Average treatment effect of the treated: p feed - p no feed ▪ Relative risk ratio: p feed / p no feed ▪ Average effect: + 0 . 2047% increase in sharing ▪ Risk Ratio: 7 . 3 x more likely to share
How does sharing increase with exposure? Influence on feed + external correlation External correlation 0.025 ! ! ! feed ! ! ! 0.020 ! probability of sharing ! no feed ! 0.015 ! 0.010 ! 0.005 ! 0.000 1 2 3 4 5 6 number of sharing friends
How does sharing increase with exposure? 0.030 0.025 p f e e d − p no feed 0.020 0.015 0.010 0.005 0.000 1 2 3 4 5 6 number of sharing friends
Study 1 : Recap ▪ Experiments are necessary to disentangle influence from other factors ▪ Significant temporal clustering exists even for unexposed users ▪ Probability of sharing increases with number of friends ▪ Even you don’t see those friends! ▪ Influence appears stronger when more friends are shown
Study 2 : Effect of Social Cues on Sharing Decisions Chapter IV, Information Diffusion and Social Influence in Online Networks (dissertation chapter)
Motivation ▪ Social influence in information diffusion occurs via two stages ▪ 1 . Exposure (study 1 ) 0.030 ▪ 2 . Decision to share 0.025 p f e e d − p no feed 0.020 ▪ Trend in previous experiment is not causal 0.015 ▪ Need a way to experimentally manipulate 0.010 0.005 the number of social signals received by the user 0.000 1 2 3 4 5 6 number of sharing friends
Study 2 : Outline ▪ Field experiment tests how the number of friends shown (social cues) increases sharing via randomization of cues ▪ Answers causal questions about influence: ▪ How does seeing a certain number of peers effect information diffusion? ▪ How is tie strength predictive of user activity? ▪ Are strong ties more influential?
Experimental Design ▪ Subjects: Users that arrive at pages independent of Facebook ▪ Are or would have been assigned to the no feed condition in Study 1 ▪ Not arriving via Facebook ▪ Assignment procedure: randomly assign (viewer, URL) to a number of cues
Data ▪ Same 7 week period as Study 1 ▪ 1 , 891 , 768 randomized trials (unique subject-page pairs) consisting of: ▪ 1 , 156 , 608 unique subjects ▪ 470 , 089 distinct web pages ▪ Record demographic features, tie strength measures between subjects and their alters for each impression and click event
Social Correlation ▪ Probability when number of friends liking = number shown 0.08 Not causal! probability of sharing 0.06 0.04 0.02 0.00 0 1 2 3 4 number of sharing friends (k)
baseline: homophily + heterogeneity (zero friends shown) observed effect number of actual liking friends } 0 1 2 3 0.08 probability of sharing 0.06 0.04 0.02 0.00 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 number of friends shown (o)
Recommend
More recommend