Quantify fying Social In Influence in in Epin inions Akshay Patil, Golnaz Ghasemiesfeh, Roozbeh Ebrahimi & Jie Gao
In Introduction Social Network : • Structure made up of entities & their relationships . i.e.: Facebook, G+, Y!, etc. Content Generation Websites : • i.e.: Wikipedia, Youtube, Epinions, Instagram, etc. Explosion in “ online social activity ” + “ content generation ” : • Large-Scale data availability. • Quantitative research into dynamics. Overlay of “ content generation ” + “ social structure ” • Study the mutual influence of content and social structure on each other.
: A Consumer Review Sit ite : Epinions.com incorporates a social structure into its rating system: • Rating system • Users write reviews • Users rate other user ’ s reviews [1-5] (1:bad ->5:Excellent) • Social structure • Trust other users to form “ web of trust ” (public) • Distrust other users to form “ block list ” (private) • We are interested in the interplay of “ rating system data ” and “ social structure data ” .
Dataset Epinions Ratings Statistic 1 2 3 4 5 #Users 131,828 2.13% 4.63% 0.01% #Reviews 1,197,816 14.30 #Trust Edges 717,667 (85%) % #Distrust Edges 123,705 (15%) #Ratings 12,943,546 78.93 Jan ’ 01 to % Time Range Aug ’ 03
1. . Relationship Formation Scenario: 1 2 … n • User A has a couple of trustees (his “ web of trust ” , or “ friends ” ) • A ’ s friends have a trust/distrust relationship with user B. A Classification: • If the majority of A ’ s friends trust B, we B say they collectively trust B. • If the majority of A ’ s friends distrust B, we say the collectively distrust B. • Other wise they are neutral or in disagreement .
1. . Relationship Formation Question 1 : Is there a correlation between the collective opinion of 1 2 … n A ’ s friends about B and his future relationship with B? A • If A ’ s friends collectively trust B, is A more likely to trust B as well? B • If A ’ s friends collectively distrust B, is A more likely to distrust B as well?
1. . Relationship Formation 100.00% Raw Observations: 95.00% 90.00% Collectively Trust 95.09% 93.85% Collectively Distrust 85.00% 85.93% Opinionated 80.00% How meaningful (statistically significant) are these results?
1. . Relationship Formation, Random Shuffle Measure over/under representation compared to mere chance (approach by Leskovec et al. ’ 10): • Randomly shuffle trust/distrust edges, while maintaining the same percentages. • Redo “ relationship formation ” analysis. 𝑹−𝑭 𝑹 • Compute “ Surprise ” value: 𝒕 = 𝑭 𝑹 𝟐−𝒒 𝟏 • Q: actual quantity of a scenario, E[Q]: expected quantity under shuffling, p 0 : priori probability of the scenario.
1. . Relationship Formation, Random Sh Shuffle Surprise is number of standard deviations by which the actual quantity differs from the expected number under the random shuffling model. • s > 0 overrepresentation • s < 0 underrepresentation • s = 6 p-value ≈ 10 -8 • A value of s > 6 results in excellent statistical significance.
1. . Relationship Formation, Random Sh Shuffle “ Agreeing with Friends ” : surprise values in excess of 70! 90 60 30 Surprise 0 Agreeing with Disagreeing with Neutral Friends -30 Friends Friends -60 -90 -120 Strong correlation between a user ’ s friends opinions and formation of his future relationships (distrust is hidden ).
1. . Relationship Formation: Li Linking Habits The dataset exhibits users with very different linking habits • Some are very trustworthy / trustful compared to others. Analysis should not overlook • The quality of the reviews (trustworthiness) written by the user. • The degree of trustfulness of the person creating the link. Looking through the lenses of linking habits (Leskovec et al. ’ 10). Receptive Baseline ( trustworthiness) : • Fraction of received trust links . Generative Baseline ( trustfulness) : • Fraction of given trust links .
1. . Relationship Formation: Li Linking Habits Receptive/Generative Surprise : Number of standard deviations the quantity is above the expected number. • If B was trusted/distrusted based solely on his trustworthiness, Receptive Surprise = 0 . • If A made his decision based solely on his trustfulness, Generative Surprise = 0 . Observations: Receptive Surprise Generative Surprise Collectively Trust 96.76 34.99 Collectively Distrust -104.15 -56.31
1. . Relationship Formation: Li Linking Habits Collectively Trust • Users exceed both generative and receptive baselines in trusting and being trusted. • This can be explained by homophily or influence of friends. Collectively Distrust • Users fall behind generative/receptive baselines. • This can be explained by heterophobia or lack of context by friends (distrust edges are hidden).
2. . Fri riend of f Friend (FoF) Dynamics Scenario: • User A trusts user B and user B trusts A user C. Rate • A does not have a trust/distrust edge to C. • A rates a review by C. B C Question 2 : Is A more likely to give a favorable rating to C ’ s review? Why is this about the influence of the social structure on the rating system data?
2. . Fri riend of f Friend Dynamics A A FoE Rate Rate FoF B C B C A A Rate Rate EoF EoE B C B C
2. . FoF Dynamics: Random Shuffling EoF FoF 1 2 3 4 5 1 2 3 4 5 1000 2000 800 1500 600 1000 Surprise Surprise 400 500 200 0 0 -500 -200 -1000 -1500 -400
2. . FoF Dynamics: Random Shuffling FoE EoE 1 2 3 4 5 1 2 3 4 5 500 200 400 150 300 100 200 50 Surprise Surprise 100 0 0 -50 -100 -100 -200 -150 -300 -400 -200
2. . FoF Dynamics: Rating Habits Analysis Rating Gen./Rec. FoF EoF FoE EoE Surprise 1 Gen. sur. -43.77 66.03 -8.36 -1.43 Rec. sur. -10.91 19.80 57.13 2.76 2 Gen. sur. -627.54 789.36 -108.83 26.54 Rec. sur. -527.77 89.58 206.16 4.03 3 Gen. sur. -360.72 2.01 -304.42 -124.23 Rec. sur. -181.17 10.16 65.53 5.89 4 Gen. sur. -847.21 -115.23 -381.94 -190.69 Rec. sur. -370.22 -3.57 81.06 -4.27 5 Gen. sur. 1065.09 -189.93 531.88 214.75 Rec. sur. 519.91 -61.03 -173.88 -1.36
2. . Fri riend of f Friend Dynamics : : Summary Distinct Trends in 2 (out of 4) scenarios, • FoF : Shift towards assigning higher ratings to C ’ s review (Specially 5). • Homophily/Influence • EoF : Shift towards assigning lower ratings to C ’ s review (Specially 1 &2). • Heterophobia/Influence? In the remaining 2 scenarios (FoE & EoE), it is hard to get a solid interpretation.
3. . Building a Predictor: Corr. Analysis Utilize FoF dynamics as features and “ actual rating ” as target value. Dynamics Correlation Coefficient FoF 0.1112 EoF -0.0918 FoE 0.0105 EoE -0.0001
3. . Building a a Predictor: Pic icking Features Class Feature Information Gain A ’ s Generative Baseline 0.1595 C ’ s Generative Baseline 0.2291 Trust A ’ s Receptive Baseline 0.1943 C ’ s Receptive Baseline 0.4496 Avg. Rating given by A 0.3316 Avg. Rating given by C 0.3776 Rating Avg. Rating received by A ’ s Reviews 0.2453 Avg. Rating received by C ’ s Reviews 0.5362 Number of FoF Paths 0.3813 FoF Number of EoF Paths 0.1894 Dynamics Number of FoE Paths 0.0119 Number of EoE Paths 0.0198
3. . Prediction Results (B (Bootstrap Aggregating) Precision Recall F-Score 0.85 ROC Area (AUC) = 0.91 Overall Accuracy = 76% 0.8 0.75 0.7 0.65 Low (1, 2) Medium (3) High (4, 5)
Conclusion Relationship Formation • Random Shuffle and Linking Habits: Strong Correlation. Exceeding both generative and receptive baselines in trusting and being trusted. Friend of Friend Dynamics • FoF : Shift towards assigning higher ratings to C ’ s review. EoF : Shift towards assigning lower ratings to C ’ s review. Building a Predictor • This alignment can be used to predict (recommend) content that would be likeable by a user. • We achieve good prediction accuracy with a simple feature set.
Thank You!
Recommend
More recommend