evolution dynamics in social networks
play

EVOLUTION DYNAMICS IN SOCIAL NETWORKS Ashwin Bahulkar Advisor & - PowerPoint PPT Presentation

EVOLUTION DYNAMICS IN SOCIAL NETWORKS Ashwin Bahulkar Advisor & Collaborators: Boleslaw K. Szymanski , Kevin Chan 1 , Omar Lizardo 2 1 US Army Research Laboratory 2 University of Notre Dame, Notre Dame, IN, USA supported by Network Science


  1. EVOLUTION DYNAMICS IN SOCIAL NETWORKS Ashwin Bahulkar Advisor & Collaborators: Boleslaw K. Szymanski , Kevin Chan 1 , Omar Lizardo 2 1 US Army Research Laboratory 2 University of Notre Dame, Notre Dame, IN, USA supported by Network Science CTA, ARL

  2. 2 Overview • Link Formation and Dissolution in attribute-rich networks • Can we predict the state of a network from node attributes? • Which node attributes can predict formation and dissolution of edges in networks. • Coevolution of node-aligned multiple layers in networks • Multiple layers: several networks sharing the same node-set, different relations among nodes. • Coevolution: Do edges occur in one network before they do so in another? • Groups and Influence

  3. 3 Motivation • Find out which factors affect evolution of networks • Sociological interests: influence policy making in organizations, based on factors • Bring stability to networks in organizations through policies, if desired • Infer cause of instability in networks • Build strong, stable teams in organizations • Commercial interests: influence advertisement, marketing and reach-out strategies

  4. 4 Part 1 Link Formation and Dissolution in Attribute-rich Networks

  5. 5 Introduction • How much does knowledge of node-attributes improve link formation and dissolution prediction? • How should these attributes be used to make predictions? • Find which attributes are correlated with formation of new links • We introduce the preference model • Find which attributes are correlated with dissolution and persistence of existing links. • Track network stability with link prediction

  6. 6 Link Prediction • Link Formation Prediction: • Given is a social network, which evolves over time and this evolution is recorded in a sequence of network snapshots. • Some new edges are created, some old edges get dissolved and some node are removed from one network snapshot to another. • At any given snapshot, which edges would be created in the future snapshot? • Highly unbalanced classification, very few potential links are created Training set Test set visible visible visible visible hidden hidden New New links links Link Dissolution Prediction: similar, predict which links would dissolve.

  7. 7 Related Work • Existing link prediction approaches: • Topology based link predictors • Machine learning based • Markov model or graphical model based • Little work on attribute-rich networks, attributes are used in very simplistic manner • Little work on dissolution prediction • Attribute-rich data has become recently available to us, although the size of networks is relatively small

  8. 8 Attribute-rich data: NetSense • Nodes : Students from University of NotreDame, from Freshman to Junior years, around 2 years, 200 of them. • Data collected: • Call and message logs between students in the study. • Contact data based on bluetooth recorded proximity. • Nominations of significant peers, opinions on social & political issues, student background and university activities for every student. • Frequency: • Nominations and opinions were collected in the form of surveys at the beginning of every semester.

  9. 9 Evolving NetSense Networks Network snapshots are taken for every semester of the year: Fall and Spring. • Behavioral Networks : Based on calls and texts made in the semester. An edge exists if there is a call or text exchange between two nodes. Typical network size ranges from 150-200 nodes and 200-350 edges. We have snapshots for 4 semesters. • Nominative Network : Based on survey answers by students to “Who are your top contacts”.

  10. 10 Node Attributes • Student background: • Major in the Notre Dame programs • Behavioral traits • Family income, race and religion • Opinions on: • Politics • Abortion and marijuana legalization • Homosexuality and gay marriage • Habits and Lifestyle: • Drinking habits • Time spent on weekly activities: studying, partying etc.

  11. 11 Attributes for link prediction • We use machine learning for link prediction • The Homophily Model: • “Birds of a feather flock together” • Node n1, n2; attribute values a1 = a1; feature value = 1 • Node n1, n2; attribute values a1 ≠ a2; feature value = 0 • Does this work? Not so much. • Why? Nodes have different “preferences” for different attribute values • We introduce the “preference model”.

  12. 12 A case for the preference model • Different groups of people have different attributes • Still, difficult to generalize preferences on a group-basis • Different nodes would have different preferences for attributes Values > 1 indicate preference for, values < indicates preference against.

  13. 13 Intuition of the Preference Model • Population: 60% liberals, 40% conservatives • Node 1: liberal; 90% contacts liberals, 10% conservatives • Strong bias towards liberals, strong bias against conservatives • Node 2: conservative; 50% liberals, 50% conservatives • Only slight bias towards conservatives • We capture the bias, or “anomaly” for every attribute value, for each node, with reference to the population.

  14. 14 Individual Preferences of Nodes • Features for machine learning : • Node Preferences -> Edge Preferences • Some network features: number of common neighbors • Node preference feature: • For an edge with nodes n1 and n2 , for attribute a : • Feature-value (a) = n1->preference(n2.a) * n2-> preference(n1.a). • Calculate preference of node n1 for attribute-value v : • n1 has n contacts with attribute-value v . • Calculate Z-score of having n contacts • Z-Score= (value – expected mean) / standard deviation • Obtain scores, which can vary from -3.4 to +3.4 • Convert to a range 0 - 1

  15. 15 Results with the Preference method • Link Prediction: We get about 90% recall with good accuracy, using SVM, Linear and Logistic regression. • Link Dissolution Prediction: 80-90% accuracy • Below are the plots of recall vs. false positives for different thresholds in linear regression.

  16. 16 Results and Ranking of Attributes • Ranking of attributes: Leave-feature out, weight in linear regression Nomination Behavior 1. Political Views 1. Gay Marriage Legalization 2. Parental Income 2. Political Views Link Creation 3. Common Neighbors 3. Parental Income 4. Time Volunteering 4. Views on homosexuality 5. Time Exercising 5. Time Camping 1. Views on Homosexuality 1. Time socializing 2. Political Views 2. Time in Clubs Link Dissolution 3. Time socializing 3. Marijuana Legalization 4. Time Partying 4. Time Exercising 5. Marijuana Legalization 5. Time Studying

  17. 17 Track Network Stability by Link Prediction • Networks evolve over time • Patterns of new Link formation also change over time • We look at the network of researchers studying Leishmaniasis, a rare disease • Network spreads over several countries, including, Brazil, India, US, European Union countries • From 1980 to 2015, leaders of research changed over time, nature of link formation also changed • We use link prediction to track the change

  18. 18 Experiment • Perform link prediction over the period 1980 to 2015, divided into seven 5-year snapshots • Perform link prediction using older snapshots, see if the models still apply • Perform link prediction only on newly emerging nodes, and compare with older nodes • Features: • Network topology features, common areas of research, country of origin, recency and strength of collaboration • Network size: • Ranges from 700 to 5000 nodes, and 1200 to 34,000 edges.

  19. 19 Results • Using the most recent snapshot, we get recall values between 60-80% • Using older snapshots, recall and accuracy values both drop, about a 8-10% drop • Edges between old nodes vs. new nodes: • Till 2000, recall of edges between old and new nodes is equivalent. • After 2000, recall of edges with new nodes is very poor, increases a little by 2015 • Possible large scale disruption in 2000 in the network • Leadership in research passes from USA, Europe to India, Brazil, and focus shifts from fundamental research to more diagnostic and trials based work

  20. 20 Part 2 Coevolution of a Multilayer Node-aligned Network whose Layers Represent Different Social Relations

  21. 21 Coevolution of Multiple Layers in Social Networks • Continuously evolving cognitive and behavioral layers. • Are behavioral edges formed before nominative edges are formed? • How likely does behavioral edge dissolve after the corresponding edge disappears in the nominative Nominative network (red edges) network? and behavioral network(green edges).

  22. 22 Questions • Are behavioral edges formed before nomination edges are formed? • How likely does behavioral edge dissolve after the corresponding edge disappears in the nomination network? • Are there any patterns of communication decay following link dissolution in the nomination network? • Do symmetric nominations differ from asymmetric nominations?

  23. 23 Dataset: NetSense • NetSense communication and nomination data is used. • Also, bluetooth interactions data is used. • Behavioral Layers : Layer based on communication edges, and based on bluetooth proximity measures. Bluetooth proximity layer is much more dense. We have snapshots for 4 semesters. • Nominative Layer : Based on survey answers by students to “Who are your top contacts”.

Recommend


More recommend