sociology and cs
play

Sociology and CS How close are people connected? Small World - PDF document

Sociology Problems Problem 1 Sociology and CS How close are people connected? Small World Problem 2 Who is the most connected? Philip Chan Connector How close are people connected? (Problem Understanding)


  1. Sociology Problems  Problem 1 Sociology and CS  How close are people connected?  “Small World”  Problem 2  Who is the most connected? Philip Chan  “Connector” How close are people connected? (Problem Understanding)  Are people Small World  closely connected,  not closely connected,  isolated into groups,  … Problem 1 Degree of Separation Milgram’s Experiment  The number of connections to reach another  Stanley Milgram, psychologist person  Experiment in the late 1960’s  Chain letter to gather data  Stockbroker in Boston  160 people in Omaha, Nebraska  Given a packet  Add name and forward it to another person who might be closer to the stockbroker  Partial “social network” 1

  2. Small World Bacon Number  Six degrees of separation  Number of connections to reach actor Kevin Bacon  Everyone is connected to everyone by a few  http://oracleofbacon.org/ people—about 6 on the average.  Obama might be 6 connections away from you  Is a connection in this network different from  “Small world” phenomenon the one in Milgram’s experiment? Problem Formulation Problem Formulation  Given (input)  Given (input)  People  Connections/links/friendships  Find (output)  Find (output)  the average number of connections between two people  Simplification  Simplification  don’t care about how long/strong/… the  don’t care about … friendships are Problem Formulation Problem Formulation  Formulate it into a graph problem  Formulate it into a graph problem (abstraction) (abstraction)  Given (input)  Given (input)  People  People -> vertices  Connections  Connections -> edges  Find (output)  Find (output)  the average number of connections between  the average number of connections between two people two people -> ? 2

  3. Problem Formulation Algorithm  Formulate it into a graph problem  Ideas? (abstraction)  Given (input)  People -> vertices  Connections -> edges  Find (output)  the average number of connections between two people -> average shortest path length Algorithm Algorithm  Shortest Path  Shortest Path  Dijkstra’s algorithm  Dijkstra’s algorithm  Limitations?  Limitations? Single-source  All-pair Shortest Path  Floyd’s algorithm  This could be an overkill, why? Algorithm Algorithm  Unweighted edges  Breadth-first search (BFS)  Each edge has the same weight of 1  Simpler algorithm? 3

  4. Algorithm Algorithm  Breadth-first search (BFS)  Breadth-first search (BFS)  Data structure to remember visited vertices  Data structure to remember visited vertices  Single source; repeat for each vertex to start Algorithm Implementation  Breadth-first search (BFS)  Which data structure to represent a graph (vertices and edges)?  Data structure to remember visited vertices  Single source; repeat for each vertex to start  ShortestPath(x,y) = shortestPath(y,x) Implementation Implementation  Which data structure to represent a graph  Which data structure to represent a graph (vertices and edges)? (vertices and edges)?  Adjacency matrix  Adjacency matrix  Adjacency list  Adjacency list  Tradeoffs?  Tradeoffs?  Time  Space 4

  5. Adjacency Matrix vs List Adjacency Matrix vs List  Time  Time  Speed of what?  Speed of key operations in the algorithm  Algorithm:  Key operation: Adjacency Matrix vs List Adjacency Matrix vs List  Time  Time  Speed of key operations in the algorithm  Speed of key operations in the algorithm  Algorithm: BFS  Algorithm: BFS  Key operation: identifying children  Key operation: identifying children  Space  Amount of data in the problem Adjacency Matrix vs List  Time Connector  Speed of key operations in the algorithm  Algorithm: BFS  Key operation: identifying children  Space  Amount of data in the problem Problem 2  Number of people/vertices  Number of friends/edges each person has 5

  6. Who is the most connected? Revolutionary War (Problem understanding)  Spreading the word that the British is going to  What does that mean? attack  Paul Revere vs William Dawes  Revere was more successful than Dawes  History books remember Revere more Who is the most connected? Who is the most connected?  What does that mean?  What does that mean?  The person with the most friends?  The person with the most friends?  Phone book experiment  250 random surnames  Number of friends with those surnames Who is the most connected? Who is the most connected?  What does that mean?  What does that mean?  The person with the most friends?  The person with the most friends?  Phone book experiment  How to formulate it into a graph problem?  250 random surnames  Number of friends with those surnames  Number of friends have a wide range  Random sample: 9 -118  Conference in Princeton: 16 - 108 6

  7. Who is the most connected? Who is the most connected?  What does that mean?  What does that mean?  The person with the most friends?  The person with the most friends?  How to formulate it into a graph problem?  Are all friends equal?  Output: the vertex with the highest degree Who is the most connected? Milgram’s Experiment  What does that mean?  24 letters get to the stockbroker at home  The person with the most friends?  16 from Mr. Jacobs  Are all friends equal?  The rest get to the stockbroker at work  You have 100 friends  Majority from Mr. Brown and Mr. Jones  Michelle Obama has only one friend:  Overall, half of the letters came through the  Barack Obama, who has a lot of friends three people  Not just how many, but who you know  But Milgram started from a random set of people  What does this suggest? Milgram’s Experiment Getting a Job experiment  Average degree of separation is six, but:  Mark Granovetter, sociologist  Experiment in 1974  A small number of special people connect to many people in a few steps  19%: formal means—advertisements,  Small degree of separation headhunters  The rest of us are connected to those special  20%: apply directly people  56%: personal connection  Called “Connectors” by Gladwell 7

  8. Getting a Job experiment Getting a Job experiment  Personal connection  Personal connection  17%: see often  friends  17%: see often  56%: see occasionally  acquaintances  56%: see occasionally  28%: see rarely  almost strangers?  28%: see rarely  What does this suggest?  What does this suggest?  Getting jobs via acquaintances  Why? Getting a Job experiment Who is the most connected?  Personal connection  “Connector”  17%: see often  friends  How many friends does one have?  56%: see occasionally  acquaintances  What kind of friends does one have?  28%: see rarely  almost strangers?  What does this suggest?  How do you find Connectors?  Getting jobs via acquaintances  connect you to a different world  How do you formulate it into a graph  might have a lot connections problem?  “The Strength of Weak Ties” Problem Formulation Algorithm 1: Connector Score  Given (input)  Motivation:  People -> vertices  “Friends” who are closer have higher scores  Connections -> edges  Find (output)  Friends of  Person with the “best” Connector score  distance 1, score = ?  Part of the algorithm is to define the Connector  distance 2, score = ? score  Simplification  distance 3, score = ?  Don’t care about how strong/long/… the  distance d , score = ? friendships/connections are 8

  9. Algorithm 1: Connector Score Algorithm 1: Adding the scores  Motivation:  How to enumerate the people so that we can add the scores?  “Friends” who are closer have higher scores  Friends of  distance 1, score = ?  distance 2, score = ?  distance 3, score = ?  distance d , score = 1/ d , 1/ d 2 , … Algorithm 1: Adding the scores Algorithm 1: Adding the scores  How to enumerate the people so that we can  How to enumerate the people so that we can add the scores? add the scores?  BFS  BFS  Is score(x, y) the same as score(y, x)? Algorithm 2: Connector Score Algorithm 2: Connector Score  Motivation:  Motivation:  Degree of separation (number of connections)  Degree of separation (number of connections) to other people is small to other people is small  Connector score:  Connector score:  Ideas?  Average degree of separation from a person to every other person 9

Recommend


More recommend