Sociology Problems Problem 1 Sociology and CS How close are people connected? “Small World” Problem 2 Who is the most connected? Philip Chan “Connector” How close are people connected? (Problem Understanding) Are people Small World closely connected, not closely connected, isolated into groups, … Problem 1 Degree of Separation Milgram’s Experiment The number of connections to reach another Stanley Milgram, psychologist person Experiment in the late 1960’s Chain letter to gather data Stockbroker in Boston 160 people in Omaha, Nebraska Given a packet Add name and forward it to another person who might be closer to the stockbroker Partial “social network” 1
Small World Bacon Number Six degrees of separation Number of connections to reach actor Kevin Bacon Everyone is connected to everyone by a few http://oracleofbacon.org/ people—about 6 on the average. Obama might be 6 connections away from you Is a connection in this network different from “Small world” phenomenon the one in Milgram’s experiment? Problem Formulation Problem Formulation Given (input) Given (input) People Connections/links/friendships Find (output) Find (output) the average number of connections between two people Simplification Simplification don’t care about how long/strong/… the don’t care about … friendships are Problem Formulation Problem Formulation Formulate it into a graph problem Formulate it into a graph problem (abstraction) (abstraction) Given (input) Given (input) People People -> vertices Connections Connections -> edges Find (output) Find (output) the average number of connections between the average number of connections between two people two people -> ? 2
Problem Formulation Algorithm Formulate it into a graph problem Ideas? (abstraction) Given (input) People -> vertices Connections -> edges Find (output) the average number of connections between two people -> average shortest path length Algorithm Algorithm Shortest Path Shortest Path Dijkstra’s algorithm Dijkstra’s algorithm Limitations? Limitations? Single-source All-pair Shortest Path Floyd’s algorithm This could be an overkill, why? Algorithm Algorithm Unweighted edges Breadth-first search (BFS) Each edge has the same weight of 1 Simpler algorithm? 3
Algorithm Algorithm Breadth-first search (BFS) Breadth-first search (BFS) Data structure to remember visited vertices Data structure to remember visited vertices Single source; repeat for each vertex to start Algorithm Implementation Breadth-first search (BFS) Which data structure to represent a graph (vertices and edges)? Data structure to remember visited vertices Single source; repeat for each vertex to start ShortestPath(x,y) = shortestPath(y,x) Implementation Implementation Which data structure to represent a graph Which data structure to represent a graph (vertices and edges)? (vertices and edges)? Adjacency matrix Adjacency matrix Adjacency list Adjacency list Tradeoffs? Tradeoffs? Time Space 4
Adjacency Matrix vs List Adjacency Matrix vs List Time Time Speed of what? Speed of key operations in the algorithm Algorithm: Key operation: Adjacency Matrix vs List Adjacency Matrix vs List Time Time Speed of key operations in the algorithm Speed of key operations in the algorithm Algorithm: BFS Algorithm: BFS Key operation: identifying children Key operation: identifying children Space Amount of data in the problem Adjacency Matrix vs List Time Connector Speed of key operations in the algorithm Algorithm: BFS Key operation: identifying children Space Amount of data in the problem Problem 2 Number of people/vertices Number of friends/edges each person has 5
Who is the most connected? Revolutionary War (Problem understanding) Spreading the word that the British is going to What does that mean? attack Paul Revere vs William Dawes Revere was more successful than Dawes History books remember Revere more Who is the most connected? Who is the most connected? What does that mean? What does that mean? The person with the most friends? The person with the most friends? Phone book experiment 250 random surnames Number of friends with those surnames Who is the most connected? Who is the most connected? What does that mean? What does that mean? The person with the most friends? The person with the most friends? Phone book experiment How to formulate it into a graph problem? 250 random surnames Number of friends with those surnames Number of friends have a wide range Random sample: 9 -118 Conference in Princeton: 16 - 108 6
Who is the most connected? Who is the most connected? What does that mean? What does that mean? The person with the most friends? The person with the most friends? How to formulate it into a graph problem? Are all friends equal? Output: the vertex with the highest degree Who is the most connected? Milgram’s Experiment What does that mean? 24 letters get to the stockbroker at home The person with the most friends? 16 from Mr. Jacobs Are all friends equal? The rest get to the stockbroker at work You have 100 friends Majority from Mr. Brown and Mr. Jones Michelle Obama has only one friend: Overall, half of the letters came through the Barack Obama, who has a lot of friends three people Not just how many, but who you know But Milgram started from a random set of people What does this suggest? Milgram’s Experiment Getting a Job experiment Average degree of separation is six, but: Mark Granovetter, sociologist Experiment in 1974 A small number of special people connect to many people in a few steps 19%: formal means—advertisements, Small degree of separation headhunters The rest of us are connected to those special 20%: apply directly people 56%: personal connection Called “Connectors” by Gladwell 7
Getting a Job experiment Getting a Job experiment Personal connection Personal connection 17%: see often friends 17%: see often 56%: see occasionally acquaintances 56%: see occasionally 28%: see rarely almost strangers? 28%: see rarely What does this suggest? What does this suggest? Getting jobs via acquaintances Why? Getting a Job experiment Who is the most connected? Personal connection “Connector” 17%: see often friends How many friends does one have? 56%: see occasionally acquaintances What kind of friends does one have? 28%: see rarely almost strangers? What does this suggest? How do you find Connectors? Getting jobs via acquaintances connect you to a different world How do you formulate it into a graph might have a lot connections problem? “The Strength of Weak Ties” Problem Formulation Algorithm 1: Connector Score Given (input) Motivation: People -> vertices “Friends” who are closer have higher scores Connections -> edges Find (output) Friends of Person with the “best” Connector score distance 1, score = ? Part of the algorithm is to define the Connector distance 2, score = ? score Simplification distance 3, score = ? Don’t care about how strong/long/… the distance d , score = ? friendships/connections are 8
Algorithm 1: Connector Score Algorithm 1: Adding the scores Motivation: How to enumerate the people so that we can add the scores? “Friends” who are closer have higher scores Friends of distance 1, score = ? distance 2, score = ? distance 3, score = ? distance d , score = 1/ d , 1/ d 2 , … Algorithm 1: Adding the scores Algorithm 1: Adding the scores How to enumerate the people so that we can How to enumerate the people so that we can add the scores? add the scores? BFS BFS Is score(x, y) the same as score(y, x)? Algorithm 2: Connector Score Algorithm 2: Connector Score Motivation: Motivation: Degree of separation (number of connections) Degree of separation (number of connections) to other people is small to other people is small Connector score: Connector score: Ideas? Average degree of separation from a person to every other person 9
Recommend
More recommend