Experiments (Guru) Competition-based Dollar-based • Boston University Slideshow Title Goes Here ClusterHire evimaria@cs.bu.edu
Experiments (Freelancer) Competition-based Dollar-based • Boston University Slideshow Title Goes Here ClusterHire evimaria@cs.bu.edu
Experiments • Performance of CliqueGreedy Boston University Slideshow Title Goes Here Freelancer Guru • Nodes: 1764 Nodes: 721 Cliques: 1660 Cliques: 520 evimaria@cs.bu.edu
Roadmap Boston University Slideshow Title Goes Here • Background • Team formation and cluster hires • Team formation in the presence of a social network • Inferring abilities of experts • Team formation in educational settings evimaria@cs.bu.edu
Setting [LLT’09] Experts (defining the set V, with |V|=n): Boston University Slideshow Title Goes Here Every expert i is associated with a set of skills X i and a price p i Tasks Every task T is associated with a set of skills (T) required for performing the task A social network of experts (G=(V,E)) Edges indicate ability to work well together Team Formation Experts’ skills Known Participation of experts in teams Unknown Network structure Known evimaria@cs.bu.edu
Team formation in the presence of a social network Boston University Slideshow Title Goes Here Given a task and a set of experts organized in a network find the subset of experts that can e fg ectively perform the task Task: set of required skills Expert: has a set of skills Network: represents strength of relationships evimaria@cs.bu.edu
Coverage is NOT enough T={ algorithms , java , graphics , python } Boston University Slideshow Title Goes Here A lice B ob C ynthia D avid E leanor { algorithms } { python } { graphics, java } { graphics } { graphics,java,python } A,E could perform A,B,C form an A A D the task if they e fg ective group that could communicate can communicate B B C C E E Communication: the members of the team must be able to e ffj ciently communicate and work together evimaria@cs.bu.edu
Problem definition (E fg ectiveTeam) Boston University Slideshow Title Goes Here Given a task and a social network of individuals, find the subset (team) of individuals that can e fg ectively perform the given task. Thesis: Good teams are teams that have the necessary skills and can also communicate e fg ectively evimaria@cs.bu.edu
How to measure e fg ective communication? Boston University Slideshow Title Goes Here The longest shortest path between any two nodes in the subgraph Diameter of the subgraph defined by the group members A A D B B C C E E diameter = 1 diameter = infty evimaria@cs.bu.edu
How to measure e fg ective communication? Boston University Slideshow Title Goes Here The total weight of the edges of a tree that spans all the team nodes MST (Minimum spanning tree) of the subgraph defined by the group members A A D B B C C E E MST = infty MST = 2 evimaria@cs.bu.edu
Problem definition (MinDiameter) Boston University Slideshow Title Goes Here Given a task and a social network G of experts, find the subset (team) of experts that can perform the given task and they define a subgraph in G with the minimum diameter. Problem is NP-hard evimaria@cs.bu.edu
The RarestFirst algorithm Boston University Slideshow Title Goes Here Find Rarest skill α rare required for a task S rare group of people that have α rare Evaluate star graphs, centered at individuals from S rare Report cheapest star Running time: Quadratic to the number of nodes Approximation factor: 2xO PT evimaria@cs.bu.edu
The RarestFirst algorithm T={ algorithms,java,graphics,python } Boston University Slideshow Title Goes Here {graphics,python,java} {algorithms,graphics} A B Skills: A B algorithms {algorithms,graphics,java} E E graphics java C D python {python,java} {python} α rare = algorithms Diameter = 2 S rare ={B ob , E leanor } evimaria@cs.bu.edu
The RarestFirst algorithm T={ algorithms,java,graphics,python } Boston University Slideshow Title Goes Here {graphics,python,java} {algorithms,graphics} Skills: A B algorithms { algorithms,graphics,java } graphics E E java C python C D {python,java} {python} α rare = algorithms Diameter = 1 S rare ={B ob , E leanor } evimaria@cs.bu.edu
Analysis of RarestFirst Boston University Slideshow Title Goes Here S 1 D = max {d ℓ , d k , d ℓ k } d 1 …. S rare Fact: OPT ≥ d ℓ d ℓ S ℓ Fact: OPT ≥ d k …. d k d ℓ k D ≤ d ℓ k ≤ d ℓ + d k ≤ 2*OPT S k evimaria@cs.bu.edu
Problem definition (MinMST) Boston University Slideshow Title Goes Here Given a task and a social network G of experts, find the subset (team) of experts that can perform the given task and they define a subgraph in G with the minimum MST cost. Problem is NP-hard evimaria@cs.bu.edu
The SteinerTree problem Boston University Slideshow Title Goes Here Graph G(V,E) Required vertices Partition of V into V = {R,N} Find G’ subgraph of G such that G’ contains all the required vertices (R) and MST(G’) is minimized evimaria@cs.bu.edu
The EnhancedSteiner algorithm T={ algorithms , java , graphics , python } Boston University Slideshow Title Goes Here graphics {graphics,python,java} {algorithms,graphics} A B java {algorithms,graphics,java} algorithms E E D C D python {python,java} {python} MST Cost = 1 evimaria@cs.bu.edu
Exploiting the SteinerTree problem further Boston University Slideshow Title Goes Here Graph G(V,E) Required vertices Partition of V into V = {R,N} Find G’ subgraph of G such that G’ contains all the required vertices (R) and MST(G’) is minimized evimaria@cs.bu.edu
The CoverSteiner algorithm T={ algorithms , java , graphics , python } Boston University Slideshow Title Goes Here {graphics,python,java} {algorithms,graphics} A B 1. Solve SetCover {algorithms,graphics,java} E E 2. Solve Steiner D C D {python,java} {python} MST Cost = 1 evimaria@cs.bu.edu
How good is CoverSteiner? T={ algorithms , java , graphics , python } Boston University Slideshow Title Goes Here {graphics,python,java} {algorithms,graphics} B A B A 1. Solve SetCover {algorithms,graphics,java} E 2. Solve Steiner C D {python,java} {python} MST Cost = Infty evimaria@cs.bu.edu
Experiments – Cardinality of teams Boston University Slideshow Title Goes Here Dataset DBLP graph (DB, Theory, ML, DM) ~6000 authors ~2000 features Features: keywords appearing in papers Tasks: Subsets of keywords with di fg erent cardinality k evimaria@cs.bu.edu
Example teams (I) Boston University Slideshow Title Goes Here S. Brin, L. Page: The anatomy of a large-scale hypertextual Web search engine Paolo Ferragina, Patrick Valduriez, H. V. Jagadish, Alon Y. Levy, Daniela Florescu Divesh Srivastava, S. Muthukrishnan P. Ferragina ,J. Han, H. V.Jagadish, Kevin Chen-Chuan Chang, A. Gulli , S. Muthukrishnan, Laks V. S. Lakshmanan evimaria@cs.bu.edu
Example teams (II) Boston University Slideshow Title Goes Here J. Han, J. Pei, Y. Yin: Mining frequent patterns without candidate generation F. Bronchi A. Gionis, H. Mannila, R. Motwani evimaria@cs.bu.edu
Extensions Boston University Slideshow Title Goes Here Other measures of e fg ective communication density, number of times a team member participates as a mediator, information propagation Other practical restrictions Incorporate ability levels Online team formation [ABCGL’12] evimaria@cs.bu.edu
Setting • Pool of people/experts with different skills Boston University Slideshow Title Goes Here • People are connected through a social network • Stream of jobs/tasks arriving online • Jobs have some skill requirements • Goal: Create teams on-the-fly for each job – Select the right team – Satisfy various criteria evimaria@cs.bu.edu
Criteria • Fitness Boston University Slideshow Title Goes Here – E.g. if fitness is success rate, maximize expected number of successful jobs – Depends on: – People skills – Ability to coordinate • Efficiency – Do not load people very much • Fairness – Everybody should be involved in roughly the same number of jobs • evimaria@cs.bu.edu
Basic formulation 00010101 Boston University Slideshow Title Goes Here 10011101 Stream of tasks arriving online 10010010 10001101 Vector of skills Vector of skills evimaria@cs.bu.edu
Basic formulation 00010101 Boston University Slideshow Title Goes Here 10011101 Stream of tasks arriving online 10010010 10001101 Vector of skills Vector of skills Coordination cost evimaria@cs.bu.edu
Basic formulation 00010101 Boston University Slideshow Title Goes Here 10011101 Stream of tasks arriving online 10010010 10001101 Vector of skills Vector of skills Coordination cost evimaria@cs.bu.edu
Basic formulation: Skills and people Boston University Slideshow Title Goes Here 10001101 10010010 • n people/experts • m skills • Each person has some skills evimaria@cs.bu.edu
Basic formulation: jobs & teams 00010101 Boston University Slideshow Title Goes Here 10011101 10010010 10001101 • Stream of k Jobs/Tasks • A job requires some skills • k Teams are created online • A team must cover all job skills evimaria@cs.bu.edu
Basic formulation: jobs & teams 00010101 Boston University Slideshow Title Goes Here 10011101 10010010 10001101 • Stream of k Jobs/Tasks • A job requires some skills • k Teams are created online • A team must cover all job skills • Load of p: L(p) = total # of teams having p evimaria@cs.bu.edu
Coordination cost Boston University Slideshow Title Goes Here • Coordination cost measures the compatibility of the team members • Example of : – Degree of knowledge – Time-zone difference – Past collaboration • Select teams that minimizes coordination cost : – Steiner-tree cost – Diameter – Sum of distances evimaria@cs.bu.edu
Coordination cost Boston University Slideshow Title Goes Here • Steiner-tree cost • Diameter • Sum of distances evimaria@cs.bu.edu
Conflicting goals Boston University Slideshow Title Goes Here • We want to create teams online that minimize – Load – Unfairness – Coordination cost and cover each job. • How can we model all these requirements? evimaria@cs.bu.edu
Our modeling approach • Set a desirable coordination cost upper bound B Boston University Slideshow Title Goes Here • Online solve Load of person i Team j covers job j Bounded coordination cost • Must concurrently solve various combinatorial problems: – Set cover – Steiner tree – Online makespan minimization evimaria@cs.bu.edu
Our modeling approach Boston University Slideshow Title Goes Here Job p 1 p 2 p 3 p 4 p 5 p 6 p 7 Q j 1 Q 1 = {p 2 , p 4 , p 5 } 2 Q 2 = {p 1 , p 4 , p 6 } 3 Q 3 = {p 3 , p 4 } 4 Q 4 = {p 1 , p 5 , p 7 } 5 Q 5 = {p 2 , p 3 . p 4 , p 5 } 6 Q 6 = {p 3 , p 5 , p 6 } 7 Q 7 = {p 1 , p 2 } 8 Q 8 = {p 1 , p 2 , p 3 , p 4 , p 7 } 9 Q 9 = {p 3 , p 4 , p 5 } Load 4 4 5 6 5 2 2 evimaria@cs.bu.edu
Algorithm ExpLoad Load of p at time t Boston University Slideshow Title Goes Here At each time step t, when a task arrives: • Weight each person p by • Select team Q that – Covers all required skills – Satisfies – Minimizes • Theorem. If we can solve this problem optimally, then Competitive ratio = . This is the best possible. evimaria@cs.bu.edu
The ExpLoad algorithm Load of p at time t Boston University Slideshow Title Goes Here At each time step t, when a task arrives: • Weight each person p by • Select team Q that – Covers all required skills – Satisfies – Minimizes • Theorem. If we can solve this problem optimally, then Competitive ratio = . This is the best possible. evimaria@cs.bu.edu
The ExpLoad algorithm Load of p at time t At each time step t, when a task arrives: Boston University Slideshow Title Goes Here • Weight each person p by • Select team Q that We can solve this – Covers all required skills problem only – Satisfies approximately. – Minimizes • Theorem. If we can solve this problem optimally, then Competitive ratio = . This is the best possible. evimaria@cs.bu.edu
Roadmap Boston University Slideshow Title Goes Here • Background • Team formation and cluster hires • Team formation in the presence of a social network • Inferring abilities of experts • Team formation in educational settings evimaria@cs.bu.edu
Setting [GLT’12] Experts (defining the set V, with |V|=n): Boston University Slideshow Title Goes Here Every expert i is associated with a set of skills X i and a price p i Tasks Every task T is associated with a set of skills (T) required for performing the task A social network of experts (G=(V,E)) Edges indicate ability to work well together Team Formation Skill Attribution Experts’ skills Known Unknown Participation of experts in teams Unknown Known Network structure Known Irrelevant evimaria@cs.bu.edu
The Skill-Attribution problem Input: a set of teams and the tasks they performed Boston University Slideshow Title Goes Here Team T 1 ={A,B} performed task S 1 ={algorithms, databases} Team T 2 ={B,C,D} performed task S 2 ={algorithms, system, programming} Team T 3 ={A,B,C} performed task S 3 ={databases, algorithms, systems} Question: What are the contributions of each team member? Team {A,B} appear to know algorithms and databases but who knows algorithms and who knows databases? Assumptions: Complementarity: A team has a skill if at least one of its members has that skill Parsimony: It is hard to imagine a world where all individuals have all skills evimaria@cs.bu.edu
The Skill-Attribution problem The input introduces a set of constraints Boston University Slideshow Title Goes Here Team T 1 ={A,B} performed task S 1 ={algorithms, databases} Team T 2 ={B,C,D} performed task S 2 ={algorithms, system, programming} Team T 3 ={A,B,C} performed task S 3 ={databases, algorithms, systems} A skill assignment is consistent if for every task T i and every skill in s Є S i there exist at least one expert in T i who has s. A skill assignment is consistent if and only if it is consistent for every skill separately Focus on the single-skill attribution problem evimaria@cs.bu.edu
Skill vectors and hitting sets A Boston University Slideshow Title Goes Here s = algorithms T1 Team T 1 ={A,B} B Team T 2 ={B,C} T2 C Team T 3 ={C,D} T3 Team T 4 ={D,E} D T4 E A skill vector assigns skill s to individuals from V Any consistent skill vector is a hitting set for the set system (T 1 ,T 2 ,…,T m , V) Teams: subsets of Universe of individuals individuals evimaria@cs.bu.edu
Minimum skill attribution (v 0.0) For a single skill s, and input teams T 1 ,T 2 ,…,T m Boston University Slideshow Title Goes Here find a consistent skill attribution with the minimum number of individuals possessing s. Minimum skill attribution: X * = {B,D} A Minimum skill attribution is as hard as s = algorithms T1 the minimum hitting set problem B Team T 1 ={A,B} X * is a strictly parsimonious solution T2 Team T 2 ={B,C} C One solution is not enough: Team T 3 ={C,D} T3 Near-optimal attributions are ignored D Team T 4 ={D,E} X’={A,C,D}, X’’={A,C,E}, X’’’={B,C,D}, T4 X’’’’={B,C,E} E evimaria@cs.bu.edu
Counting all consistent skill vectors Boston University Slideshow Title Goes Here For a single skill s, and input teams T 1 ,T 2 ,…,T m count for every individual in V the number of consistent skill vectors he participates in. Equivalent to counting hitting sets for input (T 1 ,T 2 ,…,T m ,V) #P-complete problem evimaria@cs.bu.edu
The lattice of skill vectors Boston University Slideshow Title Goes Here V Everyone has skill s Minimal sets Supersets if Consistent subsets a minimal set Subset of V that possesses skill s Inconsistent subsets Ø Noone has skill s evimaria@cs.bu.edu
Counting all consistent skill vectors V Everyone has skill s Boston University Slideshow Title Goes Here Naïve Monte-Carlo sampling C=0 for i=1…N Sample an element from the lattice; if it is consistent C++ return (C/N)x2 n Ø Noone has skill s evimaria@cs.bu.edu
Counting all consistent skill vectors V Everyone has skill s Boston University Slideshow Title Goes Here Naïve Monte-Carlo sampling C=0 for i=1…N Sample an element from the lattice; if it is consistent C++ return (C/N)x2 n Does not work when there are few consistent vectors Ø Noone has skill s evimaria@cs.bu.edu
The ImportanceSampling algorithm Boston University Slideshow Title Goes Here • Assume we know the set of V minimal sets that contain r M(r) = {M 1 ,…,M k } Supersets of a minimal sets • Sample consistent vectors from the space of hitting sets only • Running time: polynomial in k Ø evimaria@cs.bu.edu
ImportanceSampling Speedups Boston University Slideshow Title Goes Here • Run ImportanceSampling for all experts simultaneously • View the input as a bipartite graph and partition it into (almost) independent components • Cluster together experts that participate in identical sets of teams into super-experts evimaria@cs.bu.edu
A T1 1 T2 Boston University Slideshow Title Goes Here 2 T3 3 T5 B T6 4 T7 5 T8 ConsistentVectors(1) = ConsistentVectors(1,A)xConsistentVectors(B) evimaria@cs.bu.edu
Ranking of experts Boston University Slideshow Title Goes Here social networks privacy graphs P. Mika (1) A. Acquisti (1) C. Faloutsos (1) J. Golbeck (5) M. S. Ackerman (3) J. Kleinberg (2) M. Richardson (5) L. Faith Cranor (3) J. Leskovec (2) P. Singla (19) B. Berendt (5) R. Kumar (3) L. Zhou (7) S. Spiekermann (5) A. Tomkins (3) A. Java (19) O. Gunther (19) L. A. Adamic (3) L. Ding (2) J. Grossklags (5) E. Vee (4) T. Finin (2) G. Hsieh (19) P. Ginsparg (4) A. Joshi (2) K. Vaniea (19) J. Gehrke (4) R. Agrawal (19) N. Sadeh (19) B. A. Huberman (3) evimaria@cs.bu.edu
Roadmap Boston University Slideshow Title Goes Here • Background • Team formation and cluster hires • Team formation in the presence of a social network • Inferring abilities of experts • Team formation in educational settings evimaria@cs.bu.edu
Team formation in educational settings [AGT’14] Boston University Slideshow Title Goes Here • Consider a class of students Di fg erent ability levels (single scores) • • Example: GRE, TOEFL, SAT, … How to form study groups? evimaria@cs.bu.edu
Team formation in educational settings [AGT’14] Boston University Slideshow Title Goes Here • Classical methods Ability-Based Grouping • • Grouping students with similar abilities together Pseudo-Random Grouping • • Grouping students based on some arbitrary ordering • Alphabetically, FCFS, … evimaria@cs.bu.edu
Team formation in educational settings [AGT’14] Boston University Slideshow Title Goes Here • Classical methods Ability-Based Grouping • • Grouping students with similar abilities together Pseudo-Random Grouping • • Grouping students based on some arbitrary ordering • Alphabetically, FCFS, … Which method to use? Inconclusive verdict from empirical studies (Kulik 92, Loveless 13, McPartland 87) Let’s take a computational approach evimaria@cs.bu.edu
Framework Boston University Slideshow Title Goes Here • evimaria@cs.bu.edu
Framework Boston University Slideshow Title Goes Here • Two groups of students in a study group Students below the collective ability • Students above the collective ability • evimaria@cs.bu.edu
Framework • Two groups of students in a study group Boston University Slideshow Title Goes Here Students below the collective ability • Students above the collective ability • Mostly improve by Mostly learn from other teaching others members of the group evimaria@cs.bu.edu
Framework • Two groups of students in a study group Boston University Slideshow Title Goes Here Students below the collective ability • Students above the collective ability • Mostly improve by Mostly learn from other teaching others members of the group Our Focus Maximize the number of such students • evimaria@cs.bu.edu
Recommend
More recommend