algorithms for team formation
play

Algorithms for Team Formation Evimaria Terzi (Boston University) - PowerPoint PPT Presentation

Algorithms for Team Formation Evimaria Terzi (Boston University) Team-formation problems Boston University Slideshow Title Goes Here Given a task and a set of experts (organized in a network) find the subset of experts that can e fg ectively


  1. Experiments (Guru)  Competition-based Dollar-based • Boston University Slideshow Title Goes Here  ClusterHire evimaria@cs.bu.edu

  2. Experiments (Freelancer)  Competition-based Dollar-based • Boston University Slideshow Title Goes Here  ClusterHire evimaria@cs.bu.edu

  3. Experiments • Performance of CliqueGreedy Boston University Slideshow Title Goes Here Freelancer  Guru • Nodes: 1764 Nodes: 721 Cliques: 1660 Cliques: 520 evimaria@cs.bu.edu

  4. Roadmap Boston University Slideshow Title Goes Here • Background • Team formation and cluster hires • Team formation in the presence of a social network • Inferring abilities of experts • Team formation in educational settings evimaria@cs.bu.edu

  5. Setting [LLT’09]  Experts (defining the set V, with |V|=n): Boston University Slideshow Title Goes Here  Every expert i is associated with a set of skills X i  and a price p i  Tasks  Every task T is associated with a set of skills (T) required for performing the task  A social network of experts (G=(V,E))  Edges indicate ability to work well together Team Formation Experts’ skills Known Participation of experts in teams Unknown Network structure Known evimaria@cs.bu.edu

  6. Team formation in the presence of a social network Boston University Slideshow Title Goes Here  Given a task and a set of experts organized in a network find the subset of experts that can e fg ectively perform the task  Task: set of required skills  Expert: has a set of skills  Network: represents strength of relationships evimaria@cs.bu.edu

  7. Coverage is NOT enough T={ algorithms , java , graphics , python } Boston University Slideshow Title Goes Here A lice B ob C ynthia D avid E leanor { algorithms } { python } { graphics, java } { graphics } { graphics,java,python } A,E could perform A,B,C form an A A D the task if they e fg ective group that could communicate can communicate B B C C E E Communication: the members of the team must be able to e ffj ciently communicate and work together evimaria@cs.bu.edu

  8. Problem definition (E fg ectiveTeam) Boston University Slideshow Title Goes Here  Given a task and a social network of individuals, find the subset (team) of individuals that can e fg ectively perform the given task.  Thesis: Good teams are teams that have the necessary skills and can also communicate e fg ectively evimaria@cs.bu.edu

  9. How to measure e fg ective communication? Boston University Slideshow Title Goes Here The longest shortest path between any two nodes in the subgraph  Diameter of the subgraph defined by the group members A A D B B C C E E diameter = 1 diameter = infty evimaria@cs.bu.edu

  10. How to measure e fg ective communication? Boston University Slideshow Title Goes Here The total weight of the edges of a tree that spans all the team nodes  MST (Minimum spanning tree) of the subgraph defined by the group members A A D B B C C E E MST = infty MST = 2 evimaria@cs.bu.edu

  11. Problem definition (MinDiameter) Boston University Slideshow Title Goes Here  Given a task and a social network G of experts, find the subset (team) of experts that can perform the given task and they define a subgraph in G with the minimum diameter.  Problem is NP-hard evimaria@cs.bu.edu

  12. The RarestFirst algorithm Boston University Slideshow Title Goes Here Find Rarest skill α rare required for a task  S rare group of people that have α rare  Evaluate star graphs, centered at individuals  from S rare Report cheapest star  Running time: Quadratic to the number of nodes Approximation factor: 2xO PT evimaria@cs.bu.edu

  13. The RarestFirst algorithm T={ algorithms,java,graphics,python } Boston University Slideshow Title Goes Here {graphics,python,java} {algorithms,graphics} A B Skills: A B algorithms {algorithms,graphics,java} E E graphics java C D python {python,java} {python} α rare = algorithms Diameter = 2 S rare ={B ob , E leanor } evimaria@cs.bu.edu

  14. The RarestFirst algorithm T={ algorithms,java,graphics,python } Boston University Slideshow Title Goes Here {graphics,python,java} {algorithms,graphics} Skills: A B algorithms { algorithms,graphics,java } graphics E E java C python C D {python,java} {python} α rare = algorithms Diameter = 1 S rare ={B ob , E leanor } evimaria@cs.bu.edu

  15. Analysis of RarestFirst Boston University Slideshow Title Goes Here S 1  D = max {d ℓ , d k , d ℓ k } d 1 …. S rare  Fact: OPT ≥ d ℓ d ℓ S ℓ  Fact: OPT ≥ d k …. d k d ℓ k  D ≤ d ℓ k ≤ d ℓ + d k ≤ 2*OPT S k evimaria@cs.bu.edu

  16. Problem definition (MinMST) Boston University Slideshow Title Goes Here  Given a task and a social network G of experts, find the subset (team) of experts that can perform the given task and they define a subgraph in G with the minimum MST cost.  Problem is NP-hard evimaria@cs.bu.edu

  17. The SteinerTree problem Boston University Slideshow Title Goes Here  Graph G(V,E) Required vertices  Partition of V into V = {R,N}  Find G’ subgraph of G such that G’ contains all the required vertices (R) and MST(G’) is minimized evimaria@cs.bu.edu

  18. The EnhancedSteiner algorithm T={ algorithms , java , graphics , python } Boston University Slideshow Title Goes Here graphics {graphics,python,java} {algorithms,graphics} A B java {algorithms,graphics,java} algorithms E E D C D python {python,java} {python} MST Cost = 1 evimaria@cs.bu.edu

  19. Exploiting the SteinerTree problem further Boston University Slideshow Title Goes Here  Graph G(V,E) Required vertices  Partition of V into V = {R,N}  Find G’ subgraph of G such that G’ contains all the required vertices (R) and MST(G’) is minimized evimaria@cs.bu.edu

  20. The CoverSteiner algorithm T={ algorithms , java , graphics , python } Boston University Slideshow Title Goes Here {graphics,python,java} {algorithms,graphics} A B 1. Solve SetCover {algorithms,graphics,java} E E 2. Solve Steiner D C D {python,java} {python} MST Cost = 1 evimaria@cs.bu.edu

  21. How good is CoverSteiner? T={ algorithms , java , graphics , python } Boston University Slideshow Title Goes Here {graphics,python,java} {algorithms,graphics} B A B A 1. Solve SetCover {algorithms,graphics,java} E 2. Solve Steiner C D {python,java} {python} MST Cost = Infty evimaria@cs.bu.edu

  22. Experiments – Cardinality of teams Boston University Slideshow Title Goes Here Dataset DBLP graph (DB, Theory, ML, DM) ~6000 authors ~2000 features Features: keywords appearing in papers Tasks: Subsets of keywords with di fg erent cardinality k evimaria@cs.bu.edu

  23. Example teams (I) Boston University Slideshow Title Goes Here S. Brin, L. Page: The anatomy of a large-scale hypertextual  Web search engine  Paolo Ferragina, Patrick Valduriez, H. V. Jagadish, Alon Y. Levy, Daniela Florescu Divesh Srivastava, S. Muthukrishnan  P. Ferragina ,J. Han, H. V.Jagadish, Kevin Chen-Chuan Chang, A. Gulli , S. Muthukrishnan, Laks V. S. Lakshmanan evimaria@cs.bu.edu

  24. Example teams (II) Boston University Slideshow Title Goes Here  J. Han, J. Pei, Y. Yin: Mining frequent patterns without candidate generation  F. Bronchi  A. Gionis, H. Mannila, R. Motwani evimaria@cs.bu.edu

  25. Extensions Boston University Slideshow Title Goes Here  Other measures of e fg ective communication  density, number of times a team member participates as a mediator, information propagation  Other practical restrictions  Incorporate ability levels  Online team formation [ABCGL’12]  evimaria@cs.bu.edu

  26. Setting • Pool of people/experts with different skills Boston University Slideshow Title Goes Here • People are connected through a social network • Stream of jobs/tasks arriving online • Jobs have some skill requirements • Goal: Create teams on-the-fly for each job – Select the right team – Satisfy various criteria evimaria@cs.bu.edu

  27. Criteria • Fitness Boston University Slideshow Title Goes Here – E.g. if fitness is success rate, maximize expected number of successful jobs – Depends on: – People skills – Ability to coordinate • Efficiency – Do not load people very much • Fairness – Everybody should be involved in roughly the same number of jobs • evimaria@cs.bu.edu

  28. Basic formulation 00010101 Boston University Slideshow Title Goes Here 10011101 Stream of tasks arriving online 10010010 10001101 Vector of skills Vector of skills evimaria@cs.bu.edu

  29. Basic formulation 00010101 Boston University Slideshow Title Goes Here 10011101 Stream of tasks arriving online 10010010 10001101 Vector of skills Vector of skills Coordination cost evimaria@cs.bu.edu

  30. Basic formulation 00010101 Boston University Slideshow Title Goes Here 10011101 Stream of tasks arriving online 10010010 10001101 Vector of skills Vector of skills Coordination cost evimaria@cs.bu.edu

  31. Basic formulation: Skills and people Boston University Slideshow Title Goes Here 10001101 10010010 • n people/experts • m skills • Each person has some skills evimaria@cs.bu.edu

  32. Basic formulation: jobs & teams 00010101 Boston University Slideshow Title Goes Here 10011101 10010010 10001101 • Stream of k Jobs/Tasks • A job requires some skills • k Teams are created online • A team must cover all job skills evimaria@cs.bu.edu

  33. Basic formulation: jobs & teams 00010101 Boston University Slideshow Title Goes Here 10011101 10010010 10001101 • Stream of k Jobs/Tasks • A job requires some skills • k Teams are created online • A team must cover all job skills • Load of p: L(p) = total # of teams having p evimaria@cs.bu.edu

  34. Coordination cost Boston University Slideshow Title Goes Here • Coordination cost measures the compatibility of the team members • Example of : – Degree of knowledge – Time-zone difference – Past collaboration • Select teams that minimizes coordination cost : – Steiner-tree cost – Diameter – Sum of distances evimaria@cs.bu.edu

  35. Coordination cost Boston University Slideshow Title Goes Here • Steiner-tree cost • Diameter • Sum of distances evimaria@cs.bu.edu

  36. Conflicting goals Boston University Slideshow Title Goes Here • We want to create teams online that minimize – Load – Unfairness – Coordination cost and cover each job. • How can we model all these requirements? evimaria@cs.bu.edu

  37. Our modeling approach • Set a desirable coordination cost upper bound B Boston University Slideshow Title Goes Here • Online solve Load of person i Team j covers job j Bounded coordination cost • Must concurrently solve various combinatorial problems: – Set cover – Steiner tree – Online makespan minimization evimaria@cs.bu.edu

  38. Our modeling approach Boston University Slideshow Title Goes Here Job p 1 p 2 p 3 p 4 p 5 p 6 p 7 Q j 1    Q 1 = {p 2 , p 4 , p 5 } 2    Q 2 = {p 1 , p 4 , p 6 } 3   Q 3 = {p 3 , p 4 } 4    Q 4 = {p 1 , p 5 , p 7 } 5 Q 5 = {p 2 , p 3 . p 4 , p 5 }     6    Q 6 = {p 3 , p 5 , p 6 } 7   Q 7 = {p 1 , p 2 } 8      Q 8 = {p 1 , p 2 , p 3 , p 4 , p 7 } 9    Q 9 = {p 3 , p 4 , p 5 } Load 4 4 5 6 5 2 2 evimaria@cs.bu.edu

  39. Algorithm ExpLoad Load of p at time t Boston University Slideshow Title Goes Here At each time step t, when a task arrives: • Weight each person p by • Select team Q that – Covers all required skills – Satisfies – Minimizes • Theorem. If we can solve this problem optimally, then Competitive ratio = . This is the best possible. evimaria@cs.bu.edu

  40. The ExpLoad algorithm Load of p at time t Boston University Slideshow Title Goes Here At each time step t, when a task arrives: • Weight each person p by • Select team Q that – Covers all required skills – Satisfies – Minimizes • Theorem. If we can solve this problem optimally, then Competitive ratio = . This is the best possible. evimaria@cs.bu.edu

  41. The ExpLoad algorithm Load of p at time t At each time step t, when a task arrives: Boston University Slideshow Title Goes Here • Weight each person p by • Select team Q that We can solve this – Covers all required skills problem only – Satisfies approximately. – Minimizes • Theorem. If we can solve this problem optimally, then Competitive ratio = . This is the best possible. evimaria@cs.bu.edu

  42. Roadmap Boston University Slideshow Title Goes Here • Background • Team formation and cluster hires • Team formation in the presence of a social network • Inferring abilities of experts • Team formation in educational settings evimaria@cs.bu.edu

  43. Setting [GLT’12]  Experts (defining the set V, with |V|=n): Boston University Slideshow Title Goes Here  Every expert i is associated with a set of skills X i  and a price p i  Tasks  Every task T is associated with a set of skills (T) required for performing the task  A social network of experts (G=(V,E))  Edges indicate ability to work well together Team Formation Skill Attribution Experts’ skills Known Unknown Participation of experts in teams Unknown Known Network structure Known Irrelevant evimaria@cs.bu.edu

  44. The Skill-Attribution problem  Input: a set of teams and the tasks they performed Boston University Slideshow Title Goes Here Team T 1 ={A,B} performed task S 1 ={algorithms, databases}  Team T 2 ={B,C,D} performed task S 2 ={algorithms, system, programming}  Team T 3 ={A,B,C} performed task S 3 ={databases, algorithms, systems}   Question: What are the contributions of each team member? Team {A,B} appear to know algorithms and databases but who knows  algorithms and who knows databases?  Assumptions: Complementarity: A team has a skill if at least one of its members has  that skill Parsimony: It is hard to imagine a world where all individuals have all skills  evimaria@cs.bu.edu

  45. The Skill-Attribution problem  The input introduces a set of constraints Boston University Slideshow Title Goes Here Team T 1 ={A,B} performed task S 1 ={algorithms, databases}  Team T 2 ={B,C,D} performed task S 2 ={algorithms, system, programming}  Team T 3 ={A,B,C} performed task S 3 ={databases, algorithms, systems}   A skill assignment is consistent if for every task T i and every skill in s Є S i there exist at least one expert in T i who has s. A skill assignment is consistent if and only if it is consistent for every skill  separately Focus on the single-skill attribution problem evimaria@cs.bu.edu

  46. Skill vectors and hitting sets A Boston University Slideshow Title Goes Here s = algorithms  T1 Team T 1 ={A,B}  B Team T 2 ={B,C} T2  C Team T 3 ={C,D}  T3 Team T 4 ={D,E} D  T4 E  A skill vector assigns skill s to individuals from V  Any consistent skill vector is a hitting set for the set system (T 1 ,T 2 ,…,T m , V) Teams: subsets of Universe of individuals individuals evimaria@cs.bu.edu

  47. Minimum skill attribution (v 0.0)  For a single skill s, and input teams T 1 ,T 2 ,…,T m Boston University Slideshow Title Goes Here find a consistent skill attribution with the minimum number of individuals possessing s.  Minimum skill attribution: X * = {B,D} A  Minimum skill attribution is as hard as s = algorithms  T1 the minimum hitting set problem B Team T 1 ={A,B}   X * is a strictly parsimonious solution T2 Team T 2 ={B,C}  C  One solution is not enough: Team T 3 ={C,D}  T3 Near-optimal attributions are ignored  D Team T 4 ={D,E}  X’={A,C,D}, X’’={A,C,E}, X’’’={B,C,D}, T4 X’’’’={B,C,E} E evimaria@cs.bu.edu

  48. Counting all consistent skill vectors Boston University Slideshow Title Goes Here  For a single skill s, and input teams T 1 ,T 2 ,…,T m count for every individual in V the number of consistent skill vectors he participates in.  Equivalent to counting hitting sets for input (T 1 ,T 2 ,…,T m ,V)  #P-complete problem evimaria@cs.bu.edu

  49. The lattice of skill vectors Boston University Slideshow Title Goes Here V Everyone has skill s Minimal sets Supersets if Consistent subsets a minimal set Subset of V that possesses skill s Inconsistent subsets Ø Noone has skill s evimaria@cs.bu.edu

  50. Counting all consistent skill vectors V Everyone has skill s Boston University Slideshow Title Goes Here  Naïve Monte-Carlo sampling  C=0  for i=1…N  Sample an element from the lattice; if it is consistent C++  return (C/N)x2 n Ø Noone has skill s evimaria@cs.bu.edu

  51. Counting all consistent skill vectors V Everyone has skill s Boston University Slideshow Title Goes Here  Naïve Monte-Carlo sampling  C=0  for i=1…N  Sample an element from the lattice; if it is consistent C++  return (C/N)x2 n Does not work when there are few consistent vectors Ø Noone has skill s evimaria@cs.bu.edu

  52. The ImportanceSampling algorithm Boston University Slideshow Title Goes Here • Assume we know the set of V minimal sets that contain r M(r) = {M 1 ,…,M k } Supersets of a minimal sets • Sample consistent vectors from the space of hitting sets only • Running time: polynomial in k Ø evimaria@cs.bu.edu

  53. ImportanceSampling Speedups Boston University Slideshow Title Goes Here • Run ImportanceSampling for all experts simultaneously • View the input as a bipartite graph and partition it into (almost) independent components • Cluster together experts that participate in identical sets of teams into super-experts evimaria@cs.bu.edu

  54. A T1 1 T2 Boston University Slideshow Title Goes Here 2 T3 3 T5 B T6 4 T7 5 T8 ConsistentVectors(1) = ConsistentVectors(1,A)xConsistentVectors(B) evimaria@cs.bu.edu

  55. Ranking of experts Boston University Slideshow Title Goes Here social networks privacy graphs P. Mika (1) A. Acquisti (1) C. Faloutsos (1) J. Golbeck (5) M. S. Ackerman (3) J. Kleinberg (2) M. Richardson (5) L. Faith Cranor (3) J. Leskovec (2) P. Singla (19) B. Berendt (5) R. Kumar (3) L. Zhou (7) S. Spiekermann (5) A. Tomkins (3) A. Java (19) O. Gunther (19) L. A. Adamic (3) L. Ding (2) J. Grossklags (5) E. Vee (4) T. Finin (2) G. Hsieh (19) P. Ginsparg (4) A. Joshi (2) K. Vaniea (19) J. Gehrke (4) R. Agrawal (19) N. Sadeh (19) B. A. Huberman (3) evimaria@cs.bu.edu

  56. Roadmap Boston University Slideshow Title Goes Here • Background • Team formation and cluster hires • Team formation in the presence of a social network • Inferring abilities of experts • Team formation in educational settings evimaria@cs.bu.edu

  57. Team formation in educational settings [AGT’14] Boston University Slideshow Title Goes Here • Consider a class of students Di fg erent ability levels (single scores) • • Example: GRE, TOEFL, SAT, … How to form study groups? evimaria@cs.bu.edu

  58. Team formation in educational settings [AGT’14] Boston University Slideshow Title Goes Here • Classical methods Ability-Based Grouping • • Grouping students with similar abilities together Pseudo-Random Grouping • • Grouping students based on some arbitrary ordering • Alphabetically, FCFS, … evimaria@cs.bu.edu

  59. Team formation in educational settings [AGT’14] Boston University Slideshow Title Goes Here • Classical methods Ability-Based Grouping • • Grouping students with similar abilities together Pseudo-Random Grouping • • Grouping students based on some arbitrary ordering • Alphabetically, FCFS, … Which method to use? Inconclusive verdict from empirical studies (Kulik 92, Loveless 13, McPartland 87) Let’s take a computational approach evimaria@cs.bu.edu

  60. Framework Boston University Slideshow Title Goes Here • evimaria@cs.bu.edu

  61. Framework Boston University Slideshow Title Goes Here • Two groups of students in a study group Students below the collective ability • Students above the collective ability • evimaria@cs.bu.edu

  62. Framework • Two groups of students in a study group Boston University Slideshow Title Goes Here Students below the collective ability • Students above the collective ability •  Mostly improve by  Mostly learn from other teaching others members of the group evimaria@cs.bu.edu

  63. Framework • Two groups of students in a study group Boston University Slideshow Title Goes Here Students below the collective ability • Students above the collective ability •  Mostly improve by  Mostly learn from other teaching others members of the group Our Focus Maximize the number of such students • evimaria@cs.bu.edu

Recommend


More recommend