a correlated worker model for grouped imbalanced and
play

A Correlated Worker Model for Grouped, Imbalanced and Multitask Data - PowerPoint PPT Presentation

A Correlated Worker Model for Grouped, Imbalanced and Multitask Data An T. Nguyen 1 Byron C. Wallace Matthew Lease University of Texas at Austin UAI 2016 1 Presenter 1 Overview A model of workers in crowdsourcing. 2 Overview A model


  1. A Correlated Worker Model for Grouped, Imbalanced and Multitask Data An T. Nguyen 1 Byron C. Wallace Matthew Lease University of Texas at Austin UAI 2016 1 Presenter 1

  2. Overview ◮ A model of workers in crowdsourcing. 2

  3. Overview ◮ A model of workers in crowdsourcing. ◮ Idea: Transfer knowledge of worker quality. 2

  4. Overview ◮ A model of workers in crowdsourcing. ◮ Idea: Transfer knowledge of worker quality. ◮ Variational EM learning. 2

  5. Overview ◮ A model of workers in crowdsourcing. ◮ Idea: Transfer knowledge of worker quality. ◮ Variational EM learning. ◮ Apply to two datasets: ◮ Biomed Citation Screening: imbalanced, grouped. 2

  6. Overview ◮ A model of workers in crowdsourcing. ◮ Idea: Transfer knowledge of worker quality. ◮ Variational EM learning. ◮ Apply to two datasets: ◮ Biomed Citation Screening: imbalanced, grouped. ◮ Galaxy Classification: multiple tasks. 2

  7. Background ◮ Crowdsourcing: collect labels quickly at low cost. 3

  8. Background ◮ Crowdsourcing: collect labels quickly at low cost. ◮ But (usually) lower quality. 3

  9. Background ◮ Crowdsourcing: collect labels quickly at low cost. ◮ But (usually) lower quality. ◮ Common solution: collect 5 labels for each instance ... ◮ ... then aggregate them. 3

  10. Background ◮ Crowdsourcing: collect labels quickly at low cost. ◮ But (usually) lower quality. ◮ Common solution: collect 5 labels for each instance ... ◮ ... then aggregate them. ◮ Most previous work: improve (the estimates of) labels . 3

  11. Background ◮ Crowdsourcing: collect labels quickly at low cost. ◮ But (usually) lower quality. ◮ Common solution: collect 5 labels for each instance ... ◮ ... then aggregate them. ◮ Most previous work: improve (the estimates of) labels . ◮ Our work: improve (the estimates of) worker qualities . 3

  12. Motivation for estimating worker qualities 4

  13. Motivation for estimating worker qualities Diagnostic insights. 4

  14. Motivation for estimating worker qualities Diagnostic insights. Help workers improve. 4

  15. Motivation for estimating worker qualities Diagnostic insights. Help workers improve. Intelligent task routing (assign work to workers). 4

  16. Worker Quality Measure Accurary: simple but not enough. 5

  17. Worker Quality Measure Accurary: simple but not enough. → Confusion matrix: Pr (worker label | true label) 5

  18. Worker Quality Measure Accurary: simple but not enough. → Confusion matrix: Pr (worker label | true label) Binary task (this work): ◮ Sensitivity: Pr (positive | positive). ◮ Specificity: Pr (negative | negative). 5

  19. Setting Input ◮ Crowd labels for each instance. ◮ No instance-level features (future work). 6

  20. Setting Input ◮ Crowd labels for each instance. ◮ No instance-level features (future work). Output ◮ For each worker: sensitivity and specificity. 6

  21. Setting Input ◮ Crowd labels for each instance. ◮ No instance-level features (future work). Output ◮ For each worker: sensitivity and specificity. Eval. Metric ◮ RMSE on sen. and spe. 6

  22. Setting Input ◮ Crowd labels for each instance. ◮ No instance-level features (future work). Output ◮ For each worker: sensitivity and specificity. Eval. Metric ◮ RMSE on sen. and spe. ◮ gold sen. spe.: gold labels in whole dataset. 6

  23. Challenges Sparsity: many workers do only a few instances. 7

  24. Challenges Sparsity: many workers do only a few instances. Data is imbalanced: ◮ A lot more negative than positive ◮ Difficult to estimate sensitivity 7

  25. Idea Transfer knowledge of worker quality ◮ Between classes. ◮ Within group. ◮ In multiple tasks. 8

  26. Previous models (Raykar et. al. 2010; Liu & Wang 2012; Kim & Ghahramani 2012) Hidden vars: ◮ True label for each instance. ◮ Confusion mat. (sen. + spe.) for each worker. 9

  27. Previous models (Raykar et. al. 2010; Liu & Wang 2012; Kim & Ghahramani 2012) Hidden vars: ◮ True label for each instance. ◮ Confusion mat. (sen. + spe.) for each worker. Assumptions: ◮ Sen. & Spe. are independent params. ◮ A single group of workers. ◮ Multiple tasks: independent models. 9

  28. Our Model Assumptions: ◮ Sen. & Spe. are correlated. ◮ Multiple groups of workers (group membership is known). ◮ Sen. & Spe. in multiple tasks are correlated. 10

  29. The Base Model (i indexes instances, j indexes workers) U j , V j ∼ N ( µ, C ) 11

  30. The Base Model (i indexes instances, j indexes workers) U j , V j ∼ N ( µ, C ) Z i ∼ Ber ( θ ) 11

  31. The Base Model (i indexes instances, j indexes workers) U j , V j ∼ N ( µ, C ) Z i ∼ Ber ( θ ) L ij | Z i = 1 ∼ Ber ( S ( U j )) 11

  32. The Base Model (i indexes instances, j indexes workers) U j , V j ∼ N ( µ, C ) Z i ∼ Ber ( θ ) L ij | Z i = 1 ∼ Ber ( S ( U j )) L ij | Z i = 0 ∼ Ber ( S ( V j )) 11

  33. Extensions 1. Worker Groups: ◮ Know group membership. ◮ Model each group k = a Normal dist ( µ k , C k ). 12

  34. Extensions 1. Worker Groups: ◮ Know group membership. ◮ Model each group k = a Normal dist ( µ k , C k ). 2. Multiple tasks: ◮ Assume two tasks. 12

  35. Extensions 1. Worker Groups: ◮ Know group membership. ◮ Model each group k = a Normal dist ( µ k , C k ). 2. Multiple tasks: ◮ Assume two tasks. ◮ ( Sen 1 , Spe 1 ) correlates with ( Sen 2 , Spe 2 ). 12

  36. Extensions 1. Worker Groups: ◮ Know group membership. ◮ Model each group k = a Normal dist ( µ k , C k ). 2. Multiple tasks: ◮ Assume two tasks. ◮ ( Sen 1 , Spe 1 ) correlates with ( Sen 2 , Spe 2 ). ◮ ( U 1 , V 1 , U 2 , V 2 ) ∼ N ( µ, C ) 12

  37. Inference For the Base Model Approach: Variational EM ◮ E-step: infer Pr ( U 1 .. m , V 1 .. m , Z 1 .. n | L ). 13

  38. Inference For the Base Model Approach: Variational EM ◮ E-step: infer Pr ( U 1 .. m , V 1 .. m , Z 1 .. n | L ). ◮ M-step: maximize parameters µ, C , θ . 13

  39. Inference For the Base Model Approach: Variational EM ◮ E-step: infer Pr ( U 1 .. m , V 1 .. m , Z 1 .. n | L ). ◮ M-step: maximize parameters µ, C , θ . Variational Inference: ◮ Approximate the (complex) posterior Pr ( | )... ◮ ... by a simpler function q . 13

  40. Inference For the Base Model Approach: Variational EM ◮ E-step: infer Pr ( U 1 .. m , V 1 .. m , Z 1 .. n | L ). ◮ M-step: maximize parameters µ, C , θ . Variational Inference: ◮ Approximate the (complex) posterior Pr ( | )... ◮ ... by a simpler function q . ◮ Minimize KL ( q || p ) ... ◮ ... equivalent to maximize a log-likelihood lower bound. 13

  41. Inference Meanfield Assumptions: ◮ q factorizes: m n � � q ( U 1 .. m , V 1 .. m , Z 1 .. n ) = q ( U j ) q ( V j ) q ( Z i ) j =1 i =1 14

  42. Inference Meanfield Assumptions: ◮ q factorizes: m n � � q ( U 1 .. m , V 1 .. m , Z 1 .. n ) = q ( U j ) q ( V j ) q ( Z i ) j =1 i =1 ◮ Factors: σ 2 q ( U j ) = N (˜ µ uj , ˜ uj ) σ 2 q ( V j ) = N (˜ µ vj , ˜ vj ) q ( Z i ) = Ber (˜ θ i ) 14

  43. Inference Meanfield Assumptions: ◮ q factorizes: m n � � q ( U 1 .. m , V 1 .. m , Z 1 .. n ) = q ( U j ) q ( V j ) q ( Z i ) j =1 i =1 ◮ Factors: σ 2 q ( U j ) = N (˜ µ uj , ˜ uj ) σ 2 q ( V j ) = N (˜ µ vj , ˜ vj ) q ( Z i ) = Ber (˜ θ i ) ◮ Optimize with respect to σ 2 σ 2 vj | j = 1 ... m } and { ˜ { ˜ θ i | i = 1 ... n } µ uj , ˜ uj , ˜ µ vj , ˜ 14

  44. Optimization Coordinate Descent: update one var at a time. 15

  45. Optimization Coordinate Descent: update one var at a time. Update Z i : � � � q ∗ ( Z i = 1) ∝ exp log Ber (1 | θ ) + E U j ∼ q ( U j ) log Ber ( L ij |S ( U j )) � � � q ∗ ( Z i = 0) ∝ exp log Ber (0 | θ ) + E V j ∼ q ( V j ) log Ber ( L ij |S ( V j )) 15

  46. Optimization Coordinate Descent: update one var at a time. Update Z i : � � � q ∗ ( Z i = 1) ∝ exp log Ber (1 | θ ) + E U j ∼ q ( U j ) log Ber ( L ij |S ( U j )) � � � q ∗ ( Z i = 0) ∝ exp log Ber (0 | θ ) + E V j ∼ q ( V j ) log Ber ( L ij |S ( V j )) Intuition: ◮ Z i ≈ Prior + � E (Crowd labels for i) 15

  47. Optimization Coordinate Descent: update one var at a time. Update Z i : � � � q ∗ ( Z i = 1) ∝ exp log Ber (1 | θ ) + E U j ∼ q ( U j ) log Ber ( L ij |S ( U j )) � � � q ∗ ( Z i = 0) ∝ exp log Ber (0 | θ ) + E V j ∼ q ( V j ) log Ber ( L ij |S ( V j )) Intuition: ◮ Z i ≈ Prior + � E (Crowd labels for i) ◮ E wrt worker quality. 15

  48. Optimization Update U j : � q ∗ ( U j ) ∝ exp E V j ∼ q ( V j ) log N ( U j , V j | µ, C )+ � � q ( Z i = 1) log Ber ( L ij |S ( U j )) 16

  49. Optimization Update U j : � q ∗ ( U j ) ∝ exp E V j ∼ q ( V j ) log N ( U j , V j | µ, C )+ � � q ( Z i = 1) log Ber ( L ij |S ( U j )) Intuition: ◮ U j = logit sensitivity of worker j . 16

  50. Optimization Update U j : � q ∗ ( U j ) ∝ exp E V j ∼ q ( V j ) log N ( U j , V j | µ, C )+ � � q ( Z i = 1) log Ber ( L ij |S ( U j )) Intuition: ◮ U j = logit sensitivity of worker j . ◮ U j ≈ E (correlation with specificity) + ... 16

Recommend


More recommend