A Correlated Worker Model for Grouped, Imbalanced and Multitask Data An T. Nguyen 1 Byron C. Wallace Matthew Lease University of Texas at Austin UAI 2016 1 Presenter 1
Overview ◮ A model of workers in crowdsourcing. 2
Overview ◮ A model of workers in crowdsourcing. ◮ Idea: Transfer knowledge of worker quality. 2
Overview ◮ A model of workers in crowdsourcing. ◮ Idea: Transfer knowledge of worker quality. ◮ Variational EM learning. 2
Overview ◮ A model of workers in crowdsourcing. ◮ Idea: Transfer knowledge of worker quality. ◮ Variational EM learning. ◮ Apply to two datasets: ◮ Biomed Citation Screening: imbalanced, grouped. 2
Overview ◮ A model of workers in crowdsourcing. ◮ Idea: Transfer knowledge of worker quality. ◮ Variational EM learning. ◮ Apply to two datasets: ◮ Biomed Citation Screening: imbalanced, grouped. ◮ Galaxy Classification: multiple tasks. 2
Background ◮ Crowdsourcing: collect labels quickly at low cost. 3
Background ◮ Crowdsourcing: collect labels quickly at low cost. ◮ But (usually) lower quality. 3
Background ◮ Crowdsourcing: collect labels quickly at low cost. ◮ But (usually) lower quality. ◮ Common solution: collect 5 labels for each instance ... ◮ ... then aggregate them. 3
Background ◮ Crowdsourcing: collect labels quickly at low cost. ◮ But (usually) lower quality. ◮ Common solution: collect 5 labels for each instance ... ◮ ... then aggregate them. ◮ Most previous work: improve (the estimates of) labels . 3
Background ◮ Crowdsourcing: collect labels quickly at low cost. ◮ But (usually) lower quality. ◮ Common solution: collect 5 labels for each instance ... ◮ ... then aggregate them. ◮ Most previous work: improve (the estimates of) labels . ◮ Our work: improve (the estimates of) worker qualities . 3
Motivation for estimating worker qualities 4
Motivation for estimating worker qualities Diagnostic insights. 4
Motivation for estimating worker qualities Diagnostic insights. Help workers improve. 4
Motivation for estimating worker qualities Diagnostic insights. Help workers improve. Intelligent task routing (assign work to workers). 4
Worker Quality Measure Accurary: simple but not enough. 5
Worker Quality Measure Accurary: simple but not enough. → Confusion matrix: Pr (worker label | true label) 5
Worker Quality Measure Accurary: simple but not enough. → Confusion matrix: Pr (worker label | true label) Binary task (this work): ◮ Sensitivity: Pr (positive | positive). ◮ Specificity: Pr (negative | negative). 5
Setting Input ◮ Crowd labels for each instance. ◮ No instance-level features (future work). 6
Setting Input ◮ Crowd labels for each instance. ◮ No instance-level features (future work). Output ◮ For each worker: sensitivity and specificity. 6
Setting Input ◮ Crowd labels for each instance. ◮ No instance-level features (future work). Output ◮ For each worker: sensitivity and specificity. Eval. Metric ◮ RMSE on sen. and spe. 6
Setting Input ◮ Crowd labels for each instance. ◮ No instance-level features (future work). Output ◮ For each worker: sensitivity and specificity. Eval. Metric ◮ RMSE on sen. and spe. ◮ gold sen. spe.: gold labels in whole dataset. 6
Challenges Sparsity: many workers do only a few instances. 7
Challenges Sparsity: many workers do only a few instances. Data is imbalanced: ◮ A lot more negative than positive ◮ Difficult to estimate sensitivity 7
Idea Transfer knowledge of worker quality ◮ Between classes. ◮ Within group. ◮ In multiple tasks. 8
Previous models (Raykar et. al. 2010; Liu & Wang 2012; Kim & Ghahramani 2012) Hidden vars: ◮ True label for each instance. ◮ Confusion mat. (sen. + spe.) for each worker. 9
Previous models (Raykar et. al. 2010; Liu & Wang 2012; Kim & Ghahramani 2012) Hidden vars: ◮ True label for each instance. ◮ Confusion mat. (sen. + spe.) for each worker. Assumptions: ◮ Sen. & Spe. are independent params. ◮ A single group of workers. ◮ Multiple tasks: independent models. 9
Our Model Assumptions: ◮ Sen. & Spe. are correlated. ◮ Multiple groups of workers (group membership is known). ◮ Sen. & Spe. in multiple tasks are correlated. 10
The Base Model (i indexes instances, j indexes workers) U j , V j ∼ N ( µ, C ) 11
The Base Model (i indexes instances, j indexes workers) U j , V j ∼ N ( µ, C ) Z i ∼ Ber ( θ ) 11
The Base Model (i indexes instances, j indexes workers) U j , V j ∼ N ( µ, C ) Z i ∼ Ber ( θ ) L ij | Z i = 1 ∼ Ber ( S ( U j )) 11
The Base Model (i indexes instances, j indexes workers) U j , V j ∼ N ( µ, C ) Z i ∼ Ber ( θ ) L ij | Z i = 1 ∼ Ber ( S ( U j )) L ij | Z i = 0 ∼ Ber ( S ( V j )) 11
Extensions 1. Worker Groups: ◮ Know group membership. ◮ Model each group k = a Normal dist ( µ k , C k ). 12
Extensions 1. Worker Groups: ◮ Know group membership. ◮ Model each group k = a Normal dist ( µ k , C k ). 2. Multiple tasks: ◮ Assume two tasks. 12
Extensions 1. Worker Groups: ◮ Know group membership. ◮ Model each group k = a Normal dist ( µ k , C k ). 2. Multiple tasks: ◮ Assume two tasks. ◮ ( Sen 1 , Spe 1 ) correlates with ( Sen 2 , Spe 2 ). 12
Extensions 1. Worker Groups: ◮ Know group membership. ◮ Model each group k = a Normal dist ( µ k , C k ). 2. Multiple tasks: ◮ Assume two tasks. ◮ ( Sen 1 , Spe 1 ) correlates with ( Sen 2 , Spe 2 ). ◮ ( U 1 , V 1 , U 2 , V 2 ) ∼ N ( µ, C ) 12
Inference For the Base Model Approach: Variational EM ◮ E-step: infer Pr ( U 1 .. m , V 1 .. m , Z 1 .. n | L ). 13
Inference For the Base Model Approach: Variational EM ◮ E-step: infer Pr ( U 1 .. m , V 1 .. m , Z 1 .. n | L ). ◮ M-step: maximize parameters µ, C , θ . 13
Inference For the Base Model Approach: Variational EM ◮ E-step: infer Pr ( U 1 .. m , V 1 .. m , Z 1 .. n | L ). ◮ M-step: maximize parameters µ, C , θ . Variational Inference: ◮ Approximate the (complex) posterior Pr ( | )... ◮ ... by a simpler function q . 13
Inference For the Base Model Approach: Variational EM ◮ E-step: infer Pr ( U 1 .. m , V 1 .. m , Z 1 .. n | L ). ◮ M-step: maximize parameters µ, C , θ . Variational Inference: ◮ Approximate the (complex) posterior Pr ( | )... ◮ ... by a simpler function q . ◮ Minimize KL ( q || p ) ... ◮ ... equivalent to maximize a log-likelihood lower bound. 13
Inference Meanfield Assumptions: ◮ q factorizes: m n � � q ( U 1 .. m , V 1 .. m , Z 1 .. n ) = q ( U j ) q ( V j ) q ( Z i ) j =1 i =1 14
Inference Meanfield Assumptions: ◮ q factorizes: m n � � q ( U 1 .. m , V 1 .. m , Z 1 .. n ) = q ( U j ) q ( V j ) q ( Z i ) j =1 i =1 ◮ Factors: σ 2 q ( U j ) = N (˜ µ uj , ˜ uj ) σ 2 q ( V j ) = N (˜ µ vj , ˜ vj ) q ( Z i ) = Ber (˜ θ i ) 14
Inference Meanfield Assumptions: ◮ q factorizes: m n � � q ( U 1 .. m , V 1 .. m , Z 1 .. n ) = q ( U j ) q ( V j ) q ( Z i ) j =1 i =1 ◮ Factors: σ 2 q ( U j ) = N (˜ µ uj , ˜ uj ) σ 2 q ( V j ) = N (˜ µ vj , ˜ vj ) q ( Z i ) = Ber (˜ θ i ) ◮ Optimize with respect to σ 2 σ 2 vj | j = 1 ... m } and { ˜ { ˜ θ i | i = 1 ... n } µ uj , ˜ uj , ˜ µ vj , ˜ 14
Optimization Coordinate Descent: update one var at a time. 15
Optimization Coordinate Descent: update one var at a time. Update Z i : � � � q ∗ ( Z i = 1) ∝ exp log Ber (1 | θ ) + E U j ∼ q ( U j ) log Ber ( L ij |S ( U j )) � � � q ∗ ( Z i = 0) ∝ exp log Ber (0 | θ ) + E V j ∼ q ( V j ) log Ber ( L ij |S ( V j )) 15
Optimization Coordinate Descent: update one var at a time. Update Z i : � � � q ∗ ( Z i = 1) ∝ exp log Ber (1 | θ ) + E U j ∼ q ( U j ) log Ber ( L ij |S ( U j )) � � � q ∗ ( Z i = 0) ∝ exp log Ber (0 | θ ) + E V j ∼ q ( V j ) log Ber ( L ij |S ( V j )) Intuition: ◮ Z i ≈ Prior + � E (Crowd labels for i) 15
Optimization Coordinate Descent: update one var at a time. Update Z i : � � � q ∗ ( Z i = 1) ∝ exp log Ber (1 | θ ) + E U j ∼ q ( U j ) log Ber ( L ij |S ( U j )) � � � q ∗ ( Z i = 0) ∝ exp log Ber (0 | θ ) + E V j ∼ q ( V j ) log Ber ( L ij |S ( V j )) Intuition: ◮ Z i ≈ Prior + � E (Crowd labels for i) ◮ E wrt worker quality. 15
Optimization Update U j : � q ∗ ( U j ) ∝ exp E V j ∼ q ( V j ) log N ( U j , V j | µ, C )+ � � q ( Z i = 1) log Ber ( L ij |S ( U j )) 16
Optimization Update U j : � q ∗ ( U j ) ∝ exp E V j ∼ q ( V j ) log N ( U j , V j | µ, C )+ � � q ( Z i = 1) log Ber ( L ij |S ( U j )) Intuition: ◮ U j = logit sensitivity of worker j . 16
Optimization Update U j : � q ∗ ( U j ) ∝ exp E V j ∼ q ( V j ) log N ( U j , V j | µ, C )+ � � q ( Z i = 1) log Ber ( L ij |S ( U j )) Intuition: ◮ U j = logit sensitivity of worker j . ◮ U j ≈ E (correlation with specificity) + ... 16
Recommend
More recommend