Contribution I: Bayesian Preference Model Bayesian Preference Model Labelers express accumulated, shared preferences. ◮ Parameter γ b models effect of preference b = 1 , . . . , K . ◮ m × K binary matrix Z models parameter sharing. ◮ If z l , b = 1, labeler l expresses preference b . ◮ Parameter β l accumulates preferences: � β l = z l , b γ b b ◮ Likelihood: � � p ( y i , l | β ⊤ p ( Y | X , Z , γ ) = l x i ) i : y i , l � =0 l Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 9
Contribution I: Bayesian Preference Model Bayesian Preference Model Labelers express accumulated, shared preferences. ◮ Parameter γ b models effect of preference b = 1 , . . . , K . ◮ m × K binary matrix Z models parameter sharing. ◮ If z l , b = 1, labeler l expresses preference b . ◮ Parameter β l accumulates preferences: � β l = z l , b γ b b ◮ Likelihood: � � p ( y i , l | β ⊤ p ( Y | X , Z , γ ) = l x i ) i : y i , l � =0 l ◮ Similar preferences ⇒ similar labelling. Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 9
Contribution I: Bayesian Preference Model Priors ◮ Prior on γ b : p ( γ b ) = N (0 , σ 2 I ) for each b . Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 10
Contribution I: Bayesian Preference Model Priors ◮ Prior on γ b : p ( γ b ) = N (0 , σ 2 I ) for each b . ◮ Prior on Z : fix Z to be m × K . Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 10
Contribution I: Bayesian Preference Model Priors ◮ Prior on γ b : p ( γ b ) = N (0 , σ 2 I ) for each b . ◮ Prior on Z : fix Z to be m × K . � α � π b | α ∼ Beta K , 1 , b = 1 , . . . , K (1) (2) m Z Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 10
Contribution I: Bayesian Preference Model Priors ◮ Prior on γ b : p ( γ b ) = N (0 , σ 2 I ) for each b . ◮ Prior on Z : fix Z to be m × K . � α � π b | α ∼ Beta K , 1 , b = 1 , . . . , K (1) z l , b | π b ∼ Bern ( π b ) , l = 1 , . . . , m (2) m Z Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 10
Contribution I: Bayesian Preference Model Priors ◮ Prior on γ b : p ( γ b ) = N (0 , σ 2 I ) for each b . ◮ Prior on Z : fix Z to be m × K . � α � π b | α ∼ Beta K , 1 , b = 1 , . . . , K (1) z l , b | π b ∼ Bern ( π b ) , l = 1 , . . . , m (2) K m Z Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 10
Contribution I: Bayesian Preference Model Priors ◮ Prior on γ b : p ( γ b ) = N (0 , σ 2 I ) for each b . ◮ Prior on Z : fix Z to be m × K . � α � π b | α ∼ Beta K , 1 , b = 1 , . . . , K (1) z l , b | π b ∼ Bern ( π b ) , l = 1 , . . . , m (2) K m Z Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 10
Contribution I: Bayesian Preference Model Priors ◮ Prior on γ b : p ( γ b ) = N (0 , σ 2 I ) for each b . ◮ Prior on Z : fix Z to be m × K . � α � π b | α ∼ Beta K , 1 , b = 1 , . . . , K (1) z l , b | π b ∼ Bern ( π b ) , l = 1 , . . . , m (2) K m Z Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 10
Contribution I: Bayesian Preference Model Priors ◮ Prior on γ b : p ( γ b ) = N (0 , σ 2 I ) for each b . ◮ Prior on Z : fix Z to be m × K . � α � π b | α ∼ Beta K , 1 , b = 1 , . . . , K (1) z l , b | π b ∼ Bern ( π b ) , l = 1 , . . . , m (2) K m Z Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 10
Contribution I: Bayesian Preference Model Priors ◮ Prior on γ b : p ( γ b ) = N (0 , σ 2 I ) for each b . ◮ Prior on Z : fix Z to be m × K . � α � π b | α ∼ Beta K , 1 , b = 1 , . . . , K (1) z l , b | π b ∼ Bern ( π b ) , l = 1 , . . . , m (2) K m Z Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 10
Contribution I: Bayesian Preference Model Priors ◮ Prior on γ b : p ( γ b ) = N (0 , σ 2 I ) for each b . ◮ Prior on Z : fix Z to be m × K . � α � π b | α ∼ Beta K , 1 , b = 1 , . . . , K (1) z l , b | π b ∼ Bern ( π b ) , l = 1 , . . . , m (2) K m Z ◮ As K → ∞ , distribution over Z converges to the Indian Buffet Process (IBP) . Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 10
Contribution I: Bayesian Preference Model Complete model p ( Y , Z , γ | X ) = p ( Y | X , Z , γ ) p ( γ | Z ) p ( Z ) Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 11
Contribution I: Bayesian Preference Model Complete model p ( Y , Z , γ | X ) = p ( Y | X , Z , γ ) p ( γ | Z ) p ( Z ) ◮ Recall bias: different labellers can have different β ’s Example: Disagreement if guitar is behind/next to the couch. Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 11
Contribution I: Bayesian Preference Model Complete model p ( Y , Z , γ | X ) = p ( Y | X , Z , γ ) p ( γ | Z ) p ( Z ) ◮ Recall bias: different labellers can have different β ’s Example: Disagreement if guitar is behind/next to the couch. ◮ Want to predict labeller l ’s labels. Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 11
Contribution I: Bayesian Preference Model Complete model p ( Y , Z , γ | X ) = p ( Y | X , Z , γ ) p ( γ | Z ) p ( Z ) ◮ Recall bias: different labellers can have different β ’s Example: Disagreement if guitar is behind/next to the couch. ◮ Want to predict labeller l ’s labels. ◮ Labeller l could be in the crowd, or the gold standard . Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 11
Contribution I: Bayesian Preference Model Complete model p ( Y , Z , γ | X ) = p ( Y | X , Z , γ ) p ( γ | Z ) p ( Z ) ◮ Recall bias: different labellers can have different β ’s Example: Disagreement if guitar is behind/next to the couch. ◮ Want to predict labeller l ’s labels. ◮ Labeller l could be in the crowd, or the gold standard . ◮ Required inference: p ( β l | X , Y ), or equivalently p ( z l , b , γ b , b = 1 , . . . , K | X , Y ). Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 11
Contribution I: Bayesian Preference Model Complete model p ( Y , Z , γ | X ) = p ( Y | X , Z , γ ) p ( γ | Z ) p ( Z ) ◮ Recall bias: different labellers can have different β ’s Example: Disagreement if guitar is behind/next to the couch. ◮ Want to predict labeller l ’s labels. ◮ Labeller l could be in the crowd, or the gold standard . ◮ Required inference: p ( β l | X , Y ), or equivalently p ( z l , b , γ b , b = 1 , . . . , K | X , Y ). ◮ Model is complex. Exact inference intractable. Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 11
Contribution I: Bayesian Preference Model Complete model p ( Y , Z , γ | X ) = p ( Y | X , Z , γ ) p ( γ | Z ) p ( Z ) ◮ Recall bias: different labellers can have different β ’s Example: Disagreement if guitar is behind/next to the couch. ◮ Want to predict labeller l ’s labels. ◮ Labeller l could be in the crowd, or the gold standard . ◮ Required inference: p ( β l | X , Y ), or equivalently p ( z l , b , γ b , b = 1 , . . . , K | X , Y ). ◮ Model is complex. Exact inference intractable. ◮ Possible alternatives: Gibbs sampling , variational inference , slice sampling , etc. Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 11
BBMC Results Overview Contribution I: Bayesian Preference Model BBMC Results Contribution II: Approximate Active Learning Active Learning Results Conclusion Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 12
BBMC Results Results: Synthetic Data ◮ X is 2000 × 4 Gaussian matrix Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 13
BBMC Results Results: Synthetic Data ◮ X is 2000 × 4 Gaussian matrix ◮ Z is 30 × 2 uniform binary matrix ( m = 30 , K = 2). Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 13
BBMC Results Results: Synthetic Data ◮ X is 2000 × 4 Gaussian matrix ◮ Z is 30 × 2 uniform binary matrix ( m = 30 , K = 2). ◮ γ b Gaussian b = 1 , 2. β l = � b z l , b γ b . Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 13
BBMC Results Results: Synthetic Data ◮ X is 2000 × 4 Gaussian matrix ◮ Z is 30 × 2 uniform binary matrix ( m = 30 , K = 2). ◮ γ b Gaussian b = 1 , 2. β l = � b z l , b γ b . ◮ Observation probability ǫ = 0 . 1. 0 w.p. (1 − ǫ ) w.p. ǫ Φ( x ⊤ y i , l = +1 i β l ) − 1 o.w. Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 13
BBMC Results Results: Synthetic Data ◮ X is 2000 × 4 Gaussian matrix ◮ Z is 30 × 2 uniform binary matrix ( m = 30 , K = 2). ◮ γ b Gaussian b = 1 , 2. β l = � b z l , b γ b . ◮ Observation probability ǫ = 0 . 1. 0 w.p. (1 − ǫ ) w.p. ǫ Φ( x ⊤ y i , l = +1 i β l ) − 1 o.w. ◮ Inference: want to recover β 1 (say). Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 13
BBMC Results Results: Synthetic Data ◮ X is 2000 × 4 Gaussian matrix ◮ Z is 30 × 2 uniform binary matrix ( m = 30 , K = 2). ◮ γ b Gaussian b = 1 , 2. β l = � b z l , b γ b . ◮ Observation probability ǫ = 0 . 1. 0 w.p. (1 − ǫ ) w.p. ǫ Φ( x ⊤ y i , l = +1 i β l ) − 1 o.w. ◮ Inference: want to recover β 1 (say). ◮ Requires p ( z 1 , b , γ b , b = 1 . . . , K | , X , Y ). Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 13
BBMC Results Results: Synthetic Data ◮ X is 2000 × 4 Gaussian matrix ◮ Z is 30 × 2 uniform binary matrix ( m = 30 , K = 2). ◮ γ b Gaussian b = 1 , 2. β l = � b z l , b γ b . ◮ Observation probability ǫ = 0 . 1. 0 w.p. (1 − ǫ ) w.p. ǫ Φ( x ⊤ y i , l = +1 i β l ) − 1 o.w. ◮ Inference: want to recover β 1 (say). ◮ Requires p ( z 1 , b , γ b , b = 1 . . . , K | , X , Y ). ◮ For inference set K = 10. Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 13
BBMC Results ◮ Latent Z mostly correct after 1000 Gibbs steps. Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 14
BBMC Results ◮ Latent Z mostly correct after 1000 Gibbs steps. ◮ Gibbs sequence for γ 1 , 1 . 0.75 0.7 0.65 0.6 0.55 1000 1200 1400 1600 1800 2000 Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 14
BBMC Results ◮ Latent Z mostly correct after 1000 Gibbs steps. ◮ Gibbs sequence for γ 1 , 1 . 0.75 0.7 0.65 0.6 0.55 1000 1200 1400 1600 1800 2000 ◮ True and posterior mean of β 1 after 1000 iterations burnin. 0 . 6915 0 . 6514 0 . 0754 0 . 0535 ˆ β 1 = β 1 = (3) − 0 . 6815 − 0 . 6473 0 . 6988 0 . 6957 Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 14
BBMC Results Results: Crowdsourced data ◮ Task: Is the triangle to the left or above the rectangle Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 15
BBMC Results Results: Crowdsourced data ◮ Task: Is the triangle to the left or above the rectangle ◮ Labelled on Amazon Mechanical Turk: 523 tasks, 3 labels per task, 76 labellers. Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 15
BBMC Results Results: Crowdsourced data ◮ Task: Is the triangle to the left or above the rectangle ◮ Labelled on Amazon Mechanical Turk: 523 tasks, 3 labels per task, 76 labellers. ◮ Want to predict gold standard: compare centroid positions. Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 15
BBMC Results Results: Crowdsourced data ◮ Task: Is the triangle to the left or above the rectangle ◮ Labelled on Amazon Mechanical Turk: 523 tasks, 3 labels per task, 76 labellers. ◮ Want to predict gold standard: compare centroid positions. ◮ All 26 labellers with over 20 labels have error above 0.16. Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 15
BBMC Results Results: Crowdsourced data ◮ Task: Is the triangle to the left or above the rectangle ◮ Labelled on Amazon Mechanical Turk: 523 tasks, 3 labels per task, 76 labellers. ◮ Want to predict gold standard: compare centroid positions. ◮ All 26 labellers with over 20 labels have error above 0.16. ◮ Researcher also labels, and gives 60 gold standard labels. Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 15
BBMC Results ◮ Averaged log likelihood and error rate on test set. ◮ Our model: BBMC. Algorithm Final Loglik Final Error − 3716 ± 1695 0 . 0547 ± 0 . 0102 GOLD No Active − 421 . 1 ± 2 . 6 0 . 0935 ± 0 . 0031 CONS Learning − 219 . 1 ± 3 . 1 0 . 0309 ± 0 . 0033 BBMC Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 16
Contribution II: Approximate Active Learning Overview Contribution I: Bayesian Preference Model BBMC Results Contribution II: Approximate Active Learning Active Learning Results Conclusion Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 17
Contribution II: Approximate Active Learning Active Learning ◮ Want to predict labeller l ’s labels. Need β l . Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 18
Contribution II: Approximate Active Learning Active Learning ◮ Want to predict labeller l ’s labels. Need β l . ◮ Not all labellers are useful to infer β l . Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 18
Contribution II: Approximate Active Learning Active Learning ◮ Want to predict labeller l ’s labels. Need β l . ◮ Not all labellers are useful to infer β l . ◮ If l and l ′ share parameters ⇒ can learn about β l from l ′ . � � β l = z l , b γ b β l ′ = z l ′ , b γ b (4) b b Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 18
Contribution II: Approximate Active Learning Active Learning ◮ Want to predict labeller l ’s labels. Need β l . ◮ Not all labellers are useful to infer β l . ◮ If l and l ′ share parameters ⇒ can learn about β l from l ′ . � � β l = z l , b γ b β l ′ = z l ′ , b γ b (4) b b ◮ Active learning: repeatedly select training data that helps learning β l . Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 18
Contribution II: Approximate Active Learning Active Learning ◮ Want to predict labeller l ’s labels. Need β l . ◮ Not all labellers are useful to infer β l . ◮ If l and l ′ share parameters ⇒ can learn about β l from l ′ . � � β l = z l , b γ b β l ′ = z l ′ , b γ b (4) b b ◮ Active learning: repeatedly select training data that helps learning β l . ◮ Goal: cheaper training data, faster learning. Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 18
Contribution II: Approximate Active Learning Approximate inference and Active Learning ◮ Suppose we start with training data Y . Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 19
Contribution II: Approximate Active Learning Approximate inference and Active Learning ◮ Suppose we start with training data Y . ◮ Query task-labeler pair ( i , l ) to maximize expected utility of adding it � � p ( β | y i ′ , l ′ , X , Y ) �� ( i , l ) = argmax ( i ′ , l ′ ) E y i ′ , l ′ U Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 19
Contribution II: Approximate Active Learning Approximate inference and Active Learning ◮ Suppose we start with training data Y . ◮ Query task-labeler pair ( i , l ) to maximize expected utility of adding it � � p ( β | y i ′ , l ′ , X , Y ) �� ( i , l ) = argmax ( i ′ , l ′ ) E y i ′ , l ′ U ◮ Examples: U ( · ) = − Entropy( · ). U µ ( · ) = | | 2 | Mean( · ) − µ | 2 Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 19
Contribution II: Approximate Active Learning Approximate inference and Active Learning ◮ Suppose we start with training data Y . ◮ Query task-labeler pair ( i , l ) to maximize expected utility of adding it � � p ( β | y i ′ , l ′ , X , Y ) �� ( i , l ) = argmax ( i ′ , l ′ ) E y i ′ , l ′ U ◮ Examples: U ( · ) = − Entropy( · ). U µ ( · ) = | | 2 | Mean( · ) − µ | 2 ◮ For each ( i ′ , l ′ ) score, need posterior p ( β | y i ′ , l ′ , X , Y ). Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 19
Contribution II: Approximate Active Learning Approximate inference and Active Learning ◮ Suppose we start with training data Y . ◮ Query task-labeler pair ( i , l ) to maximize expected utility of adding it � � p ( β | y i ′ , l ′ , X , Y ) �� ( i , l ) = argmax ( i ′ , l ′ ) E y i ′ , l ′ U ◮ Examples: U ( · ) = − Entropy( · ). U µ ( · ) = | | 2 | Mean( · ) − µ | 2 ◮ For each ( i ′ , l ′ ) score, need posterior p ( β | y i ′ , l ′ , X , Y ). ◮ Gibbs sampling ⇒ separate Gibbs samplers to score ( i ′ , l ′ ). Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 19
Contribution II: Approximate Active Learning Approximate inference and Active Learning ◮ Suppose we start with training data Y . ◮ Query task-labeler pair ( i , l ) to maximize expected utility of adding it � � p ( β | y i ′ , l ′ , X , Y ) �� ( i , l ) = argmax ( i ′ , l ′ ) E y i ′ , l ′ U ◮ Examples: U ( · ) = − Entropy( · ). U µ ( · ) = | | 2 | Mean( · ) − µ | 2 ◮ For each ( i ′ , l ′ ) score, need posterior p ( β | y i ′ , l ′ , X , Y ). ◮ Gibbs sampling ⇒ separate Gibbs samplers to score ( i ′ , l ′ ). ◮ We are already running one Gibbs sampler for basic inference. Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 19
Contribution II: Approximate Active Learning Approximate inference and Active Learning ◮ Suppose we start with training data Y . ◮ Query task-labeler pair ( i , l ) to maximize expected utility of adding it � � p ( β | y i ′ , l ′ , X , Y ) �� ( i , l ) = argmax ( i ′ , l ′ ) E y i ′ , l ′ U ◮ Examples: U ( · ) = − Entropy( · ). U µ ( · ) = | | 2 | Mean( · ) − µ | 2 ◮ For each ( i ′ , l ′ ) score, need posterior p ( β | y i ′ , l ′ , X , Y ). ◮ Gibbs sampling ⇒ separate Gibbs samplers to score ( i ′ , l ′ ). ◮ We are already running one Gibbs sampler for basic inference. ◮ Problem: Can we avoid running the extra scoring chains? Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 19
Contribution II: Approximate Active Learning Contribution II: Approximate Active Learning Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 20
Contribution II: Approximate Active Learning Contribution II: Approximate Active Learning ◮ Gibbs sampler for p ( β | X , Y ) is a Markov chain for inference Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 20
Contribution II: Approximate Active Learning Contribution II: Approximate Active Learning ◮ Gibbs sampler for p ( β | X , Y ) is a Markov chain for inference β t − 1 β t β t + 1 β t + 2 Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 20
Contribution II: Approximate Active Learning Contribution II: Approximate Active Learning ◮ Gibbs sampler for p ( β | X , Y ) is a Markov chain for inference ◮ Sampler for p ( β | y i ′ , l ′ , X , Y ) is a perturbed chain for scoring β t − 1 β t β t + 1 β t + 2 Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 20
Contribution II: Approximate Active Learning Contribution II: Approximate Active Learning ◮ Gibbs sampler for p ( β | X , Y ) is a Markov chain for inference ◮ Sampler for p ( β | y i ′ , l ′ , X , Y ) is a perturbed chain for scoring β t − 1 β t β t + 1 β t + 2 ˆ ˆ β t − 1 ˆ ˆ β t β t + 1 β t + 2 Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 20
Contribution II: Approximate Active Learning Contribution II: Approximate Active Learning ◮ Gibbs sampler for p ( β | X , Y ) is a Markov chain for inference ◮ Sampler for p ( β | y i ′ , l ′ , X , Y ) is a perturbed chain for scoring ◮ Na¨ ıve scoring: β t − 1 β t β t + 1 β t + 2 ˆ ˆ β t − 1 ˆ ˆ β t β t + 1 β t + 2 Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 20
Contribution II: Approximate Active Learning Contribution II: Approximate Active Learning ◮ Gibbs sampler for p ( β | X , Y ) is a Markov chain for inference ◮ Sampler for p ( β | y i ′ , l ′ , X , Y ) is a perturbed chain for scoring ◮ Na¨ ıve scoring: • Run perturbed chain; sample from stationary distribution. • Compute U ( p ( β | y i ′ , l ′ , X , Y )). β t − 1 β t β t + 1 β t + 2 ˆ ˆ β t − 1 ˆ ˆ β t β t + 1 β t + 2 Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 20
Contribution II: Approximate Active Learning Contribution II: Approximate Active Learning ◮ Gibbs sampler for p ( β | X , Y ) is a Markov chain for inference ◮ Sampler for p ( β | y i ′ , l ′ , X , Y ) is a perturbed chain for scoring ◮ Na¨ ıve scoring: • Run perturbed chain; sample from stationary distribution. • Compute U ( p ( β | y i ′ , l ′ , X , Y )). ◮ Our method : β t − 1 β t β t + 1 β t + 2 ˆ ˆ β t − 1 ˆ ˆ β t β t + 1 β t + 2 Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 20
Contribution II: Approximate Active Learning Contribution II: Approximate Active Learning ◮ Gibbs sampler for p ( β | X , Y ) is a Markov chain for inference ◮ Sampler for p ( β | y i ′ , l ′ , X , Y ) is a perturbed chain for scoring ◮ Na¨ ıve scoring: • Run perturbed chain; sample from stationary distribution. • Compute U ( p ( β | y i ′ , l ′ , X , Y )). ◮ Our method : • Get approximate samples of p ( β | y i ′ , l ′ , X , Y ) by transforming samples of p ( β | X , Y ). β t − 1 β t β t + 1 β t + 2 ˆ ˆ β t − 1 ˆ ˆ β t β t + 1 β t + 2 Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 20
Contribution II: Approximate Active Learning Contribution II: Approximate Active Learning ◮ Gibbs sampler for p ( β | X , Y ) is a Markov chain for inference ◮ Sampler for p ( β | y i ′ , l ′ , X , Y ) is a perturbed chain for scoring ◮ Na¨ ıve scoring: • Run perturbed chain; sample from stationary distribution. • Compute U ( p ( β | y i ′ , l ′ , X , Y )). ◮ Our method : • Get approximate samples of p ( β | y i ′ , l ′ , X , Y ) by transforming samples of p ( β | X , Y ). β t − 1 β t β t + 1 β t + 2 ˆ ˆ ˆ ˆ β t β t + 1 β t + 2 β t + 3 Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 20
Contribution II: Approximate Active Learning Contribution II: Approximate Active Learning ◮ Gibbs sampler for p ( β | X , Y ) is a Markov chain for inference ◮ Sampler for p ( β | y i ′ , l ′ , X , Y ) is a perturbed chain for scoring ◮ Na¨ ıve scoring: • Run perturbed chain; sample from stationary distribution. • Compute U ( p ( β | y i ′ , l ′ , X , Y )). ◮ Our method : • Get approximate samples of p ( β | y i ′ , l ′ , X , Y ) by transforming samples of p ( β | X , Y ). • Approximate U ( p ( β | y i ′ , l ′ , X , Y )) from these. β t − 1 β t β t + 1 β t + 2 ˆ ˆ ˆ ˆ β t β t + 1 β t + 2 β t + 3 Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 20
Contribution II: Approximate Active Learning Approximate Scoring for Active Learning p (ˆ β t | ˆ ◮ Suppose chain p ( β t | β t − 1 ) and a perturbed chain ˆ β t − 1 ). p ∞ (ˆ ◮ Stationary distributions are p ∞ ( β ) and ˆ β ). Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 21
Contribution II: Approximate Active Learning Approximate Scoring for Active Learning p (ˆ β t | ˆ ◮ Suppose chain p ( β t | β t − 1 ) and a perturbed chain ˆ β t − 1 ). p ∞ (ˆ ◮ Stationary distributions are p ∞ ( β ) and ˆ β ). ◮ Let β s ∼ p ∞ ( β ) s = 1 , . . . , S , and approximate S β | β ) p ∞ ( β ) d β ≈ 1 � p ∞ (ˆ p (ˆ � p (ˆ β | β s ) . β ) ≈ ˆ ˆ ˆ S s =1 Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 21
Contribution II: Approximate Active Learning Approximate Scoring for Active Learning p (ˆ β t | ˆ ◮ Suppose chain p ( β t | β t − 1 ) and a perturbed chain ˆ β t − 1 ). p ∞ (ˆ ◮ Stationary distributions are p ∞ ( β ) and ˆ β ). ◮ Let β s ∼ p ∞ ( β ) s = 1 , . . . , S , and approximate S β | β ) p ∞ ( β ) d β ≈ 1 � p ∞ (ˆ p (ˆ � p (ˆ β | β s ) . β ) ≈ ˆ ˆ ˆ S s =1 ◮ If p ∞ ( β ) = ˆ p ∞ ( β ), the first approximation is exact. Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 21
Contribution II: Approximate Active Learning Approximate Scoring for Active Learning p (ˆ β t | ˆ ◮ Suppose chain p ( β t | β t − 1 ) and a perturbed chain ˆ β t − 1 ). p ∞ (ˆ ◮ Stationary distributions are p ∞ ( β ) and ˆ β ). ◮ Let β s ∼ p ∞ ( β ) s = 1 , . . . , S , and approximate S β | β ) p ∞ ( β ) d β ≈ 1 � p ∞ (ˆ p (ˆ � p (ˆ β | β s ) . β ) ≈ ˆ ˆ ˆ S s =1 ◮ If p ∞ ( β ) = ˆ p ∞ ( β ), the first approximation is exact. ◮ Specialize to active learning: • Unperturbed chain = Gibbs sampler for p ( β | X , Y ). • Perturbed chain = Gibbs sampler for p ( β | y i ′ , l ′ , X , Y ). Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 21
Contribution II: Approximate Active Learning Special Case: Discrete Random Walks ◮ Suppose W is n × n , positive, symmetric. P = D − 1 W . Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 22
Contribution II: Approximate Active Learning Special Case: Discrete Random Walks ◮ Suppose W is n × n , positive, symmetric. P = D − 1 W . ◮ Stationary distribution is left eigenvector of P . Decompose A = D − 1 / 2 WD − 1 / 2 (5) = V Λ V ⊤ , λ 1 ≤ λ 2 ≤ . . . ≤ λ n = 1 (6) p ∞ ∝ D 1 / 2 v n (7) Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 22
Contribution II: Approximate Active Learning Special Case: Discrete Random Walks ◮ Suppose W is n × n , positive, symmetric. P = D − 1 W . ◮ Stationary distribution is left eigenvector of P . Decompose A = D − 1 / 2 WD − 1 / 2 (5) = V Λ V ⊤ , λ 1 ≤ λ 2 ≤ . . . ≤ λ n = 1 (6) p ∞ ∝ D 1 / 2 v n (7) ◮ Perturb the matrix: ˆ W = W + dW ≥ 0, with dW 1 = 0. Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 22
Contribution II: Approximate Active Learning Special Case: Discrete Random Walks ◮ Suppose W is n × n , positive, symmetric. P = D − 1 W . ◮ Stationary distribution is left eigenvector of P . Decompose A = D − 1 / 2 WD − 1 / 2 (5) = V Λ V ⊤ , λ 1 ≤ λ 2 ≤ . . . ≤ λ n = 1 (6) p ∞ ∝ D 1 / 2 v n (7) ◮ Perturb the matrix: ˆ W = W + dW ≥ 0, with dW 1 = 0. P = D − 1 ˆ ◮ Then ˆ W = P + D − 1 dW = P + dP . Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 22
Contribution II: Approximate Active Learning Special Case: Discrete Random Walks ◮ Matrix perturbation theory: v k v ⊤ � p ∞ ≈ p ∞ + D 1 / 2 k dP ⊤ D − 1 / 2 p ∞ ˜ (8) 1 − λ k k � = n Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 23
Recommend
More recommend