Modeling Annotator Accuracies for Supervised Learning Abhimanu - PowerPoint PPT Presentation

Modeling Annotator Accuracies for Supervised Learning Abhimanu Kumar Matthew Lease Department of Computer Science School of Information University of Texas at Austin University of Texas at Austin abhimanu@cs.utexas.edu ml@ischool.utexas.edu http://abhimanukumar.com

Supervised learning from noisy labels • Labeling is inherently uncertain – Even experts disagree and make mistakes – Crowd tends to be noisier with higher variance • Use wisdom of crowds to reduce uncertainty – Multi-label + aggregation = consensus labels • How to maximize learning rate (labeling effort)? – Label a new example? – Get another label for an already-labeled example? • See: Sheng, Provost & Ipeirotis, KDD’08 February 9, 2011 Kumar and Lease. Modeling Annotator Accuracies for Supervised Learning. CSDM 2011. 2

Task Setup • Task: Binary classification • Learner: C4.5 decision tree • Given – An initial seed set of single-labeled examples (64) – An unlimited pool of unlabeled examples • Cost model – Fixed unit cost for labeling any example – Unlabeled examples are freely obtained • Goal: Maximize learning rate (for labeling effort) February 9, 2011 Kumar and Lease. Modeling Annotator Accuracies for Supervised Learning. CSDM 2011. 3

Compare 3 methods: SL, MV, & NB • Single labeling (SL): label a new example • Multi-Labeling: get another label for pool – Majority Vote (MV) : consensus by simple vote – Naïve Bayes (NB) : weight vote by annotator accuracy February 9, 2011 Kumar and Lease. Modeling Annotator Accuracies for Supervised Learning. CSDM 2011. 4

Assumptions • Example selection: random – From pool for SL, from seed set for multi-labeling – No selection based on active learning • Fixed commitment to a single method a priori – No switching between methods at run-time • Balanced classes – model & measure simple accuracy (not P/R, ROC) – Assume uniform class prior for NB • Annotator accuracies are known to system – In practice, must estimate these: from gold data (Snow et al. ’08) or EM ( Dawid & Skene’79) February 9, 2011 Kumar and Lease. Modeling Annotator Accuracies for Supervised Learning. CSDM 2011. 5

Simulation • Each annotator – Has parameter p (prob. of producing correct label) – Generates exactly one label • Uniform distribution of accuracies U(min,max) • Generative model for simulation – Pick an example x (with true label y* ) at random – Draw annotator accuracy p ~ U(min,max) – Generate label y ~ P( y | p, y* ) February 9, 2011 Kumar and Lease. Modeling Annotator Accuracies for Supervised Learning. CSDM 2011. 6

Evaluation • Data: 4 datasets from UCI ML Repository – Mushroom http://archive.ics.uci.edu/ml/datasets.html – Spambase – Tic-Tac-Toe – Chess: King-Rook vs. King-Pawn • Same trends across all 4, so we report first 2 • Random 70 / 30 split of data for seed+pool / test • Repeat each run 10 times and average results February 9, 2011 Kumar and Lease. Modeling Annotator Accuracies for Supervised Learning. CSDM 2011. 7

p ~ U(0.6, 1.0) • Fairly accurate annotators (mean = 0.8) • Little uncertainty -> little gain from multi-labeling February 9, 2011 Kumar and Lease. Modeling Annotator Accuracies for Supervised Learning. CSDM 2011. 8

p ~ U(0.4, 0.6) • Very noisy (mean = 0.5, random coin flip) • SL and MV learning rates are flat • NB wins by weighting more accurate workers February 9, 2011 Kumar and Lease. Modeling Annotator Accuracies for Supervised Learning. CSDM 2011. 9

p ~ U(0.3, 0.7) • Same noisy mean (0.5), but widen range • SL and MV stay flat • NB further outperforms February 9, 2011 Kumar and Lease. Modeling Annotator Accuracies for Supervised Learning. CSDM 2011. 10

p ~ U(0.1, 0.7) • Worsen accuracies further (mean = 0.4) • NB virtually unchanged • SL and MV predictions become anti-correlated – We should actually flip their predictions… February 9, 2011 Kumar and Lease. Modeling Annotator Accuracies for Supervised Learning. CSDM 2011. 11

p ~ U(0.2, 0.6) • Keep noisy mean 0.4, tighten range • NB best of the worst, but only 50% • Again, seems we should be flipping labels… February 9, 2011 Kumar and Lease. Modeling Annotator Accuracies for Supervised Learning. CSDM 2011. 12

Label flipping • Is NB doing better due to how it uses accuracy, or simply because it’s using more information? • If a worker’s average accuracy is below 50%, we know he tends to be wrong (we’ve ignored this) – whatever he says, we should guess the opposite • Flipping: put all methods on even-footing – Assume a given p < 0.5 produces label = y – Use label = (1-y) instead; for NB, use 1-p accuracy – Same as changing distribution so p always > 0.5 February 9, 2011 Kumar and Lease. Modeling Annotator Accuracies for Supervised Learning. CSDM 2011. 13

p ~ U(0.1, 0.7) No flipping fter With flipping Mushroom Dataset Spambase Dataset 100 90 80 80 70 SL accuracy (%) 60 SL accuracy (%) 60 MV accuracy(%) 40 50 MV accuracy(%) NB accuracy (%) 40 NB accuracy (%) 20 30 0 20 64 128 256 512 1024 2048 4096 64 128 256 512 1024 2048 February 9, 2011 Kumar and Lease. Modeling Annotator Accuracies for Supervised Learning. CSDM 2011. 14

p ~ U(0.2, 0.6) No flipping With flipping Spambase Dataset Mushroom Dataset 100 100 80 80 60 60 40 40 SL accuracy (%) SL accuracy (%) 20 20 MV accuracy(%) MV accuracy(%) NB accuracy (%) NB accuracy (%) 0 0 64 128 256 512 1024 2048 64 128 256 512 1024 2048 4096 February 9, 2011 Kumar and Lease. Modeling Annotator Accuracies for Supervised Learning. CSDM 2011. 15

Conclusion • Take-home: modeling accuracies matters, even if single labeling and majority vote • But what about… – When accuracies are estimated (noisy)? – With real annotation errors (real distribution)? – With different learners or tasks (e.g. ranking)? – With dynamic choice of new example or re-label? – With active learning example selection? – With imbalanced classes? – … February 9, 2011 Kumar and Lease. Modeling Annotator Accuracies for Supervised Learning. CSDM 2011. 16

Recent Events (2010 was big!) http://ir.ischool.utexas.edu/crowd • Human Computation: HCOMP 2009 & HCOMP 2010 at KDD • IR: Crowdsourcing for Search Evaluation at SIGIR 2010 • NLP – The People's Web Meets NLP: Collaboratively Constructed Semantic Resources: 2009 at ACL-IJCNLP & 2010 at COLING – Creating Speech and Language Data With Mechanical Turk. NAACL 2010 – Maryland Workshop on Crowdsourcing and Translation. June, 2010 • ML: Computational Social Science and Wisdom of Crowds. NIPS 2010 • Advancing Computer Vision with Humans in the Loop at CVPR 2010 • Conference: CrowdConf 2010 (organized by CrowdFlower) February 9, 2011 Kumar and Lease. Modeling Annotator Accuracies for Supervised Learning. CSDM 2011. 17

Upcoming Crowdsourcing Events http://ir.ischool.utexas.edu/crowd Special issue of Information Retrieval journal on Crowdsourcing (papers due May 6, 2011) Upcoming Conferences & Workshops • CHI 2011 workshop (May 8) • HCOMP 2011 workshop at AAAI (papers due April 22) • CrowdConf 2011 (TBA) • SIGIR 2011 workshop? (in review) • TREC 2011 Crowdsourcing Track February 9, 2011 Kumar and Lease. Modeling Annotator Accuracies for Supervised Learning. CSDM 2011. 18

Thanks! Special thanks to our diligent crowd annotators and their relentless dedication to science… February 9, 2011 Kumar and Lease. Modeling Annotator Accuracies for Supervised Learning. CSDM 2011. 19

Modeling Annotator Accuracies for Supervised Learning Abhimanu - PowerPoint PPT Presentation

Modeling Annotator Accuracies for Supervised Learning Abhimanu Kumar Matthew Lease Department of Computer Science School of Information University of Texas at Austin University of Texas at Austin abhimanu@cs.utexas.edu ml@ischool.utexas.edu

accuracies Arie ten Cate Estimating reporting accuracies exp imp

Annotator https://github.com/okfn/annotator User interactions Protocols Data models User

Measuring inter- -annotator annotator Measuring inter agreement in GO agreement in GO

PCA CS 446 Supervised learning So far, weve done supervised learning: Given (( x i , y i )) ,

Accuracies and Biases in Modeling Password Guessability Blase Ur, Sean M. Segreti, Lujo Bauer,

Generative Adversarial Networks (GANs) By: Ismail Elezi ismail.elezi@gmail.com Supervised

Machine Learning for NLP Supervised Learning Aurlie Herbelot 2019 Centre for Mind/Brain

Margin-based Semi-supervised Learning Using Apollonius circle MONA EMADI AND JAFAR TANHA T TC S

Introduction to Scikit-Learn: Machine Learning with Introduction to Scikit-Learn: Machine Learning

Supervised Learning Prof. Kuan-Ting Lai 2020/4/9 Machine Learning Supervised Unsupervised

Welcome to the Machine Learning Toolbox! Machine Learning Toolbox Supervised learning caret

4CSLL5 Parameter Estimation (Supervised and Unsupervised) Supervised Maximum Likelihood

Stacking for supervised learning Stacking for supervised learning Niall Rooney, NIKEL,

Classification Accuracies of Malaria Infected Cells Using Deep Convolutional Neural Networks

Inexact Tensor Methods with Dynamic Accuracies Nikita Doikov Yurii Nesterov UCLouvain, Belgium

Unsupervised and Semi-supervised Learning of Structure Graham Neubig Site

Time Management for System Administrators www.EverythingSysAdmin.com Top 5 #1 Create a

Searching over Metapositions in Kriegspiel Andrea Bolognesi and Paolo Ciancarini

An in investm tment t in in know owledge ge always pays the the best t in inte

JavaScript developer friendly syntax let meaningOfLife = 41 + 1; let add = (x, y) => x + y;

The Internet of Things Niels Olof Bouvin 1 Overview What is the Internet of Things? The vision

Lecture 20: Programming with Subclasses CS 1110 Introduction to Computing Using Python [E.

FORCE11 Scholarly Commons Working Group Webinar 20171208 Welcome ! Practicalities WEBINAR

The SLAC KPiX Chip for ILC K p f GEM-Digital Hadron C l Calorimetry i t A d Andy White

Sambuz

Useful Links

Newsletter

Mail Us

Modeling Annotator Accuracies for Supervised Learning Abhimanu - PowerPoint PPT Presentation

Modeling Annotator Accuracies for Supervised Learning Abhimanu Kumar Matthew Lease Department of Computer Science School of Information University of Texas at Austin University of Texas at Austin abhimanu@cs.utexas.edu ml@ischool.utexas.edu

accuracies Arie ten Cate Estimating reporting accuracies exp imp

Annotator https://github.com/okfn/annotator User interactions Protocols Data models User

Measuring inter- -annotator annotator Measuring inter agreement in GO agreement in GO

PCA CS 446 Supervised learning So far, weve done supervised learning: Given (( x i , y i )) ,

Accuracies and Biases in Modeling Password Guessability Blase Ur, Sean M. Segreti, Lujo Bauer,

Generative Adversarial Networks (GANs) By: Ismail Elezi ismail.elezi@gmail.com Supervised

Machine Learning for NLP Supervised Learning Aurlie Herbelot 2019 Centre for Mind/Brain

Margin-based Semi-supervised Learning Using Apollonius circle MONA EMADI AND JAFAR TANHA T TC S

Introduction to Scikit-Learn: Machine Learning with Introduction to Scikit-Learn: Machine Learning

Supervised Learning Prof. Kuan-Ting Lai 2020/4/9 Machine Learning Supervised Unsupervised

Welcome to the Machine Learning Toolbox! Machine Learning Toolbox Supervised learning caret

4CSLL5 Parameter Estimation (Supervised and Unsupervised) Supervised Maximum Likelihood

Stacking for supervised learning Stacking for supervised learning Niall Rooney, NIKEL,

Classification Accuracies of Malaria Infected Cells Using Deep Convolutional Neural Networks

Inexact Tensor Methods with Dynamic Accuracies Nikita Doikov Yurii Nesterov UCLouvain, Belgium

Unsupervised and Semi-supervised Learning of Structure Graham Neubig Site

Time Management for System Administrators www.EverythingSysAdmin.com Top 5 #1 Create a

Searching over Metapositions in Kriegspiel Andrea Bolognesi and Paolo Ciancarini

An in investm tment t in in know owledge ge always pays the the best t in inte

JavaScript developer friendly syntax let meaningOfLife = 41 + 1; let add = (x, y) =&gt; x + y;

The Internet of Things Niels Olof Bouvin 1 Overview What is the Internet of Things? The vision

Lecture 20: Programming with Subclasses CS 1110 Introduction to Computing Using Python [E.

FORCE11 Scholarly Commons Working Group Webinar 20171208 Welcome ! Practicalities WEBINAR

The SLAC KPiX Chip for ILC K p f GEM-Digital Hadron C l Calorimetry i t A d Andy White

Sambuz

Useful Links

Newsletter

Mail Us

JavaScript developer friendly syntax let meaningOfLife = 41 + 1; let add = (x, y) => x + y;