Keepin It Real: Semi-Supervised Learning with Realistic Tuning - PowerPoint PPT Presentation

Keepin’ It Real: Semi-Supervised Learning with Realistic Tuning Andrew B. Goldberg Xiaojin Zhu goldberg@cs.wisc.edu jerryzhu@cs.wisc.edu Computer Sciences Department University of Wisconsin-Madison

Gap between Semi-Supervised Learning (SSL) research and practical applications Andrew B. Goldberg (UW-Madison), SSL with Realistic Tuning

Gap between Semi-Supervised Learning (SSL) research and practical applications Semi-Supervised Learning: Using unlabeled data to build better classifiers Andrew B. Goldberg (UW-Madison), SSL with Realistic Tuning

Gap between Semi-Supervised Learning (SSL) research and practical applications Real World • natural language Semi-Supervised Learning: processing Using unlabeled data to • computer vision • web search & IR build better classifiers • bioinformatics • etc Andrew B. Goldberg (UW-Madison), SSL with Realistic Tuning

Gap between Semi-Supervised Learning (SSL) research and practical applications Assumptions • manifold? clusters? • low-density gap? • multiple views? Real World Parameters • natural language Semi-Supervised Learning: processing • regularization? Using unlabeled data to • computer vision • graph weights? • web search & IR build better classifiers • kernel parameters? • bioinformatics • etc Model Selection • Little labeled data • Many parameters • Computational costs Andrew B. Goldberg (UW-Madison), SSL with Realistic Tuning

Gap between Semi-Supervised Learning (SSL) research and practical applications Assumptions • manifold? clusters? • low-density gap? • multiple views? Wrong choices could hurt performance! Real World Parameters • natural language Semi-Supervised Learning: processing • regularization? Using unlabeled data to • computer vision How can we ensure that SSL is never worse • graph weights? • web search & IR build better classifiers • kernel parameters? than supervised learning? • bioinformatics • etc Model Selection • Little labeled data • Many parameters • Computational costs Andrew B. Goldberg (UW-Madison), SSL with Realistic Tuning

O UR F OCUS Andrew B. Goldberg (UW-Madison), SSL with Realistic Tuning 3

O UR F OCUS • Two critical issues • Parameter tuning • Choosing which (if any) SSL algorithm to use Andrew B. Goldberg (UW-Madison), SSL with Realistic Tuning 3

O UR F OCUS • Two critical issues • Parameter tuning • Choosing which (if any) SSL algorithm to use • Interested in realistic settings: • Practitioner is given some new labeled and unlabeled data • Must produce the best classifier possible Andrew B. Goldberg (UW-Madison), SSL with Realistic Tuning 3

O UR C ONTRIBUTIONS Andrew B. Goldberg (UW-Madison), SSL with Realistic Tuning 4

O UR C ONTRIBUTIONS • Medium-scale empirical study Andrew B. Goldberg (UW-Madison), SSL with Realistic Tuning 4

O UR C ONTRIBUTIONS • Medium-scale empirical study • Compares one supervised learning (SL) and two SSL methods Andrew B. Goldberg (UW-Madison), SSL with Realistic Tuning 4

O UR C ONTRIBUTIONS • Medium-scale empirical study • Compares one supervised learning (SL) and two SSL methods • Eight less-familiar NLP tasks, three evaluation metrics Andrew B. Goldberg (UW-Madison), SSL with Realistic Tuning 4

O UR C ONTRIBUTIONS • Medium-scale empirical study • Compares one supervised learning (SL) and two SSL methods • Eight less-familiar NLP tasks, three evaluation metrics • Experimental protocol explores several real-world settings Andrew B. Goldberg (UW-Madison), SSL with Realistic Tuning 4

O UR C ONTRIBUTIONS • Medium-scale empirical study • Compares one supervised learning (SL) and two SSL methods • Eight less-familiar NLP tasks, three evaluation metrics • Experimental protocol explores several real-world settings • All parameters are tuned realistically via cross validation Andrew B. Goldberg (UW-Madison), SSL with Realistic Tuning 4

O UR C ONTRIBUTIONS • Medium-scale empirical study • Compares one supervised learning (SL) and two SSL methods • Eight less-familiar NLP tasks, three evaluation metrics • Experimental protocol explores several real-world settings • All parameters are tuned realistically via cross validation • Findings under these conditions: Andrew B. Goldberg (UW-Madison), SSL with Realistic Tuning 4

O UR C ONTRIBUTIONS • Medium-scale empirical study • Compares one supervised learning (SL) and two SSL methods • Eight less-familiar NLP tasks, three evaluation metrics • Experimental protocol explores several real-world settings • All parameters are tuned realistically via cross validation • Findings under these conditions: • Each SSL can be worse than SL on some data sets Andrew B. Goldberg (UW-Madison), SSL with Realistic Tuning 4

O UR C ONTRIBUTIONS • Medium-scale empirical study • Compares one supervised learning (SL) and two SSL methods • Eight less-familiar NLP tasks, three evaluation metrics • Experimental protocol explores several real-world settings • All parameters are tuned realistically via cross validation • Findings under these conditions: • Each SSL can be worse than SL on some data sets • Can achieve agnostic SSL by using cross validation accuracy to select among SL and SSL algorithms Andrew B. Goldberg (UW-Madison), SSL with Realistic Tuning 4

O UTLINE • Introduce “realistic tuning” for SSL • Empirical study protocol • Data sets • Algorithms • Meta algorithm for SSL model selection • Performance metrics • Results • Conclusions Andrew B. Goldberg (UW-Madison), SSL with Realistic Tuning 5

SSL W ITH R EALISTIC T UNING Andrew B. Goldberg (UW-Madison), SSL with Realistic Tuning 6

SSL W ITH R EALISTIC T UNING • Given labeled and unlabeled data, { ( x 1 , y 1 ) , . . . , ( x l , y l ) , x l +1 , ..., x l + u } how should you set parameters for some algorithm? Andrew B. Goldberg (UW-Madison), SSL with Realistic Tuning 6

SSL W ITH R EALISTIC T UNING • Given labeled and unlabeled data, { ( x 1 , y 1 ) , . . . , ( x l , y l ) , x l +1 , ..., x l + u } how should you set parameters for some algorithm? • Tune based on test set performance? Andrew B. Goldberg (UW-Madison), SSL with Realistic Tuning 6

SSL W ITH R EALISTIC T UNING • Given labeled and unlabeled data, { ( x 1 , y 1 ) , . . . , ( x l , y l ) , x l +1 , ..., x l + u } how should you set parameters for some algorithm? • Tune based on test set performance? No, this is cheating Andrew B. Goldberg (UW-Madison), SSL with Realistic Tuning 6

SSL W ITH R EALISTIC T UNING • Given labeled and unlabeled data, { ( x 1 , y 1 ) , . . . , ( x l , y l ) , x l +1 , ..., x l + u } how should you set parameters for some algorithm? • Tune based on test set performance? No, this is cheating • Use default values based on heuristics/experience? Andrew B. Goldberg (UW-Madison), SSL with Realistic Tuning 6

SSL W ITH R EALISTIC T UNING • Given labeled and unlabeled data, { ( x 1 , y 1 ) , . . . , ( x l , y l ) , x l +1 , ..., x l + u } how should you set parameters for some algorithm? • Tune based on test set performance? No, this is cheating • Use default values based on heuristics/experience? May fail on new data Andrew B. Goldberg (UW-Madison), SSL with Realistic Tuning 6

SSL W ITH R EALISTIC T UNING • Given labeled and unlabeled data, { ( x 1 , y 1 ) , . . . , ( x l , y l ) , x l +1 , ..., x l + u } how should you set parameters for some algorithm? • Tune based on test set performance? No, this is cheating • Use default values based on heuristics/experience? May fail on new data • k-fold cross validation? Andrew B. Goldberg (UW-Madison), SSL with Realistic Tuning 6

SSL W ITH R EALISTIC T UNING • Given labeled and unlabeled data, { ( x 1 , y 1 ) , . . . , ( x l , y l ) , x l +1 , ..., x l + u } how should you set parameters for some algorithm? • Tune based on test set performance? No, this is cheating • Use default values based on heuristics/experience? May fail on new data • k-fold cross validation? Little labeled data, but best available option Andrew B. Goldberg (UW-Madison), SSL with Realistic Tuning 6

SSL W ITH R EALISTIC T UNING • Given labeled and unlabeled data, { ( x 1 , y 1 ) , . . . , ( x l , y l ) , x l +1 , ..., x l + u } how should you set parameters for some algorithm? • Tune based on test set performance? No, this is cheating • Use default values based on heuristics/experience? May fail on new data • k-fold cross validation? Little labeled data, but best available option • Cross validation choices: Andrew B. Goldberg (UW-Madison), SSL with Realistic Tuning 6

SSL W ITH R EALISTIC T UNING • Given labeled and unlabeled data, { ( x 1 , y 1 ) , . . . , ( x l , y l ) , x l +1 , ..., x l + u } how should you set parameters for some algorithm? • Tune based on test set performance? No, this is cheating • Use default values based on heuristics/experience? May fail on new data • k-fold cross validation? Little labeled data, but best available option • Cross validation choices: • number of folds Andrew B. Goldberg (UW-Madison), SSL with Realistic Tuning 6

Keepin It Real: Semi-Supervised Learning with Realistic Tuning - PowerPoint PPT Presentation

Keepin It Real: Semi-Supervised Learning with Realistic Tuning Andrew B. Goldberg Xiaojin Zhu goldberg@cs.wisc.edu jerryzhu@cs.wisc.edu Computer Sciences Department University of Wisconsin-Madison Gap between Semi-Supervised Learning

Margin-based Semi-supervised Learning Using Apollonius circle MONA EMADI AND JAFAR TANHA T TC S

CS330 Paper Presentation: October 16th, 2019 Supervised Classification Semi-Supervised

Realistic Evaluation of Deep Semi-Supervised Learning Algorithms Avital Oliver* Augustus Odena*

Semi-Supervised Learning Maria-Florina Balcan 03/30/2015 Readings: Semi-Supervised Learning.

Unsupervised and Semi-supervised Learning of Structure Graham Neubig Site

Unsupervised and Semi-supervised Learning of Structure Graham Neubig Site

Support Vector Machines (SVMs). Semi-Supervised Learning. Semi-Supervised SVMs.

Semi-Supervised Kernel Mean Shift Clustering A Semi-Supervised Clustering Approach Motivation:

Semi-Supervised Local Fisher Semi-Supervised Local Fisher Discriminant Analysis Discriminant

Iterative Hybrid Algorithm for Semi-supervised Classification Martin SAVESKI Supervised by

PCA CS 446 Supervised learning So far, weve done supervised learning: Given (( x i , y i )) ,

5 Semi-Supervised Learning BVM Tutorial: Advanced Deep Learning Methods David Zimmerer, Division

Semi-Supervised Learning Jia-Bin Huang Virginia Tech Spring 2019 ECE-5424G / CS-5824

10701 Semi supervised learning Can Unlabeled Data improve supervised learning? Important

Parallelizing Semi- ReDAS Lab Supervised Learning Algorithms with MapReduce Nick Gauthier

Semi-Supervised Learning Tutorial Xiaojin Zhu Department of Computer Sciences University of

Hot Rolling Mill Descaling nozzles are an indispensable part in achieving surface quality in the

A SUSTAINABLE FUTURE MARCH 2020 CSE: DHC US OTC: GODYF METALS FOR CLEAN TECHNOLOGY AND A

ORMONDE MINING BARRUECOPARDO TUNGSTEN PROJECT Project Permitted Project Funded Project

56 MHz SRF Cavity HOM Study Sergey Belomestnykh, Mike Blaskiewicz, Thomas Hayes, Kevin Mernick,

PAC PACE AUT AUTO-WER WERKS KS Vehicle Tuning Services Performance tuning with fuel

Charles E. Reece Charles E. Reece Charles E. Reece Charles E. Reece SRF Workshop, July 11, 2005

The Federal Circuit month at COURT TAKES A BITE OUT OF APPLE Rejecting Apples claim

Fiscal Y l Year 2 r 2017 18 t throug ugh 2 h 2026 27 10-Year ar C Capital al P