Active Learning with Active Learning with Model Selection Neil - PowerPoint PPT Presentation

Active Learning with Active Learning with Model Selection Neil Rubens Sugiyama Lab / Tokyo Institute of Technology

Active Learning (NLP Motivation) • NLP (common scenario) – Large amounts of unlabeled data Large amounts of unlabeled data – Labeling data is expensive • Active Learning (Optimal Experimental D Design) i ) – Allows to select the most informative examples l 1

Supervised Learning as Function Approximation y target f function i f f ( ( x ) ) y y n 1 learned function f f ( ( x ) ) 2 f ( 1 x ) ( ) f x n y 2 x x x x n 1 2 Goal: From training samples obtain that minimizes G q(x) – test input density ( ) t t i t d it 2

Design Cycle (Common) Collect Data (Active Learning) Model Selection Parameter Learning Evaluation There is a problem with this flow 3

Active Learning (AL) Target function Target function Learned function Good inputs G d i t P Poor inputs i t • Choice of training input points can significantly affect the learned function. • Active Learning – choose training input points g g p p so that generalization error is minimized 4

Setting • Linear Model • Least-squares Learning • In AL can’t use training output values g p for estimating generalization error 5

6 Orthogonal Decomposition

Bias Variance Decomposition model error C bias B variance V 7

Active Learning • Nothing can be done about model error C • Bias -> 0; (least-squares is unbiased) Bias -> 0; (least-squares is unbiased) • Minimize variance -> Minimize Error 8

Variance AL (assuming zero bias) (assuming zero bias) 9

Active Learning - Approximation • In general, simultaneous optimizing n points is not tractable • Approximation Approaches: • Optimize points one by one (greedy) Opt e po ts o e by o e (g eedy) • Optimize probability distribution from which points are drawn points are drawn 10

Bias / Variance (no unbiasedness guarantee) f f f bias variance 11

12 Best fit “min error” Bias / Variance

Model Selection (MS) • Model – could be represented by number M d l ld b t d b b and type of basis functions, e.g. • Model Selection – select appropriate Target function model M: Learned function Learned function too complex simple appropriate

Model Selection • Cross-validation: Measure generalization accuracy by testing on data unused during training training • Regularization: Penalize complex models E’=error on data + λ model complexity E error on data λ model complexity Akaike’s information criterion (AIC), Bayesian ( ), y information criterion (BIC) • Minimum description length (MDL): Kolmogorov complexity, shortest description of data l i h d i i f d • Structural risk minimization (SRM) 14

Active Learning with Model Selection Active Learning Model Selection Share the same goal: Active Learning with Model Selection: Possible Approaches: naïve sequential batch Possible Approaches: naïve, sequential, batch 15

Naïve Approach • Naïve Approach – combine existing AL and MS methods N ï A h bi i ti AL d MS th d Naïve approach is not possible due to: ALMS Dilemma – Active Learning – model should be fixed Target function (MS already performed) Learned function [Fedorov 78, MacKay 92, Kanomori and Shimodaira 04] Kanomori and Shimodaira 04] Good inputs Poor inputs – Model Selection – points should be fixed Model Selection points should be fixed (AL already performed) [Akaike 78, Rissanen 78, Schwarz 78] too simple appropriate too complex 16

Sequential Approach Model b f Selection Active plexity) Learning Learning b (comp Optimal points depend on the model n (number of samples) Has a risk of large error Has a risk of large error (due to overfitting to a different model). 17

Batch Approach I i i l MS i Initial MS is not reliable li bl b f Initial Model Selection Active Learning Active Learning mplexity) b (com Final Model Selection n (number of samples) Has a risk of large error Has a risk of large error (due to overfitting to a different model). 18

Motivation – Hedge the Risk of Large Error Active Learning with Model Selection • Naïve – impossible • Batch, Sequential – risk of large error ( (due to overfitting to a different model) f ff ) Goal: Hedge the risk of large error g g (minimize risk of overfitting to a different model) 19

Ensemble Active Learning Approach (Proposed) • Hedge the risk of overfitting by designing H d th i k f fitti b d i i input points for all of the models. Criterion EAL G G 1 G 2 G CEAL Data EAL Location of training points X X DEAL X 1 X 2 20

Evaluation D – D-EAL C – C-EAL B – Batch S S – Sequential S ti l P - Passive proposed d • Compares favorably with existing methods • Compares favorably with existing methods 21 o Minimized worst case performance (in most cases) o Surprisingly, improved average performance (in some cases)

Current / Future Work • Improving AL by utilizing existing data • My work mostly deals with theoretical aspects. • I am also looking for practical applications. – If you have any problems that involve active If h bl th t i l ti learning, I would be very glad to help. 22

References • Sugiyama. Active learning in approximately linear regression based on conditional expectation of generalization error. JMLR 2006 • Bishop, Pattern Recognition and Machine Learning • Alpaydin, Introduction to Machine Learning payd , oduc o o ac e ea g • Rubens, Sugiyama. Coping with active learning with model selection dilemma: Minimizing with model selection dilemma: Minimizing expected generalization error. IBIS 2006 23

Active Learning with Active Learning with Model Selection Neil - PowerPoint PPT Presentation

Active Learning with Active Learning with Model Selection Neil Rubens Sugiyama Lab / Tokyo Institute of Technology Active Learning (NLP Motivation) NLP (common scenario) Large amounts of unlabeled data Large amounts of unlabeled

ERP Selection KIRTANE & PANDIT Suhas Deshpande Why ERP Selection is important ?

Release from Active Learning / Release from Active Learning / Model Selection Dilemma: Model

STAT 213 Model Selection II Colin Reimer Dawson Oberlin College March 30, 2018 1 / 13 Outline

SECONDHAND SELECTION Sales Price - 275,000.00 EU SECONDHAND SELECTION INTERNAL VIEWS SECONDHAND

Variable selection bias Bias in Ensemble Bias in Ensemble Methods Methods Variable selection

SELECTION Deterministic Stochastic Proportionate selection: Roulette Wheel Selection

Selection 2 Selection Selection given a set of (distinct) elements, finding the element larger

Agenda Intro to Active Learning Activity Design Resources for Active Learning Lunch with Active

MODEL SELECTION AND REGULARISATION MODEL SELECTION ESTIMATING THE ACCURACY OF THE MODEL We

The Active Card An Active Mind in an Active Body More people, More Active, More often! The

Active Adversary Lecture 7 CCA Security MAC Active Adversary Active Adversary An active

Model Selection and Assumptions November 15, 2019 November 15, 2019 1 / 32 Forward Selection

STAT 213 Multicollinearity and Model Selection Colin Reimer Dawson Oberlin College 7 April 2016

Multi-Task Active Learning Yi Zhang Outline Active Learning Multi-Task Active Learning

Demo (Step 1, Selection) Demo (Step 1, Optimization) Demo (Step 2, Selection) Demo (Step 2,

Conference Site Selection Stephanie Sabal Program Coordinator: Site Selection sabal@acm.org

Meeting - Scotland 15 th JULY 2013 Stephen Kenyon, AIC Scotland Feed Sector Chairman David

SeisComP3 Playback Seismic Picker Add a new profile called pick_local, double click to

AIS/AIM Workshop Presented by Laila Hassim ATNS South Africa Introduction AIM Responsibility -

Information Session Agenda 4.00pm David Croft, Chairman, Amuri Irrigation (AIC) Apology: Andrew

Chilean Chilean jack jack mackerel mackerel stock stock assessment assessment m odel Cristian

Obtaining Simultaneous Equation Models through a unified shared-memory scheme of metaheuristics

The Role of Expert Knowledge in Uncertainty Quantification (Are We Adding More Uncertainty (Are

F OR P ROBABILISTIC S EISMIC H AZARD A NALYSIS : C ASE S TUDY OF T AIWAN Phung-Van Bang,

Active Learning with Active Learning with Model Selection Neil - PowerPoint PPT Presentation

Active Learning with Active Learning with Model Selection Neil Rubens Sugiyama Lab / Tokyo Institute of Technology Active Learning (NLP Motivation) NLP (common scenario) Large amounts of unlabeled data Large amounts of unlabeled

ERP Selection KIRTANE &amp; PANDIT Suhas Deshpande Why ERP Selection is important ?

Release from Active Learning / Release from Active Learning / Model Selection Dilemma: Model

STAT 213 Model Selection II Colin Reimer Dawson Oberlin College March 30, 2018 1 / 13 Outline

SECONDHAND SELECTION Sales Price - 275,000.00 EU SECONDHAND SELECTION INTERNAL VIEWS SECONDHAND

Variable selection bias Bias in Ensemble Bias in Ensemble Methods Methods Variable selection

SELECTION Deterministic Stochastic Proportionate selection: Roulette Wheel Selection

Selection 2 Selection Selection given a set of (distinct) elements, finding the element larger

Agenda Intro to Active Learning Activity Design Resources for Active Learning Lunch with Active

MODEL SELECTION AND REGULARISATION MODEL SELECTION ESTIMATING THE ACCURACY OF THE MODEL We

The Active Card An Active Mind in an Active Body More people, More Active, More often! The

Active Adversary Lecture 7 CCA Security MAC Active Adversary Active Adversary An active

Model Selection and Assumptions November 15, 2019 November 15, 2019 1 / 32 Forward Selection

STAT 213 Multicollinearity and Model Selection Colin Reimer Dawson Oberlin College 7 April 2016

Multi-Task Active Learning Yi Zhang Outline Active Learning Multi-Task Active Learning

Demo (Step 1, Selection) Demo (Step 1, Optimization) Demo (Step 2, Selection) Demo (Step 2,

Conference Site Selection Stephanie Sabal Program Coordinator: Site Selection sabal@acm.org

Meeting - Scotland 15 th JULY 2013 Stephen Kenyon, AIC Scotland Feed Sector Chairman David

SeisComP3 Playback Seismic Picker Add a new profile called pick_local, double click to

AIS/AIM Workshop Presented by Laila Hassim ATNS South Africa Introduction AIM Responsibility -

Information Session Agenda 4.00pm David Croft, Chairman, Amuri Irrigation (AIC) Apology: Andrew

Chilean Chilean jack jack mackerel mackerel stock stock assessment assessment m odel Cristian

Obtaining Simultaneous Equation Models through a unified shared-memory scheme of metaheuristics

The Role of Expert Knowledge in Uncertainty Quantification (Are We Adding More Uncertainty (Are

F OR P ROBABILISTIC S EISMIC H AZARD A NALYSIS : C ASE S TUDY OF T AIWAN Phung-Van Bang,

ERP Selection KIRTANE & PANDIT Suhas Deshpande Why ERP Selection is important ?