Learning From Data Lecture 13 Validation and Model Selection The - PowerPoint PPT Presentation

Learning From Data Lecture 13 Validation and Model Selection The Validation Set Model Selection Cross Validation M. Magdon-Ismail CSCI 4100/6100

recap: Regularization Regularization combats the effects of noise by putting a leash on the algorithm. E aug ( h ) = E in ( h ) + λ N Ω( h ) Ω( h ) → smooth, simple h noise is rough, complex. Different regularizers give different results can choose λ , the amount of regularization. λ = 0 λ = 0 . 0001 λ = 0 . 01 λ = 1 Data Target Fit y y y y x x x x Overfitting Underfitting → → Optimal λ balances approximation and generalization, bias and variance. M Validation and Model Selection : 2 /31 � A c L Creator: Malik Magdon-Ismail Peeking at E out − →

Validation: A Sneak Peek at E out E out ( g ) = E in ( g ) + overfit penalty � �� VC bounds this using a complexity error bar Ω( H ) regularization estimates this through a heuristic complexity penalty Ω( g ) Validation goes directly for the jugular: E out ( g ) = E in ( g ) + overfit penalty . � �� validation estimates this directly In-sample estimate of E out is the Holy Grail of learning from data. M Validation and Model Selection : 3 /31 � A c L Creator: Malik Magdon-Ismail Peeking at E out − →

Validation: A Sneak Peek at E out E out ( g ) = E in ( g ) + overfit penalty � �� VC bounds this using a complexity error bar Ω( H ) regularization estimates this through a heuristic complexity penalty Ω( g ) Validation goes directly for the jugular: E out ( g ) = E in ( g ) + overfit penalty . � �� validation estimates this directly In-sample estimate of E out is the Holy Grail of learning from data. M Validation and Model Selection : 4 /31 � A c L Creator: Malik Magdon-Ismail Peeking at E out − →

Validation: A Sneak Peek at E out E out ( g ) = E in ( g ) + overfit penalty � �� VC bounds this using a complexity error bar Ω( H ) regularization estimates this through a heuristic complexity penalty Ω( g ) Validation goes directly for the jugular: E out ( g ) = E in ( g ) + overfit penalty . � �� validation estimates this directly In-sample estimate of E out is the Holy Grail of learning from data. M Validation and Model Selection : 5 /31 � A c L Creator: Malik Magdon-Ismail Test set − →

The Test Set E test is an estimate for E out ( g ) D D test ( N data points) ( K test points) E D test [ e k ] = E out ( g ) − − − − − − � K 1 − − → → E [ E test ] = E [ e k ] K e k = e ( g ( x k ) , y k ) k =1 g − − − − − − − − − − − − → e 1 , e 2 , . . . , e K K � 1 = E out ( g )= E out ( g ) − − − − K − − k =1 − − → → K � E test = 1 g e k K e 1 , . . . , e K are independent k =1 � K 1 − Var[ E test ] = Var[ e k ] − K 2 − k =1 − → 1 = K Var[ e ] E out ( g ) տ 1 decreases like K bigger K = ⇒ more reliable E test . M Validation and Model Selection : 6 /31 � A c L Creator: Malik Magdon-Ismail Validation set − →

The Validation Set D ( N data points) − − − − − − − − − − − − − − − − − − → → D train D val ( N − K training points) ( K validation points) 1. Remove K points from D D = D train ∪ D val . − − − − − − − − → → e k = e ( g ( x k ) , y k ) − − − − − − − − − − − − → e 1 , e 2 , . . . , e K g 2. Learn using D train − → g . − − − − − − 3. Test g on D val − → E val . − − → → K � E val = 1 4. Use error E val to estimate E out ( g ). g e k K k =1 − − − − → E out ( g ) M Validation and Model Selection : 7 /31 � A c L Creator: Malik Magdon-Ismail Validation − →

The Validation Set D ( N data points) − − − − − − − − − − − − − − − − − − → → D train D val ( N − K training points) ( K validation points) 1. Remove K points from D − − − − D = D train ∪ D val . − − − − → → e k = e ( g ( x k ) , y k ) 2. Learn using D train − → g . − − − − − − − − − − − − → e 1 , e 2 , . . . , e K g − − 3. Test g on D val − → E val . − − − − − − → → 4. Use error E val to estimate E out ( g ). K � E val = 1 g e k K k =1 − − − − → E out ( g ) M Validation and Model Selection : 8 /31 � A c L Creator: Malik Magdon-Ismail Reliability of validation − →

The Validation Set D ( N data points) E val is an estimate for E out ( g ) − − − − − − − − − − − − − − − − − − → → E D val [ e k ] = E out ( g ) D train D val K � 1 ( N − K training points) ( K validation points) E [ E test ] = E [ e k ] K k =1 − − − − � K − − 1 − − = E out ( g )= E out ( g ) → → K k =1 e k = e ( g ( x k ) , y k ) − − − − − − − − − − − − → e 1 , e 2 , . . . , e K g − − − − − − e 1 , . . . , e K are independent − − → → � K K 1 � E val = 1 Var[ E val ] = Var[ e k ] g K 2 e k K k =1 k =1 1 = K Var[ e ( g )] − − տ decreases like − 1 − → K depends on g , not H bigger K = ⇒ more reliable E val ? E out ( g ) M Validation and Model Selection : 9 /31 � A c L Creator: Malik Magdon-Ismail E val versus K − →

Choosing K Expected E val 10 20 30 Size of Validation Set, K Rule of thumb: K ∗ = N 5 . M Validation and Model Selection : 10 /31 � A c L Creator: Malik Magdon-Ismail Restoring D − →

Restoring D D Primary goal: output best hypothesis. ( N ) g was trained on all the data. D train ( N − K ) Secondary goal: estimate E out ( g ). g is behind closed doors. g D val E out ( g ) E out ( g ) ( K ) ↓ ↓ E in ( g ) E val ( g ) � �� E val ( g ) g which should we use? CUSTOMER M Validation and Model Selection : 11 /31 � A c L Creator: Malik Magdon-Ismail E val versus E in − →

E val Versus E in Biased error bar depends on H . ւ �� d vc E out ( g ) ≤ E in ( g ) + O N log N � 1 � E out ( g ) ≤ E out ( g ) ≤ E val ( g ) + O √ K ↑ learning curve is decreasing տ (a practical truth, not a theorem) Unbiased error bar depends on g . E val ( g ) usually wins as an estimate for E out ( g ), especially when the learning curve is not steep. M Validation and Model Selection : 12 /31 � A c L Creator: Malik Magdon-Ismail Model Selection − →

Model Selection The most important use of validation · · · H 1 H 2 H 3 H M − − − − − − − − D train − − − → − − − − → → → → · · · g 1 g 2 g 3 g M − − D val − − − → − → E 1 M Validation and Model Selection : 13 /31 � A c L Creator: Malik Magdon-Ismail Validation estimate for E out ( g 1 ) − →

Validation Estimate for ( H 1 , g 1 ) The most important use of validation · · · H 1 H 2 H 3 H M − − D train − − − → − → g 1 − − D val − − − → − → E val ( g 1 ) M Validation and Model Selection : 14 /31 � A c L Creator: Malik Magdon-Ismail Call it E 1 − →

Validation Estimate for ( H 1 , g 1 ) The most important use of validation · · · H 1 H 2 H 3 H M − − D train − − − → − → g 1 − − D val − − − → − → E 1 M Validation and Model Selection : 15 /31 � A c L Creator: Malik Magdon-Ismail Validation estimates E 1 , . . . , E M − →

Compute Validation Estimates for All Models The most important use of validation · · · H 1 H 2 H 3 H M − − − − − − − − D train − − − → − − − − → → → → · · · g 1 g 2 g 3 g M − − − − − − − − D val − − − → − − − − → → → → · · · E 1 E 2 E 3 E M M Validation and Model Selection : 16 /31 � A c L Creator: Malik Magdon-Ismail Pick best validation error − →

Pick The Best Model According to Validation Error The most important use of validation · · · H 1 H 2 H 3 H M − − − − − − − − D train − − − → − − − − → → → → · · · g 1 g 2 g 3 g M − − − − − − − − D val − − − → − − − − → → → → · · · E 1 E 2 E 3 E M M Validation and Model Selection : 17 /31 � A c L Creator: Malik Magdon-Ismail Biased E val ( g m ∗ ) − →

E val ( g m ∗ ) is not Unbiased For E out ( g m ∗ ) 0.8 0.7 Expected Error E out ( g m ∗ ) 0.6 . . . because we choose one of the M finalists. E val ( g m ∗ ) 0.5 5 15 25 Validation Set Size, K �� ln M E out ( g m ∗ ) ≤ E val ( g m ∗ ) + O K ↑ VC error bar for selecting a hypothesis from M using a data set of size K . M Validation and Model Selection : 18 /31 � A c L Creator: Malik Magdon-Ismail Restoring D − →

Restoring D · · · H 1 H 2 H 3 H M − − − − − − − − − − − − → → → → · · · g 1 g 2 g 3 g M − − − → E 1 Model with best g also has best g ← leap of faith We can find model with best g using validation ← true modulo E val error bar M Validation and Model Selection : 19 /31 � A c L Creator: Malik Magdon-Ismail Comparing E in and E val − →

Learning From Data Lecture 13 Validation and Model Selection The - PowerPoint PPT Presentation

Learning From Data Lecture 13 Validation and Model Selection The Validation Set Model Selection Cross Validation M. Magdon-Ismail CSCI 4100/6100 recap: Regularization Regularization combats the effects of noise by putting a leash on the

Data Mining II Model Validation Heiko Paulheim Why Model Validation? We have seen so far

Validation of National Burn Severity Validation of National Burn Severity Validation of National

Form Validation 1 CS380 What is form validation? 2 validation: ensuring that form's values

Validation and Testing COMPSCI 371D Machine Learning COMPSCI 371D Machine Learning

Learning From Data Lecture 14 Three Learning Principles Occams Razor Sampling Bias Data

Data validation and exploration Data validation and exploration Abhijit Dasgupta Abhijit

Introduction to Scikit-Learn: Machine Learning with Introduction to Scikit-Learn: Machine Learning

LaGov LaGov Version 2.2 Updated: 12/17/08 Visit our website for Blueprint Presentations,

Progress to Date in A3: Method Transfer, Partial Validation and Cross validation A3: Method

Module 4 19/05/2015 2 Agenda 1. What is validation? 2. Three-part empathy 3. What is

LaGov LaGov Validation Session Agenda Validation Session Agenda Purpose Work Session

Bounce Address Tag Validation Bounce Address Tag Validation Bounce Address Tag Validation (BATV)

Capital Quality Validation Webinar Sept. 17, 2020 Agenda Validation Overview

AIRS Validation Overview & TDS Support of Validation Eric Fetzer AIRS Science Team Meeting

AngularJS & Bootstrap Form Validation HTML default validation Browsers have built-in

Chapter 5 Analysis: Four Level for Validation Vis/Visual Analytics, Chap 5 Validation 1 CGGM

t -deformations of Grothendieck rings as quantum cluster algebras Lea Bittmann Universite

Energy Storage for Peak Shaving in a Microgrid in the Context of Brazilian Time-of of-Use Rate

Introduction to SM th. uncertainties in S Emi KOU (LPT, Universit e Paris XI) This

A New Method for -Delayed Neutron-Emission Probability Measurements I. Mardor 1,2 ,

Effective Communication STOR 390 04/11/17 Effective communication will make better at whatever

1 Internal Controls Practices Group September 30, 2020 Travis English Training & Outreach

The Oort cloud: shape an dynamics Marc Fouchard (University of Lille 1 / IMCCE) Hans Rickman

Gyrotonic Aftermarket Bench Lift Design ME ME 395 F Fin inal l Pr Proje ject Gr Group up

Learning From Data Lecture 13 Validation and Model Selection The - PowerPoint PPT Presentation

Learning From Data Lecture 13 Validation and Model Selection The Validation Set Model Selection Cross Validation M. Magdon-Ismail CSCI 4100/6100 recap: Regularization Regularization combats the effects of noise by putting a leash on the

Data Mining II Model Validation Heiko Paulheim Why Model Validation? We have seen so far

Validation of National Burn Severity Validation of National Burn Severity Validation of National

Form Validation 1 CS380 What is form validation? 2 validation: ensuring that form's values

Validation and Testing COMPSCI 371D Machine Learning COMPSCI 371D Machine Learning

Learning From Data Lecture 14 Three Learning Principles Occams Razor Sampling Bias Data

Data validation and exploration Data validation and exploration Abhijit Dasgupta Abhijit

Introduction to Scikit-Learn: Machine Learning with Introduction to Scikit-Learn: Machine Learning

LaGov LaGov Version 2.2 Updated: 12/17/08 Visit our website for Blueprint Presentations,

Progress to Date in A3: Method Transfer, Partial Validation and Cross validation A3: Method

Module 4 19/05/2015 2 Agenda 1. What is validation? 2. Three-part empathy 3. What is

LaGov LaGov Validation Session Agenda Validation Session Agenda Purpose Work Session

Bounce Address Tag Validation Bounce Address Tag Validation Bounce Address Tag Validation (BATV)

Capital Quality Validation Webinar Sept. 17, 2020 Agenda Validation Overview

AIRS Validation Overview &amp; TDS Support of Validation Eric Fetzer AIRS Science Team Meeting

AngularJS &amp; Bootstrap Form Validation HTML default validation Browsers have built-in

Chapter 5 Analysis: Four Level for Validation Vis/Visual Analytics, Chap 5 Validation 1 CGGM

t -deformations of Grothendieck rings as quantum cluster algebras Lea Bittmann Universite

Energy Storage for Peak Shaving in a Microgrid in the Context of Brazilian Time-of of-Use Rate

Introduction to SM th. uncertainties in S Emi KOU (LPT, Universit e Paris XI) This

A New Method for -Delayed Neutron-Emission Probability Measurements I. Mardor 1,2 ,

Effective Communication STOR 390 04/11/17 Effective communication will make better at whatever

1 Internal Controls Practices Group September 30, 2020 Travis English Training &amp; Outreach

The Oort cloud: shape an dynamics Marc Fouchard (University of Lille 1 / IMCCE) Hans Rickman

Gyrotonic Aftermarket Bench Lift Design ME ME 395 F Fin inal l Pr Proje ject Gr Group up

AIRS Validation Overview & TDS Support of Validation Eric Fetzer AIRS Science Team Meeting

AngularJS & Bootstrap Form Validation HTML default validation Browsers have built-in

1 Internal Controls Practices Group September 30, 2020 Travis English Training & Outreach