empirical confidence models for supervised machine
play

Empirical Confidence Models for Supervised Machine Learning - PowerPoint PPT Presentation

Empirical Confidence Models for Supervised Machine Learning Margarita Castro 1 , Meinolf Sellmann 2 , Zhaoyuan Yang 2 , Nurali Virani 2 1 University of Toronto, Mechanical and Industrial Engineering 2 General Electric, Global Research Center May,


  1. Empirical Confidence Models for Supervised Machine Learning Margarita Castro 1 , Meinolf Sellmann 2 , Zhaoyuan Yang 2 , Nurali Virani 2 1 University of Toronto, Mechanical and Industrial Engineering 2 General Electric, Global Research Center May, 2020

  2. Motivation 2 ML in high-stake context Main Issues: u We can’t expect the models to be perfect. Self-Driving Cyber Cars security u Summarize statistics (e.g., accuracy) can be misleading to assess a specific prediction. Healthcare diagnosis

  3. Empirical Confidence for Regression 3 We propose: π‘Œ Run time instance A model that can declare its own incompetence. Regression Model + Competence Assessor β€œWe develop techniques that learn when models generated by certain learning techniques on a particular data set can be expected to perform well, and when not .” 𝑍′ Competence Prediction 𝐷 Level Trusted, Cautioned or Not Trusted

  4. Outline of the Talk 4 Part 1: Competence Assessor Part 2: Numerical Evaluation u Overall framework. u Experimental Setting. u Meta-features. u Results. u Meta Training Data. u Conclusions.

  5. 5 Empirical Competence Assessor PART 1

  6. Competence Assessor Pipeline 6 Prediction Run-time input y β€² 𝑦 Regressor 𝐷 Meta Feature Competence (π‘Œ, 𝑍) Builder Assessor Input for 1 2 Competence Competence Level Training Set Assessor (π‘Œ, 𝑍) Primary Technique Regressor Model (e.g., Random Forest) Training Set

  7. Meta Feature Builder 7 Relate run-time input with training Relating Input and Training set: data and regressor technique u Different distances measures depending Run-time Prediction on the regressor technique input 𝑦 𝑔 𝑦 = 𝑧′ 𝑒: 𝐺×𝐺 β†’ ℝ ! u Neighborhood 𝑂(𝑦) based on the distance 𝑦 y β€² Meta Feature measure 𝑒 β‹…,β‹… . Builder u We consider 𝑙 nearest neighbors 6 meta with 𝑙 = 5. features (π‘Œ, 𝑍) Training Set

  8. Our Six Meta Features 8 1. Average Distance to the Neighborhood 2. Average Prediction Distance 3. Deviation from regressor’s prediction 𝑒 𝑦, 𝑦 ( 𝑔 𝑦 βˆ’ 𝑔(𝑦 ( ) 𝑁 " 𝑦 ≔ 3 𝑁 ) 𝑦 ≔ 3 𝑙 𝑙 # ! ,% ! ∈' # # ! ,% ! ∈' # 𝑑(𝑦) 𝑁 * 𝑦 ≔ 𝑔 𝑦 βˆ’ 3 𝑧 𝑒(𝑦, 𝑦′) # ! ,% ! ∈' # u Measure how far the run-time input from the training data set. u Relationship between predictions at the vicinity of current input.

  9. Our Six Meta Features 9 4. Average training error on 𝑂(𝑦) 6. Target value variability on 𝑂(𝑦) 5. Variance training error on 𝑂(𝑦) 𝑧 ( βˆ’ 9 𝑧 ) 𝑁 , 𝑦 ≔ 3 𝑙 βˆ’ 1 𝑑 𝑦 𝑔 𝑦 ( βˆ’ 𝑧′ # ! ,% ! ∈' # 𝑁 + 𝑦 ≔ 3 𝑒 𝑦 ( , 𝑦 # ! ,% ! ∈' # 𝑔 𝑦 ' βˆ’ 𝑧 ' βˆ’ 𝑁 ( 𝑦 ) u Variance of true value in 𝑂(𝑦) . 𝑁 ! 𝑦 ≔ $ 𝑙 βˆ’ 1 " ! ,$ ! ∈& " u Accuracy of regressor in the immediate vicinity.

  10. Training Data For Competence Assesor 10 Validation Splitter Training Set Base Technique Regressor 𝑍 𝑍′ Meta Feature 𝐷 Builder Training Data for Competence Assessor

  11. Splitter Procedure 11 Standard Cross-Validation Projection Splits Random splitting into β„Ž ∈ {3,5,10} buckets. Assess i.i.d. assumption of the technique. u u One validation bucket and the rest as base. Create interpolation and extrapolation u u scenarios. Project over 1 st and 2 nd PC dimension and sort the u Base training data before splitting. Training Set Base Training Set Validation Validation Projected and sorted data

  12. Training Meta Model 12 Classification Label (C) Training Techniques u Based on the true error of the learned u Off-the-shelf SVM and Random Forest model. Classifier. u Sort the absolute residual values in u Our goal is to test the framework in several datasets. ascending order and set the labels as: u 80% smaller Γ  Trusted Note: More sophisticated techniques can be u 80-95% Γ  Cautioned used for specific applications. u Last 5% Γ  Not trusted Note: the labeling can be modified for specific applications

  13. 13 Numerical Evaluation PART 2

  14. Experimental Setting 14 Objective: Cross-Validation Tasks Evaluate our Empirical Competence u Standard cross-validation. Model (ECM) over different u Interpolation and Extrapolation: scenarios. u Cluster data and take complete clusters as test set. u PC projections (1 st and 3 rd ). u Six UCI benchmark data-sets. u Regressors: Linear, Random Forest, and SVR. (Off-the-shelf) u Task: standard, interpolation, and extrapolation.

  15. Proof-of-Concept Experiment 15 Setting: Linear Regression Model u 1-dimension data following a linear regression with random noise. ECM u Interpolation task. Predictions u Regressors: u Linear regression. Random Forest u Random forest. Model

  16. Evaluating ECM over Airfoil Dataset 16 Trusted Cautious Not Trusted Bigger MSE for C and NT classes.

  17. Evaluate Effectiveness of Pipeline 17 Baseline: Competence assessor trained over original data (only standard splitting and no meta features) Trusted Warned ECM has lower MSE for Trusted class and higher MSE for Warned class.

  18. Conclusions & Future Works 18 u We present an Empirical Confidence Model (ECM) that assess the reliability of the regression model predictions. u We show the effectives of ECM for i.i.d. and non-i.i.d. train/test splits. u Future works: u Study other reliability measures as meta-features. u Integrate our methodology in an active learning setting.

  19. Thank You! Empirical Confidence Models for Supervised Machine Learning

Recommend


More recommend