Week 2 Video 4 Metrics for Regressors Metrics for Regressors - PowerPoint PPT Presentation

Week 2 Video 4 Metrics for Regressors

Metrics for Regressors ¨ Linear Correlation ¨ MAE/RMSE ¨ Information Criteria

Linear correlation (Pearson’s correlation) ¨ r(A,B) = ¨ When A’s value changes, does B change in the same direction? ¨ Assumes a linear relationship

What is a “good correlation”? ¨ 1.0 – perfect ¨ 0.0 – none ¨ -1.0 – perfectly negatively correlated ¨ In between – depends on the field

What is a “good correlation”? ¨ 1.0 – perfect ¨ 0.0 – none ¨ -1.0 – perfectly negatively correlated ¨ In between – depends on the field ¨ In physics – correlation of 0.8 is weak! ¨ In education – correlation of 0.3 is good

Why are small correlations OK in education? ¨ Lots and lots of factors contribute to just about any dependent measure

Examples of correlation values From Denis Boigelot, available on Wikipedia

Same correlation, different functions Anscombe’s Quartet

r 2 ¨ The correlation, squared ¨ Also a measure of what percentage of variance in dependent measure is explained by a model ¨ If you are predicting A with B,C,D,E ¤ r 2 is often used as the measure of model goodness rather than r (depends on the community)

Spearman’s Correlation ( ρ ) ¨ Rank correlation ¨ Turn each variable into ranks ¨ 1 = highest value, 2 = 2 nd highest value, 3 = 3 rd highest value, and so on ¨ Then compute Pearson’s correlation ¨ (There’s actually an easier formula, but not relevant here)

Spearman’s Correlation ( ρ ) ¨ Interpreted exactly the same way as Pearson’s correlation ¨ 1.0 – perfect ¨ 0.0 – none ¨ -1.0 – perfectly negatively correlated

Why use Spearman’s Correlation ( ρ )? ¨ More robust to outliers ¨ Determines how monotonic a relationship is, not how linear it is

RMSE/MAE

Mean Absolute Error ¨ Average of ¨ Absolute value (actual value minus predicted value)

Root Mean Squared Error (RMSE) ¨ Square Root of average of ¨ (actual value minus predicted value) 2

MAE vs. RMSE ¨ MAE tells you the average amount to which the predictions deviate from the actual values ¤ Very interpretable ¨ RMSE can be interpreted the same way (mostly) but penalizes large deviation more than small deviation

However ¨ RMSE is largely preferred to MAE The example to follow is courtesy of Radek Pelanek, Masaryk University

Radek’s Example ¨ Take a student who makes correct responses 70% of the time ¨ And two models ¤ Model A predicts 70% correctness ¤ Model B predicts 100% correctness

In other words ¨ 70% of the time the student gets it right ¤ Response = 1 ¨ 30% of the time the student gets it wrong ¤ Response = 0 ¨ Model A Prediction = 0.7 ¨ Model B Prediction = 1.0 ¨ Which of these seems more reasonable?

MAE ¨ 70% of the time the student gets it right ¤ Response = 1 ¤ Model A (0.7) Absolute Error = 0.3 ¤ Model B (1.0) Absolute Error = 0 ¨ 30% of the time the student gets it wrong ¤ Response = 0 ¤ Model A (0.7) Absolute Error = 0.7 ¤ Model B (1.0) Absolute Error = 1

MAE ¨ Model A ¤ (0.7)(0.3)+(0.3)(0.7) ¤ 0.21+0.21 ¤ 0.42 ¨ Model B ¤ (0.7)(0)+(0.3)(1) ¤ 0+0.3 ¤ 0.3

MAE ¨ Model A ¤ (0.7)(0.3)+(0.3)(0.7) ¤ 0.21+0.21 ¤ 0.42 ¨ Model B is better, according to MAE ¤ (0.7)(0)+(0.3)(1) ¤ 0+0.3 ¤ 0.3

Do you believe it? ¨ Model A ¤ (0.7)(0.3)+(0.3)(0.7) ¤ 0.21+0.21 ¤ 0.42 ¨ Model B is better, according to MAE ¤ (0.7)(0)+(0.3)(1) ¤ 0+0.3 ¤ 0.3

RMSE ¨ 70% of the time the student gets it right ¤ Response = 1 ¤ Model A (0.7) Squared Error = 0.09 ¤ Model B (1.0) Squared Error = 0 ¨ 30% of the time the student gets it wrong ¤ Response = 0 ¤ Model A (0.7) Squared Error = 0.49 ¤ Model B (1.0) Squared Error = 1

RMSE ¨ Model A ¤ (0.7)(0.09)+(0.3)(0.49) ¤ 0.063+0.147 ¤ 0.21 ¨ Model B ¤ (0.7)(0)+(0.3)(1) ¤ 0+0.3 ¤ 0.3

RMSE ¨ Model A is better, according to RMSE. ¤ (0.7)(0.09)+(0.3)(0.49) ¤ 0.063+0.147 ¤ 0.21 ¨ Model B ¤ (0.7)(0)+(0.3)(1) ¤ 0+0.3 ¤ 0.3

RMSE ¨ Model A is better, according to RMSE. Does this seem more reasonable? ¤ (0.7)(0.09)+(0.3)(0.49) ¤ 0.063+0.147 ¤ 0.21 ¨ Model B ¤ (0.7)(0)+(0.3)(1) ¤ 0+0.3 ¤ 0.3

Note ¨ Low RMSE is good ¨ High Correlation is good

What does it mean? ¨ Low RMSE/MAE, High Correlation = Good model ¨ High RMSE/MAE, Low Correlation = Bad model

What does it mean? ¨ High RMSE/MAE, High Correlation = Model goes in the right direction, but is systematically biased ¤ A model that says that adults are taller than children ¤ But that adults are 8 feet tall, and children are 6 feet tall

What does it mean? ¨ Low RMSE/MAE, Low Correlation = Model values are in the right range, but model doesn’t capture relative change ¤ Particularly common if there’s not much variation in data

Information Criteria

BiC ¨ Bayesian Information Criterion (Raftery, 1995) ¨ Makes trade-off between goodness of fit and flexibility of fit (number of parameters) ¨ Formula for linear regression ¤ BiC’ = n log (1- r 2 ) + p log n ¨ n is number of students, p is number of variables

BiC’ ¨ Values over 0: worse than expected given number of variables ¨ Values under 0: better than expected given number of variables ¨ Can be used to understand significance of difference between models (Raftery, 1995)

BiC ¨ Said to be statistically equivalent to k-fold cross- validation for optimal k ¨ The derivation is… somewhat complex ¨ BiC is easier to compute than cross-validation, but different formulas must be used for different modeling frameworks ¤ No BiC formula available for many modeling frameworks

AIC ¨ Alternative to BiC ¨ Stands for ¤ An Information Criterion (Akaike, 1971) ¤ Akaike’s Information Criterion (Akaike, 1974) ¨ Makes slightly different trade-off between goodness of fit and flexibility of fit (number of parameters)

AIC ¨ Said to be statistically equivalent to Leave-Out- One-Cross-Validation

AIC or BIC: Which one should you use? ¨ <shrug>

All the metrics: Which one should you use? ¨ “The idea of looking for a single best measure to choose between classifiers is wrongheaded.” – Powers (2012)

Next Lecture ¨ Cross-validation and over-fitting

Week 2 Video 4 Metrics for Regressors Metrics for Regressors - PowerPoint PPT Presentation

Week 2 Video 4 Metrics for Regressors Metrics for Regressors Linear Correlation MAE/RMSE Information Criteria Linear correlation (Pearsons correlation) r(A,B) = When As value changes, does B change in the same direction?

MATH2130-F17 Week 13 Week 14 Week 15, Inner Farid Aliniaeifard Product Space CU BOULDER

Time Matters Week 7 Week 6 Prototyping + Needfinding Week 7 Week 8 Implementation Week 9

Math 610 Section 700 - Recitation week 3 week 4 week 6 week 8 TA: Peng Wei Office: Blocker

Video Games Written and Researched by: Patrick Kania First Video Game The first Video Game made

Galatians: week 3 Galatians 3:1-29 Week 1: Galatians 1:1-2:14 Week 2: Galatians 2:15-21 Week 3:

Vermont M nt Marble: A e: Americas s nt Stone Monument Sto Class S s Schedule e Week

Week 1: Christ: The Source of True Happiness Week 2: Happiness, the Gospel and Living Well Week

NVIDIA VIDEO TECHNOLOGIES Abhijit Patait, 3/20/2019 NVIDIA Video Technologies Overview Turing

NVIDIA VIDEO TECHNOLOGIES Abhijit Patait, 3/26/2018 NVIDIA Video Technologies Overview Video

Video Sur Video Sur rveillance, rveillance, , Video Analyti Video Analyti ics, and You.

Islands of the Pacific Northwest One or Two Week Cruise Week 1: September 14 th 20 th Week 2:

Menu Day Week 1 Week 2 Week 3 Week 4 Monday +Pork and Apple Casserole or +Meat Loaf or Lamb

www. velpaprojects .com Finishing your property the VELPA way Time plan Week 1 - 4 Week 5 - 8

Case-X Progress Report By: MELRR Engineering Group #3 Weekly Updates Week Week Week Week

INSTRUCTION WEEK OF MAY 18 TH 2020 MS. KELLYS SIXTH GRADE GLOBAL THINKERS STUDENT OF THE WEEK:

INSTRUCTION WEEK OF MAY 18 TH 2020 MS. KELLYS SIXTH GRADE GLOBAL THINKERS STUDENT OF THE WEEK:

Continuable asynchronous programming with allocation aware futures /Naios/continuable Denis

Introduction to Liquid Crystals Denis Andrienko IMPRS school, Bad Marienberg September 14, 2006

How-to for real-time streaming and analytics at scale with Apache Kafka and Apache Ignite Viktor

Verificarlo: Checking Floating-Point Accuracy Through Monte Carlo Arithmetic Christophe Denis,

draft-denis-behave-dccp-01 BEHAVE working group meeting R emi Denis-Courmont VideoLAN

Computability Theory and Asymptotic Density Denis R. Hirschfeldt University of Chicago Groups

Universality for the golden mean Siegel Disks, and existence of Siegel cylinders Denis Gaidashev,

Compiling PL/SQL Away CIDR 2020 Christian Duta Denis Hirn Torsten Grust University of