and evaluation
play

and Evaluation CMSC 678 UMBC Central Question: How Well Are We - PowerPoint PPT Presentation

Experimental Setup, Multi-class vs. Multi-label classification, and Evaluation CMSC 678 UMBC Central Question: How Well Are We Doing? Precision, Recall, F1 Accuracy Log-loss Classification ROC-AUC


  1. Experimental Setup, Multi-class vs. Multi-label classification, and Evaluation CMSC 678 UMBC

  2. Central Question: How Well Are We Doing? • Precision, Recall, F1 • Accuracy • Log-loss Classification • ROC-AUC • … • (Root) Mean Square Error • Mean Absolute Error Regression • … Clustering • Mutual Information • V-score • the task : what kind … of problem are you solving?

  3. Central Question: How Well Are We Doing? • Precision, This does Recall, F1 • Accuracy not have to • Log-loss be the same Classification • ROC-AUC thing as the • … loss • (Root) Mean Square Error function • Mean Absolute Error Regression • you … optimize Clustering • Mutual Information • V-score • the task : what kind … of problem are you solving?

  4. Outline Experimental Design: Rule 1 Multi-class vs. Multi-label classification Evaluation Regression Metrics Classification Metrics

  5. Experimenting with Machine Learning Models All your data Dev Test Training Data Data Data

  6. Rule #1

  7. Experimenting with Machine Learning Models What is “correct?” What is working “well?” Dev Test Training Data Data Data Learn model parameters from set hyper- training set parameters

  8. Experimenting with Machine Learning Models What is “correct?” What is working “well?” Evaluate the learned model on dev with that hyperparameter setting Dev Test Training Data Data Data Learn model parameters from set hyper- training set parameters

  9. Experimenting with Machine Learning Models What is “correct?” What is working “well?” Evaluate the learned model on dev with that hyperparameter setting Dev Test Training Data Data Data Learn model parameters from perform final evaluation on test, set hyper- training set using the hyperparameters that parameters optimized dev performance and retraining the model

  10. Experimenting with Machine Learning Models What is “correct?” What is working “well?” Evaluate the learned model on dev with that hyperparameter setting Dev Test Training Data Data Data Learn model parameters from perform final evaluation on test, set hyper- training set using the hyperparameters that parameters optimized dev performance and retraining the model Rule 1: DO NOT ITERATE ON THE TEST DATA

  11. On-board Exercise Produce dev and test tables for a linear regression model with learned weights and set/fixed (non-learned) bias

  12. Outline Experimental Design: Rule 1 Multi-class vs. Multi-label classification Evaluation Regression Metrics Classification Metrics

  13. Multi-class Classification Given input 𝑦 , predict discrete label 𝑧 Multi-label Classification

  14. Multi-class Classification Given input 𝑦 , predict discrete label 𝑧 If 𝑧 ∈ {0,1} (or 𝑧 ∈ {True, False} ), then a binary classification task Multi-label Classification

  15. Multi-class Classification Given input 𝑦 , predict discrete label 𝑧 If 𝑧 ∈ {0,1} (or 𝑧 ∈ If 𝑧 ∈ {0,1, … , 𝐿 − 1} (for {True, False} ), then a finite K), then a multi-class binary classification task classification task Q: What are some examples of multi-class classification? Multi-label Classification

  16. Multi-class Classification Given input 𝑦 , predict discrete label 𝑧 If 𝑧 ∈ {0,1} (or 𝑧 ∈ If 𝑧 ∈ {0,1, … , 𝐿 − 1} (for {True, False} ), then a finite K), then a multi-class binary classification task classification task Q: What are some examples A: Many possibilities. See of multi-class classification? A2, Q{1,2,4-7} Multi-label Classification

  17. Multi-class Classification Given input 𝑦 , predict discrete label 𝑧 If 𝑧 ∈ {0,1} (or 𝑧 ∈ If 𝑧 ∈ {0,1, … , 𝐿 − 1} (for Single {True, False} ), then a finite K), then a multi-class output binary classification task classification task If multiple 𝑧 𝑚 are Multi- predicted, then a multi- output label classification task Multi-label Classification

  18. Multi-class Classification Given input 𝑦 , predict discrete label 𝑧 If 𝑧 ∈ {0,1} (or 𝑧 ∈ If 𝑧 ∈ {0,1, … , 𝐿 − 1} (for Single {True, False} ), then a finite K), then a multi-class output binary classification task classification task If multiple 𝑧 𝑚 are Multi- predicted, then a multi- output label classification task Given input 𝑦 , predict multiple discrete labels 𝑧 = (𝑧 1 , … , 𝑧 𝑀 ) Multi-label Classification

  19. Multi-class Classification Given input 𝑦 , predict discrete label 𝑧 If 𝑧 ∈ {0,1} (or 𝑧 ∈ If 𝑧 ∈ {0,1, … , 𝐿 − 1} (for Single {True, False} ), then a finite K), then a multi-class output binary classification task classification task If multiple 𝑧 𝑚 are Each 𝑧 𝑚 could be binary or Multi- predicted, then a multi- multi-class output label classification task Given input 𝑦 , predict multiple discrete labels 𝑧 = (𝑧 1 , … , 𝑧 𝑀 ) Multi-label Classification

  20. Multi- Label Classification… Will not be a primary focus of this course Many of the single output classification methods apply to multi-label classification Predicting “in the wild” can be trickier Evaluation can be trickier

  21. We’ve only developed binary classifiers so far… Option 1: Develop a multi- class version Option 2: Build a one-vs- all (OvA) classifier Option 3: Build an all-vs- all (AvA) classifier (there can be others)

  22. We’ve only developed binary classifiers so far… Loss function may (or may not) Option 1: Develop a multi- need to be extended & the class version model structure may need to change (big or small) Option 2: Build a one-vs- all (OvA) classifier Option 3: Build an all-vs- all (AvA) classifier (there can be others)

  23. We’ve only developed binary classifiers so far… Loss function may (or may not) Option 1: Develop a multi- need to be extended & the class version model structure may need to change (big or small) Option 2: Build a one-vs- all (OvA) classifier Common change: instead of a single weight vector Option 3: Build an all-vs- 𝑥 , keep a weight vector 𝑥 (𝑑) for all (AvA) classifier each class c Compute class specific scores, e.g., (there can be others) ෢ (𝑑) = 𝑥 (𝑑) 𝑈 𝑦 + 𝑐 (𝑑) 𝑧 𝑗

  24. Multi-class Option 1: Linear Regression/Perceptron 𝑦 𝐱 𝑧 𝑧 = 𝐱 𝑈 𝑦 + 𝑐 output: if y > 0: class 1 else: class 2

  25. Multi-class Option 1: Linear Regression/Perceptron: A Per-Class View 𝑦 𝑦 𝑧 𝐱 𝟐 𝐱 𝑈 𝑦 + 𝑐 1 𝑧 1 = 𝐱 𝟐 𝑧 1 𝑧 𝑧 2 𝑈 𝑦 + 𝑐 2 𝑧 2 = 𝐱 𝟑 𝑧 = 𝐱 𝑈 𝑦 + 𝑐 𝐱 𝟑 output: output: if y > 0: class 1 i = argmax { y 1 , y 2 } else: class 2 class i binary version is special case

  26. Multi-class Option 1: Linear Regression/Perceptron: A Per-Class View (alternative) 𝑦 𝑦 𝑧 𝐱 𝟐 𝑧 1 = 𝒙 𝟐 ; 𝒙 𝟑 𝑼 [𝑦; 𝟏] + 𝑐 1 𝐱 𝑧 1 𝑧 concatenation 𝑧 2 𝑧 2 = 𝒙 𝟐 ; 𝒙 𝟑 𝑼 [𝟏; 𝑦] + 𝑐 2 𝑧 = 𝐱 𝑈 𝑦 + 𝑐 𝐱 𝟑 output: output: i = argmax { y 1 , y 2 } if y > 0: class 1 class i else: class 2 Q : (For discussion) Why does this work?

  27. We’ve only developed binary classifiers so far… With C classes: Option 1: Develop a multi- class version Train C different binary classifiers 𝛿 𝑑 (𝑦) Option 2: Build a one-vs- 𝛿 𝑑 (𝑦) predicts 1 if x is likely class c, all (OvA) classifier 0 otherwise Option 3: Build an all-vs- all (AvA) classifier (there can be others)

  28. We’ve only developed binary classifiers so far… With C classes: Option 1: Develop a multi- class version Train C different binary classifiers 𝛿 𝑑 (𝑦) Option 2: Build a one-vs- 𝛿 𝑑 (𝑦) predicts 1 if x is likely class c, all (OvA) classifier 0 otherwise Option 3: Build an all-vs- To test/predict a new instance z : all (AvA) classifier Get scores 𝑡 𝑑 = 𝛿 𝑑 (𝑨) Output the max of these scores, (there can be others) 𝑧 = argmax 𝑑 𝑡 𝑑 ො

  29. We’ve only developed binary classifiers so far… With C classes: Option 1: Develop a multi- class version Train 𝐷 2 different binary classifiers 𝛿 𝑑 1 ,𝑑 2 (𝑦) Option 2: Build a one-vs- all (OvA) classifier Option 3: Build an all-vs- all (AvA) classifier (there can be others)

  30. We’ve only developed binary classifiers so far… With C classes: Option 1: Develop a multi- class version Train 𝐷 2 different binary classifiers 𝛿 𝑑 1 ,𝑑 2 (𝑦) Option 2: Build a one-vs- all (OvA) classifier 𝛿 𝑑 1 ,𝑑 2 (𝑦) predicts 1 if x is likely class 𝑑 1 , 0 otherwise (likely class 𝑑 2 ) Option 3: Build an all-vs- all (AvA) classifier (there can be others)

  31. We’ve only developed binary classifiers so far… With C classes: Option 1: Develop a multi- class version Train 𝐷 2 different binary classifiers 𝛿 𝑑 1 ,𝑑 2 (𝑦) Option 2: Build a one-vs- all (OvA) classifier 𝛿 𝑑 1 ,𝑑 2 (𝑦) predicts 1 if x is likely class 𝑑 1 , 0 otherwise (likely class 𝑑 2 ) Option 3: Build an all-vs- all (AvA) classifier To test/predict a new instance z : Get scores or predictions 𝑡 𝑑 1 ,𝑑 2 = (there can be others) 𝛿 𝑑 1 ,𝑑 2 𝑨

Recommend


More recommend