model evaluation
play

Model evaluation P RACTICIN G MACH IN E LEARN IN G IN TERVIEW QUES - PowerPoint PPT Presentation

Model evaluation P RACTICIN G MACH IN E LEARN IN G IN TERVIEW QUES TION S IN R Rafael Falcon Data Scientist at Shopify Q: What aspects need to be considered when you evaluate a Machine Learning model? 1. Type of Machine Learning task


  1. Model evaluation P RACTICIN G MACH IN E LEARN IN G IN TERVIEW QUES TION S IN R Rafael Falcon Data Scientist at Shopify

  2. Q: What aspects need to be considered when you evaluate a Machine Learning model? 1. Type of Machine Learning task Classi�cation Regression Clustering 2. Carefully choose your performance metrics 3. Get a realistic performance estimate Split data into training/validation/test sets use cross-validation PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R

  3. Classi�cation: confusion matrix PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R

  4. Classi�cation: accuracy Accuracy proportion of correctly classi�ed examples TP + TN TP + TN + F P + F N useful when errors in predicting all classes are equally important beware of class imbalance scenarios! always predicting most frequent class → high accuracy cost-sensitive accuracy TP + TN TP + TN + c F P + c F N 1 2 PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R

  5. Q: What is the ROC curve? Receiver Operating Characteristic (ROC) For models that return class probabilities Is the model able to distinguish between the classes? For each possible classi�cation threshold: True Positive Rate TPR = TP TP + F N False Positive Rate FPR = F P F P + TN Area under ROC Curve (AUC) higher is better PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R

  6. Regression: Root Mean Squared Error (RMSE) Average distance between model predictions and ground truth (actual values) Easy to compute In the same units of the response variable example: y = house price RMSE = 8,000 Model is $8,000 off from the true house price on average PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R

  7. Clustering: validity indices No label information Two criteria to consider: compact clusters well-separated clusters Several validity indices Dunn's index Davies-Bouldin index Silhouette index etc. Use multiple indices! PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R

  8. Let's practice! P RACTICIN G MACH IN E LEARN IN G IN TERVIEW QUES TION S IN R

  9. Handling imbalanced data P RACTICIN G MACH IN E LEARN IN G IN TERVIEW QUES TION S IN R Rafael Falcon Data Scientist at Shopify

  10. Nuclear submarine detection PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R

  11. Results are in! PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R

  12. Frequency of the decision classes PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R

  13. Q: What is imbalanced classi�cation? Large disparity in the frequencies of the decision classes. Accuracy metric is especially sensitive to these scenarios. always predicting majority class --> high accuracy! Two popular avenues: cost-sensitive classi�cation subsampling imbalanced data PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R

  14. Cost-sensitive classi�cation Misclassi�cation cost for minority classes is higher than for the majority class. PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R

  15. Q: How to subsample imbalanced data? Subsample the training data in a way that mitigates the class imbalance. Three common approaches: downsampling upsampling SMOTE PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R

  16. Downsampling Reduce frequency of the overrepresented classes Match the frequency of the underrepresented classes Example: Before : majority class (80 samples), minority class (20 samples) After : majority class (20 samples), minority class (20 samples) PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R

  17. Upsampling Increase frequency of the underrepresented classes random sampling with replacement Match the frequency of the overrepresented classes Example: Before : majority class (80 samples), minority class (20 samples) After : majority class (80 samples), minority class (80 samples) PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R

  18. SMOTE S ynthetic M inority O versampling TE chnique Generates new instances from the minority class PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R

  19. SMOTE PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R

  20. SMOTE PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R

  21. Subsampling before model evaluation PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R

  22. Subsampling as part of model evaluation PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R

  23. Let's practice! P RACTICIN G MACH IN E LEARN IN G IN TERVIEW QUES TION S IN R

  24. Hyperparameter tuning P RACTICIN G MACH IN E LEARN IN G IN TERVIEW QUES TION S IN R Rafael Falcon Data Scientist at Shopify

  25. Q: Model parameter vs. hyperparameter? Parameters vs. hyperparameters Parameters : learned by the model during training often in an iterative manner Hyperparameters : not learned but speci�ed prior to training in�uence different aspects of the training process do not change as training unfolds tuned as part of a meta-learning process PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R

  26. Example: Multi Layer Perceptron (MLP) Parameters weight matrix bias vector Hyperparameters learning rate number of hidden layers number of hidden neurons per layer PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R

  27. Example: K-means clustering K-means clustering Parameters cluster prototypes Hyperparameters number of clusters K centroid initialization method PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R

  28. Q: What is hyperparameter tuning and how is it done? Finding adequate hyperparameter values Iterative process generate hyperparameter vector train the model with this vector evaluate model performance Computationally expensive! PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R

  29. Hyperpameter tuning strategies Three main strategies: grid search random search informed search PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R

  30. Grid search Exhaustive search over a manually speci�ed subset of the hyperparameter space All possible combinations are considered expensive but highly parallelizable Example: α ∈ [0,1], β ∈ [2,5] Sample each hyperparameter space α ∈ {0.2,0.5,0.8} β ∈ {2,3,4,5} 12 hyperparameter vectors are tested (Cartesian product) PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R

  31. Random search Randomly selects hyperparameter vectors discrete sample continuous hyperparameter distribution Highly parallelizable Can outperform grid search a few hyperparameters affect model performance PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R

  32. Informed search Bayesian optimization Probabilistically maps a hyperparameter vector to a model performance indicator Samples more frequently around promising hyperparameter vectors Better results in fewer evaluations compared to grid and random search PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R

  33. Hyperparameter tuning in R Several R packages available: caret mlr h2o Check out Hyperparameter tuning in R DataCamp course! We will show you how to tune hyperparameters using caret in the exercises. PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R

  34. Let's practice! P RACTICIN G MACH IN E LEARN IN G IN TERVIEW QUES TION S IN R

  35. Random Forests or Gradient Boosted Trees? P RACTICIN G MACH IN E LEARN IN G IN TERVIEW QUES TION S IN R Rafael Falcon Data Scientist at Shopify

  36. Q: Commonalities between RFs and GBTs? They are both top-performing ensemble models. Suitable for both classi�cation and regression tasks. Decision trees as base learners. Can handle missing values. Provide model-speci�c variable importance metric. PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R

  37. Q: Main differences between RFs and GBTs? Random Forests Gradient Boosting Trees Bagging ensemble Boosting ensemble Deeper decision trees Shallower decision trees Aimed at reducing variance Aimed at reducing bias Trees grown in parallel Trees grown sequentially Easier to tune Harder to tune All trees used Trees added as needed PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R

  38. R implementation There are multiple R packages that implement RFs and GBTs. library(randomForest) library(ranger) library(gbm) library(xgboost) library(caret) PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R

  39. Hyperparameter tuning: RFs library(randomForest) # Tunes mtry tunedModel <- tuneRF(x = predictors, y = response, nTreeTry=500) library(caret) # Tunes m_try by default, others if configured tunedModel <- train(x = predictors, y = response, method = 'rf') PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R

  40. Hyperparameter tuning: GBTs library(gbm) # Tunes n.trees based on CV/OOB error opt_ntree_cv <- gbm.perf(model, method="cv") opt_ntree_oob <- gbm.perf(model, method="OOB") library(caret) # Tunes several hyperparameters model <- train(x = predictors, y = response, method='xgbLinear') PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R

  41. Let's practice! P RACTICIN G MACH IN E LEARN IN G IN TERVIEW QUES TION S IN R

  42. You made it! P RACTICIN G MACH IN E LEARN IN G IN TERVIEW QUES TION S IN R Rafael Falcon Data Scientist at Shopify

Recommend


More recommend