DataCamp Supervised Learning in R: Case Studies SUPERVISED LEARNING IN R : CASE STUDIES Welcome! Julia Silge Data Scientist at Stack Overflow
DataCamp Supervised Learning in R: Case Studies In this course, you will... use exploratory data analysis to prepare for predictive modeling explore which modeling approaches to use for different kinds of data practice implementing supervised machine learning for classification and regression
DataCamp Supervised Learning in R: Case Studies Supervised machine learning Regression Classification
DataCamp Supervised Learning in R: Case Studies Case studies Fuel efficiency for cars Stack Overflow Developer Survey Voter turnout Predict age of nuns from survey responses
DataCamp Supervised Learning in R: Case Studies Fuel efficiency
DataCamp Supervised Learning in R: Case Studies Fuel efficiency From the US Department of Energy > cars2018 # A tibble: 1,144 x 15 Model `Model Index` Displacement Cylinders Gears Transmission MPG <chr> <dbl> <dbl> <dbl> <dbl> <chr> <dbl> 1 Acura NSX 57.0 3.50 6.00 9.00 Manual 21.0 2 ALFA ROMEO 4C 410 1.80 4.00 6.00 Manual 28.0 3 Audi R8 AWD 65.0 5.20 10.0 7.00 Manual 17.0 4 Audi R8 RWD 71.0 5.20 10.0 7.00 Manual 18.0 5 Audi R8 Spyde… 66.0 5.20 10.0 7.00 Manual 17.0 6 Audi R8 Spyde… 72.0 5.20 10.0 7.00 Manual 18.0 7 Audi TT Roads… 46.0 2.00 4.00 6.00 Manual 26.0 8 BMW M4 DTM Ch… 488 3.00 6.00 7.00 Manual 20.0 9 Bugatti Chiron 38.0 8.00 16.0 7.00 Manual 11.0 10 Chevrolet COR… 278 6.20 8.00 8.00 Automatic 18.0 # ... with 1,134 more rows, and 8 more variables: Aspiration <chr>, `Lockup # Torque Converter` <chr>, Drive <chr>, `Max Ethanol` <dbl>, `Recommended # Fuel` <fct>, `Intake Valves Per Cyl` <dbl>, `Exhaust Valves Per Cyl` <dbl>, # `Fuel injection` <chr>
DataCamp Supervised Learning in R: Case Studies Fuel efficiency From the US Department of Energy > names(cars2018) [1] "Model" "Model Index" [3] "Displacement" "Cylinders" [5] "Gears" "Transmission" [7] "MPG" "Aspiration" [9] "Lockup Torque Converter" "Drive" [11] "Max Ethanol" "Recommended Fuel" [13] "Intake Valves Per Cyl" "Exhaust Valves Per Cyl" [15] "Fuel injection"
DataCamp Supervised Learning in R: Case Studies Special characters in variable names > cars2018 %>% + select(`Fuel injection`) # A tibble: 1,144 x 1 `Fuel injection` <chr> 1 Direct ignition 2 Direct ignition 3 Direct ignition 4 Direct ignition 5 Direct ignition 6 Direct ignition 7 Direct ignition 8 Direct ignition 9 Multipoint/sequential ignition 10 Direct ignition # ... with 1,134 more rows
DataCamp Supervised Learning in R: Case Studies Exploratory data analysis
DataCamp Supervised Learning in R: Case Studies Exploratory data analysis library(tidyverse) ggplot2 dplyr tidyr others! To learn more about the tidyverse, visit this page .
DataCamp Supervised Learning in R: Case Studies SUPERVISED LEARNING IN R : CASE STUDIES Time to train some models!
DataCamp Supervised Learning in R: Case Studies SUPERVISED LEARNING IN R : CASE STUDIES Getting started with caret Julia Silge Data Scientist at Stack Overflow
DataCamp Supervised Learning in R: Case Studies Predicting fuel efficiency
DataCamp Supervised Learning in R: Case Studies Tools for predictive modeling THE CARET PACKAGE
DataCamp Supervised Learning in R: Case Studies
DataCamp Supervised Learning in R: Case Studies Training data and testing data with caret > library(caret) > > in_train <- createDataPartition(cars_vars$Aspiration, + p = 0.8, list = FALSE) > training <- cars_vars[in_train,] > testing <- cars_vars[-in_train,]
DataCamp Supervised Learning in R: Case Studies Training data and testing data with caret Build your model with your training data Choose your model with your validation data Evaluate your model with your testing data
DataCamp Supervised Learning in R: Case Studies Training a model > fit_lm <- train(log(MPG) ~ ., method = "lm", data = training, + trControl = trainControl(method = "none")) Train a model Evaluate that model using yardstick
DataCamp Supervised Learning in R: Case Studies Evaluating a model THE YARDSTICK PACKAGE
DataCamp Supervised Learning in R: Case Studies SUPERVISED LEARNING IN R : CASE STUDIES Let's practice!
DataCamp Supervised Learning in R: Case Studies SUPERVISED LEARNING IN R : CASE STUDIES Training a model with resampling Julia Silge Data Scientist at Stack Overflow
DataCamp Supervised Learning in R: Case Studies Bootstrap resampling Sample with replacement from the original dataset
DataCamp Supervised Learning in R: Case Studies
DataCamp Supervised Learning in R: Case Studies
DataCamp Supervised Learning in R: Case Studies Bootstrap resampling with caret > cars_rf_bt <- train(log(MPG) ~ ., method = "rf", + data = training, + trControl = trainControl(method = "boot")
DataCamp Supervised Learning in R: Case Studies Comparing predicted to real values `log(MPG)` `Linear regression` `Random forest` <dbl> <dbl> <dbl> 1 2.89 2.79 2.83 2 2.89 3.00 2.89 3 3.26 3.22 3.26 4 3.14 3.09 3.10 5 3.26 3.22 3.26 6 2.89 3.11 2.98 7 2.48 2.59 2.51 8 2.71 2.81 2.82 9 3.37 3.29 3.27 10 2.83 2.90 2.90
DataCamp Supervised Learning in R: Case Studies Visualizing model predictions
DataCamp Supervised Learning in R: Case Studies SUPERVISED LEARNING IN R : CASE STUDIES Let's practice!
Recommend
More recommend