The int u ition behind tree - based methods SU P E R VISE D L E AR N - PowerPoint PPT Presentation

The int u ition behind tree - based methods SU P E R VISE D L E AR N IN G IN R : R E G R E SSION Nina Z u mel and John Mo u nt Win - Vector , LLC

E x ample : Predict animal intelligence from Gestation Time and Litter Si z e SUPERVISED LEARNING IN R : REGRESSION

Decision Trees R u les of the form : if a AND b AND c THEN y Non - linear concepts inter v als non - monotonic relationships non - additi v e interactions AND : similar to m u ltiplication SUPERVISED LEARNING IN R : REGRESSION

Decision Trees IF Li � er < 1.15 AND Gestation ≥ 268 → intelligence = 0.315 IF Li � er IN [1.15, 4.3) → intelligence = 0.131 SUPERVISED LEARNING IN R : REGRESSION

Decision Trees Pro : Trees Ha v e an E x pressi v e Concept Space Model RMSE linear 0.1200419 tree 0.1072732 SUPERVISED LEARNING IN R : REGRESSION

Decision Trees Con : Coarse - Grained Predictions SUPERVISED LEARNING IN R : REGRESSION

It ' s Hard for Trees to E x press Linear Relationships Trees Predict A x is - Aligned Regions SUPERVISED LEARNING IN R : REGRESSION

It ' s Hard for Trees to E x press Linear Relationships It ' s Hard to E x press Lines w ith Steps SUPERVISED LEARNING IN R : REGRESSION

Other Iss u es w ith Trees Tree w ith too man y splits ( deep tree ): Too comple x - danger of o v er � t Tree w ith too fe w splits ( shallo w tree ): Predictions too coarse - grained SUPERVISED LEARNING IN R : REGRESSION

Ensembles of Trees Ensembles Gi v e Finer - grained Predictions than Single Trees SUPERVISED LEARNING IN R : REGRESSION

Ensembles of Trees Ensemble Model Fits Animal Intelligence Data Be � er than Single Tree Model RMSE linear 0.1200419 tree 0.1072732 random forest 0.0901681 SUPERVISED LEARNING IN R : REGRESSION

Let ' s practice ! SU P E R VISE D L E AR N IN G IN R : R E G R E SSION

Random forests SU P E R VISE D L E AR N IN G IN R : R E G R E SSION Nina Z u mel and John Mo u nt Win - Vector , LCC

Random Forests M u ltiple di v erse decision trees a v eraged together Red u ces o v er � t Increases model e x pressi v eness Finer grain predictions SUPERVISED LEARNING IN R : REGRESSION

B u ilding a Random Forest Model 1. Dra w bootstrapped sample from training data 2. For each sample gro w a tree At each node , pick best v ariable to split on ( from a random s u bset of all v ariables ) Contin u e u ntil tree is gro w n 3. To score a dat u m , e v al u ate it w ith all the trees and a v erage the res u lts . SUPERVISED LEARNING IN R : REGRESSION

E x ample : Bike Rental Data cnt ~ hr + holiday + workingday + + weathersit + temp + atemp + hum + windspeed SUPERVISED LEARNING IN R : REGRESSION

Random Forests w ith ranger () model <- ranger(fmla, bikesJan, + num.trees = 500, + respect.unordered.factors = "order") formula , data num.trees ( defa u lt 500) - u se at least 200 mtry - n u mber of v ariables to tr y at each node defa u lt : sq u are root of the total n u mber of v ariables respect.unordered.factors - recommend set to " order " " safe " hashing of categorical v ariables SUPERVISED LEARNING IN R : REGRESSION

Random Forests w ith ranger () model Ranger result ... OOB prediction error (MSE): 3103.623 R squared (OOB): 0.7837386 Random forest algorithm ret u rns estimates of o u t - of - sample performance . SUPERVISED LEARNING IN R : REGRESSION

Predicting w ith a ranger () model bikesFeb$pred <- predict(model, bikesFeb)$predictions predict() inp u ts : model data Predictions can be accessed in the element predictions . SUPERVISED LEARNING IN R : REGRESSION

E v al u ating the model Calc u late RMSE : bikesFeb %>% + mutate(residual = pred - cnt) %>% + summarize(rmse = sqrt(mean(residual^2))) rmse 1 67.15169 Model RMSE Q u asipoisson 69.3 Random forests 67.15 SUPERVISED LEARNING IN R : REGRESSION

E v al u ating the model SUPERVISED LEARNING IN R : REGRESSION

One - Hot - Encoding Categorical Variables SU P E R VISE D L E AR N IN G IN R : R E G R E SSION Nina Z u mel and John Mo u nt Win - Vector , LLC

Wh y Con v ert Categoricals Man u all y? Most R f u nctions manage the con v ersion for y o u model.matrix() xgboost() does not M u st con v ert categorical v ariables to n u meric representation Con v ersion to indicators : one - hot encoding SUPERVISED LEARNING IN R : REGRESSION

One - hot - encoding and data cleaning w ith `v treat ` Basic idea : designTreatmentsZ() to design a treatment plan from the training data , then prepare() to created " clean " data all n u merical no missing v al u es u se prepare() w ith treatment plan for all f u t u re data SUPERVISED LEARNING IN R : REGRESSION

A Small v treat E x ample Training Data Test Data x u y x u y one 44 0.4855671 one 5 2.6488148 t w o 24 1.3683726 three 12 1.5012938 three 66 2.0352837 one 56 0.1993731 t w o 22 1.6396267 t w o 28 1.2778516 SUPERVISED LEARNING IN R : REGRESSION

Create the Treatment Plan vars <- c("x", "u") treatplan <- designTreatmentsZ(dframe, varslist, verbose = FALSE) Inp u ts to designTreatmentsZ() dframe : training data varlist : list of inp u t v ariable names set v erbose = FALSE to s u ppress progress messages SUPERVISED LEARNING IN R : REGRESSION

Get the Ne w Variables The scoreFrame describes the v ariable mapping and t y pes (scoreFrame <- treatplan$scoreFrame %>% + select(varName, origName, code)) varName origName code 1 x_lev_x.one x lev 2 x_lev_x.three x lev 3 x_lev_x.two x lev 4 x_catP x catP 5 u_clean u clean Get the names of the ne w lev and clean v ariables (newvars <- scoreFrame %>% + filter(code %in% c("clean", "lev")) %>% + use_series(varName)) "x_lev_x.one" "x_lev_x.three" "x_lev_x.two" "u_clean" SUPERVISED LEARNING IN R : REGRESSION

Prepare the Training Data for Modeling training.treat <- prepare(treatmentplan, dframe, varRestriction = newvars) Inp u ts to prepare() : treatmentplan : treatment plan dframe : data frame varRestriction : list of v ariables to prepare ( optional ) defa u lt : prepare all v ariables SUPERVISED LEARNING IN R : REGRESSION

Before and After Data Treatment Training Data Treated Training Data x u y x_ le v x_ le v x_ le v _x. _x. _x. u_ clean one 44 0.4855671 one three t w o t w o 24 1.3683726 1 0 0 44 three 66 2.0352837 0 0 1 24 t w o 22 1.6396267 0 1 0 66 0 0 1 22 SUPERVISED LEARNING IN R : REGRESSION

Prepare the Test Data Before Model Application (test.treat <- prepare(treatplan, test, varRestriction = newvars)) x_lev_x.one x_lev_x.three x_lev_x.two u_clean 1 1 0 0 5 2 0 1 0 12 3 1 0 0 56 4 0 0 1 28 SUPERVISED LEARNING IN R : REGRESSION

v treat Treatment is Rob u st Pre v io u sl y u nseen x le v el : fo u r fo u r encodes to (0, 0, 0) prepare(treatplan, toomany, ...) x u y one 4 0.2331301 x_ le v x_ le v x_ le v _x. _x. _x. u_ clean t w o 14 1.9331760 one three t w o three 66 3.1251029 1 0 0 4 fo u r 25 4.0332491 0 0 1 14 0 1 0 66 0 0 0 25 SUPERVISED LEARNING IN R : REGRESSION

Gradient boosting machines SU P E R VISE D L E AR N IN G IN R : R E G R E SSION Nina Z u mel and John Mo u nt Win - Vector , LLC

Ho w Gradient Boosting Works 1. Fit a shallo w tree T to the 1 data : M = T 1 1 SUPERVISED LEARNING IN R : REGRESSION

Ho w Gradient Boosting Works 1. Fit a shallo w tree T to the 1 data : M = T 1 1 2. Fit a tree T _2 to the resid u als . Find γ s u ch that M = M + γT is the 2 1 2 best � t to data SUPERVISED LEARNING IN R : REGRESSION

Ho w Gradient Boosting Works Reg u lari z ation : learning rate η ∈ (0,1) M = M + ηγT 2 1 2 Larger η : faster learning Smaller η : less risk of o v er � t SUPERVISED LEARNING IN R : REGRESSION

Ho w Gradient Boosting Works 1. Fit a shallo w tree T to the 1 data M = T 1 1 2. Fit a tree T _2 to the resid u als . M = M + ηγ T 2 1 2 2 3. Repeat (2) u ntil stopping condition met Final Model : ∑ M = M + η γ T 1 i i SUPERVISED LEARNING IN R : REGRESSION

Cross -v alidation to G u ard Against O v erfit Training error keeps decreasing , b u t test error doesn ' t SUPERVISED LEARNING IN R : REGRESSION

Best Practice (w ith x gboost ()) 1. R u n xgb.cv() w ith a large n u mber of ro u nds ( trees ). SUPERVISED LEARNING IN R : REGRESSION

Best Practice (w ith x gboost ()) 1. R u n xgb.cv() w ith a large n u mber of ro u nds ( trees ). 2. xgb.cv()$evaluation_log : records estimated RMSE for each ro u nd . Find the n u mber of trees that minimi z es estimated RMSE : n best SUPERVISED LEARNING IN R : REGRESSION

The int u ition behind tree - based methods SU P E R VISE D L E AR N - PowerPoint PPT Presentation

The int u ition behind tree - based methods SU P E R VISE D L E AR N IN G IN R : R E G R E SSION Nina Z u mel and John Mo u nt Win - Vector , LLC E x ample : Predict animal intelligence from Gestation Time and Litter Si z e SUPERVISED LEARNING IN

y = x; } int a = 2, b = 6; swap(a,b); void swap(int x, int y) { int temp = y; y = x; x =

Selection Problems int FindMax(int[] list,int low, int high){ int max = low; for(int

The heap hic 1 Limitations of the stack int *table_of(int num, int len) { int table[len+1];

void fuzz(char* buf, int& len){ void fuzz(char* buf, int& len){ void fuzz(char* buf,

CSE 351: Week 4 Tom Bergan, TA 1 Does this code look okay? int binarySearch(int a[], int

TDDE18 & 726G77 Templates Duplicate code functions int sum(int a, int b) { return a + b;

Examples: Well-formed types These are types: int bool int * bool int * int ->

Reasoning About Code 1/25/2010 int deref(int p) { return p; } /* requires: p != NULL */ int

Linear Search int search(int[] list, int target, int n) { for (int i=1; i<=n; i++) if

CSC 2400: Computer Systems Using the Stack for Function Calls Lecture Goals int add3(int a, int

Are Hybrid Physical Designs Important? 1 B+ tree 2 C O L B+ tree 3 ? C O L C O L B+ tree

Tree-sitter @maxbrunsfeld What is Tree-sitter? Why I wrote Tree-sitter What were

61A Lecture 21 Announcements Binary Trees Binary Tree Class 4 Binary Tree Class class

Compiler Construction of Idempotent Regions and Applications in Architecture Design Marc de

Objectives Introduction to Grammars Identify and explain the parts of a grammar. Defjne

Here be dragons Things we didnt cover in depth... int triangle(int a, int b, int c) { if (a

SPELL-CHECKING QUERIES BY COMBINING LEVENSHTEIN AND STOILOS DISTANCES Zied Moalla 1, 2 , Lina F.

From Human From Human Regulations Regulations to to Regulated Regulated to eInstitutions

DAQ @ LHC: ATLAS Upgrade overview q What has been written in these Documents? q Well at least

Embedding Jump Upper Semilattices into the Turing Degrees. Antonio Montalb an. Cornell

A Pumping Lemma for Pushdown Graphs of Any Level Pawe Parys University of Warsaw Higher order

Multiphysics, Multigrid, and More about AMR (M 3 ) Ann Almgren Center for Computational Sciences

Graduate AI Lecture 2: Search I Instructors: Nihar B. Shah (this time) J. Zico Kolter E XAMPLE

Selfishness Level of Strategic Games Krzysztof R. Apt CWI, Amsterdam, the Netherlands ,

The int u ition behind tree - based methods SU P E R VISE D L E AR N - PowerPoint PPT Presentation

The int u ition behind tree - based methods SU P E R VISE D L E AR N IN G IN R : R E G R E SSION Nina Z u mel and John Mo u nt Win - Vector , LLC E x ample : Predict animal intelligence from Gestation Time and Litter Si z e SUPERVISED LEARNING IN

y = x; } int a = 2, b = 6; swap(a,b); void swap(int x, int y) { int temp = y; y = x; x =

Selection Problems int FindMax(int[] list,int low, int high){ int max = low; for(int

The heap hic 1 Limitations of the stack int *table_of(int num, int len) { int table[len+1];

void fuzz(char* buf, int&amp; len){ void fuzz(char* buf, int&amp; len){ void fuzz(char* buf,

CSE 351: Week 4 Tom Bergan, TA 1 Does this code look okay? int binarySearch(int a[], int

TDDE18 &amp; 726G77 Templates Duplicate code functions int sum(int a, int b) { return a + b;

Examples: Well-formed types These are types: int bool int * bool int * int -&gt;

Reasoning About Code 1/25/2010 int deref(int *p) { return *p; } /* requires: p != NULL */ int

Linear Search int search(int[] list, int target, int n) { for (int i=1; i&lt;=n; i++) if

CSC 2400: Computer Systems Using the Stack for Function Calls Lecture Goals int add3(int a, int

Are Hybrid Physical Designs Important? 1 B+ tree 2 C O L B+ tree 3 ? C O L C O L B+ tree

Tree-sitter @maxbrunsfeld What is Tree-sitter? Why I wrote Tree-sitter What were

61A Lecture 21 Announcements Binary Trees Binary Tree Class 4 Binary Tree Class class

Compiler Construction of Idempotent Regions and Applications in Architecture Design Marc de

Objectives Introduction to Grammars Identify and explain the parts of a grammar. Defjne

Here be dragons Things we didnt cover in depth... int triangle(int a, int b, int c) { if (a

SPELL-CHECKING QUERIES BY COMBINING LEVENSHTEIN AND STOILOS DISTANCES Zied Moalla 1, 2 , Lina F.

From Human From Human Regulations Regulations to to Regulated Regulated to eInstitutions

DAQ @ LHC: ATLAS Upgrade overview q What has been written in these Documents? q Well at least

Embedding Jump Upper Semilattices into the Turing Degrees. Antonio Montalb an. Cornell

A Pumping Lemma for Pushdown Graphs of Any Level Pawe Parys University of Warsaw Higher order

Multiphysics, Multigrid, and More about AMR (M 3 ) Ann Almgren Center for Computational Sciences

Graduate AI Lecture 2: Search I Instructors: Nihar B. Shah (this time) J. Zico Kolter E XAMPLE

Selfishness Level of Strategic Games Krzysztof R. Apt CWI, Amsterdam, the Netherlands ,

void fuzz(char* buf, int& len){ void fuzz(char* buf, int& len){ void fuzz(char* buf,

TDDE18 & 726G77 Templates Duplicate code functions int sum(int a, int b) { return a + b;

Examples: Well-formed types These are types: int bool int * bool int * int ->

Reasoning About Code 1/25/2010 int deref(int p) { return p; } /* requires: p != NULL */ int

Linear Search int search(int[] list, int target, int n) { for (int i=1; i<=n; i++) if