M u ltiple e x planator y v ariables IN TE R ME D IATE STATISTIC AL MOD E L IN G IN R Dann y Kaplan Instr u ctor
The statisticalModeling package To e v al u ate the model , need to set v al u es for e x planator y v ariables Commonl y u se mean , median , or mode To v is u ali z e the model , need to select se v eral di � erent le v els of e x planator y v ariables to incl u de # Load statisticalModeling package library(statisticalModeling) INTERMEDIATE STATISTICAL MODELING IN R
Using effect _ si z e () # Train model wage_model <- lm(wage ~ educ + sector + sex + exper, data = CPS85) # Effect size of education on wage: a slope effect_size(wage_model, ~ educ) slope educ to:educ sector sex exper 1 0.7179628 12 14.61537 prof M 15 INTERMEDIATE STATISTICAL MODELING IN R
Using fmodel () # A model of the probability of being married married_model <- glm(married == "Married" ~ educ * sector * sex + age, data = CPS85, family = "binomial") fmodel(married_model, ~ age + sex + sector + educ, data = CPS85, type = "response", educ = c(10, 16)) INTERMEDIATE STATISTICAL MODELING IN R
Using fmodel () INTERMEDIATE STATISTICAL MODELING IN R
Designing graphs of models fmodel(married_model, ~ age + sex + sector + educ, data = CPS85, type = "response", educ = c(10, 16)) 1. Response v ariable al w a y s on y- a x is 2. E x planator y v ariables of primar y interest on x- a x is 3. Choose one , t w o , or three v ariables y o u w ant in displa y 4. If others , choose a �x ed v al u e that ' s of interest fmodel() does 2 - 3 a u tomaticall y and 4 either a u tomaticall y or man u all y INTERMEDIATE STATISTICAL MODELING IN R
Let ' s practice ! IN TE R ME D IATE STATISTIC AL MOD E L IN G IN R
Categorical response v ariables IN TE R ME D IATE STATISTIC AL MOD E L IN G IN R Dann y Kaplan Instr u ctor
The q u estion at hand For a q u antitati v e response v ariable and a ... Q u antitati v e e x planator y v ariable E � ect si z e is a rate Categorical e x planator y v ariable E � ect si z e is a di � erence B u t w hat happens w hen the response v ariable is categorical ? INTERMEDIATE STATISTICAL MODELING IN R
Model o u tp u t for categorical response T w o w a y s to frame the o u tp u t : As categories or classes As probabilities INTERMEDIATE STATISTICAL MODELING IN R
E x ample : marital stat u s # Create model and set inputs married_model <- rpart(married ~ educ + sex + age, data = CPS85, cp = 0.005) # Output as a category (i.e. class) evaluate_model(married_model, type = "class", age = c(25, 30), educ = 12, sex = "F") educ sex age model_output 1 12 F 25 Married 2 12 F 30 Married INTERMEDIATE STATISTICAL MODELING IN R
E x ample : marital stat u s # Output as a probability evaluate_model(married_model, type = "prob", age = c(25, 30), educ = 12, sex = "F") educ sex age model_output.Married model_output.Single 1 12 F 25 0.6333333 0.3666667 2 12 F 30 0.7425743 0.2574257 E x tra 5 y ears of age associated w ith 11% increase in probabilit y of being married INTERMEDIATE STATISTICAL MODELING IN R
Let ' s practice ! IN TE R ME D IATE STATISTIC AL MOD E L IN G IN R
Interactions among e x planator y v ariables IN TE R ME D IATE STATISTIC AL MOD E L IN G IN R Dann y Kaplan Instr u ctor
Interaction E � ect si z e of one v ariable ma y change w ith the other e x planator y v ariables INTERMEDIATE STATISTICAL MODELING IN R
Probabilit y of being married married_model <- glm(married == "Married" ~ educ * sector * sex + age, data = CPS85, family = "binomial") fmodel(married_model, ~ age + sex + sector + educ, data = CPS85, type = "response") INTERMEDIATE STATISTICAL MODELING IN R
Interactions and model architect u re lm() incl u des interactions onl y if y o u ask for them rpart() has interactions b u ilt into the method INTERMEDIATE STATISTICAL MODELING IN R
World s w imming records INTERMEDIATE STATISTICAL MODELING IN R
World s w imming records INTERMEDIATE STATISTICAL MODELING IN R
World s w imming records INTERMEDIATE STATISTICAL MODELING IN R
World s w imming records mod1 <- rpart(time ~ sex + year, data = SwimRecords) mod2 <- lm(time ~ sex + year, data = SwimRecords) INTERMEDIATE STATISTICAL MODELING IN R
Form u las w ith interactions M u st specif y interaction e x plicitl y in lm() mod3 <- lm(time ~ sex * year, data = SwimRecords) INTERMEDIATE STATISTICAL MODELING IN R
Does an interaction impro v e a model ? Use cross v alidation to see w hich is be � er : mod2: ~ year + sex v s . mod3: ~ year * sex t.test(mse ~ model, data = cv_pred_error(mod2, mod3)) data: mse by model t = 20, df = 18, p-value = 1.323e-13 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: 4.2 5.2 sample estimates: mean in group mod2 mean in group mod3 17 12 INTERMEDIATE STATISTICAL MODELING IN R
Let ' s practice ! IN TE R ME D IATE STATISTIC AL MOD E L IN G IN R
Recommend
More recommend