DataCamp Human Resources Analytics in R: Exploring Employee Data HUMAN RESOURCES ANALYTICS IN R : EXPLORING EMPLOYEE DATA Paying new hires fairly Ben Teusch HR Analytics Consultant
DataCamp Human Resources Analytics in R: Exploring Employee Data
DataCamp Human Resources Analytics in R: Exploring Employee Data
DataCamp Human Resources Analytics in R: Exploring Employee Data The data > head(pay) # A tibble: 6 x 5 employee_id department salary new_hire job_level <int> <chr> <dbl> <chr> <chr> 1 1 Sales 103263.64 No Salaried 2 2 Engineering 80708.64 No Hourly 3 4 Engineering 60737.05 Yes Hourly 4 5 Engineering 99116.32 Yes Salaried 5 7 Engineering 51021.64 No Hourly 6 8 Engineering 98399.87 No Salaried
DataCamp Human Resources Analytics in R: Exploring Employee Data Introducing broom::tidy() > chisq.test(survey$in_sales, survey$disengaged) Pearson's Chi-squared test with Yates' continuity correction data: survey$in_sales and survey$disengaged X-squared = 25.524, df = 1, p-value = 4.368e-07 > chisq.test(survey$in_sales, survey$disengaged) %>% + tidy() statistic p.value parameter method 1 25.52441 4.368222e-07 1 Pearson's Chi-squared test ... > chisq.test(survey$in_sales, survey$disengaged) %>% + tidy() %>% + pull(p.value) [1] 4.368222e-07
DataCamp Human Resources Analytics in R: Exploring Employee Data HUMAN RESOURCES ANALYTICS IN R : EXPLORING EMPLOYEE DATA Let's practice!
DataCamp Human Resources Analytics in R: Exploring Employee Data HUMAN RESOURCES ANALYTICS IN R : EXPLORING EMPLOYEE DATA Omitted variable bias Ben Teusch HR Analytics Consultant
DataCamp Human Resources Analytics in R: Exploring Employee Data When group compositions differ Two groups of people A: eats little to no meat B: eats normal amount of meat Group A gains weight Conclusion: weight gain plans should exclude meat?
DataCamp Human Resources Analytics in R: Exploring Employee Data When group compositions differ Two groups of people A: eats little to no meat B: eats normal amount of meat Group A gains weight Omitted piece of data: group A is made up of infants group B is made up of adults
DataCamp Human Resources Analytics in R: Exploring Employee Data Omitted variable bias Omitted variable bias occurs when an omitted variable is correlated with: the dependent variable, and the way the groups are divided
DataCamp Human Resources Analytics in R: Exploring Employee Data Visualizing group composition
DataCamp Human Resources Analytics in R: Exploring Employee Data
DataCamp Human Resources Analytics in R: Exploring Employee Data 100% stacked bar charts > pay %>% + ggplot(aes(x = new_hire, fill = department)) + + geom_bar(position = "fill")
DataCamp Human Resources Analytics in R: Exploring Employee Data HUMAN RESOURCES ANALYTICS IN R : EXPLORING EMPLOYEE DATA Let's practice!
DataCamp Human Resources Analytics in R: Exploring Employee Data HUMAN RESOURCES ANALYTICS IN R : EXPLORING EMPLOYEE DATA Using linear regression Ben Teusch HR Analytics Consultant
DataCamp Human Resources Analytics in R: Exploring Employee Data
DataCamp Human Resources Analytics in R: Exploring Employee Data Linear regression Focus on testing differences between groups Learn more about other uses and the math under the hood at DataCamp
DataCamp Human Resources Analytics in R: Exploring Employee Data
DataCamp Human Resources Analytics in R: Exploring Employee Data
DataCamp Human Resources Analytics in R: Exploring Employee Data Simple linear regression > lm(salary ~ new_hire, data = pay) %>% + tidy() term estimate std.error statistic p.value 1 (Intercept) 73424.603 577.2369 127.200112 0.00000000 2 new_hireYes 2649.672 1109.3568 2.388476 0.01704414 # A tibble: 2 x 2 new_hire avg_salary <chr> <dbl> 1 No 73424.60 2 Yes 76074.28 > 76074.28 - 73424.60 [1] 2649.68
DataCamp Human Resources Analytics in R: Exploring Employee Data Significance for linear regression > lm(salary ~ new_hire, data = pay) %>% + tidy() term estimate std.error statistic p.value 1 (Intercept) 73424.603 577.2369 127.200112 0.00000000 2 new_hireYes 2649.672 1109.3568 2.388476 0.01704414
DataCamp Human Resources Analytics in R: Exploring Employee Data Multiple linear regression > lm(salary ~ new_hire + department, data = pay) %>% + tidy() term estimate std.error statistic p.value 1 (Intercept) 72844.040 679.3007 107.233869 0.00000000 2 new_hireYes 2649.028 1108.9698 2.388728 0.01903265 3 departmentFinance 3092.807 2457.0717 1.258737 0.20832572 4 departmentSales 1477.215 1082.4749 1.364665 0.17256792
DataCamp Human Resources Analytics in R: Exploring Employee Data Using summary() > lm(salary ~ new_hire + department, data = pay) %>% + summary() Call: lm(formula = salary ~ new_hire + department, data = pay) Residuals: Min 1Q Median 3Q Max -31674 -14446 -3629 10657 88580 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 72844.0 679.3 107.234 <2e-16 *** new_hireYes 2649.0 1109.0 2.389 0.017 * departmentFinance 3092.8 2457.1 1.259 0.208 departmentSales 1477.2 1082.5 1.365 0.173 Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 18890 on 1466 degrees of freedom Multiple R-squared: 0.005923, Adjusted R-squared: 0.003889 F-statistic: 2.912 on 3 and 1466 DF, p-value: 0.03338
DataCamp Human Resources Analytics in R: Exploring Employee Data HUMAN RESOURCES ANALYTICS IN R : EXPLORING EMPLOYEE DATA Let's practice!
Recommend
More recommend