Parameters and confidence inter v als FOU N DATION S OF IN FE R E N C E Jo Hardin Instr u ctor
Research q u estions H y pothesis test Con � dence inter v al Under w hich diet plan w ill participants lose Ho w m u ch sho u ld participants e x pect to more w eight on a v erage ? lose on a v erage ? Which of t w o car man u fact u rers are u sers What percent of u sers are likel y to more likel y to recommend to their friends ? recommend S u bar u to their friends ? Are ed u cation le v el and a v erage income For each additional y ear of ed u cation , linearl y related ? w hat is the predicted a v erage income ? FOUNDATIONS OF INFERENCE
Parameter A parameter is a n u merical v al u e from the pop u lation E x amples ( contin u ed ): The tr u e a v erage amo u nt all dieters w ill lose on a partic u lar program The proportion of indi v id u als in a pop u lation w ho recommend S u bar u cars The a v erage income of all indi v id u als in the pop u lation w ith a partic u lar ed u cation le v el FOUNDATIONS OF INFERENCE
Confidence inter v al Range of n u mbers that ( hopef u ll y) capt u res the tr u e parameter "95% con � dent that bet w een 12% and 34% of the entire pop u lation recommends S u bar u s " FOUNDATIONS OF INFERENCE
Let ' s practice ! FOU N DATION S OF IN FE R E N C E
Bootstrapping FOU N DATION S OF IN FE R E N C E Jo Hardin Instr u ctor
H y pothesis testing Ho w do samples from the n u ll pop u lation v ar y? ^ p Statistic , proportion of s u ccesses in sample → Parameter , proportion of s u ccesses in pop u lation → p FOUNDATIONS OF INFERENCE
Confidence inter v als No n u ll pop u lation , u nlike in h y pothesis testing ^ Ho w do p and p v ar y? FOUNDATIONS OF INFERENCE
FOUNDATIONS OF INFERENCE
FOUNDATIONS OF INFERENCE
FOUNDATIONS OF INFERENCE
FOUNDATIONS OF INFERENCE
FOUNDATIONS OF INFERENCE
FOUNDATIONS OF INFERENCE
FOUNDATIONS OF INFERENCE
FOUNDATIONS OF INFERENCE
FOUNDATIONS OF INFERENCE
Polling # Original data Original data Source: local data frame [30 x 3] Candidate X Total v oters Proportion X flip_num flip 17 30 0.5667 <int> <chr> 1 1 H 2 2 H 3 3 H 4 4 T 5 5 H 6 6 H # ... with 24 more rows FOUNDATIONS OF INFERENCE
Polling # First resample First resample Source: local data frame [30 x 3] Candidate X Total v oters Proportion X replicate flip_num flip 17 30 0.5667 <dbl> <int> <chr> 1 1 7 H 14 30 0.4667 2 1 17 T 3 1 13 H 4 1 14 H 5 1 24 H 6 1 28 T # ... with 24 more rows FOUNDATIONS OF INFERENCE
Polling # Second resample Second resample Source: local data frame [30 x 3] Candidate X Total v oters Proportion X replicate flip_num flip <dbl> <int> <chr> 17 30 0.5667 1 2 21 H 2 2 19 T 3 2 25 H 14 30 0.4667 4 2 24 T 5 2 21 H 18 30 0.6 6 2 28 T 7 2 13 H 8 2 23 H 9 2 24 T 10 2 24 T # ... with 20 more rows FOUNDATIONS OF INFERENCE
Polling # Third resample Third resample Source: local data frame [30 x 3] Candidate X Total v oters Proportion X replicate flip_num flip <dbl> <int> <chr> 17 30 0.5667 1 3 6 H 2 3 19 H 3 3 1 H 14 30 0.4667 4 3 24 T 5 3 11 H 18 30 0.6 6 3 28 T 7 3 16 H 12 30 0.4 8 3 13 H 9 3 21 T 10 3 29 H # ... with 20 more rows FOUNDATIONS OF INFERENCE
Standard error Obtained standard error of 0.09 b y resampling man y times Describes ho w the statistic v aries aro u nd parameter Bootstrap pro v ides an appro x imation of the standard error FOUNDATIONS OF INFERENCE
Variabilit y of p - hat from the pop u lation # A tibble: 1 × 1 # Compute p-hat for each poll `sd(prop_yes)` ex1_props <- recommend %>% <dbl> group_by(poll) %>% 1 0.08523512 summarize(prop_yes = mean(vote == "yes")) # Variability of p-hat ex1_props %>% summarize(sd(prop_yes)) FOUNDATIONS OF INFERENCE
Variabilit y of p - hat from the sample ( bootstrapping ) # Select one poll from which to resample # Variability of p-hat one_poll <- all_polls %>% ex2_props %>% filter(poll ==1) %>% summarize(sd(stat)) select(vote) # A tibble: 1 × 1 # Compute p-hat for each resampled poll `sd(stat)` ex2_props <- one_poll %>% <dbl> specify(response = vote, 1 0.08691885 success = "yes") %>% generate(reps = 1000, type = "bootstrap") FOUNDATIONS OF INFERENCE
Let ' s practice ! FOU N DATION S OF IN FE R E N C E
Variabilit y in p - hat FOU N DATION S OF IN FE R E N C E Jo Hardin Instr u ctor
Ho w far are the data from the parameter ? FOUNDATIONS OF INFERENCE
Ho w far are the data from the parameter ? FOUNDATIONS OF INFERENCE
Ho w far are the data from the parameter ? FOUNDATIONS OF INFERENCE
Standard error of p - hat FOUNDATIONS OF INFERENCE
Empirical r u le FOUNDATIONS OF INFERENCE
Empirical r u le FOUNDATIONS OF INFERENCE
Empirical r u le FOUNDATIONS OF INFERENCE
Let ' s practice ! FOU N DATION S OF IN FE R E N C E
Interpreting CIs and technical conditions FOU N DATION S OF IN FE R E N C E Jo Hardin Instr u ctor
Creating CIs # Compare confidence intervals # Find 2.5% and 97.5% of p-hat vals one_poll_boot %>% summarize( one_poll_boot %>% summarize( lower = p_hat - 2 * q025_prop = quantile(prop_yes_boot, sd(prop_yes_boot), p = .025), upper = p_hat + 2 * q975_prop = quantile(prop_yes_boot, sd(prop_yes_boot)) p = .975)) # A tibble: 1 × 2 # A tibble: 1 × 2 lower upper q025_prop q975_prop <dbl> <dbl> <dbl> <dbl> 1 0.536148 0.863852 1 0.5333333 0.8333333 FOUNDATIONS OF INFERENCE
Moti v ating CIs Goal is to � nd the parameter w hen all w e kno w is the statistic Ne v er kno w w hether the sample y o u collected act u all y contains the tr u e parameter FOUNDATIONS OF INFERENCE
Interpreting the CIs Bootstrap t - CI : (0.536, 0.864) Percentile inter v al : (0.533, 0.833) We are 95% con � dent that the tr u e proportion of people planning to v ote for candidate X is bet w een 0.536 and 0.864 ( or 0.533 and 0.833) FOUNDATIONS OF INFERENCE
Technical conditions Sampling distrib u tion of the statistic is reasonabl y s y mmetric and bell - shaped Sample si z e is reasonabl y large Variabilit y of resampled proportions FOUNDATIONS OF INFERENCE
Let ' s practice ! FOU N DATION S OF IN FE R E N C E
S u mmar y of statistical inference FOU N DATION S OF IN FE R E N C E Jo Hardin Instr u ctor
Inference FOUNDATIONS OF INFERENCE
Testing H : There is no gender discrimination in hiring 0 H : Men are more likel y to be promoted than w omen A FOUNDATIONS OF INFERENCE
Estimation What proportion of the v oters w ill select candidate X ? FOUNDATIONS OF INFERENCE
Bootstrapping FOUNDATIONS OF INFERENCE
Congrat u lations ! FOU N DATION S OF IN FE R E N C E
Recommend
More recommend