Welcome to the co u rse ! FOU N DATION S OF IN FE R E N C E Jo Hardin Instr u ctor
What is statistical inference ? The process of making claims abo u t a pop u lation based on information from a sample FOUNDATIONS OF INFERENCE
What is statistical inference ? FOUNDATIONS OF INFERENCE
What is statistical inference ? FOUNDATIONS OF INFERENCE
What is statistical inference ? FOUNDATIONS OF INFERENCE
What is statistical inference ? FOUNDATIONS OF INFERENCE
Ass u me t w o pop u lations prefer cola at same rate FOUNDATIONS OF INFERENCE
The sample data FOUNDATIONS OF INFERENCE
The sample data ( take 2) FOUNDATIONS OF INFERENCE
Vocab u lar y N u ll h y pothesis ( H ) : The claim is not that interesting 0 Alternati v e h y pothesis ( H ) : The claim corresponding to the research h y pothesis A The " goal " is to dispro v e the n u ll h y pothesis FOUNDATIONS OF INFERENCE
E x ample : cheetah speed Compare speed of t w o di � erent s u bspecies of cheetah H : Asian and African cheetahs r u n the same 0 speed , on a v erage H : African cheetahs are faster than Asian A cheetahs , on a v erage FOUNDATIONS OF INFERENCE
E x ample : election From a sample , the researchers w o u ld like to claim that Candidate X w ill w in H : Candidate X w ill get half the v otes 0 H : Candidate X w ill get more than half the A v otes FOUNDATIONS OF INFERENCE
Let ' s practice ! FOU N DATION S OF IN FE R E N C E
Randomi z ed distrib u tions FOU N DATION S OF IN FE R E N C E Jo Hardin Instr u ctor
Logic of inference FOUNDATIONS OF INFERENCE
Logic of inference FOUNDATIONS OF INFERENCE
Logic of inference FOUNDATIONS OF INFERENCE
Logic of inference FOUNDATIONS OF INFERENCE
Logic of inference FOUNDATIONS OF INFERENCE
Logic of inference FOUNDATIONS OF INFERENCE
Understanding the n u ll distrib u tion Generating a distrib u tion of the statistic from the n u ll pop u lation gi v es information abo u t w hether the obser v ed data are inconsistent w ith the n u ll h y pothesis FOUNDATIONS OF INFERENCE
Understanding the n u ll distrib u tion Original data Location Cola Orange East 28 6 West 19 7 ^ east = 28/(28 + 6) = 0.82 p ^ west = 19/(19 + 7) = 0.73 p FOUNDATIONS OF INFERENCE
Understanding the n u ll distrib u tion First sh u� e , same as original Location Cola Orange East 28 6 West 19 7 FOUNDATIONS OF INFERENCE
Understanding the n u ll distrib u tion Second sh u� e Location Cola Orange East 27 7 West 20 6 FOUNDATIONS OF INFERENCE
Understanding the n u ll distrib u tion Third sh u� e Location Cola Orange East 28 8 West 21 5 FOUNDATIONS OF INFERENCE
Understanding the n u ll distrib u tion Fo u rth sh u� e Location Cola Orange East 25 9 West 22 4 FOUNDATIONS OF INFERENCE
Understanding the n u ll distrib u tion Fi � h sh u� e Location Cola Orange East 29 5 West 18 8 FOUNDATIONS OF INFERENCE
Understanding the n u ll distrib u tion Fi � h sh u� e Location Cola Orange East 29 5 West 18 8 FOUNDATIONS OF INFERENCE
Understanding the n u ll distrib u tion FOUNDATIONS OF INFERENCE
Understanding the n u ll distrib u tion FOUNDATIONS OF INFERENCE
Understanding the n u ll distrib u tion FOUNDATIONS OF INFERENCE
Understanding the n u ll distrib u tion FOUNDATIONS OF INFERENCE
Understanding the n u ll distrib u tion FOUNDATIONS OF INFERENCE
Understanding the n u ll distrib u tion FOUNDATIONS OF INFERENCE
One random perm u tation soda %>% library(infer) group_by(location) %>% soda %>% specify(drink ~ location, summarize(prop_cola = success = "cola") %>% mean(drink == "cola")) %>% hypothesize(null = "independence") %>% summarize(diff(prop_cola)) generate(reps = 1, type = "permute") %>% calculate(stat = "diff in props", order = c("west","east")) # A tibble: 1 x 1 `diff(prop_cola)` <dbl> # A tibble: 1 x 2 1 -0.09276018 replicate stat <int> <dbl> 1 1 -0.02488688 FOUNDATIONS OF INFERENCE
Man y random perm u tations soda %>% specify(drink ~ location, success = "cola") %>% hypothesize(null = "independence") %>% generate(reps = 5, type = "permute") %>% calculate(stat = "diff in props", order = c("west", "east")) # A tibble: 5 x 2 replicate stat <int> <dbl> 1 1 0.04298643 2 2 -0.09276018 3 3 0.11085973 4 4 0.17873303 5 5 -0.16063348 FOUNDATIONS OF INFERENCE
Random distrib u tion FOUNDATIONS OF INFERENCE
Let ' s practice ! FOU N DATION S OF IN FE R E N C E
Using the randomi z ation distrib u tion FOU N DATION S OF IN FE R E N C E Jo Hardin Instr u ctor
Understanding the n u ll distrib u tion FOUNDATIONS OF INFERENCE
Understanding the n u ll distrib u tion FOUNDATIONS OF INFERENCE
Understanding the n u ll distrib u tion FOUNDATIONS OF INFERENCE
Understanding the n u ll distrib u tion FOUNDATIONS OF INFERENCE
Understanding the n u ll distrib u tion FOUNDATIONS OF INFERENCE
Data consistent w ith n u ll ? table(soda) soda %>% group_by(location) %>% summarize(mean(drink == "cola")) location drink East West # A tibble: 2 × 2 cola 28 19 location `mean(drink == "cola")` orange 6 7 <fctr> <dbl> 1 East 0.8235294 2 West 0.7307692 FOUNDATIONS OF INFERENCE
Significance FOUNDATIONS OF INFERENCE
Ho w e x treme are the obser v ed data ? # A tibble: 1 x 1 diff_orig <- soda %>% proportion group_by(location) %>% <dbl> summarize(prop_cola = mean(drink == "cola")) %>% 1 0.380 summarize(diff(prop_cola)) %>% pull() soda_perm <- soda %>% specify(drink ~ location, success = "cola") %>% hypothesize(null = "independence") %>% generate(reps = 100, type = "permute") %>% calculate(stat = "diff in props", order = c("west", "east")) soda_perm %>% summarize(proportion = mean(diff_orig >= stat)) FOUNDATIONS OF INFERENCE
Let ' s practice ! FOU N DATION S OF IN FE R E N C E
St u d y concl u sions FOU N DATION S OF IN FE R E N C E Jo Hardin Instr u ctor
Significance We fail to reject the n u ll h y pothesis : There is no e v idence that o u r data are inconsistent w ith the n u ll h y pothesis FOUNDATIONS OF INFERENCE
NHANES : random sample Representati v e sample of US pop u lation Concl u sions from sample ma y appl y to pop u lation Nothing to report in this case FOUNDATIONS OF INFERENCE
Let ' s practice ! FOU N DATION S OF IN FE R E N C E
Recommend
More recommend