introduction to data display useful questions to ask when
play

Introduction to data display Useful questions to ask when - PowerPoint PPT Presentation

Introduction to data display Useful questions to ask when considering how to display information What do you want to show? What methods are available for this? Is the method chosen the best? Would another have been better?


  1. Chi-square • The most commonly used statistical test. • Used to test if two or more percentages are different. • For example, suppose that in a study of 933 patients with a hip fracture, 10% of the men (22/219) of the men develop pneumonia compared with 5% of the women (36/714). • What is the probability that this could happen by chance alone? • Univariate, difference, unmatched, nominal, =>2 groups, n=>20. 64

  2. 4 8 E Chi-square example A A o t 7 8 5 P A C C % 4 2 3 5 2 6 8 P C % 9 4 3 T C % u a r m c c p t t s s a l s d f i i i d d d 7 b 1 7 P a 4 1 2 C 1 2 1 1 L 0 8 F 9 1 7 L 9 3 3 N a . C b 0 65

  3. Fisher’s Exact Test • This test can be used for 2 by 2 tables when the number of cases is too small to satisfy the assumptions of the chi-square. – Total number of cases is <20 or – The expected number of cases in any cell is <1 or – More than 25% of the cells have expected frequencies <5. 66

  4. 6 . 9 9 t a b u O L S S o t E P 0 5 A 5 C C . . . 5 5 0 E 4 % % % % C 4 % % % % C 5 5 5 8 3 P C . . . 5 5 0 E % % % % C 4 % % % % C T 5 8 3 C . . . 0 0 0 E % % % % C 4 % % % % C u a r m c c p t t s s a l s d f i i i d d d 5 b 1 0 P a 4 1 3 C 2 1 9 L 0 0 F 1 1 0 L 9 3 3 N a . C 67 b 1

  5. Student’s t -test • Used to compare the average (mean) in one group with the average in another group. • Is the average age of patients significantly different between those who developed pneumonia and those who did not? • Univariate, Difference, Unmatched, Interval, Normal, 2 groups. 68

  6. n t S e s t a r i u a l fi d e D i ff S i . E e g ffe r ffe r S i ta i o w p p d f F t g 9 3 1 7 4 1 9 9 5 9 2 A E 69 5 4 1 9 6 2 5 E

  7. Mann-Whitney U test • Same as the Wilcoxon rank-sum test • Used in place of the Student’s t-test when the data are skewed. • A nonparametric test that uses the rank of the value rather than the actual value. • Univariate, Difference, Unmatched, Interval, Nonnormal, 2 groups. 70

  8. Paired t-test • Used to compare the average for measurements made twice within the same person - before vs. after. • Used to compare a treatment group and a matched control group. • For example, Did the systolic blood pressure change significantly from the scene of the injury to admission? • Univariate, Difference, Matched, Interval, Normal, 2 groups. 71

  9. Wilcoxon signed-rank test • Used to compare two skewed continuous variables that are paired or matched. • Nonparametric equivalent of the paired t-test. • For example, “Was the Glasgow Coma Scale score different between the scene and admission?” • Univariate, Difference, Matched, Interval, Nonnormal, 2 group. 72

  10. ANOVA One-way used to compare more than 3 means from independent groups. “Is the age different between White, Black, Hispanic patients?” Two-way used to compare 2 or more means by 2 or more factors. “Is the age different between Males and Females, With and Without Pnuemonia?” 73

  11. Tests of Between-Subjects Effects Dependent Variable: AGE Ty pe III Sum of Mean Source Squares df Square F Sig. 944 a Model 5769 4 1442 486 8664 .775 .000 SEX 1981 .683 1 1981 .683 11.904 .001 PNEUMON 1299 .320 1 1299 .320 7.8 05 .005 SEX * PNEUMON 519.282 1 519.282 3.1 19 .078 Error 1546 57.2 929 166.477 Total 5924 601 933 a. R Squa red = .974 (Adjusted R Sq uared = .974) 74

  12. Kruskal-Wallis One-Way ANOVA • Used to compare continuous variables that are not normally distributed between more than 2 groups. • Nonparametric equivalent to the one-way ANOVA. • Is the length of stay different by ethnicity? • Analyze, nonparametric tests, K independent samples. 75

  13. Repeated-Measures ANOVA • Used to assess the change in 2 or more continuous measurement made on the same person. Can also compare groups and adjust for covariates. • Do changes in the vital signs within the first 24 hours of a hip fracture predict which patients will develop pneumonia? • Analyze, General Linear Model, Repeated Measures. 76

  14. Pearson Correlation • Used to assess the linear association between two continuous variables. – r=1.0 perfect correlation – r=0.0 no correlation – r=-1.0 perfect inverse correlation • Univariate, Association, Interval 77

  15. Correlations 35-SYSTO NUMBER 43-TOT AL LIC OF NUMBER BLOOD 35-GLASG 49-DAYS COMORB OF PRESSU OW COMA IN IDITES COMPLIC RE FIRST SCALE 35-PULSE AGE HOSPITAL (0-9 ) ATIONS ER FIRST ER FIRST ER AGE Pearson Correlation 1.0 00 .088** .211** .137** .149** -.030 -.008 Sig. (2-ta il e d) . .007 .000 .000 .000 .356 .809 N 933 933 933 933 925 926 923 49-DAYS IN HOSPITAL Pearson Correlation .088** 1.0 00 .167** .453** .039 .016 .022 Sig. (2-ta il e d) .007 . .000 .000 .237 .633 .499 N 933 933 933 933 925 926 923 NUMBER OF Pearson Correlation .211** .167** 1.0 00 .222** .034 -.079* .055 COMORBIDITES (0-9 ) Sig. (2-ta il e d) .000 .000 . .000 .296 .017 .093 N 933 933 933 933 925 926 923 43-TOT AL NUMBER Pearson Correlation .137** .453** .222** 1.0 00 -.033 -.028 .046 OF COMPLICATIONS Sig. (2-ta il e d) .000 .000 .000 . .310 .393 .161 N 933 933 933 933 925 926 923 35-SYSTOLIC BLOOD Pearson Correlation .149** .039 .034 -.033 1.0 00 .043 .069* PRESSURE FIRST ER Sig. (2-ta il e d) .000 .237 .296 .310 . .196 .035 N 925 925 925 925 925 925 923 35-GLASGOW COMA Pearson Correlation -.030 .016 -.079* -.028 .043 1.0 00 -.100** SCALE FIRST ER Sig. (2-ta il e d) .356 .633 .017 .393 .196 . .002 N 926 926 926 926 925 926 923 35-PULSE FIRST ER Pearson Correlation -.008 .022 .055 .046 .069* -.100** 1.0 00 Sig. (2-ta il e d) .809 .499 .093 .161 .035 .002 . N 923 923 923 923 923 923 923 **. Correlation is signif ican t at the 0.0 1 lev el (2-ta il e d). *. Correlation is signif ican t at the 0.0 5 lev el (2-ta il e d). 78

  16. Spearman rank-order correlation • Use to assess the relationship between two ordinal variables or two skewed continuous variables. • Nonparametric equivalent of the Pearson correlation. • Univariate, Association, Ordinal (or skewed). 79

  17. Correlations 35-SYSTO NUMBER 43-TOT AL LIC OF NUMBER BLOOD 35-GLASG 49-DAYS COMORB OF PRESSU OW COMA IN IDITES COMPLIC RE FIRST SCALE 35-PULSE AGE HOSPITAL (0-9 ) ATIONS ER FIRST ER FIRST ER Spearman's rho AGE Correlation Coef f icient 1.0 00 .089** .158** .145** .091** -.146** -.008 Sig. (2-ta il e d) . .007 .000 .000 .005 .000 .806 N 933 933 933 933 925 926 923 49-DAYS IN HOSPITAL Correlation Coef f icient .089** 1.0 00 .142** .389** .073* .048 .037 Sig. (2-ta il e d) .007 . .000 .000 .027 .149 .268 N 933 933 933 933 925 926 923 NUMBER OF Correlation Coef f icient .158** .142** 1.0 00 .229** .037 -.091** .042 COMORBIDITES (0-9 ) Sig. (2-ta il e d) .000 .000 . .000 .257 .006 .202 N 933 933 933 933 925 926 923 43-TOT AL NUMBER Correlation Coef f icient .145** .389** .229** 1.0 00 -.014 -.076* .043 OF COMPLICATIONS Sig. (2-ta il e d) .000 .000 .000 . .676 .020 .196 N 933 933 933 933 925 926 923 35-SYSTOLIC BLOOD Correlation Coef f icient .091** .073* .037 -.014 1.0 00 .079* .080* PRESSURE FIRST ER Sig. (2-ta il e d) .005 .027 .257 .676 . .017 .015 N 925 925 925 925 925 925 923 35-GLASGOW COMA Correlation Coef f icient -.146** .048 -.091** -.076* .079* 1.0 00 -.038 SCALE FIRST ER Sig. (2-ta il e d) .000 .149 .006 .020 .017 . .252 N 926 926 926 926 925 926 923 35-PULSE FIRST ER Correlation Coef f icient -.008 .037 .042 .043 .080* -.038 1.0 00 Sig. (2-ta il e d) .806 .268 .202 .196 .015 .252 . N 923 923 923 923 923 923 923 **. Correlation is signif ican t at the .01 lev el ( 2-ta il e d). *. Correlation is signif ican t at the .05 lev el ( 2-ta il e d). 80

  18. Summary of Inferential Tests 81

  19. Unpaired vs. Paired • Student’s t -test • Paired t-test • Chi-square • McNemar’s test • One-way ANOVA • Repeated-measures • Mann-Whitney U test • Wilcoxon signed-rank • Kruskal-Wallis H test • Friedman ANOVA 82

  20. Parametric vs. Nonparametric • Student’s t -test • Mann-Whitney U test • One-way ANOVA • Kruskal-Wallis test • Paired t-test • Wilcoxon signed-rank • Pearson correlation • Spearman’s r • Correlated F ratio • Friedman ANOVA (repeatedmeasures ANOVA) 83

  21. A Good Rule to Follow • Always check your results with a nonparametric. • If you test your null hypothesis with a Student’s t -test, also check it with a Mann- Whitney U test. • It will only take an extra 25 seconds. 84

  22. Linear Regression • Used to assess how one or more predictor variables can be used to predict a continuous outcome variable. • “Do age, number of comorbidities, or admission vital signs predict the length of stay in the hospital after a hip fracture?” • Multivariate, Association, Interval/Ordinal dependent variable. 85

  23. a Coefficients Standardi zed Unstandardized Coeff icien Coeff icients ts Model B Std. Error Beta t Sig. 1 (Constant) -4.4 51 18.889 -.236 .814 AGE 7.1 36E-02 .045 .053 1.5 71 .117 NUMBER OF 2.6 06 .548 .159 4.7 57 .000 COMORBIDITES (0-9 ) 35-SYSTOLIC BLOOD 1.5 62E-02 .022 .024 .726 .468 PRESSURE FIRST ER 35-GLASGOW COMA 1.0 67 1.1 70 .030 .912 .362 SCALE FIRST ER 35-PULSE FIRST ER 2.5 81E-02 .047 .019 .554 .580 35-RESPIRATION -8.0 0E-02 .188 -.014 -.425 .671 RATE FIRST ER 86 a. Depe ndent Variable: 49-DAYS IN HOSPIT AL

  24. Logistic Regression • Used to assess the predictive value of one or more variables on an outcome that is a yes/no question. • “Do age, gender, and comorbidities predict which hip fracture patients will develop pneumonia?” • Multivariate, Difference, Nominal dependent variable, not time-dependent, 2 groups. 87

  25. 1 Total number of comorbidities 2 Cirrhosis 3 COPD 4 Gender 5 Age 88

  26. Draw Conclusions • We reject the null hypothesis. • Patients who are at high risk of developing pneumonia during their hospitalization for a hip fracture can be identified by: – total number of pre-existing conditions – cirrhosis – COPD – male gender 89

  27. Survival Analysis • Kaplan-Meier method – Used to plot cumulative survival • Log-rank test – Used to compare survival curves • Cox proportional-hazards – Used to adjust for covariates in survival analysis 90

  28. Thanks for your attention

  29. Introduction to Statistics Descriptive Analysis

  30. Review of Descriptive Stats. • Descriptive Statistics are used to present quantitative descriptions in a manageable form. • This method works by reducing lots of data into a simpler summary.

  31. Univariate Analysis • This is the examination across cases of one variable at a time. • Frequency distributions are used to group data. • One may set up margins that allow us to group cases into categories. • Examples include: – age categories – price categories – temperature categories.

  32. Distributions Two ways to describe a univariate distribution • a table • a graph (histogram, bar chart)

  33. Distributions (con’t) Sex No % Men 12 60 Women 8 40 Ditribution of participants of the research methodology workshop by sex total 20 100 70% 60% 50% 40% 30% 20% 10% 0% Men Women

  34. Distributions (con’t) Workshop participants by specialty Others Workshop participants by specialty Nursing Microbiology Env ironmental sciences Fishery Fishery Nursing Environmental Other sciences Microbiology 0% 5% 10 % 15 % 20 % 25 % 30 % 35 % 40 %

  35. Distributions (cont.) A Frequency Distribution Table Category Percent Under 35 9 36-45 21 46-55 45 56-65 19 66+ 6

  36. Distributions (cont.) A Histogram 50 40 30 Percent 20 10 0 Under 36-45 46-55 56-65 66+ 35

  37. Central Tendency • An estimate of the “center” of a distribution • Three different types of estimates: – Mean – Median – Mode

Recommend


More recommend