statistical analysis for m edical and public health data
play

Statistical Analysis for M edical and Public Health Data Qazvin - PowerPoint PPT Presentation

Statistical Analysis for M edical and Public Health Data Qazvin University of M edical Sciences 2017 Workshop Schedule 1- Types of variables 2- Types of Studies 3- Types of data summaries 4- Types of statistical inference 5- statistical


  1. Statistical Analysis for M edical and Public Health Data Qazvin University of M edical Sciences 2017

  2. Workshop Schedule 1- Types of variables 2- Types of Studies 3- Types of data summaries 4- Types of statistical inference 5- statistical graphs and data analysis with STATA

  3. 1. Types of variables • Qualitative variables : responses are not number – Nominal variable: makes group of people; no comparison Examples: gender, status (ill, health) – Ordinal variable: makes group of people; simple comparison (< = >) Examples: education, social class (I,II,III,IV)

  4. 1. Types of variables • Quantitative variables : responses are numbers 1. interval variables: makes groups, comparison, zero point or origin was made by scientists difference is OK but ratio is not Examples: temperature (0c, 32F , -270K), poverty line (Toman, $, … ) 20C – 10C = 10C 20C/ 10C = 2 F = 32 + 1.8* C 32+1.8* 20=68 32+1.8* 10=50 68 F – 50 F=18F+32=50F=10C 68F/ 50F = 1.36

  5. 1. Types of variables 2. Ratio variables: makes groups, comparison, zero point or origin is a true zero difference is OK and ratio is OK Examples: age, weight, height 180cm – 170cm = 10cm 180cm/ 170cm=1.06 180kg-170kg = 10kg 18kg/ 170kg=1.06 Statistical methods for interval and ratio variables are the same.

  6. 1. Types of variables • Dependent variable (Y) or outcome or response or end point is a function of many factors • Independent variables (X1, X2, … , Xk) predictors, factors, exploratory variables, treatment are possible causes for Y

  7. 2. Types of studies • Observational study : Definition: An observational study 1. draws inferences from a sample to a population 2. independent variables are not under the control of the researcher because of: ethical concerns logistical constraints 3. Randomization of treatment is impossible

  8. Types of observational studies • Case-control study: study originally developed in epidemiology, in which two existing groups differing in outcome are identified and compared on the basis of some supposed causal attribute. • Cross-sectional study: involves data collection from a population, or a representative subset, at one specific point in time. • Longitudinal study: correlational research study that involves repeated observations of the same variables over long periods of time. • Cohort study or Panel study: a particular form of longitudinal study where a group of patients is closely monitored over a span of time. • Ecological study: an observational study in which at least one variable is measured at the group level.

  9. Types of observational studies Disadvantage: cannot be used as reliable sources to make statements of fact about the " safety , efficacy , or effectiveness " of a practice Advantages: 1- provide information on “real world” use and practice 2- detect signals about the benefits and risks of practices in the general population 3- help formulate hypotheses to be tested in subsequent experiments 4- provide data needed to design more informative pragmatic clinical trials 5- inform clinical practice

  10. Experimental Study Definition: the investigator actively manipulates which groups receive the agent or exposure under study Randomized controlled trials (RCT) The steps in an RCT are: 1. State the hypothesis 2. Select the participants. This step includes sample size, inclusion and exclusion criteria, and informed consent 3. Allocate participants randomly to either the treatment or control group; Randomization 4. Administer the intervention. a blinded fashion; single blind; double blind 5. At a pre-determined time, the outcomes are monitored

  11. 3- Types of data summaries • Tables • Graphs • Descriptive statistics

  12. 3- Types of data summaries One-way table: shows distribution of one variable Table 1 Distribution of blood group of who where when Blood group Freq. percent A 25 18.52 B 40 29.63 AB 55 40.74 O 15 11.11 Total 135 100

  13. 3- Types of data summaries Two-way table : shows distribution of one variable by second one Table 2 Distr. of … by … who when where Disease Yes Disease NO total Blood Freq. % Freq. % Freq. % group A 20 5 25 B 20 20 40 AB 40 15 55 O 10 5 15 Total 90 45 135

  14. 3- Types of data summaries • Three-way table Application: effect of exposure on outcome after controlling for a confounder Age group exposure Disease + Disease - 25 - 30 Y es no … … … >= 75 Y es No

  15. Statistical Graphs • For qualitative variables: 1. Simple Bar chart 2. Clustered Bar chart 3. Pie chart 4. Clustered pie chart 15

  16. Bar chart for race 100 96 80 67 60 count of id 40 26 20 0 white black other 16

  17. Distribution of low birth weight by race 80 73 60 count of id 42 40 25 23 20 15 11 0 0 1 0 1 0 1 white black other 17

  18. Distribution of race 35.45% 50.79% 13.76% white black other 18

  19. Distribution of low birth weight by race white black 23.96% 42.31% 57.69% 76.04% other 37.31% 62.69% 0 1 Graphs by race 19

  20. Statistical Graphs • For quantitative variables (continuous or discrete) • Histogram • Box plot • Scatter plot • line plot • ROC curve (Receiver operating characteristic) curve 20

  21. Distribution of volume as a continuous variable 25 20 15 Percent 10 5 0 5,000 10,000 15,000 20,000 25,000 Volume (thousands) 21

  22. Distribution of M ileage as discrete variable 15 10 Percent 5 0 10 20 30 40 Mileage (mpg) 22

  23. Distribution of blood pressure (bp) by Sex effect of sex on bp 180 160 Blood pressure 140 120 Male Female 23

  24. Distribution of blood pressure (bp) by age groups and sex effects of age group and sex on bp 180 Blood pressure 160 140 120 Male Female Male Female Male Female 30-45 46-59 60+ 24

  25. Scatter plot of life expectancy by population growth 50 60 70 80 4 Avg. 2 annual % growth 0 80 70 Life expectancy at birth 60 50 0 2 4 25

  26. Line chart for life expectancy over years 65 60 life expectancy 55 50 45 40 1900 1910 1920 1930 1940 Year 26

  27. Line charts for life expectancy and inflation over years 60 50 40 30 20 10 1900 1910 1920 1930 1940 Year life expectancy inflation 27

  28. Receiver Operator Characteristic Curve (ROC) curve • To examine if a clinical marker or a new clinical test is suitable for diagnosing a disease • Find a cutoff point and its sensitivity and specificity for a marker or a test • ROC gives Area Under Curve (AUC) and p-value to examine the efficacy of the marker or test • AUC > 0.5 and closer to 1.0 indicates acceptable marker or test for diagnosing 28

  29. An example of a bad marker 1.00 0.75 Sensitivity 0.50 0.25 0.00 0.00 0.25 0.50 0.75 1.00 Specificity Area under ROC curve = 0.3870 ROC -Asymptotic Normal-- Obs Area Std. Err. [95% Conf. Interval] -------------------------------------------------------- 189 0.3870 0.0452 0.29841 0.47564 29

  30. ROC curve for a good marker 1.00 0.75 Sensitivity 0.50 0.25 0.00 0.00 0.25 0.50 0.75 1.00 Specificity Area under ROC curve = 0.9964 ROC -Asymptotic Normal-- Obs Area Std. Err. [95% Conf. Interval] -------------------------------------------------------- 2000 0.9964 0.0013 0.99390 0.99893 30

  31. Choosing a Cutoff point Detailed report of sensitivity and specificity Correctly Cutpoint Sensitivity Specificity Classified ( >= 1 ) 100.00% 0.00% 50.00% ( >= 2 ) 99.70% 94.20% 96.95% ( >= 3 ) 99.50% 96.00% 97.75% ( >= 4 ) 99.30% 97.60% 98.45% ( >= 5 ) 98.80% 98.30% 98.55% ( >= 6 ) 97.80% 98.50% 98.15% ( >= 7 ) 97.30% 98.80% 98.05% ( >= 8 ) 96.50% 99.70% 98.10% ( > 8 ) 0.00% 100.00% 50.00% 31

  32. Fundaments of statistical Testing and Confidence Interval 32

  33. Fundaments of statistical Testing Research Loop: Population with Representative statistics sample unknown parameters 33

  34. Fundaments of statistical Testing M ethods for statistical inference: 1- Estimation 1-1 Point estimation 1-2 Confidence Interval estimation 2- Statistical Testing (T est of Hypothesis) 34

  35. What is a point estimate? A point estimate is a statistical measure that is calculated based on data obtained in a sample. Examples: sample mean, sample proportion, etc. Population parameters point estimate M ean = µ Xbar Prop. = P X/ n; X=number of successes, n=sample size Standard deviation= σ s s/√n Standard Error = Std. Err. Coefficient of Variation= σ / µ s/ Xbar 35

  36. M ajor problem with point estimates • T o what extend we have confidence to generalize a point estimate to its parameter in the population? • No specific answer! • A point estimate may have confidence from 0% to 100% • The question is answered by building an interval with interested confidence and centered on the point estimate 36

Recommend


More recommend