statistics in medicine
play

Statistics in medicine Sources of variation Lecture 1- part 1: - PDF document

11/4/2016 Outline Statistics in medicine Sources of variation Lecture 1- part 1: Describing variation, and Types of variables graphical presentation Fatma Shebl, MD, MS, MPH, PhD Assistant Professor Chronic Disease Epidemiology


  1. 11/4/2016 Outline Statistics in medicine • Sources of variation Lecture 1- part 1: Describing variation, and • Types of variables graphical presentation Fatma Shebl, MD, MS, MPH, PhD Assistant Professor Chronic Disease Epidemiology Department Yale School of Public Health Fatma.shebl@yale.edu S L I D E 0 S L I D E 1 • Almost every characteristic that is measured on a patient varies Readings and resources • THAT IS WHY IT IS CALLED A VARIABLE • Chapter 9, p105-118: Jekel's epidemiology, biostatistics, preventive • EXAMPLES medicine, and public health by David L. Katz et al (4th edition). • Blood glucose level • Blood pressure • Diet • Electrolytes • etc.… S L I D E 2 S L I D E 3 1

  2. 11/4/2016 There are different sources of There are different sources of variation variation Let us consider blood pressure as an example Let us consider blood pressure as an example • Biologic differences • Measurement error – Age, race, diet, affect blood pressure – Systematic error • Older patients, of African descent, and those who • Distort the data in one direction leading to bias  consume high salt diet tend to have high blood obscure the truth pressure • Ex. Defective BP cuff that tend to give high readings • Measurement conditions – Random error – Time of the day, anxiety, fatigue…etc. • Slight, inevitable inaccuracies • High blood pressure is observed following exercise, • Not systematic because it makes some readings too and with anxiety high, and some too low Statistics can adjust for random error, but can not fix systematic error S L I D E 4 S L I D E 5 To understand variation, you have Variable could be quantitative or to describe it qualitative • Qualitative • Descriptive statistics definition: • Skin color – Statistics, such as the mean, the • Jaundice standard deviation, the proportion, and • Heart murmurs the rate, used to describe attributes of a set of a data • Quantitative http://clinicalgate.com/wp-content/uploads/2015/06/B9781437729306000483_f48-02- 97 81437729306.jpg – Blood pressure – Electrolytes levels S L I D E 6 S L I D E 7 2

  3. 11/4/2016 There are different types of Nominal variables (qualitative) variables – Nominal • Nominal are “naming” variables • Definition: – Dichotomous (binary) – The simplest scale of measurement. Used for – Ordinal (ranked) characteristics that have no numerical values, no measurement scales and no rank order. It is also – Continuous (interval) called a categorical or qualitative scale. • Ex. Skin color – Continuous (ratio) – Different number can be assigned to each color – Risks and proportions • E.g. 1: purple, 2: black, 3: white, 4 blue, 5: tan – It makes no difference to the statistical analysis which – Counts and units of observation number is assigned to which color, because the number is merely a numerical name for a color – Combining data • Percentages and proportions are commonly used to summarize the data S L I D E 8 S L I D E 9 Dichotomous variables (qualitative) Ordinal “ranked” variables • Definition: • Dichotomous from the Greek “cut into two” variables – Used for characteristics that have an underlying order to their values; that have clearly implied direction from • Ex.: Normal/abnormal skin color, living/dead better to worse. • Some time it s not enough to describe the data as two • Are categorical (qualitative) scales categories living/dead, but it is important to know how long the patient survived  survival analysis • Three or more levels • Although there is an order among categories, however the difference between two adjacent categories is not the same throughout the scale S L I D E 10 S L I D E 11 3

  4. 11/4/2016 Ordinal “ranked” variables Numerical scales (quantitative) Ex. Pitting edema grading scale: “0 - no Ex. Pain scale: “0 - no pain” - “10 - worst • Definition : edema” - “4+ - sever edema” imaginable pain” – The highest level of measurement. It is used for characteristics that can be given numerical values; the difference between numbers have meaning, ex. BMI, height. • Types • Interval http://biology-forums.com/gallery/2137_18_05_12_2_25_00.jpeg https://openclipart.org/detail/218053/pain-scale • Ratio • Percentages and proportions are commonly used to • Discrete summarize the data • Medians are sometime used to describe the whole data • Measures of central tendencies are usually used to summarize: means, medians S L I D E 12 S L I D E 13 Numerical scales (continuous) Numerical scales (Discrete) • Has values equal to integers • Has a value on a continuum • Units of observation: person, animal, thing, etc.… • Presented in frequency tables • Interval: arbitrary zero point • One characteristic in the x-axis, one characteristic in the y-axis, • Ex. Centigrade temperature scale and counts in the cells Frequency table of gender by whether serum total cholesterol was checked or not • Ratio: absolute zero point • Ex. Kalvin temperature scale Cholesterol level Gender Checked Not checked Total Female 17(63%) 10(37%) 27(100%) Male 25 (57%) 19(43%) 44(100%) Total 42(59%) 29(41%) 71(100%) https://www.google.com/url?sa=i&rct=j&q=&esrc=s&source=images&cd= Source: Jekel's epidemiology, biostatistics, preventive medicine, and &cad=rja&uact=8&ved=0ahUKEwiuo6nf8sjOAhUEkh4KHXTZAnUQjRwI public health by David L. Katz et al (4th edition). Bw&url=http%3A%2F%2Fwww.livescience.com%2F39994- kelv in.html&psig=AFQjCNFGVvg1wdLx78W2V44wDlZQDQB17A&ust=147 1 538633651130 S L I D E 14 S L I D E 15 4

  5. 11/4/2016 Risks and proportions Combining data • Continuous variable could be converted to ordinal variable • Risk is the conditional probability of an event (e.g. death) in a • When data is converted to categories individual information is lost defined population in a defined period. • The fewer the number of categories the greater is the amount of information lost Histogram of neonatal mortality rate per 1000 live births , • Share some characteristics of discrete and some characteristics of by birth weight group, United States 1980 Birth weight (g) continuous variables 120 • Ex. A discrete event (e.g., death) occurred in a fraction of 100 population 80 60 • Calculated by the ratio of counts in the numerator to counts in 40 denominator 20 0 Source: Buehler W et al. Public Health Rep 1 02:151-161, 1987 S L I D E 16 S L I D E 17 Outline Statistics in medicine • Frequency distributions Lecture 1- part 2: Describing variation, and – Frequency distribution of continuous data graphical presentation – Frequency distribution of binary data Fatma Shebl, MD, MS, MPH, PhD Assistant Professor Chronic Disease Epidemiology Department Yale School of Public Health Fatma.shebl@yale.edu S L I D E 18 S L I D E 19 5

  6. 11/4/2016 Readings and resources Frequency distribution is • Chapter 9, p105-118: Jekel's epidemiology, biostatistics, preventive medicine, and public health by David L. Katz et al (4th edition). S L I D E 20 S L I D E 21 Frequency distribution is Frequency tables • Definition – A table showing the number and or the percentages of observations occurring at different values (or range of values) of a variable. • Steps of creating frequency table TABLE of data displaying the VALUE of each data point ( or range of – Decide on the number of non-overlapping intervals data points) in one column and the FREQUENCY with which that • It is better to have equal width intervals value occurs in the other column • Usually 6 to 14 intervals are adequate to demonstrate the shape of the distribution • Creating intervals means: continuous variable converted to ordinal PLOT of data displaying the VALUE of each data point ( or range of variable data points) on one axis and the FREQUENCY with which that value – Information on individual level is lost occurs on the other axis – Count the number of observations in each interval • Percentages could be calculated as well – Percentage=the number of observation in the interval divided by the total number of observations, multiplied by 100 • Presented graphically by histogram S L I D E 22 S L I D E 23 6

Recommend


More recommend