correlation learning objectives
play

Correlation Learning Objectives At the end of this lecture, the - PowerPoint PPT Presentation

Chapter 4.1 Scatter Diagrams and Linear Correlation Learning Objectives At the end of this lecture, the student should be able to: Explain what a scattergram is and how to make one State what strength and direction mean with


  1. Chapter 4.1 Scatter Diagrams and Linear Correlation

  2. Learning Objectives At the end of this lecture, the student should be able to: • Explain what a scattergram is and how to make one • State what “strength” and “direction” mean with respect to correlations • Compute correlation coefficient r using the computational formula • Describe why correlation is not necessarily causation

  3. Introduction • Making a scatter diagram • Correlation coefficient r • Causation and lurking variables Photograph provided by Dr. John Bollinger

  4. Scattergram Also called Scatter Plots

  5. Scattergrams Graph x,y Pairs 8 • Explanatory (independent) 7 variable is called x 6 • Graphed on x-axis 5 4 3 2 1 0 0 1 2 3 4 5 6 7 8 x axis

  6. Scattergrams Graph x,y Pairs 8 • Explanatory (independent) Y 7 variable is called x 6 • Graphed on x-axis 5 • Response (dependent) 4 y axis variable is called y 3 • Graphed on y-axis 2 1 0 0 1 2 3 4 5 6 7 8 x axis

  7. Scattergrams Graph x,y Pairs 8 • Explanatory (independent) 7 variable is called x 6 • Graphed on x-axis 5 • Response (dependent) 4 y axis variable is called y 3 • Graphed on y-axis • Trick to memorizing: x → y, 2 x comes before y, so x 1 “causes” y. 0 • Scatter diagram is a graph 0 1 2 3 4 5 6 7 8 of these x,y pairs x axis

  8. Scattergrams Graph x,y Pairs 8 Do the number of diagnoses a 7 patient has correlate with the 6 number of medications s/he 5 takes? 4 y axis x y 3 (# of dx) (# of meds) 2 1 3 1 3 5 0 4 4 0 1 2 3 4 5 6 7 8 7 6 x axis

  9. Scattergrams Graph x,y Pairs 8 Do the number of diagnoses a Number of Medications 7 patient has correlate with the 6 number of medications s/he 5 takes? 4 1 x y 3 (# of dx) (# of meds) 2  1 3 3 1 3 5 0 4 4 0 1 2 3 4 5 6 7 8 7 6 Number of Diagnoses

  10. Scattergrams Graph x,y Pairs 8 Do the number of diagnoses a Number of Medications 7 patient has correlate with the 6 3 number of medications s/he 5 takes? 4 x y 3 (# of dx) (# of meds) 2 5 1 3 1  3 5 0 4 4 0 1 2 3 4 5 6 7 8 7 6 Number of Diagnoses

  11. Scattergrams Graph x,y Pairs 8 Do the number of diagnoses a Number of Medications 7 patient has correlate with the 6 number of medications s/he 5 takes? 4 x y 3 (# of dx) (# of meds) 2 1 3 1 3 5 0  4 4 0 1 2 3 4 5 6 7 8 7 6 Number of Diagnoses

  12. Scattergrams Graph x,y Pairs 8 Do the number of diagnoses a Number of Medications 7 patient has correlate with the 6 number of medications s/he 5 takes? 4 x y 3 (# of dx) (# of meds) 2 1 3 1 3 5 0 4 4 0 1 2 3 4 5 6 7 8  7 6 Number of Diagnoses

  13. Linear Correlation 8 • Linear correlation means 7 that when you make a 6 scatterplot of x,y pairs, it x y 5 looks kind of like a line 1 2 4 • “Perfect” linear correlation 2 4 3 3 6 looks like graphing points 2 4 8 in algebra 1 0 0 1 2 3 4 5 6 7 8

  14. Facts About Linear Correlation 8 • The line can go up. This Number of Medications 7 is a positive correlation. 6 5 4 3 2 1 0 0 1 2 3 4 5 6 7 8 Number of Diagnoses

  15. Facts About Linear Correlation 8 Number of Nurses Staffed on Shift • The line can go up. This 7 is a positive correlation. 6 • The line can go down. 5 This is negative 4 correlation. 3 2 1 0 0 1 2 3 4 5 6 7 8 Number of Patient Complaints

  16. Facts About Linear Correlation 8 • The line can go up. This 7 is a positive correlation. Days Spent in Hospital 6 • The line can go down. 5 This is negative 4 correlation. 3 • The line can be straight. 2 This is no correlation. 1 0 0 1 2 3 4 5 6 7 8 Total Unique Visitors

  17. Facts About Linear Correlation 8 • The line can go up. This 7 is a positive correlation. Number of Books 6 • The line can go down. 5 This is negative 4 correlation. 3 • The line can be straight. 2 This is no correlation. 1 • The line can be goofy. 0 This is also no 0 1 2 3 4 5 6 7 8 correlation. Number of Games

  18. Correlation Has Two Attributes Direc Di ection tion Str Stren ength gth • Strength refers to how • Positive close to the line all the correlation dots fall. • If they fall really close to • Negative the line, it is strong • If they fall kind of close to correlation the line, it is moderate • No correlation • If they aren’t very close to the line, it is weak

  19. Correlation Has Two Attributes Str Strong ong 8 Stren Str ength gth Ne Nega gativ tive 7 • Strength refers to how 6 close to the line all the 5 dots fall. 4 • If they fall really close to 3 the line, it is strong 2 • If they fall kind of close to 1 the line, it is moderate 0 • If they aren’t very close to 0 1 2 3 4 5 6 7 8 the line, it is weak

  20. Correlation Has Two Attributes Str Strong ong 8 Stren Str ength gth Posit ositiv ive 7 • Strength refers to how 6 close to the line all the 5 dots fall. 4 • If they fall really close to 3 the line, it is strong 2 • If they fall kind of close to 1 the line, it is moderate 0 • If they aren’t very close to 0 1 2 3 4 5 6 7 8 the line, it is weak

  21. Correlation Has Two Attributes Moder Moderate te 8 Stren Str ength gth Posit ositiv ive 7 • Strength refers to how 6 close to the line all the 5 dots fall. 4 • If they fall really close to 3 the line, it is strong 2 • If they fall kind of close to 1 the line, it is moderate 0 • If they aren’t very close to 0 1 2 3 4 5 6 7 8 the line, it is weak

  22. Correlation Has Two Attributes Hey, what’s Weak eak 8 Stren Str ength gth that? tha t?? ? Outl Outlier! ier! Posit ositiv ive 7 • Strength refers to how 6 close to the line all the 5 dots fall. 4 • If they fall really close to 3 the line, it is strong 2 • If they fall kind of close to 1 the line, it is moderate 0 • If they aren’t very close to 0 1 2 3 4 5 6 7 8 the line, it is weak

  23. Outliers in Correlation • Outliers can have a very powerful effect on a correlation • An outlier in any of the 4 corners of the plot can really affect the direction of the line • An outlier can also change the correlation from strong and moderate to weak • It’s good to look at a scatterplot to make sure you identify outliers

  24. Correlation Coefficient r Putting a Number on Correlation

  25. Correlation Coefficient r • Remember “coefficient” from CV (coefficient of variation)? • Coefficient just means a number • r stands for the sample correlation coefficient • Remember! Corrrrrrrrrrrrrrrrrrelation • Population correlation coefficient = • We will only focus on r

  26. What is r? Wha hat i t it i t is Ho How w to i to inter nterpr pret et it it • • A numerical quantification of The r calculation produces a how correlated a set of x,y number pairs are • The lowest number possible is • Calculated from plugging -1.0 x,y pairs into an equation • Perfect negative correlation • Has a defining formula and • The highest possible number is a computational formula 1.0 • I will demonstrate • Perfect positive correlation computational formula • All others are in-between

  27. Examples of Negative r r = -0.25 r = -0.70 r = -0.44 OPINION!!! For negative correlations: • 0.0 to -0.40: Weak • -0.40 to -0.70: Moderate • -0.70 to -1.0: Strong

  28. Examples of Positive r r = 0.66 r = 0.92 OPINION!!! For positive correlations: • 0.0 to 0.40: Weak • 0.40 to 0.70: Moderate • 0.70 to 1.0: Strong

  29. Calculating r Computational Formula

  30. Computational Formula • FLASHBACK! …to Chapter n Σ xy – ( Σ x)( Σ y) r = √nΣ x 2 – ( Σ x) 2 3.2 √nΣ y 2 – ( Σ y) 2 • Notice all the Σ’s Hypothetical Scenario • We have 7 patients • As before, we will • They have come to the clinic for • make columns appointments throughout the year. • We predict those with a higher diastolic • make calculations blood pressure (DBP) will have more • Then add up the appointments columns to get these Σ’s • We take DBP at last appointment as “x” • We take number of appointments over the year as “y”

  31. x=DBP , y=# of Appointments x 2 y 2 # x y xy n Σ xy – ( Σ x)( Σ y) r = 1 70 3 √nΣ x 2 – ( Σ x) 2 √nΣ y 2 – ( Σ y) 2 2 115 45 3 105 21 4 82 7 5 93 16 6 125 62 7 88 12 Σ x = Σ y = 678 166

  32. x=DBP , y=# of Appointments x 2 y 2 # x y xy n Σ xy – ( Σ x)( Σ y) r = 1 70 3 √nΣ x 2 – ( Σ x) 2 √nΣ y 2 – ( Σ y) 2 2 115 45 3 105 21 4 82 7 5 93 16 6 125 62 7 88 12 Σ x = Σ y = 678 166

  33. x=DBP , y=# of Appointments x 2 y 2 # x y xy n Σ xy – ( Σ x)( Σ y) r = 1 70 3 √nΣ x 2 – ( Σ x) 2 √nΣ y 2 – ( Σ y) 2 2 115 45 3 105 21 NOT! 4 82 7 5 93 16 6 125 62 7 88 12 Σ x = Σ y = 678 166 Σ xy will go here

Recommend


More recommend