correlation and regression
play

Correlation and Regression 9-1 Overview 9-2 Correlation 9-3 - PowerPoint PPT Presentation

Chapter 9 Slide 1 Correlation and Regression 9-1 Overview 9-2 Correlation 9-3 Regression 9-4 Variation and Prediction Intervals 9-5 Multiple Regression 9-6 Modeling Chapter 9, Triola, Elementary Statistics , MATH 1342 Slide 2 Section 9-1


  1. Chapter 9 Slide 1 Correlation and Regression 9-1 Overview 9-2 Correlation 9-3 Regression 9-4 Variation and Prediction Intervals 9-5 Multiple Regression 9-6 Modeling Chapter 9, Triola, Elementary Statistics , MATH 1342

  2. Slide 2 Section 9-1 & 9-2 Overview and Correlation and Regression Created by Erin Hodgess, Houston, Texas Chapter 9, Triola, Elementary Statistics , MATH 1342

  3. Overview Slide 3 Paired Data (p.506) � Is there a relationship? � If so, what is the equation? � Use that equation for prediction. Chapter 9, Triola, Elementary Statistics , MATH 1342

  4. Definition Slide 4 � A correlation exists between two variables when one of them is related to the other in some way. Chapter 9, Triola, Elementary Statistics , MATH 1342

  5. Definition Slide 5 � A Scatterplot (or scatter diagram) is a graph in which the paired ( x, y ) sample data are plotted with a horizontal x- axis and a vertical y- axis. Each individual ( x, y ) pair is plotted as a single point. Chapter 9, Triola, Elementary Statistics , MATH 1342

  6. Scatter Diagram Slide 6 of Paired Data (p.507) Chapter 9, Triola, Elementary Statistics , MATH 1342

  7. Positive Linear Slide 7 Correlation (p.498) Figure 9-2 Scatter Plots Chapter 9, Triola, Elementary Statistics , MATH 1342

  8. Negative Linear Slide 8 Correlation Figure 9-2 Scatter Plots Chapter 9, Triola, Elementary Statistics , MATH 1342

  9. No Linear Correlation Slide 9 Figure 9-2 Scatter Plots Chapter 9, Triola, Elementary Statistics , MATH 1342

  10. Definition (p.509) Slide 10 The linear correlation coefficient r measures strength of the linear relationship between paired x and y values in a sample. Chapter 9, Triola, Elementary Statistics , MATH 1342

  11. Assumptions (p.507) Slide 11 1. The sample of paired data ( x, y ) is a random sample. 2. The pairs of ( x, y ) data have a bivariate normal distribution. Chapter 9, Triola, Elementary Statistics , MATH 1342

  12. Notation for the Linear Correlation Coefficient Slide 12 n = number of pairs of data presented Σ denotes the addition of the items indicated. Σ x denotes the sum of all x - values. Σ x 2 indicates that each x - value should be squared and then those squares added. ( Σ x ) 2 indicates that the x - values should be added and the total then squared. Σ xy indicates that each x -value should be first multiplied by its corresponding y - value. After obtaining all such products, find their sum. r represents linear correlation coefficient for a sample ρ represents linear correlation coefficient for a population Chapter 9, Triola, Elementary Statistics , MATH 1342

  13. Definition Slide 13 The linear correlation coefficient r measures the strength of a linear relationship between the paired values in a sample. n Σ xy – ( Σ x )( Σ y ) r = n ( Σ x 2 ) – ( Σ x ) 2 n ( Σ y 2 ) – ( Σ y ) 2 Formula 9-1 Calculators can compute r ρ (rho) is the linear correlation coefficient for all paired data in the population. Chapter 9, Triola, Elementary Statistics , MATH 1342

  14. Rounding the Linear Slide 14 Correlation Coefficient r � Round to three decimal places so that it can be compared to critical values in Table A-6. (see p.510) � Use calculator or computer if possible. Chapter 9, Triola, Elementary Statistics , MATH 1342

  15. Calculating r Slide 15 Data x 1 1 3 5 2 8 6 4 y This data is from exercise #7 on p.521. Chapter 9, Triola, Elementary Statistics , MATH 1342

  16. Slide 16 Chapter 9, Triola, Elementary Statistics , MATH 1342 Calculating r

  17. Calculating r Slide 17 Data x 1 1 3 5 2 8 6 4 y n Σ xy – ( Σ x )( Σ y ) r = n ( Σ x 2 ) – ( Σ x ) 2 n ( Σ y 2 ) – ( Σ y ) 2 4( 48 ) – (10)(20) r = 4(36) – (10) 2 4(120) – (20) 2 –8 r = = – 0.135 59.329 Chapter 9, Triola, Elementary Statistics , MATH 1342

  18. Interpreting the Linear Slide 18 Correlation Coefficient (p.511) � If the absolute value of r exceeds the value in Table A - 6, conclude that there is a significant linear correlation. � Otherwise, there is not sufficient evidence to support the conclusion of significant linear correlation. Chapter 9, Triola, Elementary Statistics , MATH 1342

  19. Example: Slide 19 Boats and Manatees Given the sample data in Table 9-1, find the value of the linear correlation coefficient r , then refer to Table A-6 to determine whether there is a significant linear correlation between the number of registered boats and the number of manatees killed by boats. Using the same procedure previously illustrated, we find that r = 0.922. Referring to Table A-6, we locate the row for which n =10. Using the critical value for α =5, we have 0.632. Because r = 0.922, its absolute value exceeds 0.632, so we conclude that there is a significant linear correlation between number of registered boats and number of manatee deaths from boats. Chapter 9, Triola, Elementary Statistics , MATH 1342

  20. Properties of the Slide 20 Linear Correlation Coefficient r 1. –1 ≤ r ≤ 1 (see also p.512) 2. Value of r does not change if all values of either variable are converted to a different scale. 3. The r is not affected by the choice of x and y . interchange x and y and the value of r will not change. 4. r measures strength of a linear relationship. Chapter 9, Triola, Elementary Statistics , MATH 1342

  21. Interpreting r : Slide 21 Explained Variation The value of r 2 is the proportion of the variation in y that is explained by the linear relationship between x and y . (p.503 and p.533) Chapter 9, Triola, Elementary Statistics , MATH 1342

  22. Example: Slide 22 Boats and Manatees Using the boat/manatee data in Table 9-1, we have found that the value of the linear correlation coefficient r = 0.922 . What proportion of the variation of the manatee deaths can be explained by the variation in the number of boat registrations? With r = 0.922, we get r 2 = 0.850. We conclude that 0.850 (or about 85%) of the variation in manatee deaths can be explained by the linear relationship between the number of boat registrations and the number of manatee deaths from boats. This implies that 15% of the variation of manatee deaths cannot be explained by the number of boat registrations. Chapter 9, Triola, Elementary Statistics , MATH 1342

  23. Common Errors Slide 23 Involving Correlation (pp.503-504) 1. Causation: It is wrong to conclude that correlation implies causality. 2. Averages: Averages suppress individual variation and may inflate the correlation coefficient. 3. Linearity: There may be some relationship between x and y even when there is no significant linear correlation. Chapter 9, Triola, Elementary Statistics , MATH 1342

  24. Common Errors Slide 24 Involving Correlation FIGURE 9-3 Scatterplot of Distance above Ground and Time for Object Thrown Upward Chapter 9, Triola, Elementary Statistics , MATH 1342

  25. Formal Slide 25 Hypothesis Test (p.504) � We wish to determine whether there is a significant linear correlation between two variables. � We present two methods. � Both methods let H 0 : ρ = 0 (no significant linear correlation) H 1 : ρ ≠ 0 (significant linear correlation) Chapter 9, Triola, Elementary Statistics , MATH 1342

  26. FIGURE 9-4 Slide 26 Testing for a Linear Correlation (p.505) Chapter 9, Triola, Elementary Statistics , MATH 1342

  27. Method 1: Slide 27 Test Statistic is t (follows format of earlier chapters) Test statistic: r t = 1 – r 2 n – 2 Critical values: Use Table A-3 with degrees of freedom = n – 2 Chapter 9, Triola, Elementary Statistics , MATH 1342

  28. Method 2: Slide 28 Test Statistic is r (uses fewer calculations) � Test statistic: r � Critical values: Refer to Table A-6 (no degrees of freedom) Chapter 9, Triola, Elementary Statistics , MATH 1342

  29. Example: Slide 29 Boats and Manatees Using the boat/manatee data in Table 9-1, test the claim that there is a linear correlation between the number of registered boats and the number of manatee deaths from boats. Use Method 1. r t = 1 – r 2 n – 2 0.922 t = = 6.735 1 – 0.922 2 10 – 2 Chapter 9, Triola, Elementary Statistics , MATH 1342

  30. Method 1: Slide 30 Test Statistic is t (follows format of earlier chapters) Figure 9-5 (p.516) Chapter 9, Triola, Elementary Statistics , MATH 1342

  31. Example: Slide 31 Boats and Manatees Using the boat/manatee data in Table 9-1, test the claim that there is a linear correlation between the number of registered boats and the number of manatee deaths from boats. Use Method 2. The test statistic is r = 0.922. The critical values of r = ± 0.632 are found in Table A-6 with n = 10 and α = 0.05. Chapter 9, Triola, Elementary Statistics , MATH 1342

  32. Method 2: Slide 32 Test Statistic is r (uses fewer calculations) � Test statistic: r � Critical values: Refer to Table A-6 (10 degrees of freedom) Figure 9-6 (p.507) Chapter 9, Triola, Elementary Statistics , MATH 1342

Recommend


More recommend