dag 4 linear regression
play

Dag 4: Linear regression Susanne Rosthj Section of Biostatistics - PowerPoint PPT Presentation

u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s Dag 4: Linear regression Susanne Rosthj Section of Biostatistics Department of Public Health University of Copenhagen sr@biostat.ku.dk u n i v e


  1. u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s Dag 4: Linear regression Susanne Rosthøj Section of Biostatistics Department of Public Health University of Copenhagen sr@biostat.ku.dk

  2. u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s Example: vitamin D as a function of BMI 2 / 11

  3. u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s Linear regression model n independent observations Y 1 , Y 2 , . . . , Y n . We assume with ǫ i ∼ N ( 0 , σ 2 ) independent = a + b · X i + ǫ i , Y i ǫ is the residual error . The outcome Y i has a normal distribution with a + b · X i mean = σ 2 = variance where X i is a quanititative explanatory variable. 3 / 11

  4. u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s Estimation Estimated regression line: = 111 . 05 − 2 . 39 · BMI vitaminD SE of the effect of BMI: 0.69. 95%CI (-3.78;-1.00). Test of the effect of BMI by a t-test t = − 2 . 39 − 3 . 47 , = p = 0 . 001 . 0 . 69 Interpretation : For a 1 unit increase in BMI, vitamin D is lowered by 2.39 nmol/L (95% CI 1.00-3.78), p=0.001. 4 / 11

  5. u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s Analysis in SAS We use proc glm (General Linear Model): data vitamin ; infile ’http://publicifsv.sund.ku.dk/~sr/MPH/datasets/vitamin.txt’ URL firstobs=2; input country vitd age bmi sunexp intake; run ; proc glm data=vitamin plots=DiagnosticsPanel; model vitd = bmi / solution clparm; where country=4; * Irland; run ; Discuss the output in the handout and find the numbers on the previous slide. 5 / 11

  6. u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s Does the model fit to the data? Our conclusions based on the model are valid only if the model is valid. Assumptions : 1) Independence between observations 2) Linearity 3) Normality (of residual errors ǫ ) 4) Homogeneity of variance (of residual errors ǫ ) Normality and homogeneity assessed through the residuals. 6 / 11

  7. u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s Assessment of 2) Linearity Extend the model : a + b · BMI i + c · BMI 2 = i + ǫ i Y i A test of linearity : H 0 : c = 0 . proc glm data=vitamin plots=DiagnosticsPanel; model vitd = bmi bmi*bmi / solution clparm; where country=4; * Irland; run ; Is the linear model plausible? 7 / 11

  8. u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s Predicted values and residuals The predicted or fitted values : ˆ 111 . 05 − 2 . 39 · BMI i Y i = Expected vitamin D level for a woman with BMI=21.8 111 . 05 − 2 . 39 × 21 . 8 = 58 . 9 For each woman we determine the residual = Y i − ( 111 . 05 − 2 . 39 · BMI i ) r i as the difference between observed and predicted value. Residual for the woman with BMI=21.8 and Y=vitamin D=89.1 89 . 1 − ( 111 . 05 − 2 . 39 · 21 . 8 ) = 89 . 1 − 58 . 9 = 30 . 2 8 / 11

  9. u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s Residuals ● 100 ● ● 80 ● ● ● Vitamin D ● ● ● ● ● ● 60 ● ● ● ● ● ● ● ● ● ● ● ● ● 40 ● ● ● ● ● ● ● ● ● ● ● ● ● ● 20 ● ● ● 20 25 30 35 BMI 9 / 11

  10. u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s Assessment of 3) Normality QQ-plot of residuals (plot 4 i output) (evt histogram (plot 7)): 10 / 11

  11. u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s Assessment of 4) Homogeneity Plot residuals as a function of predicted values (plot 1): Constant variance? Trumpet-shape? 11 / 11

Recommend


More recommend