stat 401a statistical methods for research workers
play

STAT 401A - Statistical Methods for Research Workers Case statistics - PowerPoint PPT Presentation

STAT 401A - Statistical Methods for Research Workers Case statistics Jarad Niemi (Dr. J) Iowa State University last updated: November 17, 2014 Jarad Niemi (Iowa State) Case statistics November 17, 2014 1 / 9 Influential observations Case


  1. STAT 401A - Statistical Methods for Research Workers Case statistics Jarad Niemi (Dr. J) Iowa State University last updated: November 17, 2014 Jarad Niemi (Iowa State) Case statistics November 17, 2014 1 / 9

  2. Influential observations Case statistics Definition Leverage ( h i ) is a measure of the distance between an observation’s explanatory variable values and the average of the explanatory variable values in the entire data set. Rule-of-thumb: Possible concern when leverage > 2 p / n where p is the number of regression coefficients and n is the number of observations. Definition Cook’s distance (D) is a measure of the overall effect on estimated regression coefficients when removing an observation. Rule-of-thumb: Concerned when Cook’s D ≈ 1. Jarad Niemi (Iowa State) Case statistics November 17, 2014 2 / 9

  3. Influential observations Leverage and influence Consider simple linear regression (point of interest is the open circle): Low influence Leverage= 0.05 Leverage= 0.42 Cook's D= 0 Cook's D= 0.05 High influence Leverage= 0.05 Leverage= 0.42 Cook's D= 0.36 Cook's D= 4.11 Low leverage High leverage Jarad Niemi (Iowa State) Case statistics November 17, 2014 3 / 9

  4. Influential observations Residuals Residuals Residual (observed minus predicted): r i = ˆ e i = Y i − ˆ µ i (Internally) studentized residual r i r i = σ √ 1 − h i � ˆ SD ( r i ) Externally studentized residuals r i √ 1 − h i σ ( i ) ˆ where ˆ σ ( i ) is the estimate of the standard deviation about the regression line from the fit that excludes observation i . 95% of studentized residuals should be within -2 and 2. Jarad Niemi (Iowa State) Case statistics November 17, 2014 4 / 9

  5. Influential observations Residuals SAT residuals after adjusting for % taking and median class rank: Residuals Studentized residuals Externally studentized residuals 50 1 1 0 0 0 value −1 −1 −50 −2 −2 −3 −3 −100 0 10 20 30 40 50 0 10 20 30 40 50 0 10 20 30 40 50 Case number Jarad Niemi (Iowa State) Case statistics November 17, 2014 5 / 9

  6. Influential observations Residuals DATA case1201; INFILE 'case1201.csv' DSD FIRSTOBS=2; INPUT state $ sat takers income years public expend rank; ltakers = log(takers); IF state='Alaska' THEN DELETE; RUN; PROC GLM DATA=case1201; MODEL sat = ltakers rank; RUN; Jarad Niemi (Iowa State) Case statistics November 17, 2014 6 / 9

  7. Influential observations Residuals SAS diagnostics: Jarad Niemi (Iowa State) Case statistics November 17, 2014 7 / 9

  8. Influential observations Residuals mod = lm(SAT~log(Takers)+Rank, case1201) opar = par(mfrow=c(2,3)); plot(mod, 1:6, ask=FALSE); par(opar) Residuals vs Fitted Normal Q−Q Scale−Location 2 50 50 Standardized residuals Standardized residuals 1.5 1 48 16 Residuals 0 0 1.0 −1 −50 0.5 16 48 −2 48 16 −100 50 −3 0.0 50 850 950 1050 −2 −1 0 1 2 850 950 1050 Fitted values Theoretical Quantiles Fitted values Cook's dist vs Leverage h ii ( 1 Cook's distance Residuals vs Leverage 0.15 2 3.5 2.5 3 2 1.5 1 50 50 Standardized residuals 0.12 1 Cook's distance Cook's distance 0.10 16 16 0 0.08 48 48 −1 0.05 0.04 0.5 16 48 −3 Cook's distance 0.5 0.00 50 0.00 0 0 10 20 30 40 50 0.00 0.05 0.10 0.15 0.02 0.08 0.14 Leverage h ii Obs. number Leverage Jarad Niemi (Iowa State) Case statistics November 17, 2014 8 / 9

  9. Influential observations Summary Summary of case statistics Leverage: observations that might be influential Cook’s distance: observations had large overall influence on their own If influential, fit with and without to determine impact on questions of interest Residuals: observations are not being fit accurately by the model Check out this app (on campus or VPN): http://shiny1.stat.iastate.edu/_Statistics/14-outlier/ Jarad Niemi (Iowa State) Case statistics November 17, 2014 9 / 9

Recommend


More recommend