nonparametric methods
play

Nonparametric Methods Marc H. Mehlman marcmehlman@yahoo.com - PowerPoint PPT Presentation

Nonparametric Methods Marc H. Mehlman marcmehlman@yahoo.com University of New Haven Nonparametric Methods , or DistributionFree Methods is for testing from a population without knowing anything about the populations distribution. Marc


  1. Nonparametric Methods Marc H. Mehlman marcmehlman@yahoo.com University of New Haven Nonparametric Methods , or Distribution–Free Methods is for testing from a population without knowing anything about the population’s distribution. Marc Mehlman Marc Mehlman (University of New Haven) Nonparametric Methods 1 / 44

  2. Table of Contents Sign Test for Median of Ordinal Data 1 Wilcoxon Signed–Ranks Test for Matched Pairs 2 Wilcoxon Rank–Sum Test for Two Independent Samples 3 Kruskal–Wallis Test for Multiple Independent Samples 4 Spearman Rank Correlation Test for Bivariate Data 5 Chapter #12 R Assignment 6 Marc Mehlman Marc Mehlman (University of New Haven) Nonparametric Methods 2 / 44

  3. Advantages and Disadvantages to Nonparametric Methods Advantages of Nonparametric Methods Less assumptions so nonparametric methods can be used in more situations. Computations are often simple and easier to understand. Less sensitive to outliers that are actually incorrect observations. Disadvantages of Nonparametric Methods Information is wasted when numerical data is converted into rank data so conclusions tend to be weaker. Underestimates the effect of correct outliers. Marc Mehlman Marc Mehlman (University of New Haven) Nonparametric Methods 3 / 44

  4. Definition Let T 1 and T 2 be two tests statistics of H 0 versus H 1 . Let n 1 and n 2 be the minimum sample sizes to achieve a test of power β and size α using T 1 or T 2 respectively. The Pitman asymptotic relative efficiency , ARE ( T 1 , T 2 ), of test statistic, T 1 , to test statistic, T 2 , is the limit, if it exists, of the ratios n 2 / n 1 as n 1 ↑ ∞ . The ARE of a parametric Test to a nonparametric test is generally less than one because one must give up some efficiency in return for less assumptions. Marc Mehlman Marc Mehlman (University of New Haven) Nonparametric Methods 4 / 44

  5. Sign Test for Median of Ordinal Data Sign Test for Median of Ordinal Data Sign Test for Median of Ordinal Data A nonparametric version of the one sample t –test, with ARE = 2 π = 0 . 64. Marc Mehlman Marc Mehlman (University of New Haven) Nonparametric Methods 5 / 44

  6. Sign Test for Median of Ordinal Data Sign Test Situations for using the Sign Test Test of Magnitude between Matched Pairs, ( x 1 , y 1 ) , ( x 2 , y 2 ) , · · · ( x n . y n ). For any ( x j , y j )’s such that x j = y j , delete ( x j , y j ) from the sample and readjust n , the sample size. Let H 0 : P ( X > Y ) = P ( Y > X ) def d j = y j − x j x def = min(# of positive d j ’s , # of negative d j ’s) . Test for Median of a x 1 , x 2 , · · · , x n . Fixed M . For andy x j ’s such that x j = M , delete x j from the sample and readjust n , the sample size. Let H 0 : median = M def d j = x j − M x def = min(# of positive d j ’s , # of negative d j ’s) . Marc Mehlman Marc Mehlman (University of New Haven) Nonparametric Methods 6 / 44

  7. Sign Test for Median of Ordinal Data Sign Test Situations for using the Sign Test Test of Distribution of Nominal Data where x 1 , · · · , x n . Let H 0 : P ( X = first value) = P ( X = second value) � 1 if x j = first value def d j = 0 if x j = second value n n x def � � = min( d j , 1 − d j ) j =1 j =1 = min (# first values , # second values) . Nominal data need not be numbers. The two “Situations for using the Sign Test” on the previous slide are just special cases of the situation above were what was being counted was +’s and − ’s. Marc Mehlman Marc Mehlman (University of New Haven) Nonparametric Methods 7 / 44

  8. Sign Test for Median of Ordinal Data Sign Test Theorem (Sign Test for x ≤ 25) The test statistic is x and the critical values are found in Table A–7. Theorem (Sign Test for x > 25) The test statistic is z = ( x +0 . 5) − n / 2 √ n / 2 which is approximately standard normal. Marc Mehlman Marc Mehlman (University of New Haven) Nonparametric Methods 8 / 44

  9. Sign Test for Median of Ordinal Data Marc Mehlman Marc Mehlman (University of New Haven) Nonparametric Methods 9 / 44

  10. Sign Test for Median of Ordinal Data Sign Test Sign Test Example • You’re a marketing analyst for Chefs-R-Us. You’ve asked 8 people to rate a new ravioli on a 5-point Likert scale 1 = terrible to 5 = excellent The ratings are: 2 4 1 2 1 1 2 1 At the .05 level, is there evidence that the median rating is at least 3 ? 22 H 0 : median = 3 versus H A : median < 3 Table A–7 says to reject at 0.05 significance level since x = 1. Marc Mehlman Marc Mehlman (University of New Haven) Nonparametric Methods 10 / 44

  11. Sign Test for Median of Ordinal Data Sign Test Example (example solution using R) > install.packages("BSDA") > library(BSDA) > dat=c(2,4,1,2,1,1,2,1) > SIGN.test(dat,md=3,alternative="less") One-sample Sign-Test data: dat s = 1, p-value = 0.03516 alternative hypothesis: true median is less than 3 95 percent confidence interval: -Inf 2 sample estimates: median of x 1.5 Conf.Level L.E.pt U.E.pt Lower Achieved CI 0.8555 -Inf 2 Interpolated CI 0.9500 -Inf 2 Upper Achieved CI 0.9648 -Inf 2 Marc Mehlman Marc Mehlman (University of New Haven) Nonparametric Methods 11 / 44

  12. Wilcoxon Signed–Ranks Test for Matched Pairs Wilcoxon Signed–Ranks Test for Matched Pairs Wilcoxon Signed–Ranks Test for Matched Pairs A nonparametric version of the matched pair one sample t –test, with ARE = 3 π = 0 . 955. Marc Mehlman Marc Mehlman (University of New Haven) Nonparametric Methods 12 / 44

  13. Wilcoxon Signed–Ranks Test for Matched Pairs Wilcoxon Signed–Ranks Test def Given ( x 1 , y 1 ) , ( x 2 , y 2 ) , · · · , ( x n , y n ), define d j = y j − x y . Discard all bivariate data with d j = 0 (and readjust the sample size n ). Next rank the | d j | ’s from smallest to largest. For ties, reassign them the average of their would be ranks. (For ties, there is better way than averaging the ranks, but it is more complicated.) Definition The Wilcoxon Signed–Rank Statistics are t + ( ω ) def = sum of ranks of the positive d j ’s, t − ( ω ) def = sum of ranks of the negative d j ’s and t def = min( t − , t + ) . Frank Wilcoxon (1892–1965), American Since the Wilcoxon Signed-Ranks Test takes into account both the signs and the ranks of the d j ’s, it obtains a higher ARE. Marc Mehlman Marc Mehlman (University of New Haven) Nonparametric Methods 13 / 44

  14. Wilcoxon Signed–Ranks Test for Matched Pairs Wilcoxon Signed–Ranks Test Example Let (7 . 4 , 5 . 5) , (5 . 5 , 3 . 2) , (5 . 6 , 6 . 0) , (7 . 9 , 4 . 5) , (6 . 7 , 6 . 7) , (6 . 5 , 3 . 0) , (7 . 8 , 4 . 8) , (3 . 5 , 6 . 3) be a matched pair random sample. Calculate the Wilcoxon Signed–Rank Statistics. Solution: After eliminating d 5 = 0 and reducing the sample size to 7, d 1 = 1 . 9 , d 2 = 2 . 3 , d 3 = − 0 . 4 , d 4 = 3 . 4 , d 5 = 3 . 5 , d 6 = 3 . 0 , d 7 = − 2 . 8 so one has rank 1 2 3 4 5 6 7 magnitude | d 3 | | d 1 | | d 2 | | d 7 | | d 6 | | d 4 | | d 5 | . p/m − + + − + + + Thus t + = 2 + 3 + 5 + 6 + 7 = 23 t − = 1 + 4 = 5 , t = 5 . Marc Mehlman Marc Mehlman (University of New Haven) Nonparametric Methods 14 / 44

  15. Wilcoxon Signed–Ranks Test for Matched Pairs Wilcoxon Signed–Ranks Test Theorem (Wilcoxon Signed–Rank Test) Given a random bivariate sample, ( x 1 , y 1 ) , ( x 2 , y 2 ) , · · · , ( x n , y n ) , such that def the distribution of the differences, d j = y j − x j , is approximately symmetric about zero and has a median of M, consider H 0 : M = 0 versus H A : not H 0 . Then a conservative test of H 0 versus H A is is to use  t if n ≤ 30  t − n ( n +1) / 4 √ z = if n > 30  n ( n +1)(2 n +1) / 24 as the test statistic. Use Table A–8 if n ≤ 30 and assume that Z ∼ N (0 , 1) if n > 30 . Marc Mehlman Marc Mehlman (University of New Haven) Nonparametric Methods 15 / 44

  16. Wilcoxon Signed–Ranks Test for Matched Pairs Marc Mehlman Marc Mehlman (University of New Haven) Nonparametric Methods 16 / 44

  17. Wilcoxon Signed–Ranks Test for Matched Pairs Wilcoxon Signed–Ranks Test Example (continued) Using data from the above Example, find p –value of H 0 : population differences have a median of 0 versus H A : not H 0 . Solution: Here n = 7. The approximate p –value is smallest α such that t = 5 < t α . From A–8 one has the p –value > 0 . 1. Using R: > xdat=c(7.4,5.5,5.6,7.9,6.5,7.8,3.5) > ydat=c(5.5,3.2,6.0,4.5,3.0,4.8,6.3) > wilcox.test(xdat,ydat,paired=TRUE) Wilcoxon signed rank test data: xdat and yydat V = 23, p-value = 0.1563 alternative hypothesis: true location shift is not equal to 0 Marc Mehlman Marc Mehlman (University of New Haven) Nonparametric Methods 17 / 44

  18. Wilcoxon Rank–Sum Test for Two Independent Samples Wilcoxon Rank–Sum Test for Two Independent Samples Wilcoxon Rank–Sum Test for Two Independent Samples A nonparametric version of the two sample t –test, with ARE of = 3 π = 0 . 955. Marc Mehlman Marc Mehlman (University of New Haven) Nonparametric Methods 18 / 44

Recommend


More recommend