business statistics
play

Business Statistics CONTENTS Hypotheses on the median The sign - PowerPoint PPT Presentation

MEDIAN: NON-PARAMETRIC TESTS Business Statistics CONTENTS Hypotheses on the median The sign test The Wilcoxon signed ranks test Old exam question Further study HYPOTHESES ON THE MEDIAN The median is a central value that may be more


  1. MEDIAN: NON-PARAMETRIC TESTS Business Statistics

  2. CONTENTS Hypotheses on the median The sign test The Wilcoxon signed ranks test Old exam question Further study

  3. HYPOTHESES ON THE MEDIAN ▪ The median is a central value that may be more suitable for strongly asymmetric distributions ▪ and for distributions with fat tails ▪ Can we test a population median? 𝑁 is here the population median. Think of it as a ▪ e.g., 𝐼 0 : 𝑁 = 400 Greek letter ... ▪ Note: ▪ for a more or less symmetric distribution, 𝑁 ≈ 𝜈 , so a 𝑢 -test of mean is appropriate (if 𝑜 ≥ 15 ) ▪ although perhaps more sensitive to large positive or negative outliers in the sample

  4. HYPOTHESES ON THE MEDIAN ▪ What is the median of a sample? ▪ it is the middle value, i.e. 𝑦 𝑜/2 ▪ So, if 𝐼 0 : 𝑁 = 400 would be true, approximately half of the data in the sample would be lower, and half would be higher ▪ Therefore, if we count the number of data points that is lower and compare it to the number of observations, we can develop a test statistic ▪ Two varieties of such non-parametric tests today: ▪ sign test ▪ Wilcoxon signed rank test

  5. THE SIGN TEST The sign test ▪ involves simply counting the number of positive or negative signs in a sequence of 𝑜 signs ▪ is based on the binomial distribution ▪ can be applied without requirements on the population distribution

  6. THE SIGN TEST Computational steps: ▪ for each data point 𝑦 𝑗 compute the difference with the median ( 𝑁 ) of the null hypothesis ( 𝐼 0 ): 𝑒 𝑗 = 𝑦 𝑗 − 𝑁 ▪ omit zero differences ( 𝑒 𝑗 = 0 ); effective sample size is 𝑜 ′ ▪ assign +1 to positive differences ( 𝑒 𝑗 > 0 ) and −1 to negative differences ( 𝑒 𝑗 < 0 ) ▪ test statistic 𝑌 is the sum of the positive numbers (= number of positive observations)

  7. THE SIGN TEST Example: Context: battery life until failure (in hours) ▪ 𝐼 0 : 𝑁 = 400 ; 𝐼 1 : 𝑁 ≠ 400 ▪ use 𝛽 = 0.05 ▪ sample of 𝑜 = 13 observations ( 𝑦 1 , … , 𝑦 13 ) ▪ reject for large and for small numbers of positive signs

  8. THE SIGN TEST Example ( 𝐼 0 : 𝑁 = 400 ): (+) x i x i -400 s i s i 342 -58 -1 ▪ data: 𝑦 𝑗 ( 𝑗 = 1, … , 13 ) 426 26 1 1 ▪ difference with 𝑁 : 𝑒 𝑗 = 𝑦 𝑗 − 400 317 -83 -1 ▪ no cases where 𝑒 𝑗 = 0 , so 𝑜 ′ = 𝑜 545 145 1 1 264 -136 -1 ▪ 𝑡 𝑗 = ቊ 1 if 𝑒 𝑗 > 0 451 51 1 1 −1 if 𝑒 𝑗 < 0 1049 649 1 1 + = ቊ1 if 𝑒 𝑗 > 0 631 231 1 1 ▪ 𝑡 𝑗 512 112 1 1 0 if 𝑒 𝑗 < 0 266 -134 -1 𝑜 ′ 𝑡 𝑗 + = 8 ▪ 𝑦 = σ 𝑗=1 492 92 1 1 562 162 1 1 298 -102 -1

  9. THE SIGN TEST Example (continued): ▪ 𝑦 = 8 ▪ under 𝐼 0 : 𝑌~𝑐𝑗𝑜 13,0.5 ▪ 𝑄 𝑐𝑗𝑜 13,0.5 𝑌 ≥ 8 = 0.291 ▪ why ≥ 8 ? ▪ if we would reject for 8 , we would also reject for 9 ▪ 𝑞 -value: 2 × 0.291 = 0.581 ▪ why 2 × ? ▪ because it’s a two -sided null hypothesis ▪ there is no reason to reject 𝐼 0

  10. EXERCISE 1 Suppose we have more observations ( 𝑜 = 130 ) and find 𝑦 = 80 . Can you look up 𝑄 𝑐𝑗𝑜 130,0.5 𝑌 ≥ 80 ?

  11. THE SIGN TEST In the sign test, we replace the numerical values by signs ( + or − ) Advantage: ▪ we don’t need any assumption on normality, symmetry, etc. that’s why we say it’s non - parametric: we don’t have to assume a certain ▪ distribution with parameters Disadvantage: ▪ we discard much information, so that the test is not very sensitive (has low “power”; see later) Are there other non-parametric tests that are more powerful? ▪ is there a compromise between value and sign that still needs some assumptions, but not too many assumptions? Yes, replacing data by their rank

  12. THE WILCOXON SIGNED RANK TEST Wilcoxon signed rank test ▪ involves comparing the sum of ranks of the values larger than the test value with the sum of ranks of the values smaller than the test value Computational Steps: ▪ for each data point 𝑦 𝑗 compute the absolute difference with the median ( 𝑁 ) of the null hypothesis: 𝑒 𝑗 = 𝑦 𝑗 − 𝑁 ▪ omit zero differences ( 𝑒 𝑗 = 0 ); effective sample size is 𝑜 ′ ▪ assign ranks ( 1, … , 𝑜 ′ ) to the 𝑒 𝑗 ▪ reassign + and − to the ranks ▪ test statistic ( 𝑋 ) is the sum of the positive ranks

  13. THE WILCOXON SIGNED RANK TEST x i – | x i – 400| x i r i r i (+) Example ( 𝐼 0 : 𝑁 = 400 ): 400 ▪ data: 𝑦 𝑗 ( 𝑗 = 1, … , 13 ) 342 -58 58 -3 426 26 26 1 1 ▪ difference with 317 -83 83 -4 𝑁 : 𝑒 𝑗 = 𝑦 𝑗 − 400 545 145 145 10 10 ▪ no cases where 𝑒 𝑗 = 0 , 264 -136 136 -9 so 𝑜 ′ = 𝑜 451 51 51 2 2 𝑜 ′ 𝑠 1049 649 649 13 13 + = 61 ▪ 𝑥 = σ 𝑗=1 631 231 231 12 12 𝑗 ▪ under 𝐼 0 : 𝑋~? (use table) 512 112 112 7 7 266 -134 134 -8 ▪ 𝑄 𝐼 0 𝑋 ≥ 61 =? 492 92 92 5 5 562 162 162 11 11 298 -102 102 -6

  14. THE WILCOXON SIGNED RANK TEST Testing the median using the Wilcoxon 𝑋 statistic ▪ small samples: using a table of critical values ▪ included in tables at exam ▪ large samples: using a normal approximation of 𝑋 ▪ valid when 𝑜 ≥ 20 ▪ The test is only valid for symmetrically distributed populations ▪ if not, use sign test

  15. THE WILCOXON SIGNED RANK TEST Small samples: critical values of Wilcoxon statistic Lower and Upper Critical Values W of Wilcoxon Signed-Ranks Test a = 0.05 a = 0.025 a = 0.01 a = 0.005 one-tail: a = 0.10 a = 0.05 a = 0.02 a = 0.01 two-tail: (lower , upper) n 5 0 , 15 --- , --- --- , --- --- , --- Table is available at 6 2 , 19 0 , 21 --- , --- --- , --- the exam (and on 7 3 , 25 2 , 26 0 , 28 --- , --- 8 5 , 31 3 , 33 1 , 35 0 , 36 the course website) 9 8 , 37 5 , 40 3 , 42 1 , 44 10 10 , 45 8 , 47 5 , 50 3 , 52 11 13 , 53 10 , 56 7 , 59 5 , 61 12 17 , 61 13 , 65 10 , 68 7 , 71 13 21 , 70 17 , 74 12 , 79 10 , 81 ▪ two-sided, 𝛽 = 0.05 , 𝑜 = 13 : 𝑥 𝑚𝑝𝑥𝑓𝑠 = 17 and 𝑥 𝑣𝑞𝑞𝑓𝑠 = 74 ▪ 𝑆 crit = [0,17] ∪ [74,91] ▪ 𝑥 calc = 61 , so do not reject 𝐼 0 at 𝛽 = 0.05

  16. THE WILCOXON SIGNED RANK TEST Large samples: under 𝐼 0 : , it can be shown that 𝑜 𝑜+1 ▪ 𝐹 𝑋 = 4 𝑜 𝑜+1 2𝑜+1 ▪ var 𝑋 = 24 Further, for 𝑜 ≥ 20 , approximately: 𝑋− 𝑜 𝑜+1 ▪ 4 ~𝑂 0,1 𝑜 𝑜+1 2𝑜+1 24 𝑥 calc − 𝑜 𝑜+1 ▪ so you can compute 𝑨 calc = 4 𝑜 𝑜+1 2𝑜+1 24 ▪ and compare it to 𝑨 crit (e.g., ±1.96 )

  17. THE WILCOXON SIGNED RANK TEST In fact, not a good idea Example, continued: because 𝑜 = 13 ≱ 20 . We do 𝑜 ′ 𝑠 it just to show how it works ... + = 61 ▪ 𝑥 = σ 𝑗=1 𝑗 ▪ under 𝐼 0 : 𝑋~𝑂 𝐹 𝑋 , var 𝑋 𝑋−𝐹 𝑋 ▪ so, under 𝐼 0 : var 𝑋 ~𝑂 0,1 𝑋−𝐹 𝑋 61−45.5 ▪ 𝑄 𝑂 𝑋 ≥ 61 = 𝑄 = 𝑄ሺ var 𝑋 ≥ 𝑎 ≥ 14.31 ሻ 1.08 = 0.1401 ▪ 𝑞 -value: 2 × 0.1401 = 0.2802 ▪ there is no reason to reject 𝐼 0

  18. OLD EXAM QUESTION 23 March 2015, Q1l-m

  19. FURTHER STUDY Doane & Seward 5/E 16.1-16.3 Tutorial exercises week 3 Wilcoxon signed rank test, sign test

Recommend


More recommend