MEDIAN: NON-PARAMETRIC TESTS Business Statistics
CONTENTS Hypotheses on the median The sign test The Wilcoxon signed ranks test Old exam question Further study
HYPOTHESES ON THE MEDIAN ▪ The median is a central value that may be more suitable for strongly asymmetric distributions ▪ and for distributions with fat tails ▪ Can we test a population median? 𝑁 is here the population median. Think of it as a ▪ e.g., 𝐼 0 : 𝑁 = 400 Greek letter ... ▪ Note: ▪ for a more or less symmetric distribution, 𝑁 ≈ 𝜈 , so a 𝑢 -test of mean is appropriate (if 𝑜 ≥ 15 ) ▪ although perhaps more sensitive to large positive or negative outliers in the sample
HYPOTHESES ON THE MEDIAN ▪ What is the median of a sample? ▪ it is the middle value, i.e. 𝑦 𝑜/2 ▪ So, if 𝐼 0 : 𝑁 = 400 would be true, approximately half of the data in the sample would be lower, and half would be higher ▪ Therefore, if we count the number of data points that is lower and compare it to the number of observations, we can develop a test statistic ▪ Two varieties of such non-parametric tests today: ▪ sign test ▪ Wilcoxon signed rank test
THE SIGN TEST The sign test ▪ involves simply counting the number of positive or negative signs in a sequence of 𝑜 signs ▪ is based on the binomial distribution ▪ can be applied without requirements on the population distribution
THE SIGN TEST Computational steps: ▪ for each data point 𝑦 𝑗 compute the difference with the median ( 𝑁 ) of the null hypothesis ( 𝐼 0 ): 𝑒 𝑗 = 𝑦 𝑗 − 𝑁 ▪ omit zero differences ( 𝑒 𝑗 = 0 ); effective sample size is 𝑜 ′ ▪ assign +1 to positive differences ( 𝑒 𝑗 > 0 ) and −1 to negative differences ( 𝑒 𝑗 < 0 ) ▪ test statistic 𝑌 is the sum of the positive numbers (= number of positive observations)
THE SIGN TEST Example: Context: battery life until failure (in hours) ▪ 𝐼 0 : 𝑁 = 400 ; 𝐼 1 : 𝑁 ≠ 400 ▪ use 𝛽 = 0.05 ▪ sample of 𝑜 = 13 observations ( 𝑦 1 , … , 𝑦 13 ) ▪ reject for large and for small numbers of positive signs
THE SIGN TEST Example ( 𝐼 0 : 𝑁 = 400 ): (+) x i x i -400 s i s i 342 -58 -1 ▪ data: 𝑦 𝑗 ( 𝑗 = 1, … , 13 ) 426 26 1 1 ▪ difference with 𝑁 : 𝑒 𝑗 = 𝑦 𝑗 − 400 317 -83 -1 ▪ no cases where 𝑒 𝑗 = 0 , so 𝑜 ′ = 𝑜 545 145 1 1 264 -136 -1 ▪ 𝑡 𝑗 = ቊ 1 if 𝑒 𝑗 > 0 451 51 1 1 −1 if 𝑒 𝑗 < 0 1049 649 1 1 + = ቊ1 if 𝑒 𝑗 > 0 631 231 1 1 ▪ 𝑡 𝑗 512 112 1 1 0 if 𝑒 𝑗 < 0 266 -134 -1 𝑜 ′ 𝑡 𝑗 + = 8 ▪ 𝑦 = σ 𝑗=1 492 92 1 1 562 162 1 1 298 -102 -1
THE SIGN TEST Example (continued): ▪ 𝑦 = 8 ▪ under 𝐼 0 : 𝑌~𝑐𝑗𝑜 13,0.5 ▪ 𝑄 𝑐𝑗𝑜 13,0.5 𝑌 ≥ 8 = 0.291 ▪ why ≥ 8 ? ▪ if we would reject for 8 , we would also reject for 9 ▪ 𝑞 -value: 2 × 0.291 = 0.581 ▪ why 2 × ? ▪ because it’s a two -sided null hypothesis ▪ there is no reason to reject 𝐼 0
EXERCISE 1 Suppose we have more observations ( 𝑜 = 130 ) and find 𝑦 = 80 . Can you look up 𝑄 𝑐𝑗𝑜 130,0.5 𝑌 ≥ 80 ?
THE SIGN TEST In the sign test, we replace the numerical values by signs ( + or − ) Advantage: ▪ we don’t need any assumption on normality, symmetry, etc. that’s why we say it’s non - parametric: we don’t have to assume a certain ▪ distribution with parameters Disadvantage: ▪ we discard much information, so that the test is not very sensitive (has low “power”; see later) Are there other non-parametric tests that are more powerful? ▪ is there a compromise between value and sign that still needs some assumptions, but not too many assumptions? Yes, replacing data by their rank
THE WILCOXON SIGNED RANK TEST Wilcoxon signed rank test ▪ involves comparing the sum of ranks of the values larger than the test value with the sum of ranks of the values smaller than the test value Computational Steps: ▪ for each data point 𝑦 𝑗 compute the absolute difference with the median ( 𝑁 ) of the null hypothesis: 𝑒 𝑗 = 𝑦 𝑗 − 𝑁 ▪ omit zero differences ( 𝑒 𝑗 = 0 ); effective sample size is 𝑜 ′ ▪ assign ranks ( 1, … , 𝑜 ′ ) to the 𝑒 𝑗 ▪ reassign + and − to the ranks ▪ test statistic ( 𝑋 ) is the sum of the positive ranks
THE WILCOXON SIGNED RANK TEST x i – | x i – 400| x i r i r i (+) Example ( 𝐼 0 : 𝑁 = 400 ): 400 ▪ data: 𝑦 𝑗 ( 𝑗 = 1, … , 13 ) 342 -58 58 -3 426 26 26 1 1 ▪ difference with 317 -83 83 -4 𝑁 : 𝑒 𝑗 = 𝑦 𝑗 − 400 545 145 145 10 10 ▪ no cases where 𝑒 𝑗 = 0 , 264 -136 136 -9 so 𝑜 ′ = 𝑜 451 51 51 2 2 𝑜 ′ 𝑠 1049 649 649 13 13 + = 61 ▪ 𝑥 = σ 𝑗=1 631 231 231 12 12 𝑗 ▪ under 𝐼 0 : 𝑋~? (use table) 512 112 112 7 7 266 -134 134 -8 ▪ 𝑄 𝐼 0 𝑋 ≥ 61 =? 492 92 92 5 5 562 162 162 11 11 298 -102 102 -6
THE WILCOXON SIGNED RANK TEST Testing the median using the Wilcoxon 𝑋 statistic ▪ small samples: using a table of critical values ▪ included in tables at exam ▪ large samples: using a normal approximation of 𝑋 ▪ valid when 𝑜 ≥ 20 ▪ The test is only valid for symmetrically distributed populations ▪ if not, use sign test
THE WILCOXON SIGNED RANK TEST Small samples: critical values of Wilcoxon statistic Lower and Upper Critical Values W of Wilcoxon Signed-Ranks Test a = 0.05 a = 0.025 a = 0.01 a = 0.005 one-tail: a = 0.10 a = 0.05 a = 0.02 a = 0.01 two-tail: (lower , upper) n 5 0 , 15 --- , --- --- , --- --- , --- Table is available at 6 2 , 19 0 , 21 --- , --- --- , --- the exam (and on 7 3 , 25 2 , 26 0 , 28 --- , --- 8 5 , 31 3 , 33 1 , 35 0 , 36 the course website) 9 8 , 37 5 , 40 3 , 42 1 , 44 10 10 , 45 8 , 47 5 , 50 3 , 52 11 13 , 53 10 , 56 7 , 59 5 , 61 12 17 , 61 13 , 65 10 , 68 7 , 71 13 21 , 70 17 , 74 12 , 79 10 , 81 ▪ two-sided, 𝛽 = 0.05 , 𝑜 = 13 : 𝑥 𝑚𝑝𝑥𝑓𝑠 = 17 and 𝑥 𝑣𝑞𝑞𝑓𝑠 = 74 ▪ 𝑆 crit = [0,17] ∪ [74,91] ▪ 𝑥 calc = 61 , so do not reject 𝐼 0 at 𝛽 = 0.05
THE WILCOXON SIGNED RANK TEST Large samples: under 𝐼 0 : , it can be shown that 𝑜 𝑜+1 ▪ 𝐹 𝑋 = 4 𝑜 𝑜+1 2𝑜+1 ▪ var 𝑋 = 24 Further, for 𝑜 ≥ 20 , approximately: 𝑋− 𝑜 𝑜+1 ▪ 4 ~𝑂 0,1 𝑜 𝑜+1 2𝑜+1 24 𝑥 calc − 𝑜 𝑜+1 ▪ so you can compute 𝑨 calc = 4 𝑜 𝑜+1 2𝑜+1 24 ▪ and compare it to 𝑨 crit (e.g., ±1.96 )
THE WILCOXON SIGNED RANK TEST In fact, not a good idea Example, continued: because 𝑜 = 13 ≱ 20 . We do 𝑜 ′ 𝑠 it just to show how it works ... + = 61 ▪ 𝑥 = σ 𝑗=1 𝑗 ▪ under 𝐼 0 : 𝑋~𝑂 𝐹 𝑋 , var 𝑋 𝑋−𝐹 𝑋 ▪ so, under 𝐼 0 : var 𝑋 ~𝑂 0,1 𝑋−𝐹 𝑋 61−45.5 ▪ 𝑄 𝑂 𝑋 ≥ 61 = 𝑄 = 𝑄ሺ var 𝑋 ≥ 𝑎 ≥ 14.31 ሻ 1.08 = 0.1401 ▪ 𝑞 -value: 2 × 0.1401 = 0.2802 ▪ there is no reason to reject 𝐼 0
OLD EXAM QUESTION 23 March 2015, Q1l-m
FURTHER STUDY Doane & Seward 5/E 16.1-16.3 Tutorial exercises week 3 Wilcoxon signed rank test, sign test
Recommend
More recommend