Permutation tests for coefficients of variation in general one-way ANOVA models Markus Pauly 1 and Łukasz Smaga 2 1 Faculty of Statistics TU Dortmund University Dortmund, Germany 2 Faculty of Mathematics and Computer Science Adam Mickiewicz University Poznań, Poland Markus Pauly and Łukasz Smaga Permutation tests for coefficients of variation Statistical Learning Seminars 2020 1 / 34
Coefficient of variation • The coefficient of variation (CV) c = σ µ is a unitless dispersion measure for data given on a ratio scale. • It has many applications as, for example: • guide of the performance and repeatability of measurements in clinical trials (Feltz and Miller, 1996), • a reliability tool in engineering control charts (Castagliola et al., 2013), • a measure of risk in empirical finance and psychology (Ferri and Jones, 1979; Weber et al., 2004), • supply chain management in distributional and procurement logistic (Wanke and Zinn, 2004), • quantifying variability in genetics (Wright, 1952). Markus Pauly and Łukasz Smaga Permutation tests for coefficients of variation Statistical Learning Seminars 2020 2 / 34
Statistical inference for CV • Many of confidence intervals and tests for CVs are based on the estimator c = s � x ¯ of the CV or unbiased modifications thereof. • As the asymptotic distribution of � c depends on potentially unknown model parameters as the curtosis, many methods are derived for parametric models. • Assuming normality, testing for equality of CVs have been proposed by, e.g., Feltz and Miller (1996) - the current gold standard, Forkman (2009), Krishnamoorthy and Lee (2014), whereas Aerts and Haesbroeck (2017) investigate multivariate CVs under elliptic symmetry. Markus Pauly and Łukasz Smaga Permutation tests for coefficients of variation Statistical Learning Seminars 2020 3 / 34
Statistical inference for CV • If the model is correctly specified, most of these methods perform fairly well. Otherwise, however, the procedures may not be reliable in general. • To this end, we consider a statistic of Wald-type in a general model and equip it with a permutation technique to assure good finite sample behaviour. • The resulting permutation test is finitely exact if data is exchangeable and is asymptotically correct in general. Markus Pauly and Łukasz Smaga Permutation tests for coefficients of variation Statistical Learning Seminars 2020 4 / 34
Model • We consider a general k -sample model ( k ≥ 2) given by independent random variables X ij = µ i + ǫ ij , 1 ≤ i ≤ k , 1 ≤ j ≤ n i , where ǫ i 1 , . . . , ǫ in i are independent and identically distributed with E ( ǫ i 1 ) = 0 , E ( ǫ 2 i 1 ) = σ 2 E ( ǫ 4 i > 0 , sup i 1 ) < ∞ . 1 ≤ i ≤ k • X ij describes the j -th observation in group i , n i the i -th sample size and N = n 1 + · · · + n k denotes the total sample size. Markus Pauly and Łukasz Smaga Permutation tests for coefficients of variation Statistical Learning Seminars 2020 5 / 34
Hypotheses • We define the CV of the i -th group as c i = σ i µ i (additionally assuming µ i � = 0) and set β i = µ i σ i for the corresponding standardized mean. • The null hypothesis of equal CVs: H 0 : c 1 = · · · = c k (assuming µ i � = 0, i = 1 , . . . , k ) or equivalently of equal standardized means H 0 : β 1 = · · · = β k . Markus Pauly and Łukasz Smaga Permutation tests for coefficients of variation Statistical Learning Seminars 2020 6 / 34
Estimators • The natural plug-in estimators: ¯ σ i X i · c i = � and � � β i = , ¯ σ i X i · � � n i � n i X i · ) 2 are the sample mean and variance where ¯ X i · = n − 1 i = n − 1 j =1 ( X ij − ¯ σ 2 j =1 X ij and � i i in group i , i = 1 , . . . , k . c i and � • The estimators � β i are consistent and asymptotically normal (as min( n 1 , . . . , n k ) → ∞ ) under the given assumptions. Markus Pauly and Łukasz Smaga Permutation tests for coefficients of variation Statistical Learning Seminars 2020 7 / 34
Estimators Lemma i ) ⊤ ∈ R × (0 , ∞ ) . Then we have for each i = 1 , . . . , k: Set θ i := ( µ i , σ 2 σ i / ¯ (a) for the CV estimator � c i = � X i · assuming that µ i � = 0 : n i � √ n i ( � 1 c i − c i ) = h θ i ( X ij ) + o P (1) , √ n i j =1 where h θ i ( X i 1 ) = ( X i 1 − µ i ) 2 − σ 2 − σ i ( X i 1 − µ i ) i . µ 2 2 µ i σ i i Moreover, E ( h θ i ( X i 1 )) = 0 and Var ( h θ i ( X i 1 )) = σ 4 − E ( X 3 i 1 ) − 3 µ i σ 2 i − µ 3 + E ( X 4 i 1 ) − 4 µ i E ( X 3 i 1 ) + 6 µ 2 i σ 2 i + 3 µ 4 i − σ 4 i i i . µ 4 µ 3 4 µ 2 i σ 2 i i i Markus Pauly and Łukasz Smaga Permutation tests for coefficients of variation Statistical Learning Seminars 2020 8 / 34
Estimators Lemma (b) for the standardized means estimator � β i = ¯ X i · / � σ i we have � � n i � √ n i 1 � β i − β i = h θ i , inv ( X ij ) + o P (1) , √ n i j =1 where h θ i , inv ( X i 1 ) = − ( µ i /σ i ) 2 h θ i ( X i 1 ) for µ i � = 0 and h θ i , inv ( X i 1 ) = X i 1 /σ i for µ i = 0 . Furthermore, E ( h θ i , inv ( X i 1 )) = 0 and Var ( h θ i , inv ( X i 1 )) = ( µ i /σ i ) 4 Var ( h θ i ( X i 1 )) for µ i � = 0 and Var ( h θ i , inv ( X i 1 )) = 1 for µ i = 0 . Markus Pauly and Łukasz Smaga Permutation tests for coefficients of variation Statistical Learning Seminars 2020 9 / 34
Estimators Lemma Under the assumptions of the above Lemma and presuming µ i � = 0 we set � σ 2 µ i i i + 1 − σ i µ 2 p ( µ i , σ i ) = � ∈ (0 , 1) . σ 2 2 µ i i i + 1 µ 2 Then we have Var ( h θ i ( X i 1 )) = 0 (as well as Var ( h θ i , inv ( X i 1 )) = 0 ) if and only if X i 1 has a specific two-point distribution given by � µ i + σ 2 σ 2 µ i + σ i i + 1 , with probability p ( µ i , σ i ) , i i µ 2 � X i 1 = (1) µ i + σ 2 σ 2 µ i − σ i i + 1 , with probability 1 − p ( µ i , σ i ) . i i µ 2 Markus Pauly and Łukasz Smaga Permutation tests for coefficients of variation Statistical Learning Seminars 2020 10 / 34
Estimators • Assuming n i / N → p i > 0 for i = 1 , . . . , k , we have � � √ 0 , Var ( h θ i ( X i 1 )) d N ( � c i − c i ) − → N p i and � � √ 0 , Var ( h θ i , inv ( X i 1 )) d N ( � β i − β i ) − → N p i for each i = 1 , . . . , k . • To estimate Var ( h θ i ( X i 1 )) and Var ( h θ i , inv ( X i 1 )), we use the empirical sample means ¯ X i · , � n i � n i i , and third and fourth moments n − 1 ij and n − 1 σ 2 j =1 X 3 j =1 X 4 variances ˆ ij , respectively. i i • Denote the resulting estimators of Var ( h θ i ( X i 1 )) and Var ( h θ i , inv ( X i 1 )) by S 2 i and S 2 i , inv respectively. Markus Pauly and Łukasz Smaga Permutation tests for coefficients of variation Statistical Learning Seminars 2020 11 / 34
Test statistics • Following, e.g., Feltz and Miller (1996) and Chung and Romano (2013), we propose to use statistics of James-type (James, 1951): � � 2 � k k � c i / S 2 n i i =1 n i � i Z N = c i − , � � k S 2 i =1 n i / S 2 i i =1 i � � 2 � k i =1 n i � β i / S 2 k � n i � inv , i Z N , inv = β i − . � k S 2 i =1 n i / S 2 inv , i i =1 inv , i Markus Pauly and Łukasz Smaga Permutation tests for coefficients of variation Statistical Learning Seminars 2020 12 / 34
Asymptotic χ 2 -tests • Under the above assumptions: d → χ 2 Z N | H 0 − k − 1 and d → χ 2 Z N , inv | H 0 − k − 1 . • Asymptotic χ 2 -tests: ϕ N = 1 { Z N > χ 2 k − 1 , 1 − α } , and ϕ N , inv = 1 { Z N , inv > χ 2 k − 1 , 1 − α } for H 0 , where χ 2 l ,α denotes the α -quantile of the χ 2 l -distribution. Markus Pauly and Łukasz Smaga Permutation tests for coefficients of variation Statistical Learning Seminars 2020 13 / 34
Permutation tests • Keeping the pooled data ( Y 1 , . . . , Y N ) = ( X 11 , . . . , X 1 n 1 , X 21 , . . . , X 2 n 2 , . . . , X k 1 , . . . , X kn k ) fixed, a permutation π is uniformly chosen from the symmetric group S N and the test statistic, say Z N , is recalculated with the permuted sample ( Y π (1) , . . . , Y π ( N ) ) . • The permutation Z N -test is given by ϕ π N = 1 { Z N > c π 1 − α } , where c π 1 − α denotes the (conditional) (1 − α )-quantile of the distribution function of Z N ( Y π (1) , . . . , Y π ( N ) ) given by � R Z N ( t ) = 1 t �→ ˆ 1 ( Z N ( Y π (1) , . . . , Y π ( N ) ) ≤ t ) . N ! π ∈ S N • By construction, this test is exact if ( Y 1 , . . . , Y N ) are exchangeable. Markus Pauly and Łukasz Smaga Permutation tests for coefficients of variation Statistical Learning Seminars 2020 14 / 34
Permutation tests • Chung and Romano (2013) Theorem Under the above assumptions and some additional ones, the permutation distributions of Z N and Z N , inv mimic the asymptotic null distribution, that is we have convergence in probability under H 0 as min i n i → ∞ � R Z N ( t ) = 1 ˆ 1 ( Z N ( Y π (1) , . . . , Y π ( N ) ) ≤ t ) P → χ 2 k − 1 ( t ) N ! π ∈ S N � R Z N , inv ( t ) = 1 1 ( Z N , inv ( Y π (1) , . . . , Y π ( N ) ) ≤ t ) P ˆ → χ 2 k − 1 ( t ) . N ! π ∈ S N Moreover, the probabilities that the Z N - and Z N , inv -permutation tests reject H 0 tend to α . Markus Pauly and Łukasz Smaga Permutation tests for coefficients of variation Statistical Learning Seminars 2020 15 / 34
Recommend
More recommend