Ch 8: Models for Matched Pairs 8.1 McNemar’s Test Example (Crossover Study: Drug vs Placebo I) 86 subjects. Randomly assign each to either “drug then placebo” or “placebo then drug”. Binary response (S,F) for each. Treatment S F Total Drug 61 25 86 Placebo 22 64 86 Methods so far (e.g., X 2 and G 2 test of indep, CI for θ , logistic regr) assume independent samples. Inappropriate for dependent samples (e.g., same subjects in each sample yielding matched pairs of responses). 338
Example (Crossover Study: Drug vs Placebo II) To reflect dependence, display data as 86 obs rather than 2 ⇥ 86 obs. Placebo S F S 12 49 61 Drug F 10 15 25 22 64 86 Population probabilities: Placebo S F S π 11 π 12 π 1 + Drug F π 21 π 22 π 2 + π + 1 π + 2 1 There is marginal homogeneity if π 1 + = π + 1 . 339
Under H 0 : marginal homogeneity, π 12 = 1 2. π 12 + π 21 Under H 0 , each of n ⇤ = n 12 + n 21 observations has probability 1 / 2 of contributed to n 12 and 1 / 2 of contributing to n 21 : r ⇣ ⌘ ⇣ 1 ⌘⇣ 1 ⌘ mean = n ⇤ n ⇤ , 1 n 12 ∼ Bin , 2 , std dev = n ⇤ 2 2 2 By normal approx. to binomial, for large n ⇤ , z = n 12 − n ⇤ / 2 n 12 − n 21 q � = p n 12 + n 21 ∼ N ( 0, 1 ) n ⇤ � 1 �� 1 2 2 or equivalently z 2 = ( n 12 − n 21 ) 2 ∼ χ 2 1 n 12 + n 21 Called McNemar’s test. 340
Example (Crossover Study: Drug vs Placebo III) Placebo S F S 12 49 61 (71%) Drug F 10 15 25 22 64 86 (26%) n 12 − n 21 49 − 10 ( z 2 = 25.8, df = 1 ) z = p n 12 + n 21 = p 49 + 10 = 5.1 p-value < 0.0001 for H 0 : π 1 + = π + 1 vs H a : π 1 + 6 = π + 1 . Extremely strong evidence that probability of success is higher for drug than placebo. 341
CI for π 1 + − π + 1 Estimate π 1 + − π + 1 by diff. of sample proportions, p 1 + − p + 1 . p 1 + − p + 1 = n 1 + − n + 1 = n 12 − n 21 n n n r n 12 + n 21 − ( n 12 − n 21 ) 2 SE = 1 n n Example (Crossover Study: Drug vs Placebo IV) n 11 n 12 12 49 n 21 n 22 = 10 15 n 86 p 1 + − p + 1 = 49 − 10 = 39 86 = 0.453 86 r 49 + 10 − ( 49 − 10 ) 2 SE = 1 = 0.075 86 86 95% CI : 0.453 ± ( 1.96 )( 0.075 ) = 0.453 ± 0.146 = ( 0.31, 0.60 ) 342
Aside: How is the SE derived? � � ( n 11 , n 12 , n 21 , n 22 ) ∼ MN n , ( π 11 , π 12 , π 21 , π 22 ) � Var ( n ij ) = n π ij ( 1 − π ij ) ) = if i 6 = i 0 or j 6 = j 0 Cov ( n ij , n i 0 , j 0 ) = − n π ij π i 0 j 0 ✓ n 12 − n 21 ◆ = Var ( n 12 − n 21 ) Var ( p 1 + − p + 1 ) = Var n 2 n = Var ( n 12 ) + Var ( n 21 ) − 2 Cov ( n 12 , n 21 ) n 2 = n π 12 ( 1 − π 12 ) + n π 21 ( 1 − π 21 ) + 2 n π 12 π 21 n 2 = π 12 + π 21 − ( π 2 12 − 2 π 12 π 21 + π 2 21 ) n = π 12 + π 21 − ( π 12 − π 21 ) 2 (ctd next frame) n 343
Var ( p 1 + − p + 1 ) = π 12 + π 21 − ( π 12 − π 21 ) 2 n Var ( p 1 + − p + 1 ) = p 12 + p 21 − ( p 12 − p 21 ) 2 c n ⇣ ⌘ 2 n 12 n + n 21 n 12 n − n 21 n − n = n n − ( n 12 − n 21 ) 2 n 12 n + n 21 ⇥ n n 2 = n n = n 12 + n 21 − ( n 12 − n 21 ) 2 n n 2 344
Another way: Var ( p 1 + − p + 1 ) = Var ( p 1 + ) + Var ( p + 1 ) − 2 Cov ( p 1 + , p + 1 ) Var ( p 1 + ) = π 1 + ( 1 − π 1 + ) Var ( p + 1 ) = π + 1 ( 1 − π + 1 ) , , n n ✓ n 1 + ◆ ✓ n 11 + n 12 ◆ n , n + 1 , n 11 + n 21 Cov ( p 1 + , p + 1 ) = Cov = Cov n n n � � = 1 n 11 + n 12 , n 11 + n 21 n 2 Cov ⇥ ⇤ = 1 Var ( n 11 ) + Cov ( n 11 , n 21 ) + Cov ( n 12 , n 11 ) + Cov ( n 12 , n 21 ) n 2 ⇥ ⇤ = 1 n π 11 ( 1 − π 11 ) − n π 11 π 21 − n π 12 π 11 − n π 12 π 21 n 2 ⇥ ⇤ = 1 π 11 ( 1 − π 11 − π 12 − π 21 ) − π 12 π 21 n | {z } π 22 = π 11 π 22 − π 12 π 21 n 345
Thus, Var ( p 1 + − p + 1 ) ⇥ ⇤ = 1 π 1 + ( 1 − π 1 + ) + π + 1 ( 1 − π + 1 ) − 2 ( π 11 π 22 − π 12 π 21 ) n Often matched-pairs exhibit positive association (odds-ratio greater than 1), i.e., π 11 π 22 > π 12 π 21 , so covariance term is negative. Compare to two independent samples of size n each. Continuing, c Var ( p 1 + − p + 1 ) ⇥ ⇤ = 1 p 1 + ( 1 − p 1 + ) + p + 1 ( 1 − p + 1 ) − 2 ( p 11 p 22 − p 12 p 21 ) n After algebra, this simplifies to expression given before. 346
> crossover <- matrix(c(12,10,49,15), nrow=2, dimnames=list(Drug=c("S","F"), Placebo=c("S","F"))) > crossover <- as.table(crossover) > crossover Placebo Drug S F S 12 49 F 10 15 > mcnemar.test(crossover, correct = FALSE) McNemar ' s Chi-squared test data: crossover McNemar ' s chi-squared = 25.78, df = 1, p-value = 3.827e-07 347
8.5 Rater Agreement Example (Movie Reviews by Siskel and Ebert) Ebert Siskel Con Mixed Pro Total Con 24 8 13 45 Mixed 8 13 11 32 Pro 10 9 64 83 Total 42 30 88 160 How strong is their agreement? 348
8.5.5 Cohen’s Kappa Let π ij = Pr ( S = i , E = j ) . X Pr ( agree ) = π 11 + π 22 + π 33 = π ii i = 1 if perfect agreement If ratings are independent, then π ii = π i + π + i and X Pr ( agree | indep ) = π i + π + i i Cohen’s kappa is P i π ii − P κ = Pr ( agree ) − Pr ( agree | indep ) i π i + π + i = 1 − P 1 − Pr ( agree | indep ) i π i + π + i 349
Note: I κ = 0 if agreement only equals that expected under independence. I κ = 1 if perfect agreement. I Demoninator = maximum difference for numerator, attained if agreement is perfect. 350
Example (Siskel and Ebert (ctd)) π ii = 24 + 13 + 64 X ˆ = 0.63 160 i ✓ 45 ◆✓ 42 ◆ ✓ 32 ◆✓ 30 ◆ ✓ 83 ◆✓ 88 ◆ X π i + ˆ ˆ π + i = + + 160 160 160 160 160 160 i = 0.40 κ = 0.63 − 0.40 ˆ = 0.39 1 − 0.40 Moderate agreement: difference between observed agreement and agreement expected under independence is about 40% of the maximum possible difference. 351
I 95% CI for κ : κ ± 1.96 SE = 0.39 ± ( 1.96 )( 0.06 ) = 0.39 ± 0.12 = ( 0.27, 0.51 ) ˆ I For H 0 : κ = 0, SE = 0.39 κ ˆ z = 0.06 = 6.49 Very strong evidence that agreement is better than “chance”. I A very simple cohens.kappa() is in the icda package. More sophisticated versions can be found in several packages on CRAN (e.g., irr, concord, and psy). 352
> data(moviereviews) > moviereviews Ebert Siskel Con Mixed Pro Con 24 8 13 Mixed 8 13 11 Pro 10 9 64 > cohens.kappa(moviereviews) $kappa [1] 0.38884 $SE [1] 0.059917 353
Recommend
More recommend