On calculation of Blaker’s binomial confidence limits Jan Klaschka klaschka@cs.cas.cz Inst. of Computer Science, Academy of Sciences, Prague, Czech Republic COMPSTAT 2010, Paris, August 22-27, 2010
Summary • Blaker (2000) proposed - a new type of binomial confidence limts - a numerical algorithm for their calculation • Theoretical work good, but numerics a bit careless • My contribution: A better algorithm
Outline • Blaker’s confidence interval – what is it (recap) • Original Blaker’s algortithm . . . and what is wrong with it • Remedy – new algorithm • Concluding remarks
Blaker’s confidence interval • Task: Exact (conservative) two-sided confidence interval (CI) for binomial parameter p • Clopper-Pearson (1934): Both probabilities P (whole CI below true p ) ≤ α/ 2, P (whole CI above true p ) ≤ α/ 2 controlled • Les conservative exact alternatives: P (. . . below . . . ) + P (. . . above . . . ) ≤ α only controlled
Blaker’s confidence interval • Proposals of alternatives: - Sterne, Crow (1954, 1956) - Blyth, Still, Casella (1983, 1986) - Blaker (2000) • Virtues of Blaker’s CI: - Blaker’s CI ⊆ Clopper-Pearson CI - Monotonicity w.r.t. α - Easy calculation - short R program in Blaker (2000)
Blaker’s confidence interval • Based on confidence function β ( · ) defined in terms of binomial tail probabilities (details: Blaker (2000)) Blaker’s confidence curve n = 10, k= 4 1.0 0.8 0.6 confidence 0.4 0.2 0.0 0.0 0.2 0.4 0.6 0.8 1.0 p
Blaker’s confidence interval • Based on confidence function β ( · ) defined in terms of binomial tail probabilities (details: Blaker (2000)) Blaker’s confidence curve n = 10, k= 4 1.0 0.8 0.6 confidence 0.4 0.2 0.0 0.0 0.2 0.4 0.6 0.8 1.0 p
Blaker’s confidence interval • Roughly: (1 − α ) confidence interval is where β ( p ) ≥ α Blaker’s confidence curve n = 10, k= 4 1.0 0.8 0.6 confidence 0.4 0.2 0.0 0.0 0.2 0.4 0.6 0.8 1.0 p • Calculation of confidence limits: • Numerical search for points where β ( · ) crosses α level
Blaker’s confidence interval • More precisely: • C = { p ; β ( p ) ≥ α } often not an interval • Blaker’s interval defined as conv ( C ) (convex hull) • Calculation of confidence limits: Numerical search for the leftmost and rightmost points where β ( · ) crosses α level
Original Blaker’s algorithm • Short R program in Blaker (2000), correction Blaker (2001) (to be found at A. Agresti’s web, too) • Calculation of p L (lower confidence limit): 1 Start in Clopper-Pearson limit p CP L 2 Iterate p := p + ∆ p (fixed step ∆ p > 0, default 10 − 4 ) while β ( p ) < α 3 Once β ( p ) ≥ α , set p L := p − ∆ p , finish • Calculation of p U (upper confidence limit) analogous
Original Blaker’s algorithm • What is wrong? • Constant step → drastic slowdown when higher accuracy required • Algorithm may skip short intervals and fail
Original Blaker’s algorithm • Example of failure: Original Blaker’s algorithm, step = 1e−04 n = 134, k = 131, alpha = 0.05 1.0 0.8 0.6 confidence 0.4 0.2 0.0 0.90 0.92 0.94 0.96 0.98 1.00 p
Original Blaker’s algorithm • Example of failure: Original Blaker’s algorithm, step = 1e−04 n = 134, k = 131, alpha = 0.05 1.0 0.8 0.6 confidence 0.4 0.2 0.0 0.90 0.92 0.94 0.96 0.98 1.00 p
Original Blaker’s algorithm • Example of failure: Original Blaker’s algorithm, step = 1e−04 n = 134, k = 131, alpha = 0.05 0.10 0.09 0.08 0.07 confidence 0.06 0.05 0.04 0.03 0.934 0.936 0.938 0.940 0.942 p
Original Blaker’s algorithm • Example of failure: Original Blaker’s algorithm, step = 1e−04 n = 134, k = 131, alpha = 0.05 0.10 0.09 0.08 0.07 confidence 0.06 0.05 0.04 0.03 0.934 0.936 0.938 0.940 0.942 p
Original Blaker’s algorithm • Example of failure: Original Blaker’s algorithm, step = 1e−04 n = 134, k = 131, alpha = 0.05 0.0502 0.0501 0.0500 confidence 0.0499 0.0498 0.0497 0.9360 0.9365 0.9370 0.9375 0.9380 0.9385 p
Original Blaker’s algorithm • Example of failure: Original Blaker’s algorithm, step = 1e−04 n = 134, k = 131, alpha = 0.05 0.0502 0.0501 0.0500 confidence 0.0499 0.0498 0.0497 0.9360 0.9365 0.9370 0.9375 0.9380 0.9385 p
Original Blaker’s algorithm • Example of failure: Original Blaker’s algorithm, step = 1e−04 n = 134, k = 131, alpha = 0.05 0.05004 0.05003 0.05002 confidence 0.05001 0.05000 0.04999 0.93600 0.93605 0.93610 0.93615 0.93620 p
Original Blaker’s algorithm • Statistics of coverage deficits – n = 1 , . . . , 1000: Original Blaker’s algorithm − coverage deficit step = 1e−4, alpha = 0.05 −4 log(deficit) −5 −6 −7 0 200 400 600 800 1000 n
Original Blaker’s algorithm • Statistics of coverage deficits – n = 1 , . . . , 1000: Original Blaker’s algorithm − coverage deficit step = 1e−5, alpha = 0.05 −4 log(deficit) −5 −6 −7 0 200 400 600 800 1000 n
Original Blaker’s algorithm • Statistics of coverage deficits – n = 1 , . . . , 1000: Original Blaker’s algorithm − coverage deficit step = 1e−6, alpha = 0.05 −4 log(deficit) −5 −6 −7 0 200 400 600 800 1000 n
New algorithm – lemmas • p CP . . . Clopper-Pearson lower limit L p ∗ . . . first discontinuity point of β ( · ) to the right • Lemma 1: p CP ≤ p L ≤ p ∗ L • Lemma 2: β ( · ) crosses α level only once on [ p CP L , p ∗ ] • Analogously for p U
New algorithm – description • Search for the lower confidence limit p L : (For p U analogously) 1 Modify β ( · ): � β ( p ) p < p ∗ β ∗ ( p ) = + ∞ p ≥ p ∗ Remark: Modification is computationally easy 2 Apply interval halving to β ∗ ( · ) between p CP and 1 L Remark: With unmodified β ( · ), halving safe only on [ p CP L , p ∗ ], naive “global” use fails
Concluding remarks • New algorithm is fast and accurate • Implementation: ∼ 50 lines of R code • R package to come (hopefully) soon • Slightly extended version: www.cs.cas.cz/~klaschka/c10/417_ext.pdf
Thank you for your attention
Recommend
More recommend