Quantile Estimation Quantile Estimation Definition and Examples Point Estimates Peter J. Haas Confidence Intervals Further Comments Checking Normality CS 590M: Simulation Bootstrap Confidence Intervals Spring Semester 2020 1 / 20 2 / 20 Quantiles Quantile Definition f X (x) 99% Definition of p -quantile q p 1% q p = F − 1 X ( p ) (for 0 < p < 1) q 0 ◮ When F X is continuous and increasing: solve F ( q ) = p ◮ In general: Use our generalized definition of F − 1 Example: Value-at-Risk (as in inversion method) ◮ X = return on investment, want to measure downside risk ◮ q = return s.t. P (worse return than q ) ≤ 0 . 01 Alternative Definition of p -quantile q p ◮ q is called the 0 . 01-quantile of X q p = min { q : F X ( q ) ≥ p } ◮ “Probabilistic worst case scenario” 3 / 20 4 / 20
Example: Robust Statistics IQR Quantile Estimation Definition and Examples Point Estimates Confidence Intervals Further Comments Median Checking Normality ◮ Median = q 0 . 5 Bootstrap Confidence Intervals ◮ Alternative to means as measure of central tendency ◮ Robust to outliers Inter-quartile range (IQR) ◮ Robust measure of dispersion ◮ IQR = q 0 . 75 − q 0 . 25 5 / 20 6 / 20 Point Estimate of Quantile D ◮ Given i.i.d. observations X 1 , . . . , X n ∼ F ◮ Natural choice is p th sample quantile: Quantile Estimation Q n = ˆ F − 1 n ( p ) Definition and Examples Point Estimates ◮ I.e., generalized inverse of empirical cdf ˆ F n Confidence Intervals ◮ Q: Can you ever use the simple (non-generalized) inverse here? Further Comments ◮ Equivalently, sort data as X (1) ≤ X (2) ≤ · · · ≤ X ( n ) and set Checking Normality Bootstrap Confidence Intervals Q n = X ( j ) , where j = ⌈ np ⌉ ◮ Ex: q 0 . 5 for { 6 , 8 , 4 , 2 } = ◮ Other definitions are possible (e.g., interpolating between values), but we will stick with the above defs 7 / 20 8 / 20
Confidence Interval Attempt #1: Direct Use of CLT Confidence Interval Attempt #1: Direct Use of CLT CLT for Quantiles (Bahadur Representation) Suppose that X 1 , . . . , X n are i.i.d. with pdf f X . Then for large n CLT for Quantiles (Bahadur Representation) q p , σ 2 � � � p (1 − p ) Suppose that X 1 , . . . , X n are i.i.d. with pdf f X . Then for large n D Q n ∼ N with σ = n f X ( q p ) q p , σ 2 � � � p (1 − p ) D Q n ∼ N with σ = n f X ( q p ) Can derive via Delta Method for stochastic root-finding ◮ Recall: to find ¯ θ such that E [ g ( X , ¯ θ )] = 0 ◮ So if we can find an estimator s n of σ , then 100(1 − δ )% CI is � n ◮ Point estimate θ n solves 1 i =1 g ( X i , θ n ) = 0 n ◮ For large n , we have θ n ≈ N (¯ θ, σ 2 / n ), � � Q n − z δ s n √ n , Q n + z δ s n where σ 2 = Var[ g ( X , ¯ θ )] / c 2 with c = E [ ∂ g ( X , ¯ √ n θ ) /∂θ ] ◮ For quantile estimation take g ( X , θ ) = I ( X ≤ θ ) − p ◮ Problem: Estimating a pdf f X is hard (e.g., need to choose ◮ ¯ θ = q p and θ n = Q n , since E [ g ( X , ¯ θ )] = P ( X ≤ ¯ θ ) − p = 0 “bandwidth” for “kernel density estimator”) ◮ E [ ∂ g ( X , ¯ θ ) /∂θ ] = ∂ E [ g ( X , ¯ F X (¯ /∂θ = f X (¯ � � θ )] /∂θ = ∂ θ ) − p θ ) ◮ So we want to avoid estimation of σ θ ) 2 ] = E [ I 2 − 2 pI + p 2 ] ◮ Var[ g ( X , ¯ θ )] = E [ g ( X , ¯ = E [ I − 2 pI + p 2 ] = p − 2 p 2 + p 2 = p (1 − p ) 9 / 20 10 / 20 Confidence Interval Attempt #2: Sectioning Sectioning: So What’s the Problem? ◮ Assume that n = mk and divide X 1 , . . . , X n into m sections of k observations each ◮ Can show, as with nonlinear functions of means, that ◮ m is small (around 10–20) and k is large E [ Q n ] ≈ q p + b n + c ◮ Let Q n ( i ) be estimator of q p based on data in i th section n 2 ◮ Observe that Q n (1) , . . . , Q n ( m ) are i.i.d. ◮ It follows that q p , σ 2 ◮ By prior CLT, each Q n ( i ) is approx. distributed as N � � n + m 2 c k E [ Q n ( i )] ≈ q p + b k + c k 2 = q p + mb ◮ For i.i.d. normals, standard 100(1 − δ )% CI for mean is n 2 � v n � v n � � ¯ m , ¯ Q n − t m − 1 ,δ Q n + t m − 1 ,δ ◮ So m n + m 2 c Q n ] ≈ q p + mb E [ ¯ Q n = (1 / m ) � m ¯ i =1 Q n ( i ) ◮ n 2 � 2 Q n ( i ) − ¯ 1 � m ◮ v n = � Q n ◮ Bias of ¯ i =1 m − 1 Q n is roughly m times larger than bias of Q n ! ◮ t m − 1 ,δ is 1 − ( δ/ 2) quantile of Student-t distribution with m − 1 degrees of freedom 11 / 20 12 / 20
Attempt #3: Sectioning + Jackknifing Application to Quantile Estimation Sectioning + Jackknifing: General Algorithm for a Statistic α ◮ ˜ Q n ( i ) = quantile estimate ignoring section i 1. Generate n = mk i.i.d. observations X 1 , . . . , X n ◮ Clearly, ˜ Q n ( i ) has same distribution as Q ( m − 1) k , so 2. Divide observations into m sections, each of size k b c E [ ˜ 3. Compute point estimator α n based on all observations Q n ( i )] ≈ q p + ( m − 1) k + ( m − 1) 2 k 2 4. For i = 1 , 2 , . . . , m : 4.1 Compute estimator ˜ α n ( i ) using all observations ◮ It follows that, for pseudovalue α n ( i ), except those in section i c � � 4.2 Form pseudovalue α n ( i ) = m α n − ( m − 1)˜ α n ( i ) mQ n − ( m − 1) ˜ E [ α n ( i )] = E Q n ( i ) ≈ q p − ( m − 1) mk 2 m 5. Compute point estimator: α J n = 1 � α n ( i ) m ◮ Averaging does not affect bias, so, since n = mk , i =1 m 2 6. Set v J 1 ( α n ( i ) − α J n = � n ) E [ ¯ Q n ] = q p + O (1 / n 2 ) m − 1 i =1 � � � � v J v J 7. Compute 100(1 − δ )% CI: α J m , α J n − t m − 1 ,δ n + t m − 1 ,δ n n ◮ General procedure is also called the “delete- k jackknife” m 13 / 20 14 / 20 Further Comments A confession ◮ There exist special-purpose methods for quantile estimation [Sections 2.6.1 and 2.6.3 in Serfling book] Quantile Estimation ◮ We focus on sectioning + jackknife because method is general Definition and Examples ◮ Can also use bias elimination method from prior lecture Point Estimates Confidence Intervals Conditioning the data for q p when p ≈ 1 Further Comments ◮ Fix r > 1 and get n = rmk i.i.d. observations X 1 , . . . , X n Checking Normality ◮ Divide data into blocks of size r Bootstrap Confidence Intervals ◮ Set Y j = maximum value in j th block for 1 ≤ j ≤ mk ◮ Run quantile estimation procedure on Y 1 , . . . , Y mk ◮ Key observation: F Y ( q p ) = [ F X ( q p )] r = p r ◮ So p -quantile for X equals p r -quantile for Y ◮ Ex: if r = 50, then q 0 . 99 for X equals q 0 . 61 for Y ◮ Often, reduction in sample size outweighs cost of extra runs 15 / 20 16 / 20
Checking Normality Undercoverage ◮ E.g., when a “95% confidence interval” for the mean only brackets the mean 70% of the time ◮ Due to failure of CLT at finite sample sizes Quantile Estimation ◮ Note: If data is truly normal, then no error in CI for the mean Definition and Examples Point Estimates Simple diagnostics Confidence Intervals ◮ Skewness (measures symmetry, equals 0 for normal) Further Comments ◮ Definition: skewness( X ) = E [( X − E ( X )) 3 ] Checking Normality (var X ) 3 / 2 n Bootstrap Confidence Intervals ( X i − ¯ X n ) 3 n − 1 � ◮ Estimator: i =1 � 3 / 2 � n ( X i − ¯ X n ) 2 n − 1 � i =1 ◮ Kurtosis (measures fatness of tails, equals 0 for normal) ◮ Definition: kurtosis( X ) = E [( X − E ( X )) 4 ] − 3 (var X ) 2 n n − 1 ( X i − ¯ X n ) 4 � ◮ Estimator: i =1 � 2 − 3 � n ( X i − ¯ X n ) 2 n − 1 � i =1 17 / 20 18 / 20 Bootstrap Confidence Intervals General method works for quantiles (no normality assumptions needed) Quantile Estimation Bootstrap Confidence Intervals (Pivot Method) Definition and Examples 1. Run simulation n times to get D = { X 1 , . . . , X n } Point Estimates 2. Compute Q n = sample quantile based on D Confidence Intervals 3. Compute bootstrap sample D ∗ = { X ∗ Further Comments 1 , . . . , X ∗ n } Checking Normality 4. Set Q ∗ n = sample quantile based on D ∗ Bootstrap Confidence Intervals 5. Set pivot π ∗ = Q ∗ n − Q n 6. Repeat Steps 3–5 B times to create π ∗ 1 , . . . , π ∗ B 7. Sort pivots to obtain π ∗ (1) ≤ π ∗ (2) ≤ · · · ≤ π ∗ ( B ) 8. Set l = ⌈ (1 − δ/ 2) B ⌉ and u = ⌈ ( δ/ 2) B ⌉ 9. Return 100(1 − δ )% CI [ Q n − π ∗ ( l ) , Q n − π ∗ ( u ) ] 19 / 20 20 / 20
Recommend
More recommend