Comparing Frequentist and Bayesian Fixed-Confidence Guarantees for Selection-of-the-Best Problems David Eckman Shane Henderson Cornell University, ORIE Cornell University, ORIE ❞❥❡✽✽❅❝♦r♥❡❧❧✳❡❞✉ s❣❤✾❅❝♦r♥❡❧❧✳❡❞✉ INFORMS Annual Meeting November 4, 2018
C OMPARING F REQUENTIST AND B AYESIAN F IXED -C ONFIDENCE G UARANTEES D AVID E CKMAN Selection of the Best Selecting from among a finite number of simulated alternatives. • Optimize a scalar performance measure. • An alternative’s (mean) performance is observed with error. Example: Positioning ambulance bases in a city. • Minimize the expected call response time. Multiple alternatives − → ranking and selection and exploratory MAB. Two alternatives − → A/B testing. I NTRO B AYESIAN VS . F REQUENTIST G UARANTEES E XAMPLE C ONCLUSIONS 2/15
C OMPARING F REQUENTIST AND B AYESIAN F IXED -C ONFIDENCE G UARANTEES D AVID E CKMAN Setup For each alternative i = 1 , . . . , k , the observations X i 1 , X i 2 , . . . are i.i.d. from a distribution F i with mean µ i . Assume that observations across alternatives are independent. The vector µ = ( µ 1 , . . . , µ k ) represents the (unknown) problem instance. • Assume that larger µ i is better. I NTRO B AYESIAN VS . F REQUENTIST G UARANTEES E XAMPLE C ONCLUSIONS 3/15
C OMPARING F REQUENTIST AND B AYESIAN F IXED -C ONFIDENCE G UARANTEES D AVID E CKMAN Selection Events Let K be the index of the selected alternative. • Correct Selection: “Select one of the best alternatives.” CS := { µ K = max 1 ≤ i ≤ k µ i } . • Good Selection: “Select a δ -good alternative.” GS := { µ K > max 1 ≤ i ≤ k µ i − δ } . Fixed-confidence guarantees: P ( CS ) (or P ( GS )) ≥ 1 − α. I NTRO B AYESIAN VS . F REQUENTIST G UARANTEES E XAMPLE C ONCLUSIONS 4/15
C OMPARING F REQUENTIST AND B AYESIAN F IXED -C ONFIDENCE G UARANTEES D AVID E CKMAN Frequentist and Bayesian Frameworks Different perspectives on what is random and what is fixed. Frequentist PCS (PGS) = The probability that the random alternative chosen by the procedure is correct (good) for the fixed problem instance. Bayesian PCS (PGS) = The posterior probability that—given the observed data—the random problem instance is one for which the fixed alternative chosen by the procedure is correct (good). “How do these guarantees differ on a practical level?” I NTRO B AYESIAN VS . F REQUENTIST G UARANTEES E XAMPLE C ONCLUSIONS 5/15
C OMPARING F REQUENTIST AND B AYESIAN F IXED -C ONFIDENCE G UARANTEES D AVID E CKMAN Design for Frequentist Guarantees Design the procedure to satisfy the guarantee for the least favorable configuration (LFC), i.e., the hardest problem instance. The LFC is often the so-called slippage configuration (SC). • Fix a best alternative, j , and set µ i = µ j − δ for all i � = j . Frequentist procedures are conservative. • They often overdeliver on PCS/PGS. I NTRO B AYESIAN VS . F REQUENTIST G UARANTEES E XAMPLE C ONCLUSIONS 6/15
C OMPARING F REQUENTIST AND B AYESIAN F IXED -C ONFIDENCE G UARANTEES D AVID E CKMAN Design for Bayesian Guarantees By the Stopping Rule Principle, it is valid to stop and select an alternative whenever its posterior PCS/PGS exceeds 1 − α . • Use posterior PCS/PGS as a stopping rule for procedures. • E.g., VIP , OCBA, and TTTS. Advantages: • Can repeatedly compute posterior PCS/PGS without sacrificing statistical validity. • Complete flexibility in allocating simulation runs across alternatives. I NTRO B AYESIAN VS . F REQUENTIST G UARANTEES E XAMPLE C ONCLUSIONS 7/15
C OMPARING F REQUENTIST AND B AYESIAN F IXED -C ONFIDENCE G UARANTEES D AVID E CKMAN Interpreting Bayesian Guarantees A Bayesian guarantee will NOT deliver a frequentist guarantee that PCS/PGS exceeds 1 − α for all problem instances. Its guarantee can still be interpreted in a frequentist sense. 1. Draw µ from the prior distribution. 2. Run the Bayesian procedure (with the stopping rule) on µ . For repeated runs of Steps 1 and 2, the procedure will make a correct (good) selection with probability 1 − α . I NTRO B AYESIAN VS . F REQUENTIST G UARANTEES E XAMPLE C ONCLUSIONS 8/15
C OMPARING F REQUENTIST AND B AYESIAN F IXED -C ONFIDENCE G UARANTEES D AVID E CKMAN Bayesian PCS/PGS for Two Alternatives Ex: X 1 j ∼ N ( µ 1 , σ 2 ) and X 2 j ∼ N ( µ 2 , σ 2 ) for j = 1 , . . . , n with known σ 2 , noninformative prior on µ 1 and µ 2 , and independent beliefs. I NTRO B AYESIAN VS . F REQUENTIST G UARANTEES E XAMPLE C ONCLUSIONS 9/15
C OMPARING F REQUENTIST AND B AYESIAN F IXED -C ONFIDENCE G UARANTEES D AVID E CKMAN Continuation Regions √ � n ( ¯ X 1 − ¯ � ≥ � � 2 nσ Φ − 1 (1 − α ) − δn. Stop if X 2 ) Posterior PCS = 1 - Posterior PGS = 1 - I NTRO B AYESIAN VS . F REQUENTIST G UARANTEES E XAMPLE C ONCLUSIONS 10/15
C OMPARING F REQUENTIST AND B AYESIAN F IXED -C ONFIDENCE G UARANTEES D AVID E CKMAN Experimental Results: Noninformative 1 0.95 0.9 Empirical PGS 0.85 0.8 = 0 0.75 = 0.05 = 0.10 0.7 = 0.25 1 - 0.65 0 0.2 0.4 0.6 0.8 1 True difference in means ( 1 - 2 ) I NTRO B AYESIAN VS . F REQUENTIST G UARANTEES E XAMPLE C ONCLUSIONS 11/15
C OMPARING F REQUENTIST AND B AYESIAN F IXED -C ONFIDENCE G UARANTEES D AVID E CKMAN Experimental Results: N (0 , 2 σ 2 ) 1 0.95 0.9 Empirical PGS 0.85 0.8 = 0 0.75 = 0.05 = 0.10 0.7 = 0.25 1 - 0.65 0 0.2 0.4 0.6 0.8 1 True difference in means ( 1 - 2 ) I NTRO B AYESIAN VS . F REQUENTIST G UARANTEES E XAMPLE C ONCLUSIONS 12/15
C OMPARING F REQUENTIST AND B AYESIAN F IXED -C ONFIDENCE G UARANTEES D AVID E CKMAN Findings 1. For hard problem instances, procedures with Bayesian guarantees underdeliver on empirical PCS/PGS. • More pronounced for good selection. 2. Hard problems look easier because of a “means-spreading” phenomenon. • Similar issues arise in predicting the runtime of a procedure. I NTRO B AYESIAN VS . F REQUENTIST G UARANTEES E XAMPLE C ONCLUSIONS 13/15
C OMPARING F REQUENTIST AND B AYESIAN F IXED -C ONFIDENCE G UARANTEES D AVID E CKMAN Practical Implications A decision-maker’s preference may depend on the situation: 1. A one-time, critical decision. 2. Repeated problem instances (i.e., using R&S for control). 3. R&S after search, where the problem instance is random. What if the prior is wrong? I NTRO B AYESIAN VS . F REQUENTIST G UARANTEES E XAMPLE C ONCLUSIONS 14/15
C OMPARING F REQUENTIST AND B AYESIAN F IXED -C ONFIDENCE G UARANTEES D AVID E CKMAN Ongoing Work Improving the computational efficiency of Bayesian procedures. • Calculating or estimating posterior PCS/PGS. • Checking whether the stopping condition has been met. Frequentist and Bayesian guarantees for subset-selection. I NTRO B AYESIAN VS . F REQUENTIST G UARANTEES E XAMPLE C ONCLUSIONS 15/15
C OMPARING F REQUENTIST AND B AYESIAN F IXED -C ONFIDENCE G UARANTEES D AVID E CKMAN Acknowledgments This material is based upon work supported by the Army Research Office under grant W911NF-17-1-0094 and by the National Science Foundation under grants DGE-1650441 and CMMI-1537394. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation. I NTRO B AYESIAN VS . F REQUENTIST G UARANTEES E XAMPLE C ONCLUSIONS 15/15
C OMPARING F REQUENTIST AND B AYESIAN F IXED -C ONFIDENCE G UARANTEES D AVID E CKMAN References • James O. Berger (1993). Statistical Decision Theory and Bayesian Analysis . • Koichiro Inoue and Stephen E. Chick (1998). Comparison of Bayesian and frequentist assessments of uncertainty for selecting the best system. In: Proceedings of the 1998 Winter Simulation Conference . 727–734. • Jürgen Branke, Stephen E. Chick, and Christian Schmidt (2007) Selecting a selection procedure. Management Science . vol. 53, no. 12, 1916–1932. • Adam N. Sanborn and Thomas T. Hills (2014). The frequentist implications of optional stopping on Bayesian hypothesis tests. Psychonomic Bulletin and Review . vol. 21, no. 2, 283–300. • Alex Deng, Jiannan Lu, and Shouyuan Chen (2016). Continuous monitoring of A/B tests without pain: Optional stopping in Bayesian testing. In: Data Science and Advanced Analytics . • Daniel Russo (2016). Simple Bayesian algorithms for best-arm identification. arXiv e-Prints . arXiv:1602.08448v4. • Sijia Ma and Shane G. Henderson (2018). Predicting the simulation budget in ranking and selection procedures. Submitted . I NTRO B AYESIAN VS . F REQUENTIST G UARANTEES E XAMPLE C ONCLUSIONS 15/15
Recommend
More recommend