a new method for tackling limited monte carlo
play

A New Method for Tackling Limited Monte Carlo Carlos Argelles - PowerPoint PPT Presentation

A New Method for Tackling Limited Monte Carlo Carlos Argelles Austin Schneider Tianlu Yuan 1 Analysis Scenario: Binned data, Poisson likelihood, and Simulation 2 Analysis Scenario Three requirements: 1. Binned data counts 2.


  1. A New Method for Tackling Limited Monte Carlo Carlos Argüelles Austin Schneider Tianlu Yuan 1

  2. Analysis Scenario: Binned data, Poisson likelihood, and Simulation 2

  3. Analysis Scenario Three requirements: 1. Binned data counts 2. Independent rare processes 3. Modelled by simulation Applies to much of particle-physics and astrophysics 3

  4. Analysis Scenario Three requirements: 1. Binned data counts 2. Independent rare processes 3. Modelled by simulation Applies to much of particle-physics and astrophysics PhysRevLett.121.221801 4

  5. Analysis Scenario Three requirements: “It is well known that the 1. Binned data counts count of independent, rare 2. Independent rare processes natural processes can be 3. Modelled by simulation described by the Poisson Applies to much of particle-physics and distribution.” astrophysics 5

  6. Analysis Scenario Generated event Detector / analysis Three requirements: properties response 1. Binned data counts 2. Independent rare processes 3. Modelled by simulation Applies to much of particle-physics and astrophysics 6

  7. Reweighting Generated event Detector / analysis Simulating every physical hypothesis properties response theta is too expensive Reweighting modifies the physical hypothesis with the same simulation set Event Physical properties hypothesis 7

  8. Approximate Expectations Sum weights to obtain expected number of events Using this approximation we construct the AdHoc likelihood from the Poisson likelihood The error on this approximation vanishes as we approach large simulation size 8

  9. The Curse of Rare Processes and Small Signals The signals we look for are small Our virtual detector works similarly to the real thing. Sometimes we can work around this. Sometimes we can’t, or MC is too expensive. 9

  10. Accounting for errors 10

  11. Incorporating Errors Exact knowledge of lambda → treat lambda probabilistically using Bayes’ theorem The likelihood is informed by the MC This generalizes the likelihood Note that we recover the AdHoc likelihood when 11

  12. Obtaining Monte Carlo can also be modelled by a Poisson process Number of MC events m is Poisson distributed For simplicity, consider the case of equal weights The data expectation 𝛍 is related to the weight So the likelihood of lambda becomes 12

  13. Extension to Arbitrary Weights For arbitrary weights we can consider mu and sigma in terms of the “effective” weights and counts Proceed exactly as before but now where the factorial has been replaced by a gamma function*. *Note: this is fine because this factor does not depend on lambda and cancels in the normalization step, although for an un-normalized likelihood this might present a problem. 13

  14. Equal → Arbitrary (An implicit assumption) Equal Weights Arbitrary Weights Distribution Scaled Poisson Compound Poisson Used in the likelihood Scaled Poisson Scaled Poisson Bohm and Zech (2012) showed that a scaled Poisson distribution is a good approximation to this when the first and second moments are matched The effective treatment uses the scaled Poisson distribution More details on this can be found in our paper, DOI:10.1007/JHEP06(2019)030 14

  15. With the likelihood of lambda, we use Bayes’ theorem to compute the probability of lambda assuming a uniform prior Where G is the gamma distribution and 15

  16. The Effective Likelihood Integrating over the true expectation, we now have the effective likelihood This accounts for the uncertainty from finite Monte Carlo sample size 16

  17. Performance 17

  18. A Toy Experiment Measure a resonance component on top of a steeply falling background. ● Simulate comparable amounts of signal and background ● Generated according to power-law distributions ● Smeared with different uncertainties ● 18 18

  19. Point Estimation The effective likelihood produces similar results to the Poisson description The maximum likelihood is an unbiased estimator for large MC sample size 19

  20. Coverage We produce 500 independent Monte Carlo sets and 500 data sets to test the coverage True coverage compared to Wilks’ asymptotically approximated coverage Effective likelihood provides a good estimate of the coverage AdHoc likelihood vastly underestimates the coverage 20

  21. A 2D Bayesian Example The effective likelihood is also suitable for Bayesian analyses The effective likelihood broadens in the Increasing MC Size case of low MC sample size, providing robust error regions The AdHoc likelihood is liable to underestimate the width 21

  22. Performance Comparison Comparing the runtime of the effective likelihood to other treatments 22

  23. Caveats Bin to bin correlations are not directly built into the likelihood. ● Assume that correlated shape uncertainties will be handled implicitly by the reweighting. Estimate of variance in bin expectation relies on Monte Carlo events in the bin. ● If a population of events with large possible contribution to the variance is not included, then the estimate of the variance may be incorrect. Monte Carlo is needed in every bin. ● 23

  24. Summary The exact expectation in a bin is usually unknown ● It is important to account for the uncertainty inherent with limited Monte Carlo ● samples The effective likelihood ● Provides a robust treatment of these errors, provided MC is available ○ Converges to the AdHoc likelihood for large MC ○ Has improved coverage properties ○ Can be substituted directly for the AdHoc likelihood ○ https://austinschneider.github.io/MCLLH/ 24

  25. Likelihood Summary Implementations and paper links can be found here: https://austinschneider.github.io/MCLLH/ 25

Recommend


More recommend