a test statistic for weighted runs
play

A Test Statistic for Weighted Runs Frederik Beaujean, Allen Caldwell - PowerPoint PPT Presentation

A Test Statistic for Weighted Runs Frederik Beaujean, Allen Caldwell http://arxiv.org/abs/1005.3233v2 COMPSTAT 2010 Paris, 23.8.2010 Motivating example Suppose: y i Measurements with Gaussian uncertainty Standard Model (SM)


  1. A Test Statistic for Weighted Runs Frederik Beaujean, Allen Caldwell http://arxiv.org/abs/1005.3233v2 COMPSTAT 2010 Paris, 23.8.2010

  2. Motivating example Suppose: y i Measurements with Gaussian uncertainty ● Standard Model (SM) background is quadratic ● New physics (NP) predicts signal peak ● 23.08.2010 Frederik Beaujean #2

  3. Goodness of Fit: standard approach Test statistic: Any scalar function of data, T(D) ● Interpret: large T(D) = poor model ● ∝ ∏ exp { −  y i − f  x i ∣ } = exp { Example:   2 2 } 2 − P  D ∣ Prob. density of the data ● 2 2  i 2  D  T  D ≡ Familiar choice ● 23.08.2010 Frederik Beaujean #3

  4. p-value Def: p ≡ P  T  T  D  p T(D) Assuming the model and before data is taken: ● p uniform in [0,1] Critical values: p  0.05,0.01 ⇒ reject model ● Warning: p-value not the P . that the model is true ● Example: p SM = 10%, p NP = 37% ⇒ both OK 23.08.2010 Frederik Beaujean #4

  5. Runs Most statistics disrespect order ● of data, information wasted Human brain good for simple ● problems Example: N=25 datapoints ● Each Gaussian with mean = 0 ● and variance = 1 Can we combine information about order and magnitude of deviation ? 23.08.2010 Frederik Beaujean #5

  6. Runs statistic Proposal: Split data into runs ● Each run has a weight ● Gaussian case: T est statistic: largest weight of ● any run p-value becomes ● 23.08.2010 Frederik Beaujean #6

  7. Runs distribution Gaussian case: Distribution of T exactly ● calculated for any N (non- parametric) Requires sum over integer ● partitions N = 25 23.08.2010 Frederik Beaujean #7

  8. Power 5% level New physics contribution: ● T up to 35% more ● powerful than classic in detecting departures of type y(x) Lorentz peak with amplitude A 23.08.2010 Frederik Beaujean #8

  9. Conclusions choose statistic with specific alternative models in mind ● Runs statistic T excellent for “bump hunting” ● FINIS FINIS 23.08.2010 Frederik Beaujean #9

  10. Backup 23.08.2010 Frederik Beaujean #10

  11. Exact runs distribution I 23.08.2010 Frederik Beaujean #11

  12. Exact runs distribution II 23.08.2010 Frederik Beaujean #12

  13. Exact runs distribution III 23.08.2010 Frederik Beaujean #13

  14. Computational complexity: Integer partitions 23.08.2010 Frederik Beaujean #14

  15. Goodness of Fit: Bayesian approach Model selection: Need explicit alternatives M 1 , M 2 P  M 1 ∣ D  P  M 2 ∣ D = P  M 1  P  M 2 × P  D ∣ M 1  ● P  D ∣ M 2  Posterior odds ● Bayes factor: (very) sensitive to parameter range ● P  D ∣ M 1 = ∫ p  D ∣  p 0   d   Occam's razor built in ● Example: P  SM ∣ D  P  NP ∣ D  = P  SM  P  NP  × 61.7 Six (NP) vs three (SM) parameters ● 23.08.2010 Frederik Beaujean #15

Recommend


More recommend