A Test Statistic for Weighted Runs Frederik Beaujean, Allen Caldwell - - PowerPoint PPT Presentation

a test statistic for weighted runs
SMART_READER_LITE
LIVE PREVIEW

A Test Statistic for Weighted Runs Frederik Beaujean, Allen Caldwell - - PowerPoint PPT Presentation

A Test Statistic for Weighted Runs Frederik Beaujean, Allen Caldwell http://arxiv.org/abs/1005.3233v2 COMPSTAT 2010 Paris, 23.8.2010 Motivating example Suppose: y i Measurements with Gaussian uncertainty Standard Model (SM)


slide-1
SLIDE 1

A Test Statistic for Weighted Runs

Frederik Beaujean, Allen Caldwell http://arxiv.org/abs/1005.3233v2

COMPSTAT 2010

Paris, 23.8.2010

slide-2
SLIDE 2

23.08.2010 Frederik Beaujean #2

Motivating example

Suppose:

  • Measurements with Gaussian uncertainty
  • Standard Model (SM) background is quadratic
  • New physics (NP) predicts signal peak

yi

slide-3
SLIDE 3

23.08.2010 Frederik Beaujean #3

Goodness of Fit: standard approach

Test statistic:

  • Any scalar function of data, T(D)
  • Interpret: large T(D) = poor model

T D≡

2D

Example:

  • Prob. density of the data
  • Familiar choice

PD∣ ∝∏ exp{− yi−f  xi∣ 

2

2i

2

}=exp{

−

2

2 }

slide-4
SLIDE 4

23.08.2010 Frederik Beaujean #4

p T(D)

p-value

  • Assuming the model and before data is taken:

p uniform in [0,1]

  • Critical values:
  • Warning: p-value not the P

. that the model is true p0.05,0.01⇒reject model pSM=10%, pNP=37% ⇒ both OK Example: Def: p≡P TT D

slide-5
SLIDE 5

23.08.2010 Frederik Beaujean #5

Runs

  • Most statistics disrespect order
  • f data, information wasted
  • Human brain good for simple

problems Example:

  • N=25 datapoints
  • Each Gaussian with mean = 0

and variance = 1

Can we combine information about

  • rder and magnitude of deviation?
slide-6
SLIDE 6

23.08.2010 Frederik Beaujean #6

Runs statistic

Proposal:

  • Split data into runs
  • Each run has a weight

Gaussian case:

  • T

est statistic: largest weight of any run

  • p-value becomes
slide-7
SLIDE 7

23.08.2010 Frederik Beaujean #7

Runs distribution

Gaussian case:

  • Distribution of T exactly

calculated for any N (non- parametric)

  • Requires sum over integer

partitions

N = 25

slide-8
SLIDE 8

23.08.2010 Frederik Beaujean #8

Power

  • New physics contribution:

Lorentz peak with amplitude A

5% level

  • T up to 35% more

powerful than classic in detecting departures of type y(x)

slide-9
SLIDE 9

23.08.2010 Frederik Beaujean #9

Conclusions

  • choose statistic with specific alternative models in mind
  • Runs statistic T excellent for “bump hunting”

FINIS FINIS

slide-10
SLIDE 10

23.08.2010 Frederik Beaujean #10

Backup

slide-11
SLIDE 11

23.08.2010 Frederik Beaujean #11

Exact runs distribution I

slide-12
SLIDE 12

23.08.2010 Frederik Beaujean #12

Exact runs distribution II

slide-13
SLIDE 13

23.08.2010 Frederik Beaujean #13

Exact runs distribution III

slide-14
SLIDE 14

23.08.2010 Frederik Beaujean #14

Computational complexity: Integer partitions

slide-15
SLIDE 15

23.08.2010 Frederik Beaujean #15

Goodness of Fit: Bayesian approach

Model selection:

  • Need explicit alternatives M1, M2
  • Posterior odds

Bayes factor:

  • (very) sensitive to parameter range
  • Occam's razor built in

PD∣M1=∫ pD∣  p0  d  

Example:

  • Six (NP) vs three (SM) parameters

P M1∣D P M2∣D= PM1 PM2×PD∣M1 PD∣M2 P SM∣D PNP∣D =PSM PNP ×61.7