some statistics for high energy astrophysics
play

Some statistics for high-energy astrophysics with illustrations from - PowerPoint PPT Presentation

Some statistics for high-energy astrophysics with illustrations from XSPEC Andy Pollock European Space Agency XMM-Newton RGS Calibration Scientist Urbino Workshop in High-Energy Astrophysics 2008 July 31 A.M.T. Pollock European Space Astronomy


  1. Some statistics for high-energy astrophysics with illustrations from XSPEC Andy Pollock European Space Agency XMM-Newton RGS Calibration Scientist Urbino Workshop in High-Energy Astrophysics 2008 July 31 A.M.T. Pollock European Space Astronomy Centre Statistics for high-energy astrophysics XMM-Newton SOC Villanueva de la Cañada, Madrid, Spain

  2. Make every photon count. Account for every photon. A.M.T. Pollock European Space Astronomy Centre Statistics for high-energy astrophysics XMM-Newton SOC Villanueva de la Cañada, Madrid, Spain

  3. Analysis in high-energy astrophysics data  models { n i } i=1,N  { µ i } i=1,N ≥ 0 individual events  continuously distributed detector coordinates  physical parameters never change  change limited only by physics have no errors  subject to fluctuations most precious resource  predictions possible  kept forever in archives  kept forever in journals and textbooks statistics A.M.T. Pollock European Space Astronomy Centre Statistics for high-energy astrophysics XMM-Newton SOC Villanueva de la Cañada, Madrid, Spain

  4. “There are three sorts of lies: lies, damned lies and statistics.” A.M.T. Pollock European Space Astronomy Centre Statistics for high-energy astrophysics XMM-Newton SOC Villanueva de la Cañada, Madrid, Spain

  5. Statistical nature of scientific truth • Measurements in high-energy astrophysics collect individual events • Many different things could have happened to give those events • Alternatives are governed by the laws of probability • Direct inversion impossible • Information derived about the universe is not certain • Statistics quantifies the uncertainties : • What do we know ? • How well do we know it ? • Can we avoid mistakes ? • What should we do next ? A.M.T. Pollock European Space Astronomy Centre Statistics for high-energy astrophysics XMM-Newton SOC Villanueva de la Cañada, Madrid, Spain

  6. There are two sorts of statistical inference • Classical statistical inference • infinite series of identical measurements (Frequentist) • hypothesis testing and rejection • the usual interpretation • Bayesian statistical inference • prior and posterior probabilities • currently popular • Neither especially relevant for astrophysics • one universe • irrelevance of prior probabilities and cost analysis • choice among many models driven by physics A.M.T. Pollock European Space Astronomy Centre Statistics for high-energy astrophysics XMM-Newton SOC Villanueva de la Cañada, Madrid, Spain

  7. There are two sorts of statistic • χ 2 -statistic  Gaussian statistics • C-statistic  Poisson statistics A.M.T. Pollock European Space Astronomy Centre Statistics for high-energy astrophysics XMM-Newton SOC Villanueva de la Cañada, Madrid, Spain

  8. There are two sorts of statistics • Gaussian statistics  χ 2 • Poisson statistics  C A.M.T. Pollock European Space Astronomy Centre Statistics for high-energy astrophysics XMM-Newton SOC Villanueva de la Cañada, Madrid, Spain

  9. Gaussian statistics The Normal probability distribution P ( x| µ , σ ) for data={ x ∈ℜ ∈ℜ } and model={ µ , σ } 1 σ 1/3 1 σ 68.3%   2 ( ) 2 σ 1/22 2 σ 95.45% exp − x − µ 1 3 σ 1/370 3 σ 99.730% P ( x | µ , σ ) =     4 σ 1/15787 4 σ 99.99367% 2 σ 2 2 π σ   5 σ 1/1744277 5 σ 99.999943% P ( x| µ , σ ) + ∞ ∫ P ( x | µ , σ ) dx = 1 -1 σ +1 σ −∞ x 2 µ ( ) ln P = − x − µ ( ) + 1 σ − ln σ 2 π ∫ P ( x | µ , σ ) dx ≈ 0.6827 2 σ 2 − 1 σ A.M.T. Pollock European Space Astronomy Centre Statistics for high-energy astrophysics XMM-Newton SOC Villanueva de la Cañada, Madrid, Spain

  10. Poisson statistics The Poisson probability distribution for data={ n ≥ 0} and model={ µ >0 >0 } ∀ n = 0,1,2, 3,..., ∞ P ( n | µ ) = e − µ µ n P (0 | µ ) = e − µ n ! P (1 | µ ) = e − µ µ 1 ∞ P (2 | µ ) = e − µ µ µ ∑ P ( n | µ ) = 1 1 2 P (3 | µ ) = e − µ µ µ µ n = 0 1 2 3 ln P = n ln µ − µ − ln n ! P ( n | µ ) = P ( n − 1 | µ ) µ n A.M.T. Pollock European Space Astronomy Centre Statistics for high-energy astrophysics XMM-Newton SOC Villanueva de la Cañada, Madrid, Spain

  11. Likelihood of data on models  { n i } i=1,N data statistics models { µ i } i=1,N N ∏ L = P ( n i | µ i ) i = 1 Gaussian Poisson   N 2 n i e − µ i µ i N ( ) exp − n i − µ i 1 ∏ ∏ L =   L = dn i   2 2 σ i n i ! 2 π σ i   i = 1 i = 1 N 2 N N ( ) n i − µ i ln L = − 1 ∑ ∑ ∑ ( ) ln L = n i ln µ i − µ i − κ ln n i ! ln σ i + κ (ln dn i ) − 2 2 σ i i = 1 i = 1 i = 1 − 2ln L = χ 2 − 2ln L = C Cash 1979, ApJ, 228, 939 A.M.T. Pollock European Space Astronomy Centre Statistics for high-energy astrophysics XMM-Newton SOC Villanueva de la Cañada, Madrid, Spain

  12. Numerical model of the life of a photon Detected data are governed by the laws of physics. The numerical model should reproduce as completely as possible every process that gives rise to events in the detector: • photon production in the source (or sources) of interest • intervening absorption • effects of the instrument • calibration • background components • cosmic X-ray background • local energetic particles • instrumental noise • model it, don’t subtract it A.M.T. Pollock European Space Astronomy Centre Statistics for high-energy astrophysics XMM-Newton SOC Villanueva de la Cañada, Madrid, Spain

  13. An XMM-Newton RGS instrument A.M.T. Pollock European Space Astronomy Centre Statistics for high-energy astrophysics XMM-Newton SOC Villanueva de la Cañada, Madrid, Spain

  14. RGS SAS & CCF CCF components m λ = d(cos β− β− cos α ) BORESIGHT LINCOORDS MISCDATA rgsproc • atthkgen • rgsoffsetcalc • rgssources HKPARMINT • rgsframes • rgsbadpix • rgsevents • evlistcomb ADUCONV • gtimerge BADPIX • rgsangles CROSSPSF • rgsfilter CTI • rgsregions • rgsspectrum LINESPREADFUNC • rgsrmfgen QUANTUMEFF • rgsfluxer REDIST EFFAREACORR 5-10% accuracy is a common calibration goal A.M.T. Pollock European Space Astronomy Centre Statistics for high-energy astrophysics XMM-Newton SOC Villanueva de la Cañada, Madrid, Spain

  15. The final data model µ ( θ , β , Δ ,D ) =S( θ ( Ω )) ⊗ R( Ω < Δ >D)+B( β (D)) D = set of detector coordinates {X,Y,t,PI,…} S = source of interest θ = set of source parameters R = instrumental response Ω = set of physical coordinates { α , δ , τ , υ ,…} Δ = set of instrumental calibration parameters B = background β = set of background parameters  ln L ( θ , β , Δ )  ln L ( θ ) A.M.T. Pollock European Space Astronomy Centre Statistics for high-energy astrophysics XMM-Newton SOC Villanueva de la Cañada, Madrid, Spain

  16. Uses of the log-likelihood, ln L ( θ ) • ln L is what you need to assess all and any data models • locate the maximum-likelihood model when θ = θ * • minimum χ 2 is a maximum-likelihood Gaussian statistic • minimum C is a maximum-likelihood Poisson statistic • compute a goodness-of-fit statistic • reduced chi-squared χ 2 / ν ~ 1 ideally • reduced C C/ ν ~ 1 ideally • ν = number of degrees of freedom • estimate model parameters and uncertainties • ln L ( θ ) • θ * = { p 1 ,p 2 ,p 3 ,p 4 ,…,p M } • investigate the whole multi-dimensional surface ln L ( θ ) • compare two or more models • calibrating ln L, 2 Δ ln L  σ √ 2 Δ ln L • 2 Δ ln L < 1. is not interesting • 2 Δ ln L > 10. is worth thinking about ( e.g. 2XMM DET_ML ≥ 8.) • 2 Δ ln L > 100. Hmmm… A.M.T. Pollock European Space Astronomy Centre Statistics for high-energy astrophysics XMM-Newton SOC Villanueva de la Cañada, Madrid, Spain

  17. Example of a maximum-likelihood solution N-pixel image : data { n i } photons : model { µ i =sp i +b } : PSF p i : unknown parameters { s,b } N ∑ ln L = n i ln µ i − µ i i = 1 N ∑ ( ) − ( sp i + b ) n i ln sp i + b = i = 1 N ∂ ln L n i p i ∑ − p i = 0 = ∂ s sp i + b i = 1 N ∂ ln L n i ∑ − 1 = 0 = ∂ b sp i + b i = 1 N N s ∂ ln L + b ∂ ln L n i sp i n i b ∑ ∑ − sp i + − b = 0 = ∂ s ∂ b sp i + b sp i + b i = 1 i = 1 N N N ∑ ∑ ∑ n i = s p i + b 1 i = 1 i = 1 i = 1 A.M.T. Pollock European Space Astronomy Centre Statistics for high-energy astrophysics XMM-Newton SOC Villanueva de la Cañada, Madrid, Spain

  18. Goodness-of-fit • Gaussian model and data are consistent if χ 2 / ν ~ 1 • ν = “number of degrees of freedom” = number of bins − number of free model parameters = N - M • cf < (x − µ ) 2 / σ 2 >=1 • same as comparison with best-possible ν =0 model, µ =x, • χ 2 = 2(ln L ( µ =x ) − ln L ( θ )) • Poisson model and data are consistent if C/ ν ~ 1 • comparison with best-possible ν =0 model, µ =n • 2 ∑ ( n i ln n i − n i ) − 2 ∑ ( n i ln µ i − µ i ) = 2 ∑ n i ln( n i / µ i ) − ( n i − µ i ) • XSPEC definition • What happens when many µ i « 1 && n i =0 ? Estimate model parameters and their uncertainties • Parameter error estimates, d θ , around maximum-likelihood solution, θ * • 2ln L ( θ * +d θ ) = 2ln L ( θ * ) + 1. for 1 σ (other choices than 1. sometimes made ) A.M.T. Pollock European Space Astronomy Centre Statistics for high-energy astrophysics XMM-Newton SOC Villanueva de la Cañada, Madrid, Spain

Recommend


More recommend