Probabilistjc verifjcatjon Chiara Marsigli with the help of the WG - PowerPoint PPT Presentation

Probabilistjc verifjcatjon Chiara Marsigli with the help of the WG and Laurie Wilson in partjcular

Goals of this session  Increase understanding of scores used for probability forecast verifjcation  Characteristics, strengths and weaknesses  Know which scores to choose for difgerent verifjcation questions

T opics  Introduction: review of essentials of probability forecasts for verifjcation  Brier score: Accuracy  Brier skill score: Skill  Reliability Diagrams: Reliability, resolution and sharpness  Exercise  Discrimination  Exercise  Relative operating characteristic  Exercise  Ensembles: The CRPS and Rank Histogram

Probability forecast  Applies to a specifjc, completely defjned event  Examples: Probability of precipitation over 6h  …  Question: What does a probability forecast “POP for Melbourne for today (6am to 6pm) is 0.40” mean?

Deterministjc approach Weather forecast:

Probabilistjc approach Weather forecast: 50% ? 30% 20%

Deterministjc approach Weather forecast:

Probabilistjc approach 20%

Probabilistjc approach

Deterministjc forecast event E e. g.: 24 h accumulated precipitatjon on one point (raingauge, radar pixel, catchment, area) exceeds 20 mm no yes event is observed with frequency o(E) o(E) = 0 o(E) = 1 no yes event is forecasted with probability p(E) p(E) = 0 p(E) = 1

Probabilistjc forecast event E e. g.: 24 h accumulated precipitatjon on one point (raingauge, radar pixel, catchment, area) exceeds 20 mm no yes event is observed with frequency o(E) o(E) = 0 o(E) = 1 event is forecasted with probability p(E)  p(E) [0,1]

Ensemble forecast event E e. g.: 24 h accumulated precipitatjon on one point (raingauge, radar pixel, catchment, area) exceeds 20 mm no sì event is observed with frequency o(E) o(E) = 0 o(E) = 1 ensemble of M elements event is forecasted with probability p(E) = k/M none all p(E) = 0 p(E) = 1

Deterministjc approach

Probabilistjc approach

Ensemble forecast

Forecast evaluatjon  Verifjcatjon is possible only in statjstjcal sense, not for one single issue  E.g.: correspondence between forecast probabilitjes and observed frequencies  Dependence on the ensemble size

Brier Score 1 n    2 BS  f  o i i n i  1 Scalar summary measure for the assessment of the forecast performance, mean square error of the probability forecast • n = number of points in the “domain” (spatio- temporal) • o i = 1 if the event occurs = 0 if the event does not occur • f i is the probability of occurrence according to the forecast system (e.g. the fraction of ensemble members forecasting the event) • BS can take on values in the range [0,1], a perfect forecast having BS = 0

Brier Score  Gives result on a single forecast, but cannot get a perfect score unless forecast categorically.  A “summary” score – measures accuracy, summarized into one value over a dataset.  Weights larger errors more than smaller ones.  Sensitive to climatological frequency of the event: the more rare an event, the easier it is to get a good BS without having any real skill  Brier Score decomposition – components of the error

Components of probability error The Brier score can be decomposed into 3 terms (for K probability classes and a sample of size N ): 1 1 K K         BS n ( p o ) n ( o o ) o ( 1 o ) 2 2 N k k k N k k   k 1 k 1 reliability resolution uncertainty If for all occasions when The ability of the forecast to The variability of the forecast probability p k is distinguish situations with observations. Maximized distinctly different frequencies predicted, the observed when the climatological of occurrence. frequency of the event is frequency ( base rate ) =0.5 o = p k then the forecast is Has nothing to do with k forecast quality! Use the said to be reliable. Similar to bias for a continuous Brier skill score to overcome variable this problem. The presence of the uncertainty term means that Brier Scores should not be compared on difgerent samples.

Probabilistjc forecasts An accurate probability forecast system has:  reliability - agreement between forecast probability and mean observed frequency  sharpness - tendency to forecast probabilities near 0 or 1, as opposed to values clustered around the mean  resolution - ability of the forecast to resolve the set of sample events into subsets with characteristically difgerent outcomes

Brier Score decompositjon Murphy (1973) M M 1 1   2 2       BS N ( f o ) ( o o ) o ( 1 o ) k k k k N N k  0 k  0 uncertain reliabilit resolutio ty y M = ensemble size n K = 0, …, M number of ensemble members forecasting the event (probability classes) N = total number of point in the verifjcation domain N k = number of points where the event is forecast by k members N k  = frequency of the event in the sub- o  o k i sample N k i  1 = total frequency of the event (sample o climatology)

Brier Score decompositjon Murphy (1973) M M 1 1   2 2       BS N ( f o ) ( o o ) o ( 1 o ) k k k k N N k  0 k  0 uncertain reliabilit resolutio ty y n The fjrst term is a reliability measure: for forecasts that are perfectly reliable, the sub-sample relative frequency is exactly equal to the forecast probability in each sub-sample. The second term is a resolution measure: if the forecasts sort the observations into sub-samples having substantially difgerent relative frequencies than the overall sample climatology, the resolution term will be large. This is a desirable situation, since the resolution term is subtracted. It is large if there is resolution enough to produce very high and very low probability forecasts.

Brier Score decompositjon M M 1 1   2 2       BS N ( f o ) ( o o ) o ( 1 o ) k k k k N N k  0 k  0 uncertain reliabilit resolutio ty y n The uncertainty term ranges from 0 to 0.25. If E was either so common, or so rare, that it either always occurred or never occurred within the sample of years studied, then b unc =0; in this case, always forecasting the climatological probability generally gives good results. When the climatological probability is near 0.5, there is substantially more uncertainty inherent in the forecasting situation: if E occurred 50% of the time within the sample, then b unc =0.25. Uncertainty is a function of the climatological frequency of E, and is not dependent on the forecasting system itself.

Brier Score decompositjon II T alagrand et al. (1997) 2 2 M M k k           BS o H 1 ( 1 o ) F     k k M M     k  0 k  0 Hit Rate False Alarm Rate term term M = ensemble size K = 0, …, M number of ensemble members forecasting the event (probability classes) = total frequency of the event (sample o climatology) M M     H H F F k i k i i  k i  k

Brier Skill Score Measures the improvement of the accuracy of the probabilistic forecast relative to a reference forecast (e. g. climatology or persistence) BS  BS ref  BSS BS ref The forecast system has predictive skill if BSS is positive, a perfect system having BSS = 1. IF the sample climatology is used, can be expressed as: o Res  Rel     BSS   BS cli o 1 o Unc

Brier Score and Skill Score - Summary  Measures accuracy and skill respectively  “Summary” scores  Cautions:  Cannot compare BS on difgerent samples  BSS – take care about underlying climatology  BSS – T ake care about small samples

Ranked Probability Score 1  2   M  m   m      RPS  f   o      k k  M 1       m  1 k  1 k  1 Extension of the Brier Score to multi-event situation. The squared errors are computed with respect to the cumulative probabilities in the forecast and observation vectors. • M = number of forecast categories • o i k = 1 if the event occurs in category k = 0 if the event does not occur in category k • f k is the probability of occurrence in category k according to the forecast system (e.g. the fraction of ensemble members forecasting the event) • RPS take on values in the range [0,1], a perfect forecast having RPS = 0

Reliability Diagram o(p) is plotted against p for some fjnite binning of width dp In a perfectly reliable system o(p)=p and the graph is a straight line oriented at 45 o to the axes

Reliability Diagram 1 skill Reliability: Proximity to diagonal Resolution: Variation about horizontal (climatology) line Observed frequency No skill line: Where reliability and resolution are equal – Brier skill 1 1 score goes to 0 Obs. frequency Obs. frequency # fcsts clim climatology 0 0 0 1 0 1 Forecast probability Forecast probability Reliabilit Resolutio y n P fcst 0 0 1 Forecast probability

Probabilistjc verifjcatjon Chiara Marsigli with the help of the WG - PowerPoint PPT Presentation

Probabilistjc verifjcatjon Chiara Marsigli with the help of the WG and Laurie Wilson in partjcular Goals of this session Increase understanding of scores used for probability forecast verifjcation Characteristics, strengths and

Probabilistjc forecast verifjcatjon Caio Coelho Centro de Previso de Tempo e Estudos

Natural Language Processing Lecture 13: More on CFG Parsing Probabilistjc/Weighted Parsing

Development of Verifjcatjon Methodology for Extreme Weather Forecasts Hong Guan 1 and Yuejian Zhu

Project 3: Spatjal verifjcatjon of precipitatjon over the Alps during MesoVICT-I Alvarez, Mao,

Verifjcatjon of Sub-seasonal to Seasonal Predictjons Caio Coelho INPE/CPTEC, Brazil

Evaluatjon of non-standard variables Barbara Brown (bgb@ucar.edu) Natjonal Center for

Channel capacity estimation using free probability theory yvind Ryan and Merouane Debbah

risk: A UK perspective Current status of UK tidal industry Consenting decision making

Recommendations for Statistics and Probability in the Secondary Curriculum: Implications for

Primary reference: Casella-Berger 2 nd Edition Presentation 1-2- 4: Bonferronis Inequality

ZEGAs High Probability Options Strategy (HiPOS) April 2020 Disclosure Information presented

Early History on the Application of Probability Methods in the Evaluation of Generating Capacity

Probability Using Words and Numbers to Describe Probability Learning Objective To be able to

An Introduction to Bayesian Network Inference using Variable Elimination Jhonatan Oliveira

Partially specified Probabilities: decisions and games May 2007 Ehud Lehrer The problem

with 3x3 cm 2 THGEM Berkin Ulukutlu RD51 Collaboration Meeting & MPGD Stability Workhshop

Enhancing the flexible ramping product to better address net load uncertainty Ryan Kurlinski

WHAT DID WE DO YESTERDAY? OUR VOCABULARY LIST Statistics: the branch of mathematics

Lets Do Math with KCM High School Statistics & Probability with Virtual Manipulatives

On the Prime Number Subset of the Fibonacci Numbers Lacey Fish 1 Brandon Reid 2 Argen West 3 1

Interest Probability A Case Study Sofia Charalampidou, Apostolos Ampatzoglou, Alexander

Raising the Stakes in Patent Cases Anup Malani Jonathan Masur IPSC 2012 Two Baseline Patent

Primer on On-orbit Collision & Debris Generation UN Disarmament Commission Working Group, New

SPECIAL THANKS TO OUR SPONSORS AAPOR ANNUAL CONFERENCE 2017 | @PANJAAPOR LETS TAKE A STEP

Probabilistjc verifjcatjon Chiara Marsigli with the help of the WG - PowerPoint PPT Presentation

Probabilistjc verifjcatjon Chiara Marsigli with the help of the WG and Laurie Wilson in partjcular Goals of this session Increase understanding of scores used for probability forecast verifjcation Characteristics, strengths and

Probabilistjc forecast verifjcatjon Caio Coelho Centro de Previso de Tempo e Estudos

Natural Language Processing Lecture 13: More on CFG Parsing Probabilistjc/Weighted Parsing

Development of Verifjcatjon Methodology for Extreme Weather Forecasts Hong Guan 1 and Yuejian Zhu

Project 3: Spatjal verifjcatjon of precipitatjon over the Alps during MesoVICT-I Alvarez, Mao,

Verifjcatjon of Sub-seasonal to Seasonal Predictjons Caio Coelho INPE/CPTEC, Brazil

Evaluatjon of non-standard variables Barbara Brown (bgb@ucar.edu) Natjonal Center for

Channel capacity estimation using free probability theory yvind Ryan and Merouane Debbah

risk: A UK perspective Current status of UK tidal industry Consenting decision making

Recommendations for Statistics and Probability in the Secondary Curriculum: Implications for

Primary reference: Casella-Berger 2 nd Edition Presentation 1-2- 4: Bonferronis Inequality

ZEGAs High Probability Options Strategy (HiPOS) April 2020 Disclosure Information presented

Early History on the Application of Probability Methods in the Evaluation of Generating Capacity

Probability Using Words and Numbers to Describe Probability Learning Objective To be able to

An Introduction to Bayesian Network Inference using Variable Elimination Jhonatan Oliveira

Partially specified Probabilities: decisions and games May 2007 Ehud Lehrer The problem

with 3x3 cm 2 THGEM Berkin Ulukutlu RD51 Collaboration Meeting &amp; MPGD Stability Workhshop

Enhancing the flexible ramping product to better address net load uncertainty Ryan Kurlinski

WHAT DID WE DO YESTERDAY? OUR VOCABULARY LIST Statistics: the branch of mathematics

Lets Do Math with KCM High School Statistics &amp; Probability with Virtual Manipulatives

On the Prime Number Subset of the Fibonacci Numbers Lacey Fish 1 Brandon Reid 2 Argen West 3 1

Interest Probability A Case Study Sofia Charalampidou, Apostolos Ampatzoglou, Alexander

Raising the Stakes in Patent Cases Anup Malani Jonathan Masur IPSC 2012 Two Baseline Patent

Primer on On-orbit Collision &amp; Debris Generation UN Disarmament Commission Working Group, New

SPECIAL THANKS TO OUR SPONSORS AAPOR ANNUAL CONFERENCE 2017 | @PANJAAPOR LETS TAKE A STEP

with 3x3 cm 2 THGEM Berkin Ulukutlu RD51 Collaboration Meeting & MPGD Stability Workhshop

Lets Do Math with KCM High School Statistics & Probability with Virtual Manipulatives

Primer on On-orbit Collision & Debris Generation UN Disarmament Commission Working Group, New