Sub-seasonal and seasonal forecast verification Young Scientists - PowerPoint PPT Presentation

Sub-seasonal and seasonal forecast verification Young Scientists School, CITES 2019 Debbie Hudson (Bureau of Meteorology, Australia)

Overview 1. Introduction 2. Attributes of forecast quality 3. Metrics: full ensemble 4. Metrics: probabilistic forecasts 5. Metrics: ensemble mean 6. Key considerations: sampling issues; stratification; uncertainty; communicating verification

1) Introduction 2) Attributes 3) Metrics: full ensemble 4) Metrics: probabilistic fc 5) Metrics: ensemble mean 6) Key considerations Purposes of ensemble verification User-oriented  Operations  • How accurate are the forecasts? • Do they enable better decisions than could be made using alternate information (persistence, climatology)? Intercomparison and monitoring • How do forecast systems differ in performance? • How does performance change over time? Calibration • Assist in bias removal and downscaling Research  Diagnosis • Pinpoint sources of error in ensemble forecast system • Diagnose impact of model improvements, changes to DA and/or ensemble generation etc. • Diagnose/understand mechanisms and sources of predictability

1) Introduction 2) Attributes 3) Metrics: full ensemble 4) Metrics: probabilistic fc 5) Metrics: ensemble mean 6) Key considerations Evaluating Forecast Quality Need large number of forecasts and observations to evaluate ensembles and probability forecasts Forecast quality vs. value Attributes of forecast quality: • Accuracy • Skill • Reliability • Discrimination and resolution • Sharpness

1) Introduction 2) Attributes 3) Metrics: full ensemble 4) Metrics: probabilistic fc 5) Metrics: ensemble mean 6) Key considerations Accuracy and Skill Accuracy Overall correspondence/level of agreement between forecasts and observations Skill A set of forecasts is skilful if better than a reference set, i.e. skill is a comparative quantity Reference set e.g., persistence, climatology, random ��

1) Introduction 2) Attributes 3) Metrics: full ensemble 4) Metrics: probabilistic fc 5) Metrics: ensemble mean 6) Key considerations Reliability Ability to give unbiased probability estimates for dichotomous (yes/no) forecasts Can I trust the probabilities? Defines whether the certainty communicated in the forecasts is appropriate Forecast distribution represents distribution of observations Reliability can be improved by calibration

1) Introduction 2) Attributes 3) Metrics: full ensemble 4) Metrics: probabilistic fc 5) Metrics: ensemble mean 6) Key considerations Discrimination and Resolution Resolution • How much does the observed outcome change as the forecast changes i.e., "Do outcomes differ given different forecasts?" • Conditioned on the forecasts Discrimination • Can different observed outcomes can be discriminated by the forecasts. • Conditioned on the observations Indicates potential "usefulness" Cannot be improved by calibration

1) Introduction 2) Attributes 3) Metrics: full ensemble 4) Metrics: probabilistic fc 5) Metrics: ensemble mean 6) Key considerations Discrimination (a) (b) (c) observed observed observed observed observed observed non-events events non-events events non-events events frequency frequency frequency forecast forecast forecast Good discrimination Poor discrimination Good discrimination

1) Introduction 2) Attributes 3) Metrics: full ensemble 4) Metrics: probabilistic fc 5) Metrics: ensemble mean 6) Key considerations Sharpness Sharpness is tendency to forecast extreme values (probabilities near 0 or 100%) rather than values clustered around the mean (a forecast of climatology has no sharpness). A property of the forecast only. Sharp forecasts are "useful" BUT don’t want sharp forecasts if not reliable. Implies unrealistic confidence .

1) Introduction 2) Attributes 3) Metrics: full ensemble 4) Metrics: probabilistic fc 5) Metrics: ensemble mean 6) Key considerations What are we verifying? How are the forecasts being used? Ensemble distribution Set of forecasts making up the ensemble distribution Use individual members or fit distribution Probabilistic forecasts generated from the ensemble Create probabilities by applying thresholds Ensemble mean

1) Introduction 2) Attributes 3) Metrics: full ensemble 4) Metrics: probabilistic fc 5) Metrics: ensemble mean 6) Key considerations Commonly used verification metrics Characteristics of the full ensemble • Rank histogram • Spread vs. skill • Continuous Ranked Probability Score (CRPS) (discussed under probability forecasts)

1) Introduction 2) Attributes 3) Metrics: full ensemble 4) Metrics: probabilistic fc 5) Metrics: ensemble mean 6) Key considerations Rank histogram Measures consistency and reliability: the observation is statistically indistinguishable from the ensemble members  For each observation, rank the N ensemble members from lowest to highest and identify rank of observation with respect to the forecasts Obs rank 2 out of 11 degC 10 15 25 -5 5 20 0 Example for 10 ensemble Obs rank 8 out of 11 members degC 10 15 25 -5 5 20 0 Ensemble Observation Obs rank 3 out of 11 25 degC 5 10 15 20 -5 0 Need lots of samples to evaluate the ensemble

1) Introduction 2) Attributes 3) Metrics: full ensemble 4) Metrics: probabilistic fc 5) Metrics: ensemble mean 6) Key considerations Rank histogram Positive bias Negative bias (Overforecasting bias) (Underforecasting bias) 1 2 3 4 5 6 7 8 9 10 11 1 2 3 4 5 6 7 8 9 10 11 Rank of observation Rank of observation Consistent/Reliable Common problem in seasonal forecasting: ensemble does not have enough spread 1 2 3 4 5 6 7 8 9 10 11 Rank of observation Under-dispersive Over-dispersive (overconfident) (underconfident) 1 2 3 4 5 6 7 8 9 10 11 1 2 3 4 5 6 7 8 9 10 11 Rank of observation Rank of observation

1) Introduction 2) Attributes 3) Metrics: full ensemble 4) Metrics: probabilistic fc 5) Metrics: ensemble mean 6) Key considerations Rank histogram Flat rank histogram does not necessarily indicate a skillful forecast. Rank histogram shows conditional/unconditional biases BUT not full picture • Only measures whether the observed probability distribution is well represented by the ensemble. • Does NOT show sharpness – climatological forecasts are perfectly consistent (flat rank histogram) but not useful

1) Introduction 2) Attributes 3) Metrics: full ensemble 4) Metrics: probabilistic fc 5) Metrics: ensemble mean 6) Key considerations Spread-skill evaluation 500 hPa Geopotential Height (20-60S) Underdispersed RMSE RMSE (overconfident) S ens < RMSE Seasonal prediction Ensemble Ensemble system where spread (S ens ) spread ensemble is generated using: A) Stochastic physics only

1) Introduction 2) Attributes 3) Metrics: full ensemble 4) Metrics: probabilistic fc 5) Metrics: ensemble mean 6) Key considerations Spread-skill evaluation 500 hPa Geopotential Height (20-60S) Underdispersed RMSE (overconfident) S ens < RMSE Consistent/reliable Seasonal prediction S ens ≈ RMSE Ensemble system where spread ensemble is generated using: Overdispersed A) Stochastic physics only (underconfident) B) Stochastic physics AND perturbed S ens > RMSE initial conditions Hudson et al (2017)

1) Introduction 2) Attributes 3) Metrics: full ensemble 4) Metrics: probabilistic fc 5) Metrics: ensemble mean 6) Key considerations Commonly used verification metrics Probability forecasts • Reliability/Attributes diagram • Brier Score (BS and BSS) • Ranked Probability Score (RPS and RPSS) • Continuous Ranked Probability Score (CRPS and CRPSS) • Relative Operating Characteristic (ROC and ROCS) • Generalized Discrimination Score (GDS)

1) Introduction 2) Attributes 3) Metrics: full ensemble 4) Metrics: probabilistic fc 5) Metrics: ensemble mean 6) Key considerations Reliability (attributes) diagram Dichotomous forecasts Measures how well the predicted probabilities of an event correspond to their observed frequencies (reliability)  Plot observed frequency against forecast probability for all probability categories  Need a big enough sample 1 Curve tells what the observed frequency Observed relative frequency # fcsts was for a given forecast probability. Histogram: how P fcst often each Conditioned on the probability was forecasts issued. No resolution Shows sharpness (climatology) and potential 0 sampling issues 0 1 Forecast probability

1) Introduction 2) Attributes 3) Metrics: full ensemble 4) Metrics: probabilistic fc 5) Metrics: ensemble mean 6) Key considerations Interpretation of reliability diagrams No resolution Underforecasting 1 1 Observed frequency Observed frequency 0 0 0 0 1 1 Forecast probability Forecast probability Overconfident Probably under-sampled 1 1 Observed frequency Observed frequency 0 0 0 0 1 1 Forecast probability Forecast probability

Sub-seasonal and seasonal forecast verification Young Scientists - PowerPoint PPT Presentation

Sub-seasonal and seasonal forecast verification Young Scientists School, CITES 2019 Debbie Hudson (Bureau of Meteorology, Australia) Overview 1. Introduction 2. Attributes of forecast quality 3. Metrics: full ensemble 4. Metrics:

Sub-seasonal to seasonal forecast Verification Frdric Vitart and Laura Ferranti European

Visualizing Model Architecture john.sekar@mssm.edu SASB `17 Kinetics ~ Reaction Rules Enz Sub

Sub-Seasonal to Seasonal Forecasting for Africa Wassila Mamadou Thiaw Climate Prediction Center

Forecast verification 4th VALUE Training School Jonas Bhend, Sven Kotlarski Forecast verification

Regional Airport (JQF) December 3, 2019 Aviation Forecast Summary 2017 (Existing) 2018 2023

Hudson City Schools Five Year Forecast October 2014 Forecast Purpose Forecast created by the 122

DIVS DL/ID Verification Systems Verification of Legal Status DIVS Passport Verification

Qualifier seminar Seasonal and sub-seasonal rainfall and river flow prediction over Northern

Predictability of atmospheric flow regimes on seasonal and sub-seasonal scales Franco Molteni

Verifjcatjon of Sub-seasonal to Seasonal Predictjons Caio Coelho INPE/CPTEC, Brazil

FIVE-YEAR FORECAST NOVEMBER 2019 FIVE YEAR FORECAST= PLANNING TOOL FIVE YEAR FORECAST= PLANNING

General Fund Revenue Forecast Fiscal Years 2017-2019 Revenue Forecast Technical Committee

FORECAST ANNUAL RECONCILIATION PAYMENT Q3 GAS YEAR 2019/20 Forecast Annual Reconciliation

Pure Seasonal Models ARIMA Modeling with R Pure Seasonal Models O en collect data with a

Automatic Verification of Automatic Verification of Automatic Verification of Automatic

A Projection of City and School System Finances Agenda Forecast Presentation Economy

Stochastic Computing by Stochastic Computing by a New Polynomial a New Polynomial Dimensional

Numerical Optimization Biostatistics 615/815 Lecture 17: . . . . . . . Summary .

Evaluating the Population Size Adaptation Mechanism for CMA-ES on the BBOB Noiseless Testbed

Learning Conditional Distributions using Mixtures of Truncated Basis Functions Inmaculada

Selecting Variables in Two-Group Robust Linear Discriminant Analysis . . . . . Stefan Van

Reporting Standards for Social Science Experiments Kevin Esterling University of California -

Recoverable Mineral Resources Designed for Mine Planning at Gold Fields Tarkwa Mine, Ghana

Investor Presentation June 2011 FORWARD LOOKING STATEMENTS This presentation contains forward