spatial forecast verifjcation
play

Spatial forecast verifjcation Manfred Dorninger University of - PowerPoint PPT Presentation

Spatial forecast verifjcation Manfred Dorninger University of Vienna Vienna, Austria manfred.dorninger@univie.ac.at Thanks to: B. Ebert, B. Casati, C. Keil 7 th Verifjcation T utorial Course, Berlin, 3-6 May, 2017 Motivation: Model


  1. Spatial forecast verifjcation Manfred Dorninger University of Vienna Vienna, Austria manfred.dorninger@univie.ac.at Thanks to: B. Ebert, B. Casati, C. Keil 7 th Verifjcation T utorial Course, Berlin, 3-6 May, 2017

  2. Motivation: Model – VERA: RMSE pmsl, 13- 24 h Aladin (1.76) ECMWF (1.33) LM (1.80) ~9 km ~25 km ~2 km RMSE

  3. Motivation: Analysis FC-model I (coarse)  The „double“ penalty problem FC-model II (fine) • FC-AN fine-scale model catches the small- scale trough but at the wrong place (or time) - • gets penalized twice + • increases quadratic measures compared to coarse model • true for other continuous variables as well (e.g., precipitation, wind speed, etc.)

  4. Traditjonal spatjal verifjcatjon (grid point wise approach) Compute statjstjcs on forecast-observatjon pairs – Contjnuous values (e.g., precipitatjon amount, temperature, NWP variables): • mean error, MSE, RMSE, correlatjon • anomaly correlatjon, S1 score – Categorical values (e.g., precipitatjon occurrence): • Contjngency table statjstjcs (POD, FAR, etc…) 4

  5. Anomaly correlatjon       ( F  C ) ( O  C ) ( F  C )  ( F  C ) ( O  C )  ( O  C )  AC (uncentere d) or (centered)     2 2 2 2 ( F  C ) ( O  C ) ( F  C ) ( O  C ) 5

  6. Traditjonal spatjal verifjcatjon using categorical scores Contingency Table Observed yes no false yes hits Predicted alarms Misses correct no misses negatives y →  hits false alarms FBI  Hits False  hits misses alarms hits false alarms POD  FAR  hits  misses hits  false alarms hits TS  hits  misses  false alarms Forecast Observed hits  hits ETS  random x → hits  misses  false alarms  hits random 6

  7. POD=0.39, FAR=0.63, CSI=0.24 7

  8. Traditjonal spatjal verifjcatjon • Requires an exact match between forecasts and observatjons at every grid point  Problem of "double penalty" - fcst obs fcst obs event predicted where it did not 10 10 3 10 occur, no event predicted where it did occur Hi res forecast Low res forecast  Traditional scores do not say RMS ~ 4.7 RMS ~ 2.7 POD=0, FAR=1 POD~1, FAR~0.7 very much about the source or TS=0 TS~0.3 nature of the errors fcst obs 10 10 8

  9. What do traditjonal scores not tell us ? • Traditjonal approaches provide overall measures of skill but… • They don't provide much diagnostjc informatjon about the forecast: – What went wrong? What went right? – How close is the forecast to observatjon (in terms of spatjal thinking)? – Does the forecast look realistjc? – How can I improve this forecast? – How can I use it to make a decision? • Best performance for smooth forecasts !!! • Some scores are insensitjve to the size of the errors… 9

  10. Spatjal forecasts WRF model Weather variables defined over spatial domains have coherent spatial structure and features Stage II radar New spatjal verifjcatjon techniques aim to: • account for fjeld spatjal structure • provide informatjon on error in physical terms • account for uncertaintjes in locatjon (and tjming) 10

  11. Spatial verification types • Neighborhood (fuzzy) verifjcatjon methods  give credit to "close" forecasts • Scale separatjon methods  measure scale-dependent error • Features-based methods  evaluate atuributes of identjfjable features • Field deformatjon  evaluate phase errors

  12. Spatial verification types Gilleland, et al. 2009

  13. Gilleland, et al. 2009

  14. Neighborhood (fuzzy) verifjcatjon methods  give credit to "close" forecasts fcst obs fcst obs 10 10 10 10 “not close“ “close“ 14

  15. Neighborhood verifjcatjon methods • Don't require an exact match between forecasts and observatjons – Unpredictable scales – Uncertainty in observatjons  Look in a space / time neighborhood around the point of Why is it called "fuzzy"? interest t - 1 Frequency Squint your t eyes! t + 1 Forecast value  Evaluate using categorical, continuous, probabilistic observation observation forecast forecast scores / methods 15

  16. Neighborhood verifjcatjon methods Treatment of forecast data within a window: – Mean value (upscaling) – Occurrence of event* somewhere in window – Frequency of events in window  probability – Distributjon of values within window May also look in a neighborhood of observatjons Frequency observation forecast Rainfall * Event defined as a value exceeding a given threshold, for example, rain exceeding 1 mm/hr 16

  17. Moving windows For each combinatjon of neighborhood size and intensity threshold, accumulate scores as windows are moved through the domain observation observation observation observation forecast forecast forecast forecast 17

  18. Neighborhood verifjcatjon framework Neighborhood methods use one of two approaches to compare forecasts and observatjons: single observation – neighborhood forecast (SO-NF, user-oriented) observation forecast neighborhood observation – neighborhood forecast (NO-NF, model-oriented) observation forecast 18

  19. Difgerent neighborhood verifjcatjon methods have difgerent decision models for what makes a useful forecast *NO-NF = neighborhood observation-neighborhood forecast, SO-NF = single observation-neighborhood forecast 19 from Ebert, Meteorol. Appl. , 2008

  20. Detailed descriptjon of Fractjon Skill Score (FSS) (Roberts and Lean, 2008) • We want to know – How forecast skill varies with neighborhood size – The smallest neighborhood size that can be used to give suffjciently accurate forecasts – Does higher resolutjon NWP provide more accurate forecasts on scales of interest (e.g., river catchments) Step 1: FC and Observation/Analysis have to be on the same grid. Step 2: Choose suitable thresholds q (e.g.: 0.5, 1, 2, 4 mm) Step 3: Convert FC/AN fields to binary fields I O and I M according to threshold

  21. Detailed descriptjon of Fractjon Skill Score (FSS) (Roberts and Lean, 2008) P obs 1x1 P fcst 1x1 Step 4: Generate fractions for all thresholds: P obs 35x35 P fcst 35x35

  22. Detailed descriptjon of Fractjon Skill Score (FSS) (Roberts and Lean, 2008) Step 5: Compute fraction skill score for all thresholds: 1 N  2  ( P P ) fcst obs N    FSS 1 i 1 1 N 1 N   2 2 P  P fcst obs N N i  1 i  1 Maximum estimation (low-skill reference) of MSE: (P fcst -P obs ) 2 = P fcst 2 - 2P fcst P obs + P obs 2 ~ P fcst 2 + P obs 2 = MSE ref

  23. Detailed descriptjon of Fractjon Skill Score (FSS) (Roberts and Lean, 2008) Step 6: Graphical presentation for each threshold and spatial scale: Interpretation: •Skill increases with spatial scale •The smaller the displacement error the faster the skill increases with increasing spatial scale •When the length of the moving window is smaller or equal the displacment error there is no skill and FSS=0

  24. Detailed descriptjon of Fractjon Skill Score (FSS) (Roberts and Lean, 2008) Q: What happens if size of moving window is equal to domain size? Q: What are useful (skillfull) numbers of FSS? f o =domain obs fraction on the grid scale (for f 0 =0.2(20%)  target skill: FSS=0.5+0,2/2=0.6

  25. Detailed descriptjon of Fractjon Skill Score (FSS) (Roberts and Lean, 2008)

  26. Scale separatjon methods  scale-dependent error 1. Which spatial scales are well represented and which scales have error? 2. How does the skill depend on the precipitation intensity? NOTE: scale = single band spatial fjlter  features of difgerent scales  feedback on difgerent physical processes and model parameterizations In the neighborhood based (fuzzy) verifjcation, the scale is the neighborhood size (low band pass fjlter): as the scale increases the exact positioning requirements are more and more relaxed 26

  27. What is the difgerence between neighborhood and scale separatjon approaches? • Neighborhood verifjcatjon methods  Get scale informatjon by fjltering out higher resolutjon scales • Scale separatjon methods  Get scale informatjon by isolatjng scales of interest 27

  28. Nimrod case study: intense storm displaced Step 1: Gridded data, square domain with dimension 2 n It can be applied to any meteorological field … however, it was specifically designed for spatial precipitation forecasts …

  29. Step 2: Intensity: threshold to obtain binary images (categorical approach) Binary Analysis Binary Error Image u=1mm/h 1 0 Binary Forecast -1

  30. Step 3: Scale  wavelet decompositjon of binary error mean (1280 km) Scale l =8 (640 km) Scale l =7 (320 km) 1 Scale l =6 (160 km) Scale l =5 (80 km) Scale l =4 (40 km) 0 Scale l =3 (20 km) Scale l =2 (10 km) Scale l =1 (5 km) -1 L L    E  E MSE MSE u u , l u u , l l  1 l  1 30

  31. Step 4: MSE skill score for each threshold and scale component MSE  MSE MSE u , l u , l , random u , l    SS 1   L u , l MSE  MSE 2 ε 1  ε / u , l , best u , l , random Sample climatology (base rate) 31

Recommend


More recommend