progress on nowcasting convection occurrence from space
play

Progress on Nowcasting convection occurrence from space-borne - PowerPoint PPT Presentation

Progress on Nowcasting convection occurrence from space-borne instability predictors P . Antonelli (1) , A. Manzato (2) , T. Cherubini (3) , S. Tjemkes (4) , R. Stuhlmann (4) , E. Holm (5) , C. Serio (6) , G. Masiello (6) (1) Space Science


  1. Progress on Nowcasting convection occurrence from space-borne instability predictors P . Antonelli (1) , A. Manzato (2) , T. Cherubini (3) , S. Tjemkes (4) , R. Stuhlmann (4) , E. Holm (5) , C. Serio (6) , G. Masiello (6) (1) Space Science Engineering Center - University of Wisconsin - Madison (2) OSMER Arpa Friuli Venezia Giulia (3) Mauna Kea Weather Center (4) EUMETSAT (5) ECMWF (6) University of Basilicata Workshop on NWC application using MTG-IRS Darmstadt 25-26 July 2013 Thursday, July 25, 13

  2. History PART I Previous Studes Project started in 2010 and had contributions in various forms from several people from ARPA-FVG (OSMER), CNMCA, ITALIAN CIVIL PROTECTION, and SPACE SCIENCE ENGINEERING CENTER; Project aimed to derive instability indices from IASI Level 2 products, and to improve short term forecast of severe convective events; In 3 years project expanded to involve ECMWF (through EUMETSAT), AER (provided OSS), and University of Hawaii (MKWC); Project represents a significant effort in achieving standard needed to make products from high-spectral resolution IR data available to forecaster. Thursday, July 25, 13

  3. Area of interest PART I Previous Studes Generation of IASI Full dataset Thursday, July 25, 13

  4. Strategy PART I Previous Studes • define occurrence of convective event, by setting threshold of 10 strikes to convert discrete distribution of lightning strikes into binary output event yes/no (1/0); • build Full Dataset with occurrence of convective event (yes/no) and values of all available predictors; divide the full dataset into 2 subsets: 1) Total Set, to be used to build classifier; 2) Test Set, to be used for final evaluation of classifier • capacity of prediction; divide, in 12 different ways, Total Set into: 1) Training Set (75%); 2) Validation Set (25%); both to be used to sub-select the optimal • predictors using Repeated Holdout Technique [Witten:2005] Thursday, July 25, 13

  5. Preprocessing PART I Previous Studes • divide inputs in 21 bins and for each bin calculate ratio: ψ i ( x ) = N act i N tot i Thursday, July 25, 13

  6. Preprocessing PART I Previous Studes • divide inputs in 21 bins and for each bin calculate ratio: ψ i ( x ) = N act i N tot i • these values are then fit with ad-hoc functions to define, for each predictor (for example KI), empirical posterior probability (EPP(KI)), i.e. mathematical relationship which associate the probability of event=1 to continuous values of the predictors and use it as pre-processing; Thursday, July 25, 13

  7. Preprocessing PART I Previous Studes • divide inputs in 21 bins and for each bin calculate ratio: ψ i ( x ) = N act i N tot i • these values are then fit with ad-hoc functions to define, for each predictor (for example KI), empirical posterior probability (EPP(KI)), i.e. mathematical relationship which associate the probability of event=1 to continuous values of the predictors and use it as pre-processing; EPP ( KI ) = (0 . 33 ∗ exp (0 . 074 ∗ ( KI − 20))) − > x Thursday, July 25, 13

  8. Strategy PART I Previous Studes • implement forward selection algorithm (based on Artificial Neural Networks, namely single layer, feedforward network trained with backpropagation [Manzato:2004, Manzato-2007]) to choose optimal subset of predictors; • ANN chooses at first one predictor that gives the best classification of the event occurrences starting from its empirical probability distribution. Then it selects predictor which gives best fit, when used together with first one. New predictors are added, until the system predictive skill stop increasing. During input selection process, number of hidden neurons varies according to predefined function of number of inputs; • number of input predictors was chosen taking into consideration the mean skill of the 12 ANN built with the different instances of the Total Sets. Prediction skill of ANN was measured by the mean cross-entropy error (CEE): CEE = − P N n =1 [ t n ln ( y n ) + (1 − t n ) ln (1 − y n )] t n • where is the output of the ANN, and is boolean for the the convective event (1|yes y n 0|no), calculated over the 12 instances of the Validation Sets; Thursday, July 25, 13

  9. TV diagram PART I Previous Studes Thursday, July 25, 13

  10. TV diagram PART I Previous Studes Thursday, July 25, 13

  11. Strategy PART I Previous Studes • once optimal subset of predictors was identified, final ANN architecture was chosen among different candidates (different numbers of hidden neurons in hidden layer) as one with lowest combined CEE on Total set (Training + Validation) without overfitting it, that is, with similar performances also on independent Test set; X 1 W 11 W 11 φ ( Σ wx+ α ) Inputs o φ ( Σ wx+ α ) W 21 φ ( Σ wx+ α ) W 62 X 6 H1 H1 Thursday, July 25, 13

  12. Strategy PART I Previous Studes • quantitative evaluation of learning and generalization of knowledge during ANN supervised training was performed using Relative Operating Characteristic (ROC) [Swets:1973]. Once output of ANN was dichotomized using event prior probability as threshold, contingency table was calculated, and different statistical scores were determined [Manzato: 2007]. Event (Y) Event (N) a b Prediction: (Y) c d Prediction: (N) b POD = a POFD = a + c b + d ( ad - bc ) b FAR = PSS = a + b ( a + c )( b + d ) Thursday, July 25, 13

  13. Event Climatology PART I Previous Studes Thursday, July 25, 13

  14. Results using RAWINSONDES PART I Previous Studes • During input selection phase (forward selection algorithm) only Total set was used, it included 949 cases and was used to train different ANN candidates. While to select best architecture (hidden neurons) for the prediction system (ANN) also consistency between the results obtained on the Total and on the Test sets (of 350 cases) was taken into account. The architecture chosen was with 8 inputs, 2 neurons on the hidden layer, and 1 output. TRAINING: Application of the ANN on the TESTING: Applying ANN on Test set led Total set led to a Total CEE of 0.335 , while to a Test CEE of 0.375 , while applying the applying the probability threshold (0.40) on probability threshold (0.40) on the the continuous ANN output led to the continuous ANN output led to the following contingency table: following contingency table: TOTAL Event (Y) Event (N) TEST Event (Y) Event (N) Prediction: Prediction: 316 95 114 33 YES YES Prediction: Prediction: 63 475 23 180 NO NO TOTAL POD HIT FAR POFD PSS TEST POD HIT FAR POFD PSS Score 0.83 0.83 0.23 0.17 0.67 Score 0.83 0.84 0.22 0.15 0.68 Thursday, July 25, 13

  15. Results using IASI data PART I Previous Studies • By focusing on single area of interest, it was possible to generate ANN trained on a IASI dataset twice as large as the Full IASI Set. In this case the event occurrence was defined by at least 3 (IC+C2G) lightnings. Best ANN-inputs were ShowI , PCS MW 7 , MRH . The best ANN, a 3 input, 1 hidden neuron. TRAINING: Application of the ANN on the TESTING: Applying ANN on Test set led Total set led to a Total CEE of 0.30 , on 1338 to a Test CEE of 0.36 , on 657 cases, while cases, while applying the probability threshold applying the probability threshold (0.14) on (0.14) on the continuous ANN output led to the continuous ANN output led to the the following contingency table: following contingency table: TOTAL Event (Y) Event (N) TEST Event (Y) Event (N) Prediction: Prediction: 138 330 92 197 YES YES Prediction: Prediction: 41 829 17 349 NO NO TOTAL POD HIT FAR POFD PSS TEST POD HIT FAR POFD PSS Score 0.77 0.72 0.70 0.28 0.48 Score 0.84 0.67 0.68 0.36 0.48 Analysis of IASI results Thursday, July 25, 13

  16. Outcome PART I Previous Studes • False alarm was found to be too high, and PSS was found to be too low; • Results clearly indicated need for retrieval improvements; in terms of number of successful retrievals; • in terms of retrieval accuracy; • • Areas of potential improvements identified: • a-priori; • surface emissivity representation; • numerical stability; Thursday, July 25, 13

  17. PART II A-priori Improvements in characterization of a-priori information Retrieval of high spectral resolution observations over Hawai’i using WRF and ECMWF derived a-priori Tiziana Cherubini (1) Paolo Antonelli (2) (1) Muna Kea Weather Service - Univeristy of Hawai’i (2) Space Science Engineering Center - University of Wisconsin - Madison Presented at MIST VIII - KNMI - 6-7 Dec 2012 Thursday, July 25, 13

  18. Study over Hawai’i PART II A-priori • the main idea was to transition from a retrieval system based on climatological a-priori to a system based on a NWP derived a-priori; • transition steps: application of UWPHYSRET to a selected Hawai’i case (21 Jul 2012) using climatological a-priori derived from local rawinsonde • (same approached used over Udine), with retrievals being performed on IASI, CrIS, and AIRS data; application of UWPHYSRET to the same case using a single profile generated by the WRF model as First Guess and the simple • a-priori derived from the MKWC WRF (12hr forecast - analysis); Results presented at MIST VIII by Dr. Cherubini Thursday, July 25, 13

  19. CrIS Retrieval from Climatology PART II A-priori Temperature Thursday, July 25, 13

  20. CrIS Retrieval from Climatology PART II A-priori Temperature Thursday, July 25, 13

  21. CrIS Retrieval from WRF: a-priori PART II A-priori T WV log(q) T 12hr Forecast - Analysis WV 12hr Forecast - Analysis Thursday, July 25, 13

Recommend


More recommend