big data in critical infrastructure production and
play

Big data in critical infrastructure: Production and failover - PowerPoint PPT Presentation

Big data in critical infrastructure: Production and failover infrastructure in DWD's central data management Daniel Lee, German Weather Service (DWD) TI12b Sep. 2015 Agenda 1. DWD's goals 2. Operational systems 3. Technical


  1. Big data in critical infrastructure: Production and failover infrastructure in DWD's central data management Daniel Lee, German Weather Service (DWD) TI12b – Sep. 2015

  2. Agenda 1. DWD's goals 2. Operational systems 3. Technical infrastructure 4. Current challenges 5. Future plans TI12b – Sep. 2015

  3. What does DWD do? TI12b – Sep. 2015

  4. The DWD law • Monitoring of meteorological interactions between the atmosphere and other environmental systems • Prediction of meteorological events • Monitoring and prediction of the movements of radioactive trace particles • Operation of the necessary observation systems • Storage, archival and documentation of meteorological data and products TI12b – Sep. 2015

  5. Target audiences • DWD aids in protecting lives and property, as well as in planning and maintaining critical infrastructure in the areas of: – Aviation Seafaring – – Agriculture – Energy – Climate – Weather warnings – Protection and recovery from high impact weather Etc. – TI12b – Sep. 2015

  6. Multiscale topics 100 years • Climate projection 10 years • Yearly, seasonal and monthly forecasts Future 1 year 1 month • Longterm forecasts (72-360 hours) • Midterm forecasts (12-72 hours) • Shorterm forecasts (2-12 hours) today • Nowcasting ( < 2 hours) Past • Weather and climate observation TI12b – Sep. 2015

  7. Example: Model output TI12b – Sep. 2015

  8. Example: Radar map TI12b – Sep. 2015

  9. Example: Short-term storm cell forecast TI12b – Sep. 2015

  10. Example: Ensemble probabilities forecast TI12b – Sep. 2015

  11. Example: Flight cross-section TI12b – Sep. 2015

  12. Example storm event TI12b – Sep. 2015

  13. Example storm event TI12b – Sep. 2015

  14. Automated product generation TI12b – Sep. 2015

  15. Operational systems TI12b – Sep. 2015

  16. Models for multiple spatiotemporal scales ICON ICON-EU COSMO-DE TI12b – Sep. 2015

  17. Physical observation system TI12b – Sep. 2015

  18. Bringing the data together EUMETCast EUMETCast RMDCN RMDCN Messnetz Messnetz Dissemination Dissemination Radar- Radar- Bundeswehr Bundeswehr daten daten Satelliten- Satelliten- daten daten Radio- Radio- logie logie TI12b – Sep. 2015

  19. DWD's weather models COSMO-DE (-EPS): Grid spacing: 2.2 km COSMO-DE  x = 2.2 km Vertical layers: 80 Forecast range: 27 hours Runs per day: 8 EPS members: 20 ICON-EU: Grid spacing: 6.5 km ICON-EU  x = 6.5 km Vertical layers: 54 Forecast range: 78 hours Runs per day: 8 ICON: Grid spacing: 13 km Vertical layers: 90 ICON  x = 13 km Forecast range: 174 / 78 hours Runs per day: 4 Daily deterministic output: ~ 2.5 TByte Daily probabilistic output: ~ 3.5 TByte TI12b – Sep. 2015

  20. Routine operation Global observations International data exchange Database persistence + archival 24 / 7 Numerical weather prediction Model output persistence Data distribution Visualization + interpretation DIN EN ISO 9001:2008 TI12b – Sep. 2015

  21. Routine operation TI12b – Sep. 2015

  22. Daily data ingress / egress Type of Source Producer # reports Approx. daily data data volume Observation in-situ Manned ground station 12,000 350 GB Automatic ground station 50,000 Ship observation 2,500 Buoy observation 750 Radiosonde 900 Aircraft 3,000 remote sensing Radar 17 Lidar 2 Wind profiler 4 Satellite 20 Model output model COSMO-DE 8 6,000 GB COSMO-DE-EPS 8 ICON-EU 8 ICON 4 Wave model / other models 12 500 GB TI12b – Sep. 2015

  23. Physical infrastructure TI12b – Sep. 2015

  24. DWD's processing power 1 TeraFLOP/s Moore's law / XC40 Dauerleistung 1 GigaFLOP/s NEC IBM SX-9 pSeries Cray T3E 1 MegaFLOP/s Cray YMP Cyber 76 CDC-3800 TI12b – Sep. 2015

  25. Multiple supercomputers Production Research & Development ● 24 / 7 routine production ● R&D ● Data assimilation for computing model ● NUMEX: Numerical experiments ● Backup if routine computer initial states ● All numerical weather prediction unavailable models ● Parallel routine for evaluating model changes TI12b – Sep. 2015

  26. HPC specifications • 2 Cray XC40 supercomputers • Each: 796 nodes, 17,648 compute modules, 79 TB RAM • Xeon prozessors, connected with Aries network • Top performance: 550 TeraFLOP/s per computer • 2 Linux Cluster (Megware Slashtwo) • Each: 523 nodes, 4.5 TB RAM • Each: 500 Xeon compute modules connected with Infiniband network • Top performance: 16,7 TeraFLOP/s per computer TI12b – Sep. 2015

  27. Volume in meteorological databases Start of probabilistic forecasts TI12b – Sep. 2015

  28. Storage systems • Cray Sonexion & NEC/NetApp E5500 with ca. 4.2 PB storage capacity • 1.2 PB global storage for HPC • Write speed: 15 GB/s • Also: • 6 data servers (NEC/SUN Fire X2-8 with 80 Intel Xeon compute modules & 1 TB RAM each) • 2 mirrored data servers • Availability: 99,9% • Server group manages up to 3 PB meteorological data • Archive of 2 StorageTek SL8500 tape libraries with 10,000 tape casettes each • > 40 casettes can be read and written to simultaneously • 16 robots physically access the tapes and insert them into the archive server • Estimated data volume by 2016: 60 PB TI12b – Sep. 2015

  29. Georedundancy Meteorologie des Geoinformationsdienstes der Konrad-Zuse-Institut Berlin (ZiB) Bundeswehr in Euskirchen (MetBw) ECMWF Reading Megware Datenbankserver Sun X2-4 6 Knoten mit 128 GB RAM + Datenzugriffserver Sun X2-4 2 Knoten mit 32 GB RAM fürs PBS Geteilte Gesamtkapazität 2x NFS Server mit 64GB RAM 150 TiB 2x8 Cores (inkl. 16 Hyperthreads) Quelle: ZGeoBw EUS DMRZ Offenbach Modelle DMRZ Offenbach Hochleistungsrechner Entwicklung Halle Ost XCT (Übernahme der Produktion bei Ausfall Halle West) DMRZ Offenbach Hochleistungsrechner Produktion Halle West XC40 8 Sandy Bridge Knoten + 8 Haswell Knoten 640 Cores XC40 1 TiB Hauptspeicher 364 Ivy Bridge Knoten + 432 Haswell Knoten 35296 Cores XC40 77 TiB Hauptspeicher 364 Knoten Ivy Bridge + 432 Knoten Haswell 35296 Cores 77 TiB Hauptspeicher DMRZ Offenbach Hochleistungsrechner Entwicklung Halle Ost DMRZ Offenbach (Übernahme der Produktion bei Ausfall Halle West) Hochleistungsrechner Produktion Halle West Quelle: mygeo.info Archivsystem SUN/IBM-HPSS Sonexion Lustre- SUN STK SL8500 IBM X3650 Megware Datenbankserver Sun X2-4 Sonexion Lustre- Megware Datenbankserver Sun X2-4 Filesystem 2 Kassettensilos 9 Knoten 18 Knoten mit 128GB RAM 240 TiB Kapazität Filesystem 24 Knoten mit 128GB RAM 240 TiB Kapazität mit 2668 TiB Kapazität 20000 Stellplätze 72 Prozessorkerne 4 Knoten mit 512GB RAM Datenzugriffsserver Sun X2-8 mit 1012 TiB Kapazität 4 Knoten 512GB RAM Datenzugriffsserver Sun X2-8 Datenvolumen 6 PiB 216 GiB Hauptspeicher Panasas Filesystem 1300 TiB Kapazität Panasas Filesystem 1600 TiB Kapazität 36 Laufwerke 600 TiB Plattensysteme mit 120 TiB mit 171 TiB Concept: A. Pielicke, M. Jonas Current: November 2014 TI12b – Sep. 2015

  30. Software configuration TI12b – Sep. 2015

  31. Job management Job dispatchment: SMS / ecFlow ● Timed job execution ● Interjob dependency ● Status reports + output capture ● Manual starts, restarts, aborts ● Transferability between halls & computing centers TI12b – Sep. 2015

  32. Monitoring & integration Several monitoring systems, depending on target components ● Nagios ● Icinga ● Big Brother ● Custom software Testing and building with Jenkins TI12b – Sep. 2015

  33. Data management overview External users NinJo meteorological workstations NinJo Server (uses GloBUS) Binary DBs Internal Decoding Obs. model (GloBUS) data Numerical External weather prediction model WISO (WMO data data exchange) Data conversion (BUFR-TO-ROUTKLI) Relational weather data DWD stations MIRAKEL TI12b – Feb. 2015

  34. AFD: Automated file distributor External users NinJo meteorological workstations NinJo Server (uses GloBUS) Binary DBs Internal Decoding Obs. model (GloBUS) data Numerical External weather prediction model WISO (WMO data data exchange) Data conversion (BUFR-TO-ROUTKLI) Relational weather data DWD stations MIRAKEL TI12b – Feb. 2015

  35. GloBUS and BUFR-TO-ROUTKLI External users NinJo meteorological workstations NinJo Server (uses GloBUS) Binary DBs Internal Decoding Obs. model (GloBUS) data Numerical External weather prediction model WISO (WMO data data exchange) Data conversion (BUFR-TO-ROUTKLI) Relational weather data DWD stations MIRAKEL TI12b – Feb. 2015

  36. SKY External users NinJo meteorological workstations NinJo Server (uses GloBUS) Binary DBs Internal Decoding Obs. model (GloBUS) data Numerical External weather prediction model WISO (WMO data data exchange) Data conversion (BUFR-TO-ROUTKLI) Relational weather data DWD stations MIRAKEL TI12b – Feb. 2015

Recommend


More recommend