Big data in critical infrastructure: Production and failover infrastructure in DWD's central data management Daniel Lee, German Weather Service (DWD) TI12b – Sep. 2015
Agenda 1. DWD's goals 2. Operational systems 3. Technical infrastructure 4. Current challenges 5. Future plans TI12b – Sep. 2015
What does DWD do? TI12b – Sep. 2015
The DWD law • Monitoring of meteorological interactions between the atmosphere and other environmental systems • Prediction of meteorological events • Monitoring and prediction of the movements of radioactive trace particles • Operation of the necessary observation systems • Storage, archival and documentation of meteorological data and products TI12b – Sep. 2015
Target audiences • DWD aids in protecting lives and property, as well as in planning and maintaining critical infrastructure in the areas of: – Aviation Seafaring – – Agriculture – Energy – Climate – Weather warnings – Protection and recovery from high impact weather Etc. – TI12b – Sep. 2015
Multiscale topics 100 years • Climate projection 10 years • Yearly, seasonal and monthly forecasts Future 1 year 1 month • Longterm forecasts (72-360 hours) • Midterm forecasts (12-72 hours) • Shorterm forecasts (2-12 hours) today • Nowcasting ( < 2 hours) Past • Weather and climate observation TI12b – Sep. 2015
Example: Model output TI12b – Sep. 2015
Example: Radar map TI12b – Sep. 2015
Example: Short-term storm cell forecast TI12b – Sep. 2015
Example: Ensemble probabilities forecast TI12b – Sep. 2015
Example: Flight cross-section TI12b – Sep. 2015
Example storm event TI12b – Sep. 2015
Example storm event TI12b – Sep. 2015
Automated product generation TI12b – Sep. 2015
Operational systems TI12b – Sep. 2015
Models for multiple spatiotemporal scales ICON ICON-EU COSMO-DE TI12b – Sep. 2015
Physical observation system TI12b – Sep. 2015
Bringing the data together EUMETCast EUMETCast RMDCN RMDCN Messnetz Messnetz Dissemination Dissemination Radar- Radar- Bundeswehr Bundeswehr daten daten Satelliten- Satelliten- daten daten Radio- Radio- logie logie TI12b – Sep. 2015
DWD's weather models COSMO-DE (-EPS): Grid spacing: 2.2 km COSMO-DE x = 2.2 km Vertical layers: 80 Forecast range: 27 hours Runs per day: 8 EPS members: 20 ICON-EU: Grid spacing: 6.5 km ICON-EU x = 6.5 km Vertical layers: 54 Forecast range: 78 hours Runs per day: 8 ICON: Grid spacing: 13 km Vertical layers: 90 ICON x = 13 km Forecast range: 174 / 78 hours Runs per day: 4 Daily deterministic output: ~ 2.5 TByte Daily probabilistic output: ~ 3.5 TByte TI12b – Sep. 2015
Routine operation Global observations International data exchange Database persistence + archival 24 / 7 Numerical weather prediction Model output persistence Data distribution Visualization + interpretation DIN EN ISO 9001:2008 TI12b – Sep. 2015
Routine operation TI12b – Sep. 2015
Daily data ingress / egress Type of Source Producer # reports Approx. daily data data volume Observation in-situ Manned ground station 12,000 350 GB Automatic ground station 50,000 Ship observation 2,500 Buoy observation 750 Radiosonde 900 Aircraft 3,000 remote sensing Radar 17 Lidar 2 Wind profiler 4 Satellite 20 Model output model COSMO-DE 8 6,000 GB COSMO-DE-EPS 8 ICON-EU 8 ICON 4 Wave model / other models 12 500 GB TI12b – Sep. 2015
Physical infrastructure TI12b – Sep. 2015
DWD's processing power 1 TeraFLOP/s Moore's law / XC40 Dauerleistung 1 GigaFLOP/s NEC IBM SX-9 pSeries Cray T3E 1 MegaFLOP/s Cray YMP Cyber 76 CDC-3800 TI12b – Sep. 2015
Multiple supercomputers Production Research & Development ● 24 / 7 routine production ● R&D ● Data assimilation for computing model ● NUMEX: Numerical experiments ● Backup if routine computer initial states ● All numerical weather prediction unavailable models ● Parallel routine for evaluating model changes TI12b – Sep. 2015
HPC specifications • 2 Cray XC40 supercomputers • Each: 796 nodes, 17,648 compute modules, 79 TB RAM • Xeon prozessors, connected with Aries network • Top performance: 550 TeraFLOP/s per computer • 2 Linux Cluster (Megware Slashtwo) • Each: 523 nodes, 4.5 TB RAM • Each: 500 Xeon compute modules connected with Infiniband network • Top performance: 16,7 TeraFLOP/s per computer TI12b – Sep. 2015
Volume in meteorological databases Start of probabilistic forecasts TI12b – Sep. 2015
Storage systems • Cray Sonexion & NEC/NetApp E5500 with ca. 4.2 PB storage capacity • 1.2 PB global storage for HPC • Write speed: 15 GB/s • Also: • 6 data servers (NEC/SUN Fire X2-8 with 80 Intel Xeon compute modules & 1 TB RAM each) • 2 mirrored data servers • Availability: 99,9% • Server group manages up to 3 PB meteorological data • Archive of 2 StorageTek SL8500 tape libraries with 10,000 tape casettes each • > 40 casettes can be read and written to simultaneously • 16 robots physically access the tapes and insert them into the archive server • Estimated data volume by 2016: 60 PB TI12b – Sep. 2015
Georedundancy Meteorologie des Geoinformationsdienstes der Konrad-Zuse-Institut Berlin (ZiB) Bundeswehr in Euskirchen (MetBw) ECMWF Reading Megware Datenbankserver Sun X2-4 6 Knoten mit 128 GB RAM + Datenzugriffserver Sun X2-4 2 Knoten mit 32 GB RAM fürs PBS Geteilte Gesamtkapazität 2x NFS Server mit 64GB RAM 150 TiB 2x8 Cores (inkl. 16 Hyperthreads) Quelle: ZGeoBw EUS DMRZ Offenbach Modelle DMRZ Offenbach Hochleistungsrechner Entwicklung Halle Ost XCT (Übernahme der Produktion bei Ausfall Halle West) DMRZ Offenbach Hochleistungsrechner Produktion Halle West XC40 8 Sandy Bridge Knoten + 8 Haswell Knoten 640 Cores XC40 1 TiB Hauptspeicher 364 Ivy Bridge Knoten + 432 Haswell Knoten 35296 Cores XC40 77 TiB Hauptspeicher 364 Knoten Ivy Bridge + 432 Knoten Haswell 35296 Cores 77 TiB Hauptspeicher DMRZ Offenbach Hochleistungsrechner Entwicklung Halle Ost DMRZ Offenbach (Übernahme der Produktion bei Ausfall Halle West) Hochleistungsrechner Produktion Halle West Quelle: mygeo.info Archivsystem SUN/IBM-HPSS Sonexion Lustre- SUN STK SL8500 IBM X3650 Megware Datenbankserver Sun X2-4 Sonexion Lustre- Megware Datenbankserver Sun X2-4 Filesystem 2 Kassettensilos 9 Knoten 18 Knoten mit 128GB RAM 240 TiB Kapazität Filesystem 24 Knoten mit 128GB RAM 240 TiB Kapazität mit 2668 TiB Kapazität 20000 Stellplätze 72 Prozessorkerne 4 Knoten mit 512GB RAM Datenzugriffsserver Sun X2-8 mit 1012 TiB Kapazität 4 Knoten 512GB RAM Datenzugriffsserver Sun X2-8 Datenvolumen 6 PiB 216 GiB Hauptspeicher Panasas Filesystem 1300 TiB Kapazität Panasas Filesystem 1600 TiB Kapazität 36 Laufwerke 600 TiB Plattensysteme mit 120 TiB mit 171 TiB Concept: A. Pielicke, M. Jonas Current: November 2014 TI12b – Sep. 2015
Software configuration TI12b – Sep. 2015
Job management Job dispatchment: SMS / ecFlow ● Timed job execution ● Interjob dependency ● Status reports + output capture ● Manual starts, restarts, aborts ● Transferability between halls & computing centers TI12b – Sep. 2015
Monitoring & integration Several monitoring systems, depending on target components ● Nagios ● Icinga ● Big Brother ● Custom software Testing and building with Jenkins TI12b – Sep. 2015
Data management overview External users NinJo meteorological workstations NinJo Server (uses GloBUS) Binary DBs Internal Decoding Obs. model (GloBUS) data Numerical External weather prediction model WISO (WMO data data exchange) Data conversion (BUFR-TO-ROUTKLI) Relational weather data DWD stations MIRAKEL TI12b – Feb. 2015
AFD: Automated file distributor External users NinJo meteorological workstations NinJo Server (uses GloBUS) Binary DBs Internal Decoding Obs. model (GloBUS) data Numerical External weather prediction model WISO (WMO data data exchange) Data conversion (BUFR-TO-ROUTKLI) Relational weather data DWD stations MIRAKEL TI12b – Feb. 2015
GloBUS and BUFR-TO-ROUTKLI External users NinJo meteorological workstations NinJo Server (uses GloBUS) Binary DBs Internal Decoding Obs. model (GloBUS) data Numerical External weather prediction model WISO (WMO data data exchange) Data conversion (BUFR-TO-ROUTKLI) Relational weather data DWD stations MIRAKEL TI12b – Feb. 2015
SKY External users NinJo meteorological workstations NinJo Server (uses GloBUS) Binary DBs Internal Decoding Obs. model (GloBUS) data Numerical External weather prediction model WISO (WMO data data exchange) Data conversion (BUFR-TO-ROUTKLI) Relational weather data DWD stations MIRAKEL TI12b – Feb. 2015
Recommend
More recommend