Processing and analysis of Earth Observation data Carsten Brockmann, Brockmann Consult GmbH ESA Climate Change Initiative Toolbox Science Lead Big Data Analytics & GIS, Münster 20.-21. September 2017.
Earth Observation Managing big EO data is increasingly complex. But not just technically. Collaboration. Culture. Organisation. Structure.
Lake MacKay, Australia
Tonga, Pacific
Arctic Ocean
Karavasta Lagoon, Albania
Satellite Images = Measurement Data
Satellite Images = Measurement Data Generic Tools Data selection & access • Visualisation • Turning image data into information Analysis • Processing • Export • Instrument specific tools System correction • Data processing (L1 – > L2) • Thematic processing, synergy Programming tools • API for python and Java • Scripting • Graph model builder •
SNAP Architecture Any combination of toolboxes add-ons is Sentinel-1 Toolbox Sentinel-2 Toolbox Sentinel-3 Toolbox allowed, even (S1TBX) (S2TBX) (S3TBX) none, as SNAP Desktop is a already a useful stand-alone application for SNAP Desktop SNAP Engine EO data exploitation. SNAP layer 3rd-party library … NetBeans RCP GeoTools JAI NetCDF layer Programming Java SE 8 Platform Python language layer
SNAP Architecture Any combination of toolboxes add-ons is Sentinel-1 Toolbox Sentinel-2 Toolbox Sentinel-3 Toolbox allowed, even (S1TBX) (S2TBX) (S3TBX) none, as SNAP Desktop is a already a useful stand-alone application for SNAP Desktop SNAP Engine EO data exploitation. SNAP layer 3rd-party library … NetBeans RCP GeoTools JAI NetCDF layer Programming Java SE 8 Platform Python language layer
SNAP Application Modes
Golden Age of Earth Observation By the end of 2017, the operational Sentinel-1, -2, -3 and -5p satellites alone will continuously collect a volume of 27 Terabytes per day / 10PB per year. It could take around 2.5 years to download 1 Petabyte of Sentinel-1 data and a staggering 63 years to pre- process it on your own computer (Wagner, 2015)
Data Local Processing Optimising data transfer Sharing input data Sharing result Rapid turn-around cycles
Data Local Processing Archive-centric approach Hadoop approach Network storage Concurrent data-local processing Data are transferred over the network Tasks are transferred over the network Risk of network bottleneck Good scalability
Calvalus Processing System for EO Data Apply Apache Hadoop to earth observation Calvalus Calvalus Transfer the algorithm to the data On-demand Portal Bulk Production (data-local in a narrower sense) Calvalus Adapters Avoid an archive-centric approach Add Calvalus software layers for EO data processing and validation SNAP GPF SNAP Linux EO processing workflows Operators Aggre- Executables Data processor plug-in framework and Graphs gators Bulk production control Portal Calvalus Adapters Integrate data processors Linux executables SNAP & BEAM GPF operators and aggregators Ppen for other frameworks
Data Local Processing - Level 2 Processing Workflow on Calvalus - • MERIS RR L1, North Sea, 3 days • CoastColour NN L2 processor • 6 minutes (22 nodes) • output: L2 files • Only Mapper Tasks, no reduce step necessary L2 Processor L1 File L2 File • distributed file system HDFS (Mapper Task) L2 Processor L1 File L2 File (Mapper Task) on local disks of compute nodes L2 Processor L1 File L2 File • transparent, optimised data-local (Mapper Task) L2 Processor L1 File L2 File access (Mapper Task) L2 Processor L1 File L2 File (Mapper Task)
Map-Reduce on Calvalus • Algorithm defined by Google employees Dean + Ghemawat in 2004 • Idea: partition data into chunks, compute chunks locally ( map ), concatenate intermediate results to final result ( reduce ) • Allows for high degree of parallelisation • Fits very well with principle of data locality • distributed file system HDFS on local disks of compute nodes • transparent, optimised data-local access
Map-Reduce on Calvalus - Temporal and Spatial Integration - • MERIS RR L1, global, 10-day • SNAP „C2RCC“ Water processor • 20 mins (100 nodes) L3 File • Output: 1 L3 product L3 Formatting L2 Proc. + Spat. Spatial L1 File Binning L2 Proc. + Spat. Spatial • distributed file system HDFS Bins Temp. Binning Temp. L1 File (Mapper Task) Binning L2 Proc. + Spat. Spatial Bins Temp. Binning Temp. (Reducer Task) Bins on local disks of compute nodes L1 File (Mapper Task) Binning L2 Proc. + Spat. Spatial Bins (Reducer Task) Bins (Mapper Task) • transparent, optimised data-local L1 File Binning L2 Proc. + Spat. Spatial Bins L1 File (Mapper Task) Binning access Bins (Mapper Task)
Map-Reduce on Calvalus - Match-up analysis - • MERIS RR L1, global, 3 months • „ CoastColour C2W“ processor • NOMAD in-situ dataset • 1.5 minutes (100 nodes) • Output: scatter-plots and pixel extraction tables Matchup Analysis (Reducer Task) L2 Proc. & Matcher Output L1 File L2 Proc. & Matcher Output (Mapper Task) Records L1 File L2 Proc. & Matcher Output (Mapper Task) Records L1 File L2 Proc. & Matcher Output (Mapper Task) Records L1 File L2 Proc. & Matcher Output (Mapper Task) Records L1 File (Mapper Task) Records Input MA Report Records
Streaming on Calvalus - With SNAP - Supported by SNAP Graph Processing Framework • Access to data via reader/writer objects instead of files • Operator chaining to build processors from modules • Tile cache and pull principle for in-memory processing • Hadoop MapReduce for partitioning and streaming
EO Data & Data-Processing Platforms European Space Agency & national Space Agencies Thematic Exploitation Platforms • Mission Exploitation Platforms • European Commission: Copernicus Data and Information Access Services ( DIAS ) • Copernicus Collaborative Ground Segments Private offers Google Earth Engine • Amazon Web Services •
The Urban Thematic Exploitation Platform Visualisation & Analysis gateway to ... Urban TEP Processing Centres Urban TEP portal + gateway
Portal functions Processing request forms and result access Datasets and services Geo-browser
Analysis and visualisation Combination of satellite products and socio-economic data Derivation of new criteria
Processing request form
GUF Istanbul Sao Paulo Moscow Global binary raster mask showing location of human settlements (12m/75m)
Urban growth ERS-2 PRI & ASAR Beijing ▪ SAR4Urban (2015-2016) IMP VV 2002-2003 15m spatial resolution 48 scenes
Urban growth Beijing S1A IW GRDH VV 2014-2015 10m spatial resolution 31 scenes
Urban TEP is ... attractive high-quality datasets ... • ... that meet space, time and feature dimensions of the domain the capability to generate them • the facilities ... • ... to access and use them, ... to generate more of them
Processor development model Local test processing Package Deployment Upload VM for download Urban TEP portal Concurrent Processing Urban TEP Request processing processing centres Browser for request submission
Systematic or on-demand processing Datasets may be pre-generated, providing access to them as product • for long-running processes • for global datasets with high complexity/information reduction • in order to be able to visualise them • Example: GUF • Datasets may be processed on-demand, providing a service instead • for short-running processes • for selected areas • in case of user-defined parameterisation • to avoid storage of large output datasets • Example: Sentinel-2 timescan service (unless generated systematically) •
Urban TEP processing centres IT4Innovations Brockmann Consult DLR cluster (Salomon HPC) cluster (Calvalus/Hadoop) virtualised env. (GeoFarm) YARN scheduler +cluster(Calvalus/Hadoop) - Sentinel-2 (urban areas, Africa), Sentinel-1 and other datasets OLCI, MERIS Geoserver WPS + own backend Calvalus WPS + Urban TEP - implementation config+extension Geoserver WMS Geoserver WMS - large-scale global Landsat timescan GUF subsetting, Sentinel-2 timescan GUF and other Urban datasets processing processing fast internet access, HPC. distributed data-local processing systematic generation of datasets host of portal and and concurrent aggregation analysis/visualisation
Copernicus Data and Exploitation Platform – Deutschland National entry point to the EU Copernicus Sentinel Satellite Systems, their data products and the products of the Copernicus Services Processing facilities on the platform
EU DIAS
Confusing? YES!
Climate Monitoring Data Climate change is a global challenge. Open climate data is crucial.
ESA Climate Change Initiative (CCI) The objective of the Climate Change Initiative (CCI) is to realise the full potential of the long-term global Earth Observation archives that ESA together with its Member States have established over the last 30 years, as a significant and timely contribution to the Essential Climate Variable databases required by the United Nations Framework Convention on Climate Change (UNFCCC).
Recommend
More recommend