EDC Forum 2017 Big Spatial Data in Agriculture Marlena Götza, Thilo Steckel, Heinrich Warkentin CLAAS E-Systems
1. CLAAS & GIS Technologies 2. Hadoop as a „Big Data“ Ecosystem 3. Big Data & GIS Technologies 4. Résumé 21.09.2017
Product Range Combine harvesters Forage harvesters Tractors Forage harvesting machines Vier Spalten Telehandler Balers Service & Parts Software and systems 21.09.2017 Company Presentation CLAAS E-Systems | CLAAS Group
Agricultural Engineering in the past 21.09.2017 Company Presentation CLAAS E-Systems | Trends and Challenges
Have we reached our limits? 21.09.2017 Company Presentation CLAAS E-Systems | Trends and Challenges
Precision Agriculture 21.09.2017 Company Presentation CLAAS E-Systems | Trends and Challenges
GIS in Agriculture 21.09.2017
1. CLAAS & GIS Technologies 2. Hadoop as a „Big Data“ Ecosystem 3. Big Data & GIS Technologies 4. Résumé 21.09.2017
Research Project AGATA - Analyse großer Datenmengen in Verarbeitungsprozessen 21.09.2017
Hadoop Core Concepts HDFS Physical replication of data Fault tolerant through redundancy MapReduce Framework 21.09.2017
Data Machine Data GPS position Operating data Master data Field Data Polygons Documenation 09/10/2017
Hadoop Good at: Storing, processing, querying big data sets „ batch “ processing of data Bad at: Processing spatial data Handling time and space components Visualization of (spatial / temporal) data 21.09.2017
1. CLAAS & GIS Technologies 2. Hadoop as a „Big Data“ Ecosystem 3. Big Data & GIS Technologies 4. Résumé 21.09.2017
Research cooperation with 52N and ESRI How can the current big data infrastructure be extended by ESRI technology to support the spatial component of the data in the processing process at CLAAS? Does the use of GIS technologies have an added value for information processing at CLAAS? 21.09.2017
Configuration 1: GIS Tools for Hadoop GIS Tools for Hadoop: open-source Esri Geometry API for Java : Java based API Spatial Framework for Hadoop : adds User Defined Functions (UDFs) for spatial queries Geoprocessing Tools for Hadoop : connection to ArcGIS Desktop 21.09.2017
Configuration 1: GIS Tools for Hadoop - Example enrichment of altitude data Code Explanation calculating the average of selected DGM points SELECT tm_gps.*, AVG(dgm_dt.dgm_height) as avg_gps_height FROM tm_gps, dgm_dt selecting DEM points that are within a buffer of 5 m WHERE ST_Contains( around the GPS point ST_Buffer(ST_Point(tm_gps.gps_long, tm_gps.gps_lat), 0.000045), ST_Point(dgm_dt.dgm_long, testdgm2.dgm_lat)) assigning the average height to the GPS point GROUP BY tm_gps.id, tm_gps.gps_long, tm_gps.gps_lat, tm_gps.gps_height; 21.09.2017
Configuration 1: GIS Tools for Hadoop - Example enrichment of altitude data: spatial binning Code Explanation CREATE VIEW height_agg_bin spatial binning: partitioning space by a grid AS of fixed resolution, aggregation of height SELECT bin_id, ST_BinEnvelope(0.0005, bin_id) shape, COUNT(*) count, values from the DEM in each cell AVG(dgm_height) height, MAX(dgm_height) max, MIN(dgm_height) min FROM ( each cell has a unique ID, assigning the SELECT ST_Bin(0.0005, ST_Point(dgm_dt.dgm_long, dgm_dt.dgm_lat)) bin_id, * aggregated height values to the grid cells FROM dgm_dt ) bins GROUP BY bin_id; SELECT * FROM ( determining the Bin-IDs SELECT *, ST_BIN(0.0005, ST_Bin(0.0005, ST_Point(gps_long, gps_lat))) as bin_id FROM tm_gps ) t1 joining the machine data to the height LEFT OUTER JOIN height_agg_bin t2 values ON (t1.bin_id = t2.bin_id); 21.09.2017
Configuration 2: ArcGIS Enterprise ArcGIS Enterprise Stack Hadoop Cluster is integrated as a Big Data File Share extension of the Hadoop system with ArcGIS Software API for Python to use the ArcGIS Enterprise components in code scripts ArcGIS Pro to operate processes on ArcGIS Enterprise 21.09.2017
Use Case example Field boundaries Use Case example using the GeoAnalytics Server detection of field boundaries on the basis of GPS points using GIS technologies 21.09.2017
Use Case example Field boundaries Step 1: Preprocessing Input data points: GPS position + timestamp Attributes = sensor data 21.09.2017
Use Case example Field boundaries Step 1: Preprocessing Filtering of non relevant data (street data, U- turns, null values,…) 21.09.2017
Use Case example Field boundaries Step 2: Reconstruction of field trajectories “Reconstruct Tracks” Tool 21.09.2017
Use Case example Field boundaries Step 3: Grouping Grouping field tracks to fields by dissolving or rastering and grouping the tracks. 21.09.2017
Use Case example Field boundaries Step 4: Generating field boundaries Extracting the field polygons by expanding the field tracks to the work width of the machine. Known Issues: Hadoop is secured by Apache Knox Gateway workaround required 21.09.2017
1. CLAAS & GIS Technologies 2. Hadoop as a „Big Data“ Ecosystem 3. Big Data & GIS Technologies 4. Résumé 21.09.2017
Conclusion Offline 1. GIS Tools for Hadoop: using Spatial Frameworks for Hadoop (open source and easy to integrate) next steps: Using Spark on Hadoop Geospark and other geospatial packages for Spark 2. Desktop GIS provides additional GIS tools that are not available in the GIS Tools for Hadoop 3. ArcGIS Enterprise: Big Data technology stack can enhance the analysis of machine data Integration with big data structure is not working reliable 21.09.2017
Occupational Areas 21.09.2017 Company Presentation CLAAS E-Systems | CLAAS E-Systems
Employment Figures 21.09.2017 Company Presentation CLAAS E-Systems | CLAAS E-Systems
Entry Opportunities 21.09.2017 Company Presentation CLAAS E-Systems | CLAAS E-Systems
Thank you for your attention! . 21.09.2017 Company Presentation CLAAS E-Systems | CLAAS E-Systems
Recommend
More recommend