1st International KEYSTONE Conference IKC 2015 Coimbra Portugal, 8-9 September, 2015 Keyword-Based Search over Environmental Datasets José R.R. Viqueira Alberto Bugarín Joaquín Triñanes Jaime Martínez-Urtaza
Outline MOTIVATIONAL EXAMPLES ENVIRONMENTAL DATASETS KEYWORD-BASED SEARCH PROPOSED METHODOLOGY Keyword-Based Search over Environmental Datasets
MOTIVATIONAL EXAMPLES Public health and water microbiology (Cholera risk) Motivational Examples High water temperatures and rainfall close to sea level during monsoon season Environmental Datasets KeyWord- Based Search Proposed Methodology Keyword-Based Search over Environmental Datasets
MOTIVATIONAL EXAMPLES Public health and water microbiology (Cholera risk) Motivational Examples High water temperatures and rainfall close to sea level during monsoon season Environmental properties Many datasets Environmental Sea Surface Temperature (SST) [NOAA] Datasets even for the Rainfall (TRMM) [NASA] Geographic Coverages (fields, grids, etc.) same property Very large scientific data multidimensional arrays KeyWord- Dims: Lat., Lon., Time, Depth, etc. Based Search Proposed Methodology Keyword-Based Search over Environmental Datasets
MOTIVATIONAL EXAMPLES Public health and water microbiology (Cholera risk) Motivational Examples High water temperatures and rainfall close to sea level during monsoon season Meaning depends Fuzzy linguistic value Environmental on the context of Relative to the mean of the same month during Datasets application, geo the last 7 years for the specific location location, time World Ocean Atlas (WOA). Statistical mean of temperature period, … KeyWord- Based Search Proposed Methodology NOAA. cwblendednightSST. 22/07/2015 WOA. Stat. mean temperature. July Keyword-Based Search over Environmental Datasets
MOTIVATIONAL EXAMPLES Public health and water microbiology (Cholera risk) Motivational Examples High water temperatures and rainfall close to sea level during monsoon season Fuzzy geographic restriction Environmental Close to the coast line Datasets Geographic Entity (E/R data) Elevation of about 0 meters above sea level Geographic Coverage KeyWord- Based Search Coast line. NOAA GSSHG Proposed Methodology Elevation. NOAA ETOPO1 Keyword-Based Search over Environmental Datasets
MOTIVATIONAL EXAMPLES Public health and water microbiology (Cholera risk) Motivational Examples High water temperatures and rainfall close to sea level during monsoon season Fuzzy temporal restriction Environmental Monsoon season (Malaysia) Datasets Northeast Monsoon from November to March (more rainfall) Southwest Monsoon from May to September (less rainfall) KeyWord- Based Search Proposed Methodology NOAA Blended Sea Winds. January NOAA Blended Sea Winds. July Keyword-Based Search over Environmental Datasets
MOTIVATIONAL EXAMPLES Severe weather reporting Motivational Examples Hurricane of high category close to highly populated areas Strong winds and heavy swell on marine traffic routes Hailstorm on the highway Environmental Datasets Tourism Beach protected from wind, with warm water and air temperature, KeyWord- and few waves Based Search Proposed Methodology Keyword-Based Search over Environmental Datasets
ENVIRONMENTAL DATASETS Do no always fit E/R model Motivational Examples Geographic Entities Entities with geometric properties Environmental Relational data Datasets Rivers, Meteostations, Municipalities, etc. KeyWord- Based Search Geographic Coverages Mappings with geographic Proposed domain Methodology Multidimensional array data Temperature, rainfall, elevation, salinity, etc. Keyword-Based Search over Environmental Datasets
ENVIRONMENTAL DATASETS Many large or very large datasets Motivational Examples Stations, buoys, satellites, radars, etc. Blended Sea Winds Over 300 gigabytes Environmental Prediction, reanalysis, etc. Datasets Climate Forecast System Version 2 (CFSv2) Over 500 terabytes KeyWord- Based Search Proposed Methodology Keyword-Based Search over Environmental Datasets
ENVIRONMENTAL DATASETS Many large or very large datasets Motivational Examples Stations, buoys, satellites, radars, etc. Blended Sea Winds Over 300 gigabytes Environmental Prediction, reanalysis, etc. Datasets Climate Forecast System Version 2 (CFSv2) Over 500 terabytes KeyWord- Based Search Highly heterogeneous Formats and encodings Proposed Methodology Shape Files, GML, GeoJSON, etc. GeoTiff, ASC, NetCDF, HDF, GRIB, etc. Semantics Temperature, Sea Temperature, Sea Surface Temperature, air temperature, etc. Keyword-Based Search over Environmental Datasets
ENVIRONMENTAL DATASETS Open Data Motivational Examples Spatial Data Infrastructures (SDIs) Global Spatial Data Infrastructure (GSDI) INSPIRE Environmental NSDI Datasets Unidata OpenDAP KeyWord- Based Search Proposed Methodology Keyword-Based Search over Environmental Datasets
ENVIRONMENTAL DATASETS Open Data Motivational Examples Spatial Data Infrastructures (SDIs) Global Spatial Data Infrastructure (GSDI) INSPIRE Environmental NSDI Datasets Unidata OpenDAP KeyWord- Based Search Data models, Vocabularies, Ontologies Observations and Measurements (O&M) Proposed Sensor Model Language (SensorML) Methodology Semantic Web for Earth and Environmental Terminology (SWEET) CF Conventions and Metadata World Meteorological Organization Manual on Codes SeaDataNet Sensor Vocabularies ETC. Keyword-Based Search over Environmental Datasets
KEYWORD-BASED SEARCH Non structured data sources Motivational Examples Information Retrieval (IR) Text data Multimedia Information Retrieval (MIR) Environmental Audio, images, video. Datasets KeyWord- Based Search Proposed Methodology Keyword-Based Search over Environmental Datasets
KEYWORD-BASED SEARCH Non structured data sources Motivational Examples Information Retrieval (IR) Text data Multimedia Information Retrieval (MIR) Environmental Audio, images, video. Datasets KeyWord- Based Search Structured data sources Relational DBMSs Linked Data Proposed E/R based data! Methodology Keyword-Based Search over Environmental Datasets
PROPOSED METHODOLOGY Motivational Language/grammar Examples Fuzzy Spatio- temporal Keyword-based extension in Fuzzy Spatio-temporal Search each dataset extension of each Environmental keyword Datasets Search Engi ne in each dataset Spatio-temporal KeyWord- Keyword-Based Machine Learning Based Search Index Data to Text Systems Fuzzy-based approaches Crawling, Annotation, Proposed Vocabulary Indexing, ETL? Computing with Methodology Words Metadata Entity-Based Coverage-Based Repositories Datasets Datasets Keyword-Based Search over Environmental Datasets
1st International KEYSTONE Conference IKC 2015 Coimbra Portugal, 8-9 September, 2015 Keyword-Based Search over Environmental Datasets José R.R. Viqueira (jrr.viqueira@usc.es) Alberto Bugarín (alberto.bugarin@usc.es) Joaquín Triñanes (joaquin.trinanes@usc.es) Jaime Martínez-Urtaza (j.l.martinez-urtaza@bath.ac.uk)
Recommend
More recommend