Data Science for scaling water research Jordan S Read, USGS Office of Water Information U.S. Department of the Interior U.S. Geological Survey
Water Issues at Continental Scales • Water quality and quantity are changing at a scale never seen before • Global food and energy networks place demands on water resources, and we are only beginning to understand the implications Societal need: Understand water use trade-offs for energy, the environment, and human health
Existing USGS resources and strengths A network of… Science Integrity People Data
Existing USGS resources and strengths: Data The national hydrography dataset • 6.7M lakes, ponds, and impoundments • 2.6M stream reaches https://nhd.usgs.gov
Existing USGS resources and strengths: Data National Water Information System • > 850,000 station years of stream levels, discharge, reservoir and lake levels, surface-water quality, and rainfall https://waterdata.usgs.gov
Existing USGS resources and strengths: Data National Water Information System • Water use data: How and where we use water (1950-2010) • Various categories and spatial resolution of reporting https://water.usgs.gov/watuse
Existing USGS resources and strengths: Data The water quality portal • > 450 monitoring groups • 2.7M sites, ~300M records • Upstream/downstream queries https://www.waterqualitydata.us
Existing USGS resources and strengths: Data USGS Landsat • > 40 years of moderate resolution multispectral data • 51M+ scenes downloaded https://landsat.usgs.gov/
Emerging challenges Understand water use trade-offs for energy, the environment, and human health • A new observation paradigm • Shifts in the design of research collaborations • Declining research budgets
Emerging challenges Understand water use trade-offs for energy, the environment, and human health • A new observation paradigm • Shifts in the design of research collaborations • Declining research budgets
Emerging challenges: A new observation paradigm Understand water use trade-offs for energy, the environment, and human health Familiar datasets for water resources: • Continuous or discrete measurements from a site (e.g., gage height) • Uniform grids (e.g., images; climate data) Familiar data exchange formats: • waterML2.0; *.csv With familiar tools • geotiff; netCDF
Emerging challenges: A new observation paradigm Understand water use trade-offs for energy, the environment, and human health • Environmental DNA (eDNA) • “census the wild with a jar” • Moving sensors or frames of reference • Unmanned vehicles/sensors; structure from motion • Footprint or integrator measurements • Environmental exposure/duration; passive samplers • Data variety • Citizen; hyperspectral; WQ lab sample; RS image • Internet of things • Lab on a chip; personalized water use data; water infrastructure monitoring
Emerging challenges: A new observation paradigm Understand water use trade-offs for energy, the environment, and human health • Environmental DNA (eDNA) • “census the wild with a jar” • Moving sensors or frames of reference • Unmanned vehicles/sensors; structure from motion • Footprint or integrator measurements • Environmental exposure/duration; passive samplers • Data variety • Citizen; hyperspectral; WQ lab sample; RS image • Internet of things • Lab on a chip; personalized water use data; water infrastructure monitoring
Emerging challenges: A new observation paradigm Understand water use trade-offs for energy, the environment, and human health • Environmental DNA (eDNA) • “census the wild with a jar” • Moving sensors or frames of reference • Unmanned vehicles/sensors; structure from motion • Footprint or integrator measurements • Environmental exposure/duration; passive samplers • Data variety • Citizen; hyperspectral; WQ lab sample; RS image • Internet of things • Lab on a chip; personalized water use data; water infrastructure monitoring
Emerging challenges: A new observation paradigm Understand water use trade-offs for energy, the environment, and human health • Environmental DNA (eDNA) • “census the wild with a jar” • Moving sensors or frames of reference • Unmanned vehicles/sensors; structure from motion • Footprint or integrator measurements • Environmental exposure/duration; passive samplers • Data variety • Citizen; hyperspectral; WQ lab sample; satellite image • Internet of things • Lab on a chip; personalized water use data; water infrastructure monitoring
Emerging challenges: A new observation paradigm Understand water use trade-offs for energy, the environment, and human health • Environmental DNA (eDNA) • “census the wild with a jar” • Moving sensors or frames of reference • Unmanned vehicles/sensors; structure from motion • Footprint or integrator measurements • Environmental exposure/duration; passive samplers • Data variety • Citizen; hyperspectral; WQ lab sample; RS image • Internet of things • Lab on a chip; personalized water use data; water infrastructure monitoring
Emerging challenges: A new observation paradigm Understand water use trade-offs for energy, the environment, and human health • Environmental DNA (eDNA) • “census the wild with a jar” • Moving sensors or frames of reference • Unmanned vehicles/sensors; structure from motion Can we afford to leave • Footprint or integrator measurements data on the table? • Environmental exposure/duration; passive samplers • Data variety • Citizen; hyperspectral; WQ lab sample; RS image • Internet of things • Lab on a chip; personalized water use data; water infrastructure monitoring
Emerging challenges: Research collaboration shifts Understand water use trade-offs for energy, the environment, and human health • Environmental DNA (eDNA) • “census the wild with a jar” • Moving sensors or frames of reference • Unmanned vehicles/sensors; structure from motion Do we need more • Footprint or integrator measurements people at the table? • Environmental exposure/duration; passive samplers • Data variety • Citizen; hyperspectral; WQ lab sample; RS image • Internet of things • Lab on a chip; personalized water use data; water infrastructure monitoring
Emerging challenges: Research collaboration shifts Understand water use trade-offs for energy, the environment, and human health • The size and makeup of our collaborations are changing • Larger and more diverse teams • New specializations • Increasing role for technologists Domain Domain Technology Technology Data poor Data rich
Emerging challenges: Research collaboration shifts Understand water use trade-offs for energy, the environment, and human health • The size and makeup of our collaborations are changing • Larger and more diverse teams • New specializations • Increasing role for technologists Data Domain Domain Scientists Scientists Domain Technology Technology Data poor Data rich
How does Data Science function at a science agency? image: The Economist
Data Science for scaling water research High • Domain research may leave information Theory-based Models Use of Scientific Knowledge on the table • Business-oriented data science may ignore systems understanding Machine learning models Low High Low Use of Data Adapted from Karpatne et al. 2017
Data Science for scaling water research High Theory-based Models Use of Scientific Knowledge Theory-guided Data Science Models Machine learning models Low High Low Use of Data Adapted from Karpatne et al. 2017
Data Science for scaling water research High How does Data Science function at a science agency? Use of Scientific Knowledge Domain Scientist Data Scientist Low High Low Use of Data
Data Science for scaling water research • Thinking across scales • Interdisciplinary research teams • “Macrosystems” science • Computational thinking and practice Practical near-term • Embedded Comp Sci concepts within collaborative teams limits to scaling • Prioritization of democratized data and technology • Building relevant tools and training scientists • Access to data and computing resources • Thoughtful data web-service APIs • Infrastructure and resources for HPC/HTC • Long-term sustainability
Data Science for scaling water research • Thinking across scales science • Interdisciplinary research teams • “Macrosystems” science • Computational thinking and practice Practical near-term • Embedded Comp Sci concepts within collaborative teams limits to scaling • Prioritization of democratized data and technology • Building relevant tools and training scientists • Access to data and computing resources • Thoughtful data web-service APIs • Infrastructure and resources for HPC/HTC • Long-term sustainability technology
Data Science for scaling water research • Thinking across scales • Interdisciplinary research teams • “Macrosystems” science • Computational thinking and practice • Embedded Comp Sci concepts within collaborative teams • Prioritization of democratized data and technology • Building relevant tools and training scientists • Access to data and computing resources USGS Water enterprise • Thoughtful data web-service APIs • Infrastructure and resources for HPC/HTC data systems emphasis • Long-term sustainability
Recommend
More recommend