transdisciplinary foundations of spatial data science
play

Transdisciplinary Foundations of Spatial Data Science April 27 th , - PowerPoint PPT Presentation

Transdisciplinary Foundations of Spatial Data Science April 27 th , 2018 Workshop on Illuminating Space and Time in Data Science Center for Geographical Information Systems, Harvard University. Shashi Shekhar McKnight Distinguished University


  1. Transdisciplinary Foundations of Spatial Data Science April 27 th , 2018 Workshop on Illuminating Space and Time in Data Science Center for Geographical Information Systems, Harvard University. Shashi Shekhar McKnight Distinguished University Professor Dept. of Computer Sc. and Eng., University of Minnesota www.cs.umn.edu/~shekhar : shekhar@umn.edu

  2. NSF 1737633: Connecting the Smart-City Paradigm with a Sustainable Urban Infrastructure Systems Framework to Advance Equity in Communities (2017-2020) S. Shekhar, A. Ramaswami , R. Feiock, V. Merwade, J. Marshall Major Research Innovations Comprehensive fine intra-urban scale data (SEIU-EHW parameters in Figure 1) • Spatial Data Science to understand relationships (Figure 2). • Model & visualize multi-infrastructure spatial smart city futures • Knowledge co-production theories, science and practice • Figure 2. Spatial Patterns Figure 1. Complex Interactions among SEIU and EHW parameters

  3. NSF 1737633: Connecting the Smart-City Paradigm with a Sustainable Urban Infrastructure Systems Framework to Advance Equity in Communities (2017-2020) S. Shekhar, A. Ramaswami , R. Feiock, V. Merwade, J. Marshall • Co-Visioning via meetings • Plan infrastructure for driver-less, post-carbon future with climate change • Advance Environment, Health, Wellbeing & Equity via infrastructure refinement • Co-select Questions – Understand spatial equity in infrastructure & outcomes (wellbeing. health, environment)? – How does equity first approach differ from average-outcome based approaches ? • Problem Co-Definition: How to measure spatial equity? Well-being? Research • Co-Discovery • Co-Evaluation Social Diversity Education Equity Community Partners & Outreach • Details: University of Minnesota secures $2.5 million grant to improve quality of life in cities, October 20, 2017 ( https://www.cs.umn.edu/news/filter/highlights/professor-shekhar-leads-u-m-team-granted-25-million-nsf-grant )

  4. History of Spatial Data Science in S&CC 1854: What causes Cholera? Test Hypothesis Collect & Discover Patterns, Develop Curate Data Theory (Controlled Experiments) Generate Hypothesis Germ Theory Remove pump handle ? water pump Impact on cities: Health & well-being, parks, sewage system, drinking water supply, … Q? What are the Choleras of today? Q? How may spatial data science help?

  5. Today’s Transdisciplinary Spatial Data Science Spatial Statistics: Test to reduce spurious patterns • Computer Sc.: Algorithms for large (e.g., national) data • Mathematics: Reduce missed patterns • SatScan enumerates only 2-point circles • 5

  6. Theme 2: Spatial Data Analysis of SEIU-WHE Parameters Task 2A: Develop algorithms to discover statistically significant linear and buffer ● hotspots, e.g., of income-poverty, consumption, pollution exposure, and low wellbeing Task 2B: Discover co-location and teleconnection patterns: Develop scalable algorithms ● for identifying correlations in SEIU- WHE parameters, e.g., hotspots and deprived areas Task 2C: Data-Driven and Discipline-inspired hypotheses ●

  7. Task 2A: Discovering Linear and Buffer Hotspots Hotspots often along a spatial network (e.g., air pollution hotspots along roads) ● Preliminary results: Linear hotspot detection which models the linear semantics ● However, only along shortest paths between end-points ● Not including the information surrounding the network. ● Proposed approach: ● – Novel notion: Non-shortest-path Simple paths, buffer hotspots – Potential solution: graph partitioning based divide and conquer (a) Circular hotspots for pedestrian (b) Linear hotspots for pedestrian (c) Example of non-shortest path fatalities fatalities

  8. Task 2B: Discover co-location and teleconnection patterns Challenge: Spatial partitioning distorts (& misses) spatial interactions! ● Spatial Statistical Methods are computationally expensive ● • Prelim. Results: Fast algorithms for mining Co-location (& Teleconnection) Proposed: address data with multiple levels of aggregation, e.g., areal summary ● (a) a map of 3 features (c) Neighbor graph (b) Spatial Partitions Pearson’s Correlation Ripley’s cross-K Participation Index - -0.90 0.33 0.5 - 1 0.5 1

  9. Data-Intensive Science of S&CC in 21 st Century Hotspots of infrastructure Equity first Role of policies deprivation, consumption, pollution, policies & urban forms SEIU EHW investment, disease & well-being. Correlates? Collect, & Curate Test Hypothesis Spatial Patterns, S&CC Hypothesis Generation Theory Big Data (Policy Intervention) Volume, Variety Data-driven and Discipline-inspired hypothesis generation

  10. Challenges Ahead Non-stationarity • Change, e.g., climate, Web, … • Feedback Loops, e.g., Social • Fairness • Accountability • Transparency •

  11. References :Surveys, Overviews • Spatial Computing ( html , short video , tweet ), Communications of the ACM, 59(1):72-81, January, 2016. • Transdisciplinary Foundations of Geospatial Data Science ( html , pdf ), ISPRS Intl. Jr. of Geo-Informatics, 6(12):395-429, 2017. ( doi:10.3390/ijgi6120395 ) • Spatiotemporal Data Mining: A Computational Perspective , ISPRS Intl. Jr. on Geo- Information, 4(4):2306-2338, 2015 (DOI: 10.3390/ijgi4042306). • Identifying patterns in spatial information: a survey of methods ( pdf ), Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 1(3):193-214, May/June 2011. (DOI: 10.1002/widm.25). • Theory-Guided Data Science: A New Paradigm for Scientific Discovery from Data, IEEE Transactions on Knowledge and Dat Mining, 29(10):2318-2331, June 2017. ( DOI: 10.1109/TKDE.2017.2720168 ). • Parallel Processing over Spatial-Temporal Datasets from Geo, Bio, Climate and Social Science Communities: A Research Roadmap. IEEE BigData Congress 2017: 232-250. • Spatial Databases: Accomplishments and Research Needs, IEEE Transactions on Knowledge and Data Engineering, 11(1):45-55, 1999.

  12. References: Details • Discovering colocation patterns from spatial data sets: a general approach, IEEE Trans. on Know. and Colocations Data Eng ., 16(12), 2004 (w/ Y. Huang et al.). • A join-less approach for mining spatial colocation patterns, IEEE Trans. on Know. and Data Eng.,18 (10), 2006. (w/ J. Yoo). • Cascading Spatio-Temporal Pattern Discovery. IEEE Trans. Knowl. Data Eng. 24(11): 1977- 1992, 2012 (w/ P. Mohan et al.). • Detecting graph-based spatial outliers: algorithms and applications (a summary of results), Proc.: Spatial ACM Intl. Conf. on Knowledge Discovery & Data Mining, 2001 (with Q. Lu et al.) Outliers • A unified approach to detecting spatial outliers, Springer GeoInformatica, 7 (2), 2003. (w/ C. Lu, et al.) • Discovering Flow Anomalies: A SWEET Approach, IEEE Intl. Conf. on Data Mining, 2008 (w/ J. Kang). • Hot Spots Discovering personally meaningful places: An interactive clustering approach, ACM Trans. on Info. Systems (TOIS) 25 (3), 2007. (with C. Zhou et al.) • A K-Main Routes Approach to Spatial Network Activity Summarization, IEEE Trans on Know. & Data Eng., 26(6), 2014. (with D. Oliver et al.) • Significant Linear Hotspot Discovery , IEEE Trans. Big Data 3(2): 140-153, 2017, (w/ X.Tang et al.) • Spatial contextual classification and prediction models for mining geospatial data, IEEE Transactions Location on Multimedia, 4 (2), 2002. (with P. Schrater et al.) Prediction • Focal-Test-Based Spatial Decision Tree Learning. IEEE Trans. Knowl. Data Eng. 27(6): 1547-1559, 2015 (summary in Proc. IEEE Intl. Conf. on Data Mining, 2013) (w/ Z. Jiang et al.). Change • Spatiotemporal change footprint pattern discovery: an inter-disciplinary survey. Wiley Interdisc. Rew.: Data Mining and Know. Discovery 4(1), 2014. (with X. Zhou et al.) Detection

  13. Knowledge Co-Production: NSF Smart & Connected Communities Grant 1737633 (2017-2020) • Co-Visioning via meetings • Plan infrastructure for driver-less, post-carbon future with climate change • Advance Environment, Health, Wellbeing & Equity via infrastructure refinement • Co-select Questions – Understand spatial equity in infrastructure & outcomes (wellbeing. health, environment)? – How does equity first approach differ from average-outcome based approaches ? • Problem Co-Definition: How to measure spatial equity? Well-being? Research • Co-Discovery • Co-Evaluation Social Diversity Education Equity Community Partners & Outreach • Details: University of Minnesota secures $2.5 million grant to improve quality of life in cities, October 20, 2017 ( https://www.cs.umn.edu/news/filter/highlights/professor-shekhar-leads-u-m-team-granted-25-million-nsf-grant )

Recommend


More recommend