beyond gis spatial on line analytical processing and big
play

Beyond GIS: Spatial On-Line Analytical Processing and Big Data - PowerPoint PPT Presentation

Beyond GIS: Spatial On-Line Analytical Processing and Big Data Professor Yvan Bedard, PhD, P.Eng. Centre for Research in Geomatics Laval Univ., Quebec, Canada The Dangermond Lecture UCSB Dept of Geography Santa-Barbara, CA, USA February 6 th


  1. Beyond GIS: Spatial On-Line Analytical Processing and Big Data Professor Yvan Bedard, PhD, P.Eng. Centre for Research in Geomatics Laval Univ., Quebec, Canada The Dangermond Lecture UCSB Dept of Geography Santa-Barbara, CA, USA February 6 th , 2014 1

  2. Presentation � Origin of SOLAP � Nature � Evolution � Examples of applications � State-of-the-art for today’s technology � Challenges that remain � SOLAP and Big Data 2

  3. ORIGINS OF SOLAP 3

  4. Origins � Organisations worldwide invest hundreds of millions of dollars annually to acquire large amounts of data about the land, its resources and uses � These data however prove difficult to use by managers who need: � aggregated information - trends analysis � spatial comparisons - space-time correlations � fast synthesis over time - unexpected queries � interactive exploration - crosstab analysis � geogr. knowledge discovery - hypothesis dev. � etc.

  5. Barriers to make analysis with transactional systems GIS and DBMS design are transactional by nature � Oriented towards data acquisition, storing, updating, integrity � checking, simple querying Transactional databases are usually normalized so � duplication of data is kept to a minimum : � To preserve data integrity and simplify data update � � A strong normalization makes the analysis of data more � complex : � High number of tables, therefore high number of joins between � tables (less efficient). � Long processing time � � Development of complex queries � � � 5

  6. Analytical approach vs transactional approach No unique data structure is good for BOTH managing transactions and supporting complex queries. Therefore, two categories of databases must co-exist: transactional and analytical (E.F. Codd). Example of co-existence: one source -> several datacubes Analytical Read-only Source Data cubes Legacy Restructured ETL transactional & aggregated database data 6

  7. BI Market Business Intelligence exists since the early 1990s and its market is � larger that the GIS market. However, it didn’t address spatial data until recently. BI and GIS evolved in different silos for many years. 7 Eckerson, 2007

  8. Today’s Level of Integration Integrating GIS and BI is a recent field with a lot of potential � Spatially-enabling BI is becoming more common Larger smiley = more has been achieved Larger lightning = more difficult challenge 8

  9. SOLAP Epochs � 1996-2000: pionneering � early prototypes in universities � Laval U. - Simon Fraser U. - U. Minnesota � 2001-2004: early adopters � advanced prototypes in universities � first applications in industry � 2005-... : maturing � larger number of ad hoc applications � First commercial SOLAP technologies � 2010-…: wide adoption � About 40 commercial products

  10. NATURE OF SOLAP 10

  11. A Natural Evolution • Add • Add value to Nature of geospatial data capabilities to existing data, existing no attempt to GIS SOLAP systems, don't manage Spatial aim at these data replacing them DBMS Non-spatial OLAP Details Synthesis Decisional Nature of data

  12. Analytical System Architectures (ex. standard data warehouse ) - OLAP - Dashboards - Reporting Legacy OLTP DW Datamarts systems 12

  13. Analytical System Architectures (ex. direct, without data warehouse) Legacy transactional Datacubes databases Most projects we do have such an architecture: simpler, faster, less costly Requires highly open SOLAP technology to connect to a variety of legacy systems (DBMS, BI, GIS, CAD, Big Data engines, etc.) 13

  14. Datacube Concepts Dimension = axis of analysis organized hierarchically Ex: a Time All years dimension Members Years 2008 2009 2010 Levels Months Days Hypercube = N dimensions Datacube = casual name = hypercube 1/4

  15. Datacube Concepts Structure of datacubes Dimension 1 (ex. balanced) Dimension 2 (ex. simpler) Dimension N (ex. N:N paths) Measures (ex. sales) Dimension 3 (ex. unbalanced) Dimension 4 (ex. inconsistent) Members = filters (similar to independant variables) Measures = result (similar to dependant variables) 2/4

  16. Datacube Concepts Fine-grained analysis Dimension 1 (ex. balanced) Dimension 2 (ex. simpler) Dimension N (ex. N:N paths) Measures (ex. sales) Dimension 3 (ex. unbalanced) Dimension 4 (ex. inconsistent) Uses detailed members of dimensions hierarchies 2/4

  17. Datacube Concepts Global analysis Dimension 1 (ex. balanced) Dimension 2 (ex. simpler) Dimension N (ex. N:N paths) Measures (ex. sales) Dimension 3 (ex. unbalanced) Dimension 4 (ex. inconsistent) Uses highly-aggregated members As fast as fine-grained analysis (always <10 sec.) Requires only a few mouse clics (no query language) 2/4

  18. Datacube Concepts Cube (hypercube) = all facts A "sales" data cube Fact: each unique combination of fine-grained or aggregated A "sales territory" dimension Montreal members and of their resulting 5M Quebec measures Levis Ex.: sold for 2M$ of shirts in 2M 1M 2M 2M Ottawa in 2010 Toronto 3M 2M 1M 3M Ex. : sold for 8M$ of pants in Ontario 2M 2M 2M 3M Ontario in 2010 2008 Ex. : sold for 5M$ of jeans in Ottawa 2M 1M 1M 2M A "time" 2009 2010 Montreal in 2008 dimension jeans shirts blouses trousers Item level Provinces Cities pants tops Category level A "product" dimension 3/4

  19. Datacube Concepts � Data structures (MOLAP, ROLAP, HOLAP): � Multidimensional (proprietary) � Relational implementation of datacubes � Client and server provides the multidimensional view l Star schemas, snowflake schemas, constellation schemas � Hybrid solutions � Query languages: � SQL = standard for transactional database � Used in ROLAP � MDX = standard for datacubes � Used in MOLAP 19

  20. Spatial Datacube Concepts Spatial dimensions Geometric spatial Mixed spatial Non-geometric spatial dimension dimension dimension Canada Canada … CB … Québec NB … Montréal Québec … … … N.B. more concepts exist

  21. Spatial Datacube Concepts Spatial measures Metric operators Distance Area Perimeter … Topological operators Adjacent Within Intersect … Spatial dimension 1 Spatial dimension 2 3/3 N.B. more concepts exist

  22. Spatial Datacube and SOLAP Spatial OLAP (On-Line Analytical Processing) � SOLAP is the most widely used tool to harness the � power of spatial datacubes It provides operators that don’t exist in GIS � SOLAP = generic software supporting rapid and easy � navigation within spatial datacubes for the interactive exploration of spatio-temporal data having many levels of information granularity, themes, epochs and display modes which are synchronized or not: maps, tables and diagrams 22

  23. Characteristics of SOLAP � Provides a high level of interactivity � response times < 10 seconds independently of � the level of data aggregation � today's vs historic or future data � measured vs simulated data � Ease-of-use and intuitiveness � requires no SQL-type query language � no need to know the underlying data structure � Supports intuitive, interactive and synchronized exploration of spatio-temporal data for different levels of granularity in maps, tables and charts that are synchronized at will

  24. The Power of SOLAP Lies on its Capability to Support Fast and Easy Interactive Exploration of Spatial Data Select ¡1 ¡year ¡ -­‑> ¡ Select ¡all ¡years ¡ -­‑> ¡ Select ¡4 ¡years ¡ -­‑> ¡ Mul/map ¡View: ¡ 7 ¡clicks, ¡5 ¡seconds ¡ ¡

  25. The Power of SOLAP Lies on its Capability to Support Fast and Easy Interactive Exploration of Spatial Data Select ¡all ¡regions ¡ -­‑> ¡ Drill-­‑down ¡on ¡one ¡ region ¡ -­‑> ¡ Roll-­‑up ¡ -­‑> ¡ Show ¡Synchronized ¡ Views : ¡ ¡ 6 ¡clicks, ¡5 ¡seconds ¡ ¡

  26. The Power of SOLAP Lies on its Capability to Support Fast and Easy Interactive Exploration of Spatial Data Change ¡data ¡ -­‑> ¡Roll-­‑up ¡-­‑> ¡ Roll-­‑up ¡-­‑> ¡ Pivot ¡… ¡: ¡6 ¡click, ¡5 ¡seconds ¡

  27. Functionalities: Exploration-oriented Visualization and synchronized displays ü An operation on one type All data per country of display (e.g. drill, pivot or filter) must Canada automatically replicate on all other types of display Drill Down (when enabled). All data per province Canada Canada Provinces Provinces 27 Provinces

  28. Functionalities: Exploration-oriented Visualization and intelligent automatic mapping ü Intelligent automatic mapping: ü Manual processing: ü Supports user’s knowledge ü Involve specific knowledge by the user (database, ü Generates coherent maps by using semiology, mapping) predefined display rules in ü Is time-consuming accordance to the user’s selection What color, ü Instantaneous display symbol , pattern ? ü No SQL involved Which advanced map ? map thematic classification display type 16

  29. EVOLUTION OF SOLAP and TODAY’S STATE OF THE ART 29

Recommend


More recommend