Database Management Course Content Systems • Introduction • Database Design Theory • Query Processing and Optimisation Winter 2002 • Concurrency Control • Data Base Recovery and Security CMPUT 391: Spatial Data Management • Object-Oriented Databases • Inverted Index for IR Dr. Jörg Sander & Dr. Osmar R. Zaïane • Spatial Data Management • XML • Data Warehousing • Data Mining University of Alberta Chapter 26 • Parallel and Distributed Databases of Textbook 1 2 2 Dr. Jörg Sander, Dr. Osmar R. Zaïane 2002 CMPUT 391 – Database Management Systems University of Alberta Dr. Jörg Sander, Dr. Osmar R. Zaïane 2002 CMPUT 391 – Database Management Systems University of Alberta Spatial Data Management Objectives of Lecture ? Spatial Data Management Spatial Data Management • This lecture will give you a basic • Modeling Spatial Data understanding of spatial data management • Spatial Queries • Space-Filling Curves + B-Trees – What is special about spatial data • R-trees – What are spatial queries – How do typical spatial index structures work Dr. Jörg Sander, Dr. Osmar R. Zaïane 2002 3 Dr. Jörg Sander, Dr. Osmar R. Zaïane 2002 4 CMPUT 391 – Database Management Systems University of Alberta CMPUT 391 – Database Management Systems University of Alberta
Relational Representation of Spatial Data Relational Representation of Spatial Data • Example : Representation of geometric objects (here: parcels/fields of land) • For (spatial) queries involving parcels it is necessary to reconstruct in normalized relations the spatial information from the different tables Parcels Borders Points – E.g.: if we want to determine if a given point P is inside parcel F 2 , we FNr BNr BNr PNr 1 PNr 2 PNr X-Coord Y-Coord have to find all corner-points of parcel F 2 first F 5 F 1 B 1 F 2 B 1 P 1 P 2 Y P1 P 1 X P1 F 1 B 2 B 2 P 2 P 3 P 2 X P2 Y P2 SELECT Points.PNr, X-Coord, Y-Coord F 1 B 3 B 3 P 3 P 4 Y P3 P 3 X P3 F 1 B 4 B 4 P 4 P 1 FROM Parcels, Border, Points P 4 X P4 Y P4 F 4 B 2 F 1 B 5 P 2 P 5 Y P5 P 5 X P5 F 4 B 5 WHERE FNr = ‘F 2 ’ AND F 4 B 6 P 5 P 6 P 6 X P6 Y P6 F 4 B 6 B 7 P 6 P 7 Y P7 P 7 X P7 F 7 F 4 B 7 Parcel.BNr = Borders.BNr AND B 8 P 7 P 8 P 8 X P8 Y P8 F 4 B 8 B 9 P 8 P 3 P 9 X P9 Y P9 F 4 B 9 ( Borders.PNr 1 = Points.PNr OR B 10 P 6 P 9 P 10 X P10 Y P10 F 7 B 7 B 11 P 9 P 10 Borders.PNr 2 = Points.PNr) F 7 B 10 F 3 F 6 B 12 P 10 P 7 F 7 B 11 F 7 B 12 • Even this simple query requires expensive joins of three tables … … • Querying the geometry (e.g., P in F 2 ?) is not directly supported. Redundancy free representation requires distribution of the information over 3 tables: Parcels , Borders , Points 5 6 Dr. Jörg Sander, Dr. Osmar R. Zaïane 2002 CMPUT 391 – Database Management Systems University of Alberta Dr. Jörg Sander, Dr. Osmar R. Zaïane 2002 CMPUT 391 – Database Management Systems University of Alberta Extension of the Relational Model Extension of the Relational Model to Support Spatial Data to Support Spatial Data – Example • Integration of spatial data types and operations into the core of Relation: ForestZones (Zone: Polygon , ForestOfficial: String , Area: Cardinal ) a DBMS ( � object-oriented and object-relational databases) ForestZones – Data types such as Point , Line , Polygon ForestOfficial Area (m 2 ) Zone R 4 R 2 – Operations such as ObjectIntersect , RangeQuery , etc. R 1 Stevens 3900 R 6 R 2 Behrens 4250 R 3 Lee 6700 R 3 R 4 Goebel 5400 • Advantages R 1 R 5 Jones 1900 R 5 R 6 Kent 4600 – Natural extension of the relational model and query languages – Facilitates design and querying of spatial databases • The province decides that a reforestation is necessary in an area described – Spatial data types and operations can be supported by spatial index by a polygon S. Find all forest officials affected by this decsion. structures and efficient algorithms, implemented in the core of a DBMS SELECT ForestOfficial FROM ForestZones • All major database vendors today implement support for spatial data and WHERE ObjectIntersects (S, Zone) operations in their database systems via object-relational extensions Dr. Jörg Sander, Dr. Osmar R. Zaïane 2002 7 Dr. Jörg Sander, Dr. Osmar R. Zaïane 2002 8 CMPUT 391 – Database Management Systems University of Alberta CMPUT 391 – Database Management Systems University of Alberta
Data Types for Spatial Objects Spatial Data Management • Spatial objects are described by – Spatial Extent • Modeling Spatial Data • location and/or boundary with respect to a reference point in a coordinate system, which is at least 2-dimensional. • Spatial Queries • Basic object types: Point , Lines , Polygon – Other Non-Spatial Attributes • Space-Filling Curves + B-Trees • Thematic attributes such as height, area, name, land-use, etc. • R-trees 2-dim. points 2-dim. lines 2-dim. polygons Y Forest Crop Water X 9 10 Dr. Jörg Sander, Dr. Osmar R. Zaïane 2002 CMPUT 391 – Database Management Systems University of Alberta Dr. Jörg Sander, Dr. Osmar R. Zaïane 2002 CMPUT 391 – Database Management Systems University of Alberta Basic Spatial Queries Spatial Query Processing • DBMS has to support two types of operations • Containment Query : Given a spatial P object R, find all objects that completely – Operations to retrieve certain subsets of spatial object from the database R contain R. If R is a Point: Point Query • “Spatial Queries/Selections”, e.g., window query, point query, etc. • Region Query : Given a region R Containment Query Point Query – Operations that perform basic geometric computations and tests (polygon or circle), find all spatial R • E.g., point in polygon test, intersection of two polygons etc. objects that intersect with R. If R is a rectangle: Window Query • Spatial selections, e.g. in geographic information systems, are R often supported by an interactive graphical user interface • Enclosure Query : Given a polygon region R, find all objects that are Region Query Window Query Point Query Window Query completely contained in R R • K-Nearest Neighbor Query : Given an P object P, find the k objects that are closest to P (typically for points) P W Enclosure Query 2-nn Query Dr. Jörg Sander, Dr. Osmar R. Zaïane 2002 11 Dr. Jörg Sander, Dr. Osmar R. Zaïane 2002 12 CMPUT 391 – Database Management Systems University of Alberta CMPUT 391 – Database Management Systems University of Alberta
Basic Spatial Queries – Spatial Join Index Support for Spatial Queries • Conventional index structures such as B-trees are not designed • Given two sets of spatial objects (typically minimum bounding rectangles) to support spatial queries – S 1 = {R 1 , R 2 , …, R m } and S 2 = {R’ 1 , R’ 2 , …, R’ n } – Group objects only along one dimension • Spatial Join: Compute all pairs of objects (R, R’) such that – Do not preserve spatial proximity – R ∈ S 1 , R’ ∈ S 2 , – and R intersects R’ (R ∩ R’ ≠ ∅ ) • E.g. nearest neighbor query: Nearest neighbor of Q is typically not the nearest neighbor in any single – Spatial predicates other that intersection are also possible, e.g. all pairs of dimension objects that are within a certain distance from each other Y NN(Q) A5 B1 Answer Set A4 C Q (A5, B1) {A1, …, A6} {B1, …, B3} D A3 (A4, B1) A A1 (A1, B2) B A2 B2 (A6, B2) X (A2, B3) A and B closer in the X dimension; A6 B3 C and D closer in the Y dimension. Spatial-Join 13 14 Dr. Jörg Sander, Dr. Osmar R. Zaïane 2002 CMPUT 391 – Database Management Systems University of Alberta Dr. Jörg Sander, Dr. Osmar R. Zaïane 2002 CMPUT 391 – Database Management Systems University of Alberta Index Support for Spatial Queries Query Processing Using Approximations • Spatial index structures try to preserve spatial proximity Two-Step Procedure – Group objects that are close to each other on the same data page 1. Filter Step: – Use the index to find all approximations that satisfy the query – Problem: the number of bytes to store extended spatial objects (lines, polygons) varies – Some objects already satisfy the query based on the approximation, others have to be checked in the refinement step � Candidate Set – Solution: 2. Refinement Step: • Store Approximations of spatial objects in the index structure, – Load the exact object representations for candidates left after the filter typically axis-parallel minimum bounding rectangles (MBR) step and test whether they satisfies the query • Exact object representation (ER) stored separately; pointers to ER in the index false Why? hits Query Window query-window a b MBR a und b sind sicher Antworten Refinement Spatial - a and b are certain answers f, d und g sind sicher keine Antworten final candidates - f, d, and g are certainly (exact Index e c und e sind Kandidaten results e Filter c Query not answers evaluation) ER c ist ein Fehltreffer ( false hit , d. h. ein - c and e are candidates d Kandidat, der keine Antwort ist) ... (MBR, , ...)... ( MBR , , ...) f g - c is a false hit Not an answer ( ER ) Dr. Jörg Sander, Dr. Osmar R. Zaïane 2002 15 Dr. Jörg Sander, Dr. Osmar R. Zaïane 2002 16 CMPUT 391 – Database Management Systems University of Alberta CMPUT 391 – Database Management Systems University of Alberta
Recommend
More recommend