oragist how to make user defined indexing become usable
play

OraGIST How to Make User-Defined Indexing Become Usable and Useful - PowerPoint PPT Presentation

1 OraGIST How to Make User-Defined Indexing Become Usable and Useful Carsten Kleiner, Udo Lipeck Universit at Hannover BTW 2003, 27.02.2003 2 The Situation (+) Object-relational DBMSs are extensible by: user-defined datatypes


  1. 1 OraGIST – How to Make User-Defined Indexing Become Usable and Useful Carsten Kleiner, Udo Lipeck Universit¨ at Hannover BTW 2003, 27.02.2003

  2. 2 The Situation (+) Object-relational DBMSs are extensible by: • user-defined datatypes (UDT) • together with corresponding query operators • user-defined indexing • together with corresponding user-defined optimizer tuning ⇒ ”data blades”, ”cartridges”, ”extenders”, ...

  3. 3 The Situation ( − ) But user-defined indexing • needs expensive implementation • needs experimental selection for new query scenarios • is restricted: a user-defined index supports only single operators, no combinations • assumes that an operator is supported by at most one index

  4. 4 Our Proposal • Use combined datatypes and operators to allow index support for attribute combinations • Provide medium-dimensional index structures • Apply a generic indexing framework ⋆ based on generalized search trees (GiST) ⋆ easily specializable into indexes for particular UDTs and operators ⋆ coupled with DB storage and index definition ⇒ prototypical tool OraGiST (GiST for Oracle)

  5. 5 Generalized Search Trees (GiST) — Class Hierarchy GiST #insert(e:GiSTEntry,level:int) #chooseSubtree(e:GiSTEntry,level:int) #split(n:GiSTNode,e:GiSTEntry) #adjustKeys(n:GiSTNode) #delete(e:GiSTEntry) OrderedGiST UnorderedGiST #findMin(q:predicate) #search(q:predicate) #next(q:predicate,e:GiSTEntry) BTreeGiST RTreeGiST #consistent(e:GiSTEntry,q:predicate) #consistent(e:GiSTEntry,q:predicate) #union(l:ListOfGiSTEntry) #union(l:ListOfGiSTEntry) #penalty(e1:GiSTEntry,e2:GiSTEntry) #penalty(e1:GiSTEntry,e2:GiSTEntry) #pickSplit(l:ListOfGiSTEntry) #pickSplit(l:ListOfGiSTEntry) #compare(e1:GiSTEntry,e2:GiSTEntry) RStarTreeGiST SSTreeGiST #penalty(e1:GiSTEntry,e2:GiSTEntry) #penalty(e1:GiSTEntry,e2:GiSTEntry) #pickSplit(l:ListOfGiSTEntry) RSSTreeGiST #penalty(e1:GiSTEntry,e2:GiSTEntry)

  6. 6 Index Usage CREATE INDEXTYPE polygonRTree FOR overlaps(polygon,polygon) USING overlaps fct ; CREATE TABLE county (id NUMBER, population NUMBER, ... shape polygon, ...); CREATE INDEX idx geoCounty ON county(shape) INDEXTYPE IS polygonRTree PARAMETERS(...); SELECT * FROM county WHERE overlaps(shape,polygon( rectangle(0,0,20,10) ))) = true;

  7. 7 Extensible Indexing in ORDBS • Sequence of index operations over index lifetime (e.g.): for each tuple ODCIIndexCreate ODCIIndexInsert ODCIIndexDrop • Sequence of index operations in query execution: until all tuples fetched ODCIIndexStart ODCIIndexFetch ODCIIndexClose set up structures read and modify delete structures Scan Context

  8. 8 OraGiST — Architecture and Functionality libgist library Oracle ORDBS OraGiST GiST DBSExtensibleIndexing initiates calls 1 1 OraGiST Library 1 1 GiSTIndexFile DBSIndexTable is stored in n 1 GiSTExtension UserDefinedIndexStructure 1 1 OraGiSTExtension OraGiST Toolbox +getExtension() +getQuery() +needVerify() n n DBSUserDefinedObject GiSTEntry TypeMap +approximate()

  9. 9 OraGiST — Architecture and Functionality libgist library Oracle ORDBS OraGiST GiST OracleExtensibleIndexing 1 1 1 1 GiSTIndexFile OracleIndexTable getExtension RTree UserDefinedRTree getQuery 1 1 n n approximate RTreeEntry GeometryObject

  10. 10 Case Study — Spatial/Thematic Analysis • Typical query: Find all counties overlapping a given window where the median rent is below ... and the population is higher than ... • Definition of combined datatypes and operators: CREATE TYPE integerGeometry AS OBJECT (shape OGCGeometry, theme INTEGER); CREATE TYPE twoIntegerGeometry AS OBJECT (shape OGCGeometry, theme1,theme2 INTEGER); CREATE OPERATOR [two]BetweenOverlaps ... ; CREATE TABLE county OF [two]IntegerGeometry;

  11. 11 Performance Evaluation on 2D Spatial Data 60 Oracle R-Tree Oracle Quadtree User-Def-R*-Tree 50 40 response time 30 20 10 0 0 5 10 15 20 25 30 35 40 selectivity in %

  12. 12 ... on 2D spatial data + one thematic dimension 80 Oracle R-Tree User-Def-3D-RSS-Tree User-Def-3D-R*-Tree 70 Oracle Quadtree 60 50 response time 40 30 20 10 0 0 2 4 6 8 10 12 14 selectivity in %

  13. 13 ... on 2D spatial data + two thematic dimensions User-Def-4D-RSS-Tree User-Def-4D-R*-Tree 80 User-Def-3D-RSS-Tree User-Def-2D-RSS-Tree 70 60 response time 50 40 30 20 10 0 0 5 10 15 20 selectivity in %

  14. 14 Conclusions • R ∗ - and RSS-trees can be recommended as (medium- dimensional) index types for combined query scenarios. • Combined scenarios occur often in spatio-temporal- thematic DBS; think of, e.g., ” validtimeInteger ” ! • For new UDTs/operators, for combined UDTs/operators, and for experimental index selection, a framework for adaptable indexing in ORDBS is required.

  15. 15 Conclusions (cont.) • OraGiST is such an indexing framework: ⋆ extensible library + toolbox ⋆ coupling the GiST-family with an ORDBMS ⋆ for adoption of existing index types ⋆ for fast development of new index types by object-oriented specialization

  16. 16 Future Questions • How to improve behaviour on ’complicated’ objects ? • How to include user-defined cost/selectivity estimation ? • How to generate and hide the combined operators ?

Recommend


More recommend