orientx an integrated schema based native xml database
play

OrientX: an Integrated, Schema- Based Native XML Database System - PowerPoint PPT Presentation

OrientX: an Integrated, Schema- Based Native XML Database System Meng Xiaofeng, Wang Xiaofeng, Xie Min, Zhang Xin, Zhou Junfeng School of information, Renmin University of China WISA2006 1 Introduction OrientX means: O riginal R UC I DK


  1. OrientX: an Integrated, Schema- Based Native XML Database System Meng Xiaofeng, Wang Xiaofeng, Xie Min, Zhang Xin, Zhou Junfeng School of information, Renmin University of China WISA2006 1

  2. Introduction • OrientX means: O riginal R UC I DK E N a t ive X ML Database – RUC: Renmin University of China – IDKE: Institute of Data and Knowledge Engineering – Native XML DataBase: Exposing a logical model of storing and retrieving XML documents. (non Native XML DataBase: for example, based on relation database) 2/21 WISA2006

  3. Outline • Architecture and Features • Storage and data management • Indexing Schema • Query processing • Conclusion and Future Work 3/21 WISA2006

  4. 4/21 Architecture WISA2006

  5. Features • Full support to XML Schema • Supporting XQuery1.0 and XPath2.0 Data Model • Various native storage techniques • Path index and value index • Multi-Query Processing strategies based on native storage. 5/21 WISA2006

  6. Outline • Architecture and Features • Storage and data management • Indexing Schema • Query processing • Conclusion and Future Work 6/21 WISA2006

  7. Different storage granularities • Document : – do not decompose the document, build index on it to direct the structure. – Query complexity and efficiency are restricted by the power of index. • Sub tree : – decompose the document into sub trees according to storage space partition. – Persistent the structure in the tree. – save space • Node: – decompose the document into nodes sequence , each node corresponding to a type (element, attribute, …). – May use too many links to persistent relation between nodes 7/21 WISA2006

  8. Storage Techniques in OrientX Element- SubTree- Document- based based based Depth-first DEB DSB Broad-first BEB BSB DB Clustered CEB CSB Akin to DSB, each record is Like DEB, but each record One element is a record, but One node is a record, a sub tree. But all sub trees is a sub-tree. The size of sub all node with the same tag through preorder traversing with the same structure are tree is close to physical name will be clustered-stored. Implemented techniques are marked in red tree clustered store. page size 8/21 WISA2006

  9. 9/21 r a1 a2 r a2 f1 f2 a1 l2 f2 Example-- Element based l2 l1 f1 l1 � DEB � CEB t1 t1 f2 a2 l2 Source doc a1 r f1 WISA2006 t1 l1

  10. r Example-- Subtree based t1 a1 a2 Proxy node (virtual node ) f2 l1 f1 l2 Also have Proxy DOC node r r t1 a1 a2 t1 a1 a2 l1 f1 l2 f2 l1 f1 l2 f2 DSB (Depth-first sub-tree based) CSB (clustered sub-tree based) 10/21 WISA2006

  11. Outline • Architecture and Features • Storage and data management • Indexing Schema • Query processing • Conclusion and Future Work 11/21 WISA2006

  12. 12/21 Path index SUPEX : Index Architecture WISA2006

  13. Features of SUPEX • Constructed based on DTD,Schema • Integrating path index with value indexes • Supporting Twig query efficiently • Supporting label path expressions ( bib//author) • Supporting the evaluation of value-based condition predicates (//author[firstname = “jone”]) 13/21 WISA2006

  14. Outline • Architecture and Features • Storage and data management • Indexing Schema • Query processing • Conclusion and Future Work 14/21 WISA2006

  15. Query processing • Navigation strategy – Supporting XPath2.0 and XQuery1.0 – Combine continuous steps in one XPath into a single path. – Reform syntax tree into reduced execution plan. – Introducing the pipeline operator to XQuery process. 15/21 WISA2006

  16. Operators in Navigation Currently, Navigation Containing 13 operators: 7. EleConstructor 1. Step 8. AttrConstructor 2. CondTreeNode 9. BuiltInFun 3. Path 10. IfThenElse 4. ForVarBind 11. Quanlify 5. LetVarBind 12. SetOpt 13. SortBy 6. FLWR 16/21 WISA2006

  17. General Steps to process XQuery XQuery Query Parser and Translator Initial Query plan optimizer optimized Query plan Evaluator Engine 17/21 WISA2006

  18. 18/21 The query plan WISA2006

  19. Outline • Architecture and Features • Storage and data management • Indexing Schema • Query processing • Conclusion and Future Work 19/21 WISA2006

  20. Conclusion and Future Work • Conclusion: – OrientX is an integrated, schema-based native XML database system. – It implements storing and querying xml data. • Future work: – XQuery optimization. – Xml Update and Other XQuery processing engine. 20/21 WISA2006

  21. Thanks Q&A ☺ Welcome to our website http://idke.ruc.edu.cn to obtain more information about OrientX 21/21 WISA2006

Recommend


More recommend