XML <Foo> <Bars> XML and databases <Bar Number=2 String=”ABC” /> <Bar Number=1 /> <Bar String=”XTC”> Baz </Bar> Dennis Andersson, FOI </Bars> Andreas Borg, LiU/IDA/PELAB <Bar> Booze </Bar> </Foo> XML database XML as a DB • XML in RDBs • In order to effectively use an XML – Adding semi+structured features to strongly document as a database we need: typed databases – A method for persistance – Example: MS SQL Server 2005 • (a filesystem?) – Dense vs. sparse – To allow placing constraints on data • XML as DBs • (a data model) – An XML file IS a database – A method for querying – A set of XML files is also a database • (a query language) – Semi+structured XQuery background XQuery • W3C defines several XML standards: • Design in progress – XML Schema: notation for defining new types – Only retrieval of elements and documents – Updating existing XML documents may follow – XSLT: notation for transforming XML • XQuery 1.0 documents from one representation to another – W3C recommendation 23 January 2007 – XPath: notation for selecting elements within • Two syntaxes: an XML document – Expressed in XML – XQuery: a query language designed – Human+oriented version expressly for XML data sources 1
XQuery data model XQuery Expressions • Sequence: An ordered collection of zero or more items • Basics (literals, variables, core function library) • Path expressions (child, descendant, parent …) • Item: Node or atomic value – Node • Predicates (e.g. ” ������ ������� ”) • Element • Element constructors (to construct new • Attribute elements) • Text • Iteration and sorting (FLWR: for+let+where+ • Document • Comment return) • Processing instruction • Arithmetic (+,+,*,div) • Namespace nodes • Operations on sequences – Atomic value: E.g. strings, integers, decimals • Typed value: A sequence of zero or more typed values • Conditional expressions • Document order: Each node appears before its children • Quantified expressions (some, every) Example data: items.xml Example data: bids.xml document element attribute text node Path expressions Predicates • The result of each step is a sequence of nodes • Q3:Find the status attribute of the item that is the • The value is the node sequence resulting from the last parent of a given description step • Q1: List the descriptions of all items offered for sale by ����������������������� Smith. – XML ������������������������������ variable parent attribute node �������������������������� ���������� ������������������� – Human+oriented ��������������������� �������������� ���������������������� 2
Iteration and sorting A model for XML databases • Q4: For each item that has more than ten bids, generate a popular+ • An XML document is ����������� item element containing the item number, description, and bid count. – tags are properly nested F �������������������������������������� L ��� � �������������� ��������� – no need to conform to a particular schema ��� ��������� ������������ W !���� ����� �� ��"�#$ – semi+structured data R ������ – relational and object+oriented modeling %�������&����" ' techniques becomes complex ���������( ��������������( – efficient data models are needed % ��&�����"�'����� �� �)�%� ��&�����" ) %��������&����" ���� * ��&����� ���������+ XDD, XML Declarative Description XML elements • A simple yet expressive mechanism • Ground XML expression � XML element – explicit and implicit info • Example: <Element id=1 type=”foo”> • A description in XDD consists of <SubElement>Bar</SubElement> – XML elements <SubElement>Baz</SubElement> – XML expressions (extended XML elements <SubElement>Boz</SubElement> with variables) </Element> – XML clauses (constraints and relationships) • Example: <AnotherElement /> (Non+ground) XML expressions Generalization • XML element with variable <AirTrip from=”Bangkok” to=”London”> <Path> – Name ��� ���� ������ <City>Bangkok</City> – String ��� ���� �������� <City>Singapore</City> � (ground XML expression) – Attribute+value+pair ��� ���� ��� �������� <City>London</City> </Path> – XML+expression ��� ���� ��� �������������� <Price>650</Price> – Intermediate+expression ��� ���� ������������������ </AirTrip> • Example: <$N:element id=$S:id $P:att1> <AirTrip from= ������� to=”London”> ���������� �� (generalization of a) $E:subelements </AirTrip> </$N:element> < ���������� > ��������� ��������� ��������������������� ��������� ������ ��������������� ���� <City>Singapore</City> ��� ������ ������������� ������� ������ �������������� ������ ����� ��� (another generalization of a) </ ���������� > 3
Specialization XDD database modeling <AirTrip from= ��������� to=”London”> <AirTrip from= ������� to=”London”> • XML document ������ ���������� – Formalized as an XDD description containing � �������������� ����� </AirTrip> ground XML unit clauses (facts, see definition 5) �����������!���� ����� ������"������ ����� • Extensional XML DB (XDB E ) � ����� <AirTrip from= ��������� to=”London”> ����� ���������� – 1+ XML documents formalized as above </AirTrip> </AirTrip> • Intensional XML DB (XDB I ) <AirTrip from= ��������� to=”London”> – Comprised of XML non+unit clauses defining axioms, ������ <AirTrip from= ��������� to=”London”> relationships or deductible knowledge (XML non+unit �������������� ����� ����������� clauses) �����������!���� ����� </AirTrip> ������"������ ����� • Set of structural and integrity constraints (XDB C ) � ����� ����#��$%&� ���#�� – XML non+unit clauses defining particular constraints </AirTrip> • XDD Description: XDB = XDB E υ XDB I υ XDB C Extensional XML DB (XDB E ) Intensional XML DB (XDB I ) <Flight number=”TG916” airline=”TG”> • Example axiom <Origin>Bangkok</Origin> <Destination>London</Destination> – Minimum waiting time between two connecting flights <Price>750</Price> is 1 hour </Flight> <Flight number=”SQ61” airline=”SQ”> • Example deductible information <Origin>Bangkok</Origin> – There is a flight from Singapore to Bangkok <Destination>Singapore</Destination> <Price>150</Price> – There is a flight from Bangkok to London </Flight> <Flight number=”SQ320” airline=”SQ”> – Hence there is a 2+step flight from Singapore to <Origin>Singapore</Origin> London <Destination>London</Destination> <Price>500</Price> • Can be expressed in XML </Flight> – see definition 5 and figure 4 Constraints (XDB C ) XDD querying • Example constraints • An XML query can be formalized as an – A flight can not have the same origin and destination XML non+unit clause ( ����� ������ ) – The price of a flight must be an integer • The result of the query is a sequence of all – The price of a flight must be less than 1500 possible specializations of the query – The flight number must be unique clause in the database. – Elements in the database must conform to a certain schema • An example query is presented in figure 7 • Can be expressed in XML – see definition 4 and figure 5 4
Recommend
More recommend