comp6037
play

COMP6037 We know Semi-structured Data and the Web when a grammar - PowerPoint PPT Presentation

Clarification: a grammar, its language, and their types COMP6037 We know Semi-structured Data and the Web when a grammar is local: i.e., if none of their non-terminal symbols compete Uniqueness in Trees, given a grammar G, what


  1. Clarification: a grammar, its language, and their types COMP6037 • We know Semi-structured Data and the Web • when a grammar is local: i.e., if none of their non-terminal symbols compete… Uniqueness in Trees, • given a grammar G, what the language (set of trees) L(G) of G is: Repercussion on interesting problems, and finite L(G) := { t | t is a tree accepted by G} Graphs 5.2 • what it means for a language (set of trees) L to be local: i.e., if we can find a local grammar G such that L = L(G) Uli Sattler • hence to find out whether L is local (and perhaps L is given through a grammar G, i.e., L = L(G)) University of Manchester you need to determine whether we can find /construct a local grammar F such that L = L(F) • ...the above works analgously if “local” is replaced with “single-type” 1 2 Clarification: a grammar, its language, and their types Things done so far • Remember: we saw • [structures] semi-structured data, XML, datamodels, trees • G is not single-type • [description mechanisms] schema languages G = (N, � ,S, P) with N = {Book, Author, Editor, Affilia, Paper, F, L} – of different styles, strengths, purposes • G’ is single-type: � = {B, P, Name, F, L, A} – validation, validate-as, PSVIs Author and S = {Book, Paper} – a useful abstraction: tree grammars BA still compete, P = { Book � B Editor|Author, Paper � P Author, • [‘difficult’ extensibility mechanism] namespaces, schemas Editor � Name F,L, Author � Name L,Affilia, but don’t occur F � F � , L � L � , Affilia � A � } • [interaction mechanisms] query languages, parsers, together in a rule! – possibly schema aware – namespace aware G’ = (N’, � ’,S’, P’) with • L(G’) = L(G) N’ = {Book, Author, Editor, Affilia, Paper, F, L} • error handling � ’ = {B, P, Name, F, L, A} • [modelling] attributes vs elements, deep vs flat, ... S’ = {Book, Paper} • hence L(G) is P’ = { Book � B BA, Paper � P Author, single-type! BA � Name (F,L)|(L,Affilia), Author � Name • ...today: L,Affilia, – we go back to [structures]: beyond trees, and F � F � , L � L � , Affilia � A � } – other ‘tasks’ around schemas – more modelling, human factors 3 4 – exam preview

  2. So far, there were trees everywhere Trees and families: family trees! • trees in semi-structured data • Assume you want to work with/display/search/combine/... family trees – apart from when object identifiers are used – you are interested in genealogy • parse trees from XML documents – you work with a solicitor who handles inheritance cases • DOM trees – you study genetics <?xml version="1.0" encoding="UTF-8"?> • infosets – .... • XPath datamodel tree • easy: <!ELEMENT family-tree (person | family)*> • trees that tree grammars run on – information is patchy & <!ELEMENT person (name, birth?, death?, varied, thus use XML father?, mother?, note?)> – and that Relax NG and Schematron work on – let’s build a DTD for this <!ELEMENT family (husband?, wife?, child*, marriage*, divorce*, note*)> ... • ...but is everything really a tree? <!ELEMENT father (name, birth?, death?, – e.g., you, your friends and family, and the relationships between them? father?, mother?, note?)> <!ELEMENT mother (name, birth?, death?, father?, mother?, note?)> ... <!ELEMENT name (firstname?, middle?, lastname)> <!ELEMENT middle (#PCDATA)> <!ELEMENT firstname (#PCDATA)> <!ELEMENT given (#PCDATA)> example taken from & modified .... 5 http://penguin.dcs.bbk.ac.uk/academic/xml/family/index.php 6 <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE family-tree SYSTEM "family.dtd"> <family-tree> <?x ml version="1.0" encoding="UTF-8"?> <person id="p5" sex="m"> Trees and families: family trees! Trees and families: family trees! <name> <!ENTITY % reference "person IDREF #REQUIRED "> <firstname>Alfred Ernest</firstname> <lastname>Farmer</lastname> <!ELEMENT family-tree (person | family)*> </name> <death> • in order to ensure • things work nicely: <place>Finsbury Park, London</place> <!ELEMENT person (name, birth?, death?, • e.g., to retrieve all pairs of persons and their <date>8 January, 1964</date> father?, mother?, note?)> </death> – integrity : a person’s DoB should be the fathers, we can use a simple XQuery: </person> <!ELEMENT family (husband?, wife?, child*, same regardless of where they occur in <person id="p6" sex="m"> marriage*, divorce*, note*)> <name> our tree let $d := doc("family.xml") <firstname>Ronald Alfred</firstname> – maintainability : when we change a for $p in $d//person <!ELEMENT name (firstname?, middle?, lastname)> <lastname>Farmer</lastname> </name> return person’s data (e.g., add DoD), we <!ELEMENT middle (#PCDATA)> <birth> <childAndParents> should only have to do it once <place>London</place> <!ELEMENT firstname (#PCDATA)> <child>{ $p/name }</child> <date>27 April, 1922</date> <!ELEMENT given (#PCDATA)> { if ($p/father/@father != "") </birth> � we can make use of IDs & IDREFs <death> then <father>{ id($p/father/@father)/name } < !ATTLIST person id ID #REQUIRED <place>Hill House Nursing Home, sex (m | f) #IMPLIED> </father> Kenley, Surrey</place> else <fatherUnknown/>} <date>23 November, 2003</date> before: <!ELEMENT father EMPTY> </death> { if ($p/mother/@mother != "") < !ATTLIST father %reference; > <father father="p5"/> <!ELEMENT father (name, birth?, death?, father?, mother?, note?)> then <mother>{ id($p/mother/@mother)/name } </person> <!ELEMENT mother (name, birth?, death?, father?, mother?, note?)> </mother> <!ELEMENT mother EMPTY> <person id="p7" sex="f"> < !ATTLIST mother %reference; > else <motherUnknown/>} <name> </childAndParents> <firstname>Daisy May</firstname> <!ELEMENT wife EMPTY> <lastname>Farmer</lastname> < !ATTLIST wife %reference; > </name> .... <death> 7 8

More recommend