General Stuff COMP6037 • Read Blackboard’s Announcements • Read Blackboard’s Discussions Semi-structured Data and the Web • Forward your Blackboard email to your email account • Relax NG and Tree Grammars and XSLT 4.1 U. Sattler University of Manchester 1 2 Loc Loc Reg ST Reg ST Unique Particle Attribution Constraint: Schema Languages and Tree Grammars Schema Languages and Tree Grammars A content model must be formed such that • during validation of an element information item • Last week, we have seen how to translate schema languages in tree • Last week, we have seen how to translate schema languages in tree sequence (child node sequence, cns), grammars: we saw that grammars: we saw that • the particle component contained directly, indirectly or – each DTD can be faithfully translated into a local tree grammar, and – each DTD can be faithfully translated into a local tree grammar, and implicitly therein (in the content model) therefor in a single-type one therefor in a single-type one • with which to attempt to validate each item in the • hence each DTD corresponds to a single-type grammar • hence each DTD corresponds to a single-type grammar sequence (cns) in turn can be uniquely determined • hence there is exactly 1 PSVI for each document that validates • hence there is exactly 1 PSVI for each document that validates • without examining the content or attributes of that item against a DTD against a DTD (element name suffice), and without any information – each XML schema can be faithfully translated into a single-type tree about the items in the remainder of the sequence (no – each XML schema can be faithfully translated into a single-type tree grammar, look-ahead into rest of cns). � is unnecessary grammar, • hence there is exactly 1 PSVI for each document that validates • hence there is exactly 1 PSVI for each document that validates against an XML schema against an XML schema • ...we also saw that parts of the UPA constraint helps to generate • ...we also saw that parts of the UPA constraint helps to generate PSVI: do we need other parts? PSVI: do we need other parts? 3 4
See the paper by Murata, Lee, Mani, Kawaguchi Translating Relax NG schema into tree grammars Schema Languages and Tree Grammars by example 1 • Last week, we have also learned about a third, flexible, liberal schema grammar { Translate into G=(N, � , S, P) with language, Relax NG start = AddressBook N = {AddressBook, Card, Inline, Name, – but we haven’t translated Relax NG schemas yet into tree grammars AddressBook = element addressBook { Card* } Email, Pcdata} Card = element card { Inline } – so we don’t know whether being more liberal gives more than single- � = {addressBook, card, name, email, pcdata} Inline = Name, Email+ S = {AddressBook} type Name = element name { text } P = {AddressBook � addressBook Card*, Email = element email { text } } • Also, the whole approach brings another opportunity: Card � card Inline, validating a – we can investigate the problem of Inline � Name, Email+, document against Name � name Pcdata, – we can discuss algorithms for a tree grammar Email � email Pcdata, Pcdata � pcdata � } “element y” � y � � “yes”, if T � L(G) Tree T ...possibly also “uppercased copy” � Y � N algorithm Grammar G “no”, otherwise all other user defined symbols X � X � N ...translate Relax NG rules easy (depending on Relax NG style) • oh, and we will learn about XSLT today as well • ...let’s see one more 5 6 Translating Relax NG schema into tree grammars Translating Relax NG schema into tree grammars by example 2 This Relax NG style makes by example 3 This Relax NG style makes translation of rules less easy… translation of rules easy and leads to generalized rules! grammar { start = p-el Translate into G = (N, � , S, P) with grammar { start = element people Translate into G=(N, � , S, P) with N = {P-EL, PER-EL, NA-EL, AD-EL, PRO-EL, {people-content} N = {PEOPLE, P-C, PER-C, NA, NA-C, p-el = element people FIRST, MIDDLE, LAST, Pcdata} PERSON, PRO-C,ADR, PROJ, PRO-C, { per-el+ } � = {people, person, name, first, middle, people-content = FIRST, MIDDLE,LAST, Pcdata} last, address, project} Ignore! element person � = {people, person, name, first, middle, per-el = element person { S = {P-EL} { person-content }+ last, address, project} attribute age { text }, P = {P-EL � people PER-EL, PER-EL*, Ignore! S = {PEOPLE} na-el, PER-EL � person expand! person-content = attribute age { text }, P = {PEOPLE � people P-C, ad-el+, NA-EL,AD-EL, AD-EL*,PRO-EL* element name P-C � PERSON, PERSON*, pro-el*} NA-EL � name FIRST, (MIDDLE| � ) , LAST, expand! {name-content}, PERSON � person PER-C, FIRST � first Pcdata, element address { text }+, PER-C � NA, ADR, ADR*,PROJ, na-el = element name { MIDDLE � middle Pcdata, element project NA � name NA-C, element first { text }, LAST � last Pcdata, {project-content}* ADR � address Pcdata, element middle { text }?, AD-EL � address Pcdata, PROJ � project PRO-C, element last { text } } PRO-EL � project Pcdata, name-content = element first { text }, PRO-C � pcdata � , Pcdata � pcdata � } element middle { text }?, NA-C � FIRST,(MIDDLE| � ),LAST ad-el = element address { text } element last { text } FIRST � first Pcdata, MIDDLE � middle Pcdata, Ignore! pro-el = element project { Ignore! project-content = attribute type { text }, LAST � last Pcdata, attribute type { text }, attribute id {text}, Pcdata � pcdata � } attribute id {text}, text } text }} 5 7 8
Translating Relax NG schema into tree grammars Translating Relax NG schema into tree grammars by example 3 by example 4 ... ... people-content = PERSON � person PER-C, expand! element person 2. we might have to “contextualise” names and types of elements, PER-C � NA, ADR, ADR*,PROJ, { person-content }+ to handle schemas where the same element name is used in NA � name NA-C, ..... ADR � address Pcdata, different contexts with different types: person-content = attribute age { text }, ... element name {name-content}, ... ... element address { text }+, people-content = P-C � PERSON, PERSON*,FRIEND,FRIEND* element project element person PERSON � person PER-C, {project-content}* { person-content }+, FRIEND � friend FRIE-C, element friend PER-C � NA^NA-C, ... Two things we have already seen when translating WXS: {friend-content }+ FRIE-C � NA^FRIE-NA-C, ... • ..... that we might need to introduce “generalized” rules -- which can & need to NA^NA-C � name NA-C, person-content = attribute age { text }, be expanded, as for WXS: NA^FRIE-NA-C � name FRIE-NA-C, element name ... {name-content}, for each illegal rule X � e: ... – remove X � e from rule set friend-content = attribute age { text }, element name – replace all occurrences of X in rule set with e {friend-name-content}, ... • we might have to “contextualise” names and types of elements: ... 9 10 Translating Relax NG schema into tree grammars Relax NG schema is indeed as powerful as tree grammars • each Relax NG schema can be faithfully translated into a tree grammar: � Every tree grammar can be faithfully translated into a Relax NG schema. – local? no : example on previous slide leads to competing non-terminals • Proof (not too hard): given a tree grammar G = (N, � , S, P), (NA^PER-C and NA^FRIE-C) 1. translate each production rule N � t regexp in P into ... NA^PER-C � name NA-C, N = element t { regexp } NA^FRIE-C � name NA-C, ... – single-type? no : see example below (fortunately, the tree grammar regular expression syntax is very close NA^NA-C and NA^FO-NA-C compete and occur in the same RHS to and more strict than Relax NG regular expression syntax) 2. Put the resulting statements into grammar {start = N 1 | ... | N k ... ... a grammar, where N 1 , ... , N k are ..... person-content = attribute age { text }, PER-C � NA^NA-C | NA^FO-NA-C all start symbols, i.e., element name } NA^NA-C � name NA-C, S = {N 1 , ... , N k } {name-content} | NA^FO-NA-C � name FO-NA-C, element name ... {foreign-name-content}, 3. Call the resulting schema G S ... � Then T � L(G) if and only if T validates against G S. – so is Relax NG as powerful as tree grammars? 11 12
Recommend
More recommend