Module 2 Module 2 XML Basics XML Basics (XML, Namespaces, (XML, Namespaces, Usage scenarios, DTDs) Usage scenarios, DTDs) 1
History: SGML vs. HTML vs. History: SGML vs. HTML vs. XML XML SGML (1960) XML(1996) HTML(1990) XHTML(2000) http://www.w3.org/TR/2006/REC-xml-20060816/ 2
Why XML ? Why XML ? HTML is to be interpreted by browsers HTML is to be interpreted by browsers Shown on the screen to a human Shown on the screen to a human Desire to separate the “content” from Desire to separate the “content” from “presentation” “presentation” Presentation has to please the human eye Presentation has to please the human eye Content can be interpreted by machines, for Content can be interpreted by machines, for machines presentation is a handicap machines presentation is a handicap Semantic markup of the data Semantic markup of the data 3
Information about a book in Information about a book in HTML HTML <td><h1 class=”Books Books"> ">Politics of experience by Ronald Laing, Politics of experience by Ronald Laing, <td><h1 class=” published in 1967</h1></td><td align="right" nowrap> Item </h1></td><td align="right" nowrap> Item published in 1967 number:320070381076</td><td align="right" valign="top"><img number:320070381076</td><td align="right" valign="top"><img src="http://pics.booksstatic.com/aw/pics/globalAssets/rtCurve.gi src="http://pics.booksstatic.com/aw/pics/globalAssets/rtCurve.gi f" width="8" height="8"></td></tr><tr><td colspan="6" f" width="8" height="8"></td></tr><tr><td colspan="6" valign="middle" bgcolor="#5F66EE"><img valign="middle" bgcolor="#5F66EE"><img src="http://pics.booksstatic.com/aw/pics/s.gif" width="1" src="http://pics.booksstatic.com/aw/pics/s.gif" width="1" height="4"></td></tr></table><table width="100%" border="0" height="4"></td></tr></table><table width="100%" border="0" cellpadding="0" cellspacing="0"><tr><td cellpadding="0" cellspacing="0"><tr><td bgcolor="#CCCCFF"><img bgcolor="#CCCCFF"><img src="http://pics.booksstatic.com/aw/pics/s.gif" width="1" src="http://pics.booksstatic.com/aw/pics/s.gif" width="1" height="1"></td><td bgcolor="#EEEEFF"><div height="1"></td><td bgcolor="#EEEEFF"><div id="FastVIPBIBO"><table border="0" cellpadding="0" id="FastVIPBIBO"><table border="0" cellpadding="0" cellspacing="0" width="100%"> cellspacing="0" width="100%"> 4
The same information in XML The same information in XML <book book year year=“1967”> =“1967”> < < <title title>Politics of experience</ >Politics of experience</title title> > < <author author> > <firstname firstname>Ronald</ >Ronald</firstname firstname> > < <lastname lastname>Laing</ >Laing</lastname lastname> > < </ </author author> > </book book> > </ Elements • Information is (1) decoupled from presentation, then (2) chopped into smaller pieces, and then (3) marked with semantic meaning • It can be processed by machines • Like HTML, only syntax, not logical abstract data model 5
XML key concepts XML key concepts Documents Documents Elements Elements Attributes Attributes Namespace declarations Namespace declarations Text Text Comments Comments Processing Instructions Processing Instructions All inherited from SGML, then HTML All inherited from SGML, then HTML 6
The key concepts of XML The key concepts of XML <book book year year=“1967”> =“1967”> < < <title title>Politics of experience</ >Politics of experience</title title> > < <author author> > <firstname firstname>Ronald</ >Ronald</firstname firstname> > < <lastname lastname>Laing</ >Laing</lastname lastname> > < </ </author author> > • Documents Elements • Elements </book book> > </ • Attributes • Text • Nested structure • Conceptual tree • Order is important • Only “characters”, not integers, etc 7
Elements Elements Enclosed in Tags Enclosed in Tags Begin Tag: e.g., Begin Tag: e.g., <bibliography> <bibliography> End Tag: e.g., End Tag: e.g., </bibliography> </bibliography> Element without content: e.g., Element without content: e.g., <bibliography /> <bibliography /> is a is a shorthand for <bibliography> </bibliography> <bibliography> </bibliography> shorthand for Elements can be nested Elements can be nested <bib> <book> Wilde Wutz </book> <book> Wilde Wutz </book> </bib> </bib> <bib> Subelements can implement multisets Subelements can implement multisets <bib> <book> ... </book> <book> ... </book> <book> ... </book> <book> ... </book> </bib> </bib> <bib> Order is important ! Order is important ! Documents must be well-formed Documents must be well-formed <a> <b> <b> </a> </a> </b> </b> is forbidden! is forbidden! <a> <a> <b> </b> <b> </b> is forbidden! is forbidden! <a> 8
Attributes Attributes Attribute are associated to Elements Attribute are associated to Elements <book price = „55“ year = „1967“ > > <book price = „55“ year = „1967“ <title> ... </title> <title> ... </title> <author> ... </author> <author> ... </author> </book> </book> Elements can have only attributes Elements can have only attributes <person name = „Wutz“ age = „33“/> <person name = „Wutz“ age = „33“/> Attribute names must be unique! (No Multisets) Attribute names must be unique! (No Multisets) <person name = „Wilde“ name = „Wutz“/> is illegal! is illegal! <person name = „Wilde“ name = „Wutz“/> What is the difference between a nested element What is the difference between a nested element and an attribute? Are attributes useful? and an attribute? Are attributes useful? Modeling decision: should „name“ be an attribute Modeling decision: should „name“ be an attribute or a subelement of a person ? What about „age“ ? or a subelement of a person ? What about „age“ ? 9
Text and Mixed Content Text and Mixed Content Text appears in element content Text appears in element content <title> <title>The politics of experience The politics of experience</title> </title> Can be mixed with other subelements Can be mixed with other subelements <title> <title>The politics of <em>experience</em> The politics of <em>experience</em></title> </title> Mixed Content Mixed Content For „documents“ data -- very useful For „documents“ data -- very useful The need does not arise in „data“ processing, only entities The need does not arise in „data“ processing, only entities and relationships and relationships People speak in sentences, not entities and relationships. People speak in sentences, not entities and relationships. XML allows to preserve the structure of natural language, XML allows to preserve the structure of natural language, while adding semantic markup that can be interpreted by while adding semantic markup that can be interpreted by machines. machines. 10
Recommend
More recommend