Data Presentation and Markup Languages MIE456 Tutorial Acknowledgements � Some contents of this presentation are borrowed from a tutorial given at VLDB 2000, Cairo, Agypte (www.vldb.org) by D. Florescu &. J. Simeno. 1
Agenda � Web Servers and HTML � Dynamic Web Page Generation � XML � DTD HTML � Most web contents are formatted in HTML � Displayed by web browsers � Use “tags” to indicate formatting information (presentation logic) � For example: <html> <body> <H1>Heading</H1> normal text <b>bode text</b> <i>italic</i> </body> </html> 2
Static vs. Dynamic Content � Static content � Preformatted HTML pages � Dynamic content � The content is determined at runtime � Presentation is dependent on user input and data from the database � HTML page is generated upon request. Web Server � HTTP – hypertext transfer protocol � URL – Uniform resource locater � The web server parse the incoming request and sends a reply to the client � Static page � http://www.eecg.toronto.edu/~jacobsen/ mie456/index.html � Locate the “index.html” file in the server file system and sends the file back to the client 3
Dynamic Web Page Generation � URLs can indicate a request for invoking a program. � http://www.ibm.com/webapp/wcs/stores/ servlet/CategoryDisplay?categoryId=2035724& storeId=124&catalogId=-124&langId=124 � CGI – Common Gateway Interface � CGI scripts, e.g. Perl � CGI programs, e.g. C/C++ programs � Java servlets, run in servlet engines � JSP, ASP – combine code and scripts Java Server Pages (JSP) � An easier way to write server programs for dynamic content generation � Compile JSP pages into servlet code � Manually � When the JSP page is invoked for the first time � Use directives to control how the web container translates and executes the JSP page 4
JSP example <% User[] users; users = getUsers(); // get a list of users from database %> <table> <tr> <th>first name</th> <th>Last name</th> </tr> <% for( int i=0; i<users.length; i++) { User u = users[i]; String first = u.getFirstName(); String last = u.getLastName(); %> <TR> <TD><%= first%></TD> <TD><%= last%> </TD> </TR> <% } // end for %> </table> Extended Markup Language (XML) � A subset of Standard Generalized Markup Language (SGML) � A markup language � Use tags to describe semantics and structure of data � Self-descriptive � Use user-defined tags with meaningful tag names � Allows a tree-like, nested data structure � Semi-structured data � Meaningful with or without a schema � Extensible 5
XML Example prolog <?xml version=“1.0”?> <!-- a list of students--> attribute <Students> <Student age=“20”> <Lastname>Smith</Lastname> <Firstname>John</First> empty tags <Male/> Elements </Student> <Student age=“21”> <Lastname>Brown</Lastname> <Firstname>Jane</First> <Female/> </Student> </Students> Well-formed XML � Correct syntax � Start with an XML Declaration (prolog) � Match start and end tags � End empty tags with /> � Has a root element completely contains all other elements � Tags may nest but may not overlap � Attribute values must be quoted � (more…) 6
Valid XML � A valid XML document is associated with a Document Type Definition (DTD) � An XML document that conforms to its DTD is valid. � A valid XML document must also be well-formed � Can be declared inline with the XML document or in a separate file Document Type Definition (DTD) � Provides a grammar for the document � Contains or points to markup declarations for: elements, attributes, entities, notations � Optional 7
DTD Example <Students> <!ELEMENT Students (student*)> <Student age=“20” <!ELEMENT Student (Lastname, Firstname, weight=“150”> Male?, Female?)> <Lastname>Smith</Lastname> <!ATTLIST Student age CDATA #REQUIRED> <Firstname>John</First> <!ATTLIST Student weight CDATA #IMPLIED> <Male/> <!ELEMENT Lastname (#PCDATA)> </Student> <!ELEMENT Firstname (#PCDATA)> <Student age=“21”> <!ELEMENT Male EMPTY> <!ELEMENT Female EMPTY> <Lastname>Brown</Lastname> <Firstname>Jane</First> <Female/> </Student> </Students> Elements Element: <?XML version=“1.0”?> � <DOCTYPE book [ the logical atomic unit of data � <!ELEMENT book (title, author*, publisher?, section+)> has a name, a content and a set of attributes � <!ATTLIST book year CDATA #IMPLIED> the content is an ordered list of children that can be � <!ELEMENT title (#PCDATA)> elements, character data, comments, processing <!ENTITY % macro “publisher (#PCDATA)”> instructions and references <!-- The declaration of the <publisher> element--> Element declaration: <!ELEMENT %macro;> � describes constraints on the content of an element <!ELEMENT author (#PCDATA)> � <!ELEMENT section (#PCDATA | title | section)*> EMPTY: no content allowed � ]> ANY: can contain any elements defined in the DTD, � <book year=“1967” > in any order <title>The politics of experience</title> MIXED: character data mixed with the additional � <author>R.D.Laing</author> declared elements <section> CHILDREN: the children can be only elements and The great and true Amphibian, whose nature is disposed to….. � they have to satisfy the given regular expression <title>Persons and experience</title> Even facts become fictious without adequate ways to... </section> <section> <section> <![CDATA[Exploitation <must> not been….]]> </section> </section> </book> 8
Attributes <?XML version=“1.0”?> Attribute: <DOCTYPE book [ � <!ELEMENT book (title, author*, publisher?, section+)> (name, string value) pair � <!ATTLIST book year CDATA #IMPLIED> associated with an element <!ELEMENT title (#PCDATA)> � <!ENTITY %macro “publisher (#PCDATA)”> Attribute declaration: � <!-- The declaration of the <publisher> element--> a triple (name, type, defaultValue) <!ELEMENT %macro;> � <!ELEMENT author (#PCDATA)> type: � <!ELEMENT section (#PCDATA | title | section)*> string type (CDATA) ]> � <book year=“1967” > tokenized type (ID,IDREF, IDREFS, � <title>The politics of experience</title> entity, nmtoken,etc) <author>R.D.Laing</author> enumerated � <section> default declaration: The great and true Amphibian, whose nature is disposed to….. � <title>Persons and experience</title> REQUIRED � Even facts become fictions without adequate ways to... IMPLIED � </section> FIXED <section> <section> <![CDATA[Explointation <must> not � been….]]> default value � </section> </section> </book> Entities Entities: � <?XML version=“1.0”?> the physical storage unit for the XML <DOCTYPE book [ � data <!ELEMENT book (title, author*, publisher?, section+)> <!ATTLIST book year CDATA #IMPLIED> have a name and a content � <!ELEMENT title (#PCDATA)> can be referenced by name <!ENTITY %macro “publisher (#PCDATA)”> � first classification: <!-- The declaration of the <publisher> element--> � <!ELEMENT %macro;> parsed entities � <!ELEMENT author (#PCDATA)> unparsed entities � <!ELEMENT section (#PCDATA | title | section)*> second classification: <!ENTITY macro2 “<![CDATA[Exploitation <must> � not been….]]>”> internal entities � ]> external entities � <book year=“1967” > parsed entities: � <title>The politics of experience</title> general entities: can occur in the data <author>R.D.Laing</author> � content of the document <section> parameter entities : can occur in the The great and true Amphibian, whose nature is disposed to….. � DTD <title>Persons and experience</title> Even facts become fictious without adequate ways to... entity references: � </section> general entity: &name; � <section> <section> ¯o2 </section></section> parameter entity: %name; </book> � 9
XML and DTD Syntax � Get XML and DTD specifications from http://www.xml.com/axml/testaxml.htm Applications of XML � Communicating data between distributed applications � Web services � Data integration and transformation � Structuring data in flat files, e.g. configuration files � Large textual databases � B2B e-business � Enterprise Application Integration (EAI) � Voice XML � Many others… 10
Recommend
More recommend