SGML � Standard Generalized Markup Language � ISO-Standard [ISO/IS 8879, 1986] Web Engineering � developed from GML, IBM 1969 (Goldfarb, Mosher, Lorie) � distinction between content and presentation Prof. Dr. Dr. h.c. mult. Gerhard Krüger, Albrecht Schmidt � dilemma of specific markup language: � What is an appropriate set of tags? Universität Karlsruhe � generalized markup Fakultät für Informatik � documents are desribed in three parts Institut für Telematik � SGML declaration: mapping of the abstract SGML Syntax onto concrete characters definition of STAGO e.g. '<' or TAGC e.g. '>' or charset Wintersemester 2000/2001 � Document Type Definition, DTD: definition of tags and their meaning � the document (content) in the markup define in the DTD � semantic of markup is context dependent Prof. Dr. Dr. h.c. mult. Ge rhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page 1 Prof. Dr. Dr. h.c. mult. Ge rhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page 3 A General SGML Parser Web Engineering SGML Declaration Document Type SGML Processing Definition System (Parser) Chapter 3: The Web – An Information System Document Prof. Dr. Dr. h.c. mult. Ge rhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page 2 Prof. Dr. Dr. h.c. mult. Ge rhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page 4
Specialized SGML Parser I SGML Concepts � descriptive instead of procedural markup � SGML syntax is predefined (included in the parser) � “this is an X” instead of “do X here” � E.g. XML -parser (Extensible Markup Language) � entity Document Type � collection of characters that can be referenced as unit Definition SGML Declaration � element � component of the hierarchical structure defined by a document SGML Processing type definition DTD System (Parser) Document � in SGML tags are used to the structure of elements � entities are reference points Prof. Dr. Dr. h.c. mult. Ge rhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page 5 Prof. Dr. Dr. h.c. mult. Ge rhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page 7 Specialized SGML Parser II Document Type Definition - Example <!ELEMENT announcement (head,content)> � Document Type Definition and SGML syntax is predefined � e.g. HTML (Hypertext Markup Language) <!ELEMENT head (title,date?)> <!ELEMENT title (#PCDATA)> <!ELEMENT date (#PCDATA)> SGML Declaration <!ELEMENT content (abstract?, paragraph+)> SGML Processing <!ELEMENT paragraph((#PCDATA|course|name)*)> Document System (Parser) <!ELEMENT (abstract|course|name) (#PCDATA)> Document Type Definition Prof. Dr. Dr. h.c. mult. Ge rhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page 6 Prof. Dr. Dr. h.c. mult. Ge rhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page 8
Model Group II (SGML) Document Type Definition (SGML) � exceptions � Inclusions � a DTD defines: � the element can be used anywhere in the model group � elements of a document class � In the current element and in all elements embedded � rules how these elements are combined � structure: � Exclusion � MDO ELEMENT name model MDC � the element must not be used in this model group � <!ELEMENT name model > � name � example: � is the name of the element e.g . content or <!ELEMENT document(author,chapter+) +(italic)> � a or conjunction of names of elements e.g. (abstract|course|name) <!ELEMENT italic(#PCDATA) –(italic)> � model � minimizing markup OMITTAG � specifies the content that is allowed for this element � combinations are possible � only if tag omission is enabled � specify the tags that can be omitted � example – start tag may not be omitted, end tag may be omitted: <!ELEMENT author - 0 (#PCDATA) > Prof. Dr. Dr. h.c. mult. Ge rhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page 9 Prof. Dr. Dr. h.c. mult.Gerhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page11 Model Group I (SGML) DTD Attributes � (#PCDATA) character data � structure - attribute definition list declaration: � MDO ATTLIST name attribute-definition MDC � connecting elements: � <!ATTLIST name attribute-definition > � , sequence in the order specified � & all elements must exist (order not specified) � structure - Attribute definition: � � | one, or an arbitrary combination must exist attribute-name declared-value default-value � example: � declared-value – alpha numeric, � <!ELEMENT announcement (head,content)> (like variables or reserved words) � <!ELEMENT head (name & author & date)> � default-value - alpha numeric or reserved word � <!ELEMENT content (image | paragraph | table)> � #REQUIRED – must be provided � occurrence of elements: � #IMPLIED – if not available the program interpreting the document may imply a value � + at least once, repeatable � #FIXED – fixed attribute � * optional, may be repeated � example � ? optional, can occur but at most once � <!ATTLIST image> � example: name CDATA #REQUIRED � <!ELEMENT type (bw|color) bw document(author,chapter+,appendix*,abstract?)> date CDATA #IMPLIED> Prof. Dr. Dr. h.c. mult.Gerhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page10 Prof. Dr. Dr. h.c. mult.Gerhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page12
Document Processing /Layout I Fully-Tagged SGML � HTML � Fully-tagged SGML document � layout can be generated automatically � parsing without DTD possible � restricted to a specific document type with defined tags � all information required is included in the tags � SGML � requirement: type validation when creating the document � possible to define arbitrary document type � layout based on Document Style Specification and Semantics Language, DSSSL � Extensible Markup Language (XML) � restricted sub set of SGML � to make parsing simpler SGML document � only fully-tagged documents are valid (incl. DTD) DSSSL SPDL processor document DSSSL SPDL = Standard Page style sheet Description Language Prof. Dr. Dr. h.c. mult.Gerhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page13 Prof. Dr. Dr. h.c. mult.Gerhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page15 Document Processing /Layout II Extensible Markup Language (XML) � DSSSL Design Goals (http://www.w3.org/TR/REC-xml): Document Style Specification and Semantics Language [ISO/IS 10179, 1996] XML shall be straightforwardly usable over the Internet. 1. � transformation of SGML documents of one type into another XML shall support a wide variety of applications. 2. document type XML shall be compatible with SGML. 3. � transformation in a output format It shall be easy to write programs which process XML 4. documents. � SPDL: Standard Page Description Language (similar to Postscript) The number of optional features in XML is to be kept to the 5. absolute minimum, ideally zero. � transforming SGML documents XML documents should be human-legible and reasonably 6. � DSSSL creates a tree structure of the SGML document based clear. on the DTD The XML design should be prepared quickly. 7. � Standard Tree Formatting Process, STFP, The design of XML shall be formal and concise. 8. produces an output according to the style sheet XML documents shall be easy to create. 9. 10. Terseness in XML markup is of minimal importance. Prof. Dr. Dr. h.c. mult.Gerhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page14 Prof. Dr. Dr. h.c. mult.Gerhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page16
XML XML Tree Structure � general way to describe structured information � XML-name space <name-register> (http://www.w3.org/TR/REC-xml-names/) � „well-formed-documents“ � End-Tags are required, or <../> <entity> <entity> <entity> � minimizing tagging is not allowed � case sensitive <name> <phone> <email> <name> <phone> <name> <email> � user / developer defined DTDs � family of languages � Any XML-Document has a hierarchical structure. Each � XLink document can be visualized in tree structure. � XPointer � XPath � XSL � ... Prof. Dr. Dr. h.c. mult.Gerhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page17 Prof. Dr. Dr. h.c. mult.Gerhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page19 DTD - Example XML Example � example DTD: � name-register <?xml version="1.0"?> <!DOCTYPE nameregister [ <!ELEMENT nameregister (entry*) > <name-register> <!ELEMENT entry (name, phone?, email?)+ > <entry><name>Albrecht Schmidt</name> <!ELEMENT name (#PCDATA) > <phone>690229</phone></entry> <!ELEMENT phone (#PCDATA) > <entry><name>Bill Clinton</name> <!ELEMENT email (#PCDATA) > <email>bill@white_house.gov</email></entry> ]> <entry><name>James Bond</name> <phone>+44 007</phone> <email>007@mi5.gov.uk</email> </entry> </name-register> Prof. Dr. Dr. h.c. mult.Gerhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page18 Prof. Dr. Dr. h.c. mult.Gerhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page20
Recommend
More recommend