XML Technologies and Applications Jukka Teuhola University of Turku Dept. of Information Technology Computer Science Spring 2013 XML-1 J. Teuhola 2013 1
1. General • Extent: 5 study points • Level: Advanced (syventävä) • Form: Self-study course • Components: Written material, exercise project, exam • Starting lecture (2 h): Tue 15.1.2013 at 8:15-10 in Beta • Exercise project: – See instructions in: http://staff.cs.utu.fi/kurssit/XML_technologies_and_applications/spring_2013/project/ – Must be finished before the examination • Exam dates: March 11th, 2013, two others to be announced • Preliminary knowledge (recommended): – HTML – Programming in Java – Principles of Databases XML-1 J. Teuhola 2013 2
Course material • Powerpoint slides : http://staff.cs.utu.fi/kurssit/XML_technologies_and_applications/spring_2013/slides The slides are in principle sufficient for passing the course, but for more detailed presentation, some XML textbook can be useful, such as: – Elliotte Rusty Harold, W. Scott Means: " XML in a Nutshell ", O'Reilly, 2nd ed. 2002 – Neil Bradley: " The XML Companion ", Addison-Wesley, 2002. – Anders Møller, Michael Schwartzbach: An Introduction to XML and Web Technologies ”, Addison-Wesley, 2006 – P. J. Deitel, H. M. Deitel: ” Internet and World Wide Web: How to Program ”, Prentice Hall 2008 – Ossi Nykänen: " XML ", Docendo 2001 ( in Finnish ) XML-1 J. Teuhola 2013 3
Useful links – mainly recommendations by WWW Consortium (W3C XML) • XML 1.0, XML 1.1 • Namespaces in XML • XML Schema • Extensible stylesheet language: XSL 1.0, XSL 1.1 • XSL Transformations: XSLT 1.0, XSLT 2.0 • XML Path language: Xpath 1.0, XPath 2.0 , XPath 3.0 • XML Linking • Cascading Style Sheets (CSS) • Document Object Model (DOM) • Simple API for XML (SAX) • XML Query (XQuery) • HTML5 XML-1 J. Teuhola 2013 4
Other useful web sources • Tutorials: W3 schools Moeller & Schwartzbach Oasis cover pages • Frequently Asked Questions: The XML FAQ • Java & XML: Oracle's pages • Software: Apache XML • Tools: XMLSpy (free trial), Cooktop (free), XMLFox (free), and many others … • XML News and Resources: Cafe con Leche XML-1 J. Teuhola 2013 5
Contents 1. General 8. Transformations (XSLT) 2. XML syntax 9. Selecting parts: XPath 3. Defining the document 10.XML links and pointers structure: DTD 11.Formatting documents: 4. Designing the document CSS and XSL-FO structure 12.Application programming 5. Namespaces interfaces (APIs) for XML 6. Defining the document 13.XML databases and structure: XML schema querying 7. Character sets 14.Application areas XML-1 J. Teuhola 2013 6
What is XML? • ” Extensible Markup Language ” • Generalized way of representing the structure of documents • Actually a meta-language , a formalism enabling the definition of application-specific markup • Simplification of SGML (Standard Generalized Markup Language, ISO 1986) • Developed by World Wide Web Consortium (W3C, recommendation 1998) • Open standard, independent of vendors and operating systems. XML-1 J. Teuhola 2013 7
What is XML? (Cont.) • XML was originally developed to enhance HTML • It soon turned out to have much wider use in document processing • Numerous application areas • XML has been extended by several attached technologies • What is markup ? Tags and other additional descriptions of structure/content/layout/etc. • XML is based on a strict grammar of tags • Different tag sets for different applications XML-1 J. Teuhola 2013 8
What XML is NOT? Misconceptions corrected: • XML is not a programming language (but could be used for marking such) • XML is not a transport protocol (but is commonly used to markup documents transferred in computer networks) • XML is not a database structure (but may be stored in databases, and queried therefrom) XML-1 J. Teuhola 2013 9
What is an XML document? • Contains only text , not binary data • Roughly divided into tags and character data • Rules for well-formedness, e.g. start and end tags must match. • Tags can be chosen freely, but the XML application program must know the tags • Nesting of tag pairs defines hierarchical documents (tree structures) XML-1 J. Teuhola 2013 10
Kinds of XML documents • Narrative documents: – Long text paragraphs – ’Semi-structured data’: flexible component structure – E.g. books, articles, web-pages, mail, etc. • Data-oriented documents: – Shorter data units – More uniform structuring – Resembles formatted databases (though textual representation) • XML was originally planned for narrative documents. XML-1 J. Teuhola 2013 11
Example document (narrative) <?xml version="1.0"?> <article> <title>Adaptive Text Compression</title> <author>John W. Smith</author> <text>Due to correlations between subsequent characters in natural language texts, it is possible to predict the next character on the basis of predecessors, which enables efficient compression. An adaptive compression method learns the correlations gradually, so that the properties of the text already processed are utilized when making predictions of the followers. </text> </article> XML-1 J. Teuhola 2013 12
Example document (data-oriented) <?xml version="1.0"?> <course name=“Advanced databases"> <teacher>Jukka</teacher> <semester>Spring 2013</semester> <audience> <student>Pekka</student> <student>Pirkko</student> </audience> </course> XML-1 J. Teuhola 2013 13
XML goals • Interoperability among users in the same field • Portability of data objects between applications • Flexibility in transforming one XML represen- tation to another • Customizability of the tag sets • The markup should bear also the semantics of documents • The markup should not define how the document is displayed (but special tools for that are defined, as well). XML-1 J. Teuhola 2013 14
Comparison: HTML <html> <head><title>Advanced databases</title></head> <body> <h1>Advanced databases</h1> <p>Teacher: Jukka</p> <p>Semester: Fall 2011</p> <p>Students: <ul> <li>Pekka</li> <li>Pirkko</li> </ul> </body> </html> XML-1 J. Teuhola 2013 15
Writing XML applications Alternatives: • Self-made programs e.g. in Java, C++, Python, Perl, etc., using ready-made (often free) libraries for auxiliary tasks • Off-the-shelf software: – General: Editors, validators, transformers (Note: any text editor can be used for editing). – Application-specific (for specialized tag sets) XML-1 J. Teuhola 2013 16
Parsing and validation • No fixed tag set; yet strict syntax: w ell-formedness must be verified. This is usually done by an XML parser . • Application-specific markup is defined in a schema ; the document is valid if it matches the schema, otherwise invalid . Two ways of defining the schema (by W3C): – Document Type Definition ( DTD ) – XML Schema Language • Not all constraints can be specified in a declarative language; the rest must be handled in application software. XML-1 J. Teuhola 2013 17
Development of XML 1. Starting point: SGML (1986); very complicated, long specification with many special cases, no widespread use 2. XML 1.0 (1998) simplified from SGML, immediate success 3. Namespaces (1999) extended generality and cross-application usage 4. Transformations (XSLT, 1999) were defined for portability and output (XSL-FO, 2001) 5. Addressing elements of docs: XPath (1999) XML-1 J. Teuhola 2013 18
Development of XML (cont.) 6. Definition of Application Program Interfaces: – DOM (2000): Document Object Model, objects within objects (OO view) – SAX (2000): Simple API for XML (sequential processing, developed outside W3C) 7. Pointers for hypertext (2001): XLink (between docs) and XPointer (within docs). 8. XML Schema (2001): complex, not very successful; external parties developed their own schema languages. 9. Later extensions: XQuery , XInclude , RDF , Signatures , ... XML-1 J. Teuhola 2013 19
Example application domains for XML • MathML: Mathematical Markup Language • SVG : Scalable Vector Graphics • MusicXML: Music notation format • SMIL : Synchronized Multimedia Integration Language • CML: Chemical Markup Language • X3D: Virtual Reality Modeling Language (earlier VRML) • GML: Geography Markup Language • Office Open XML: Zipped XML, e.g. docx (Word 2007) • Web Services: Simple Object Access Protocol, Web Service Description Language XML-1 J. Teuhola 2013 20
Recommend
More recommend