Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
ITEC 2336 –InternetITEC 2336 –Internet Application Developmentpp p
XML
What is XML?What is XML?
XML stands for EXtensible Markup Language XML stands for EXtensible Markup Language markup language designed to describe data designed to describe data tags are not predefined – you define your
own tagsown tags uses a Document Type Definition (DTD) or an
XML Schema to describe the tag system Sc e a to desc be t e tag systeused to markup the data
XML+(DTD or Schema) is self-describing ( ) g XML is a W3C Recommendation
The 10 Primary XML Design GoalsThe 10 Primary XML Design Goals
1 XML must be easily usable over the Internet1. XML must be easily usable over the Internet2. XML must support a wide variety of
applicationspp3. XML must be compatible with SGML4 It must be easy to write programs that4. It must be easy to write programs that
process XML documents5. The number of optional features in XML 5 e u be o opt o a eatu es
must be kept small
The 10 Primary XML Design GoalsThe 10 Primary XML Design Goals
6. XML documents should be clear and easily6. XML documents should be clear and easily understood
7. The XML design should be prepared quicklyg p p q y8. The design of XML must be exact and
concise9. XML documents must be easy to create10. Keeping an XML document size small is of 0 eep g a docu e t s e s a s o
minimal importance
XML ParsersXML Parsers
An XML processor (also called XML parser) An XML processor (also called XML parser) evaluates the document to make sure it conforms to all XML specifications for structure and syntax.
There are two categories of XML documents Well-formed Valid
XML ParsersXML Parsers
Microsoft’s parser is called MSXML and is Microsoft s parser is called MSXML and is built into IE. The .dll file for MSXML can be downloaded
and used with other applications http://support.microsoft.com/kb/269238 to read
all bout the parser versions (from 1.0 to 6.0) Mozilla used the eXpat XML parser in FireFox
eXpat is an XML parser library written in C http://expat.sourceforge.net/
Well-Formed and Valid XML Documents An XML document is well-formed if it contains An XML document is well formed if it contains
no syntax errors and fulfills all of the specifications for XML code as defined by the W3C
An XML document is valid if it is well-formed and also satisfies the rules laid out in the DTD
h h d h dor schema attached to the document
Valid XML DocumentsValid XML Documents
http://www.w3schools.com/Schema/schema ip _ntro.asp
Different types of schemas for XML All are rules that define the elements,
attributes, and structure of a particular markup languagelanguage
When you point to a DTD or a schema document then those are the rules for that particular document’s markup
Different types of schemas for XMLXML Document Type Definition (DTD) yp ( ) W3C XML Schema RELAX NG – http://relaxng.org/p g g
A simple schema language for XML based on [RELAX] and [TREX]
S h t Schematron -http://www.schematron.com/overview.html differs in basic concept from other schema differs in basic concept from other schema
languages in that it not based on grammars but on finding tree patterns in the parsed documentdocument
The building blocks of XML documents Seen from a DTD point of view, all XML Seen from a DTD point of view, all XML
documents (and XHTML documents) are made up of the following simple building blocks:
Elements Tags Attributes Entities PCDATA CDATA
Document Type Definition (DTD)Document Type Definition (DTD) An XML document can only have one Document
T D l tiType Declaration A DTD provides a list of the elements, tags, attributes
and entity references contained in an XML document d d ib th i l ti hi t h thand describes their relationships to each other.
Inline Definition: <?xml version="1.0"?>
!DOCTYPE d t l t [d fi iti ]<!DOCTYPE documentelement [definition]> External Definition:
<?xml version="1.0"?> <!DOCTYPE d t l t SYSTEM<!DOCTYPE documentelement SYSTEM "documentelement.dtd">
XML – ElementsXML Elements
Elements are the main building blocks of both Elements are the main building blocks of both XML and XHTML documents.
XML elements can contain text, other ,elements, or be empty Elements could be “note” and “message”, e.g.
XML - Elements and AttributesXML Elements and Attributes
Element names are case sensitive Element names are case sensitive
Elements can be nested as follows: Elements can be nested, as follows:<CD>Kind of Blue
<TRACK>So What (:22)</TRACK>TRACK So What (:22) /TRACK<TRACK>Blue in Green (5:37)</TRACK>
</CD>
Elements and Attributes: Adding elements to the Jazz.XML File
This figure shows an XML documentg
prologprolog
document elements
XML -TagsXML Tags
Tags are used to mark up elements.g p A starting tag like <element_name> marks up
the beginning of an element, and an ending lik / l k h dtag like </element_name> marks up the end
of an element. Examples: Examples:
body element marked up with body tags: <body> some body text in between</body>y y y
message element marked up with message <message> some message in between
</message></message>
XML - EntitiesXML Entities Variables used to define common text Entity references provide access to character entities
that are used for part of the syntax Entities are expanded when a document is parsed by p p y
an XML parser XHTML entity reference: " “ - "non-breaking-
space" entity is used to insert an extra space in a doc mentdocument
The following entities are predefined in XML: Entity References Character
<< <> >& && t “" “' '
XML - PCDATAXML PCDATA PCDATA is text that will be parsed by a parser. T i id th t t ill b t t d k d titi ill b Tags inside the text will be treated as markup and entities will be
expanded<?xml version="1.0"?>
<!DOCTYPE message [<!ELEMENT message (to from subject text)><!ELEMENT message (to,from,subject,text)><!ELEMENT to (#PCDATA)><!ELEMENT from (#PCDATA)><!ELEMENT subject (#PCDATA)><!ELEMENT text (#PCDATA)>
]>]><message><to>Dave</to><from>Susan</from><subject>Reminder</subject><text>Don't forget to buy milk on the way home.</text>
</message>
#PCDATA is character data that must be parsed and expanded. If a #PCDATA section contains elements, those elements must also be
declared
XML - CDATAXML CDATA Everything inside a CDATA section is ignored by the parser
<script> <![CDATA[ function match (a, b) { if (a < b && a < 0) then{ if (a < b && a < 0) then {
return 1 }}
else {
return 0 }
} ]]> </script>
CDATA SectionsCDATA Sections
This figure shows CDATA section in Jazz.XML fileThis figure shows CDATA section in Jazz.XML file
CDATA section
The Document Creation ProcessThe Document Creation Process
This figure shows the document creation process
Structure of an XML DocumentStructure of an XML Document
XML documents consist of three partsp The prolog The document body The epilog
The prolog is optional and provides information about the document itself
The document body contains the document’s content in a hierarchical tree structurein a hierarchical tree structure.
The epilog is also optional and contains any final comments or processing instructions.p g
The Structure of an XML Document: Creating the Prolog The prolog consists of four parts in the The prolog consists of four parts in the
following order: XML declaration Miscellaneous statements or comments Document type declarationy Miscellaneous statements or comments
This order has to be followed or the parser pwill generate an error message.
None of these four parts is required, but it is good form to include them.
The Structure of an XML Document: The XML Declaration The XML declaration is always the first line of code in y
an XML document. It tells the processor what follows is written using XML. It can also provide any information about how the parser should interpret theinformation about how the parser should interpret the code.
The complete syntax is:<?xml version=“version number” encoding=“encoding type” standalone=“yes | no” ?>
A sample declaration might look like this: A sample declaration might look like this:<?xml version=“1.0” encoding=“UTF-8” standalone=“yes” ?>
Linking to a Style SheetLinking to a Style Sheet
A style sheet is linked to an XML document to A style sheet is linked to an XML document to format the document.
XML processor combines style sheet with p yXML document to display a formatted document
There are two main style sheet languages used with XML: Cascading Style Sheets (CSS) Extensible Style Sheets (XSL)
HTML AND XMLHTML AND XML
XHTML tags can be added to an XMLXHTML tags can be added to an XML document Add the namespace to the XML documentp http://www.w3.org/TR/REC-html40 An XML processor recognizes any tag g y g
associated with this namespace as an HTML tag, and a browser treats those tags as if they came from an HTML filecame from an HTML file
Mixing HTML and XML allows you to place an inline image into an XML document or toinline image into an XML document or to create hypertext links
MIXING XHTML AND XMLMIXING XHTML AND XML
Hyperlink example:Hyperlink example:
ITEC 2336 –InternetITEC 2336 –Internet Application Developmentpp p
XML