Upload
bryan-hunter
View
240
Download
0
Embed Size (px)
Citation preview
XMLXML
Extensible Markup LanguageExtensible Markup Language
What is XML?What is XML?
o XML stands for EXML stands for EXXtensible tensible MMarkup arkup LLanguageanguageo XML is a XML is a markup languagemarkup language much like HTML much like HTMLo XML was designed to XML was designed to describe datadescribe datao XML tags are not predefined in XML. You XML tags are not predefined in XML. You
mustmustdefine your own tags.define your own tags.
o XML uses a XML uses a Document Type DefinitionDocument Type Definition (DTD) or(DTD) orXML SchemaXML Schema to describe the data to describe the data
o XML with a DTD or XML Schema is designed XML with a DTD or XML Schema is designed to beto beself-descriptiveself-descriptive
XML Vs HTMLXML Vs HTML
XML was designed to carry data.XML was designed to carry data.
XML is not a replacement for HTML.XML is not a replacement for HTML.XML and HTML were designed with different XML and HTML were designed with different
goals:goals:
XML was designed to describe data and to focus XML was designed to describe data and to focus what data is.what data is.
HTML was designed to display data and to focus on HTML was designed to display data and to focus on how data looks.how data looks.
HTML is about displaying information, XML is HTML is about displaying information, XML is about about
describing information.describing information.
XML was designed NOT to do XML was designed NOT to do anythinganything
XML was created to structure, store, and to send XML was created to structure, store, and to send information.information.ExampleExample of a note to Tove from Jani, stored as XML: of a note to Tove from Jani, stored as XML:
<note><note>
<to>Tove</to><to>Tove</to>
<from>Jani</from><from>Jani</from>
<heading>Remainder</heading><heading>Remainder</heading>
<body>Don’t forget me this weekend!</body><body>Don’t forget me this weekend!</body>
</note></note>
The note has a header, and a message body. It also has sender and The note has a header, and a message body. It also has sender and receiver receiver
information. But still, this XML document does not do anything. It’s information. But still, this XML document does not do anything. It’s just purejust pure
information wrapped in XML tags. Someone must write a piece of information wrapped in XML tags. Someone must write a piece of software to software to
send, receive or display it.send, receive or display it.
XML is free and extensibleXML is free and extensible
XML tags are not predefined. You must XML tags are not predefined. You must invent your invent your
own tags.own tags.
The tags used to markup HTML documents and the structure of The tags used to markup HTML documents and the structure of HTML documents are predefined. The author of HTML HTML documents are predefined. The author of HTML documents can only use tags that are defined in HTML standard documents can only use tags that are defined in HTML standard (like<p>, <h1>, etc.). (like<p>, <h1>, etc.).
XML allows the author to define his own tags and XML allows the author to define his own tags and his his
own document structure.own document structure.
The tags like <to> and <from> in the example above are not The tags like <to> and <from> in the example above are not defined in any XML standard. These tags are defined in any XML standard. These tags are invented invented by the by the authorauthorof the XML document.of the XML document.
XML is a complement to HTMLXML is a complement to HTML
XML is not a replacement for HTML.XML is not a replacement for HTML.
In the future Web development, it is most likely that XML In the future Web development, it is most likely that XML will be used to describe the data, while HTML will be usedwill be used to describe the data, while HTML will be used to format and display the same data. to format and display the same data.
XML is a cross-platform, software and hardwareXML is a cross-platform, software and hardware
independent independent tool for transmitting information.tool for transmitting information.
XML is expected to be as important to the future of the WebXML is expected to be as important to the future of the Webas HTML has been to the foundation of the Web and to be as HTML has been to the foundation of the Web and to be thethemost common tool for all data manipulation and data most common tool for all data manipulation and data transmition.transmition.
XML was not designed to XML was not designed to display data.display data.
It is important It is important
to understand that to understand that
XMLXML was designed to store, carry, was designed to store, carry,
and exchange data.and exchange data.
XML FeaturesXML FeaturesXML is used to Exchange DataXML is used to Exchange Data
with XML, data can be exchanged between with XML, data can be exchanged between incompatible systems.incompatible systems.
XML can be used to Share DataXML can be used to Share Datawith XML, plain text files can be used to share data.with XML, plain text files can be used to share data.
XML can be used to Store DataXML can be used to Store Datawith XML, plain text files can be used to store data.with XML, plain text files can be used to store data.
XML can make your Data more UsefulXML can make your Data more Usefulwith XML, your data is available to more users. with XML, your data is available to more users.
XML can be used to Create new LanguagesXML can be used to Create new LanguagesXML is the mother of WAP and WMLXML is the mother of WAP and WML
The Wireless Markup Language (WML), used to The Wireless Markup Language (WML), used to markup markup Internet applications for handheld devices Internet applications for handheld devices like mobile like mobile phones, is written in XML. phones, is written in XML.
An example XML documentAn example XML document
XML documents use a self-describing and simple XML documents use a self-describing and simple syntax.syntax.
<note><note><to>Tove</to><to>Tove</to><from>Jani</from><from>Jani</from><heading>Remainder</heading><heading>Remainder</heading><body>Don’t forget me this weekend!</body><body>Don’t forget me this weekend!</body></note></note>
The first line in the document-<note> describes The first line in the document-<note> describes the the
root element of the document (like: “this document root element of the document (like: “this document is a is a
note”).note”).
An example XML document An example XML document (Cont.)(Cont.)
The next 4 lines describe 4 child elements of the The next 4 lines describe 4 child elements of the root root
(to, from, heading, and body):(to, from, heading, and body):
<to>Tove</to><to>Tove</to><from>Jani</from><from>Jani</from><heading>Remainder</heading><heading>Remainder</heading>
<body>Don’t forget me this weekend!</body><body>Don’t forget me this weekend!</body>
And finally the last line defines the end of the And finally the last line defines the end of the root root
element: </note>.element: </note>.
From the example, don’t you agree that XML is From the example, don’t you agree that XML is pretty self-pretty self-
descriptive?descriptive?
TagsTags
All XML elements must have a closing All XML elements must have a closing tagtag
With XML it is illegal to omit the closing tag.With XML it is illegal to omit the closing tag.For example:For example:
<p>this is a paragraph</p> <p>this is a paragraph</p> <p>this is another paragraph</p> <p>this is another paragraph</p>
XML Tags are case sensitiveXML Tags are case sensitiveUnlike HTML, XML tags are case sensitive.Unlike HTML, XML tags are case sensitive.
The tag <Letter> is different from the tag The tag <Letter> is different from the tag <letter>:<letter>:
<Message> This is incorrect </message><Message> This is incorrect </message><Message> This is correct </Message><Message> This is correct </Message>
Properly NestedProperly Nested
All XML element must be properly All XML element must be properly nestednested <x><y>This is not <x><y>This is not properly nested</x></y>properly nested</x></y>
<x><y>This is properly nested</y><x><x><y>This is properly nested</y><x>
Improper nesting of tags make no sense to XML.Improper nesting of tags make no sense to XML.
All XML document must have a root tagAll XML document must have a root tagTThe first tag in an XML document is a root tag.he first tag in an XML document is a root tag.
AAll XML documents must contain a single tag pair to ll XML documents must contain a single tag pair to
define the root element. define the root element.
AAll other elements must be nested within ll other elements must be nested within the root the root element.element.
Properly Nested (Cont.)Properly Nested (Cont.)
AAll elements can have sub elements ll elements can have sub elements (children). (children).
SSub elements must be correctly nested ub elements must be correctly nested within their within their
parent element:parent element:
<root><root>
<child><child>
<subchild>......</subchild><subchild>......</subchild>
</child></child>
</root></root>
Attribute values must always Attribute values must always be quotedbe quoted
With XML, it is illegal to omit quotation marks With XML, it is illegal to omit quotation marks around around
attribute values.attribute values.
For example:For example: <note date=12/11/02><note date=12/11/02>
<to>Tove</to><to>Tove</to>
<from>Jani</from><from>Jani</from>
<heading>Remainder</heading><heading>Remainder</heading>
<body>Don’t forget me this <body>Don’t forget me this weekend!</body>weekend!</body>
</note></note>
This is incorrect!This is incorrect!
The date attribute in note element is not quotedThe date attribute in note element is not quoted
Attribute values must always Attribute values must always be quoted (Cont.)be quoted (Cont.)
<note date=“12/11/02”><note date=“12/11/02”>
<to>Tove</to><to>Tove</to>
<from>Jani</from><from>Jani</from>
<heading>Remainder</heading><heading>Remainder</heading>
<body>Don’t forget me this <body>Don’t forget me this weekend!</body>weekend!</body>
</note></note>
This is correct!This is correct!
This is correct: date=“12/11/02”This is correct: date=“12/11/02”
This is incorrect: date=12/11/02This is incorrect: date=12/11/02
Comments in XMLComments in XML
The syntax for writing comments in XML The syntax for writing comments in XML is similar to is similar to
that of HTML:that of HTML:
<!-- this is a comment --><!-- this is a comment -->
XML Elements are ExtensibleXML Elements are Extensible
XML documents can be extended to carry more XML documents can be extended to carry more information.information.
E.g.E.g.<note><note>
<to>Tove</to><to>Tove</to>
<from>Jani</from><from>Jani</from>
<heading>Remainder</heading><heading>Remainder</heading>
<body>Don’t forget me this weekend!</body><body>Don’t forget me this weekend!</body>
</note></note>
Imagine if later on the author decided to add some Imagine if later on the author decided to add some extra extra
information to it:information to it:
XML Elements are Extensible XML Elements are Extensible (Cont.)(Cont.)
<note><note><date>2002-10-12</date><date>2002-10-12</date><to>Tove</to><to>Tove</to><from>Jani</from><from>Jani</from><heading>Remainder</heading><heading>Remainder</heading><body>Don’t forget me this <body>Don’t forget me this
weekend!</body>weekend!</body></note></note>
Should the application break or crash?Should the application break or crash?No. The application should still be able to No. The application should still be able to
find the <to>, <from>, and <body> elements in find the <to>, <from>, and <body> elements in the XML document and produce the same output.the XML document and produce the same output.
XML documents are Extensible!XML documents are Extensible!
XML Elements have XML Elements have RelationshipsRelationships
Elements are related as parents and Elements are related as parents and children.children.
To understand XML terminology, you have To understand XML terminology, you have to know how relationships between XML to know how relationships between XML elements are named, and how element elements are named, and how element content is described.content is described.
XML Elements have XML Elements have RelationshipsRelationships (Cont.) (Cont.)
For Instance, this is the description of a For Instance, this is the description of a book:book:
Book title: My First XMLBook title: My First XML
Chapter 1: Introduction to XMLChapter 1: Introduction to XML
What is HTMLWhat is HTMLWhat is XMLWhat is XML
Chapter 2: XML SyntaxChapter 2: XML Syntax
Elements must have a closing tagsElements must have a closing tags
Elements must be properly nestedElements must be properly nested
XML Elements have Relationships XML Elements have Relationships (Cont.)(Cont.)
Then, this is the XML document that describes the Then, this is the XML document that describes the book:book:
<book><book><title>My First XML</title><title>My First XML</title><prod id=“33-657” media=“paper”></prod><prod id=“33-657” media=“paper”></prod>
<chapter>Introduction to XML<chapter>Introduction to XML<para>What is HTML</para><para>What is HTML</para><para>What is XML<[/para><para>What is XML<[/para></chapter></chapter>
<chapter>XML Syntax<chapter>XML Syntax<para>Elements must have a closing tags</para><para>Elements must have a closing tags</para><para>Elements must be properly nested</para><para>Elements must be properly nested</para></chapter></chapter>
</book></book>
ExplanationExplanation
• Book is theBook is the root element root element
• Title and chapter are Title and chapter are child elementschild elements of book of book
• Book is the Book is the parent element parent element of the title and of the title and chapterchapter
• Title and chapter are Title and chapter are siblings siblings (or (or sister sister elementselements) because they have the same ) because they have the same parent.parent.
Elements have contentsElements have contents
Elements can have different content typesElements can have different content typesAn An XMLXML elementelement is everything from (including) is everything from (including)
the element’s start tag to (including) element’s end tag.the element’s start tag to (including) element’s end tag.An element can have An element can have elementelement content, content, mixedmixed
contend, contend, simplesimple content, or content, or emptyempty content. An element content. An element can also have can also have attributesattributes..
In the previous example, book has In the previous example, book has elementelement contentcontent, , because it because it
contains other elements. Chapter has contains other elements. Chapter has mixedmixed contentscontents because it because it
contains both text and other elements. Para has contains both text and other elements. Para has simplesimple contentcontent (or (or
texttext contentcontent) because it contains only text. Prod has ) because it contains only text. Prod has emptyempty
contentcontent, because it carries no information., because it carries no information.
Elements have contents Elements have contents (Cont.)(Cont.)
In that example, only the prod element has In that example, only the prod element has attributesattributes. The . The
attributeattribute named id has the named id has the valuevalue “33-657”. The “33-657”. The attributeattribute named named
media has the media has the valuevalue “paper”. “paper”.
Element NamingElement Naming
XML elements must follow these naming XML elements must follow these naming rules:rules:
Names can contain letters, numbers, and other Names can contain letters, numbers, and other characters.characters.
Names must not start with a number or punctuation Names must not start with a number or punctuation character.character.
Names must not start with the letters xml (or XML or Names must not start with the letters xml (or XML or Xml …)Xml …)
Any name can be used, no words are reserved, but the Any name can be used, no words are reserved, but the idea is to make names idea is to make names
Descriptive-Names with an underscore separator are nice.Descriptive-Names with an underscore separator are nice.
Examples: <first_name>, <last_name>.Examples: <first_name>, <last_name>.
XML AttributesXML Attributes
XML Elements can have attributes.XML Elements can have attributes.
Attributes are used to provide additional information Attributes are used to provide additional information about about
the elements.the elements.Attributes often provide information that is not a part Attributes often provide information that is not a part
of of
the data.the data.
For example:For example:
<file type=“gif”>computer.gif</file><file type=“gif”>computer.gif</file>
In the example, the file type is irrelevant to the data, In the example, the file type is irrelevant to the data, but but
important to the software that wants to manipulate the important to the software that wants to manipulate the element.element.
Quote StylesQuote Styles
Attribute values must always be enclosed in Attribute values must always be enclosed in quotes, but quotes, but
either double or single quote can be used.either double or single quote can be used.E.g. E.g.
<person sex=“female”> <person sex=“female”>
or or
<person sex=‘female’><person sex=‘female’>
Double quotes are the most common, but Double quotes are the most common, but sometimes sometimes
(if the attribute value itself contains quotes) it is (if the attribute value itself contains quotes) it is necessary necessary
to use single quotes.to use single quotes.<gangster name=‘George “Shotgun” Ziegler’><gangster name=‘George “Shotgun” Ziegler’>
Elements Vs AttributesElements Vs Attributes E.g.1:E.g.1: <person sex=“female”><person sex=“female”>
<firstname>Anna</firstname> <firstname>Anna</firstname> <lastname>Smith</lastname> <lastname>Smith</lastname></person></person>
E.g.2:E.g.2: <person><person> <sex>female</sex> <sex>female</sex> <firstname>Anna</firstname> <firstname>Anna</firstname> <lastname>Smith</lastname> <lastname>Smith</lastname></person></person>
In the first example, sex is an attribute. In the In the first example, sex is an attribute. In the last, sex last, sex
is a child element. Both examples provide the same is a child element. Both examples provide the same information.information.
There are no rules about when to use attributes There are no rules about when to use attributes and and
when to use elements. when to use elements.
Should we avoid using Should we avoid using attributes?attributes?
Some problem of using attributes are:Some problem of using attributes are:
- Attributes can’t contain multiple values (child element - Attributes can’t contain multiple values (child element can)can)- Attributes are not easily extendable (for future changes)- Attributes are not easily extendable (for future changes)- Attributes can’t describe structure (child elements can)- Attributes can’t describe structure (child elements can)- Attributes are more difficult to manipulate by program - Attributes are more difficult to manipulate by program codecode- Attribute values are not easy to test against a DTD- Attribute values are not easy to test against a DTD
Use attributes only to provide information that is Use attributes only to provide information that is not not
relevant to the data! Don’t end up like this:relevant to the data! Don’t end up like this:<note day=“12” month=“8” year=“02”<note day=“12” month=“8” year=“02”to=“Tove” from=“Jani” heading=“Remainder”to=“Tove” from=“Jani” heading=“Remainder”body=“Don’t forget me this weekend!”>body=“Don’t forget me this weekend!”></note></note>
MetadataMetadataMetadata (or data about data) should be Metadata (or data about data) should be stored as stored as
attributes, and the data itself should be stored attributes, and the data itself should be stored as elements.as elements.
For exampleFor example::
<messages><messages> <note id=“p501”> <note id=“p501”>
<to>Tove</to><to>Tove</to><from>Jani</from><from>Jani</from><heading>Remainder</heading><heading>Remainder</heading><body>Don’t forget me this weekend!</body><body>Don’t forget me this weekend!</body>
</note> </note>
<note id=“p502> <note id=“p502><to>Jani</to><to>Jani</to><from>Tove</from><from>Tove</from><heading>Re: Remainder</heading><heading>Re: Remainder</heading><body>I will not!</body><body>I will not!</body>
</note> </note></message></message>
Metadata (Cont.)Metadata (Cont.)
The ID in the previous example is just a The ID in the previous example is just a counter, or a counter, or a
unique identifier, to identify the different unique identifier, to identify the different notes in the notes in the
XML file, and not a part of the note data.XML file, and not a part of the note data.
XML DTDXML DTD
A DTD defines the legal elements of an A DTD defines the legal elements of an XML elements.XML elements.
The purpose of a DTD is to define the The purpose of a DTD is to define the legal building blocks of an XML legal building blocks of an XML document. It defines the document document. It defines the document structure with a list of legal elements.structure with a list of legal elements.
A DTD can be declared inline in your XML A DTD can be declared inline in your XML document, or as an external reference.document, or as an external reference.
Internal DOCTYPE DeclarationInternal DOCTYPE DeclarationIf the DTD is included in your XML source file, it If the DTD is included in your XML source file, it should be should be
wrapped in a DOCTYPE definition with the following wrapped in a DOCTYPE definition with the following syntax:syntax: <!DOCTYPE root-element [element-<!DOCTYPE root-element [element-declarations]>declarations]>
Example XML document with a DTD:Example XML document with a DTD:<!DOCTYPE note [<!DOCTYPE note [ <!ELEMENT note (to, from, heading, body)> <!ELEMENT note (to, from, heading, body)> <!ELEMENT to <!ELEMENT to (#PCDATA)>(#PCDATA)> <!ELEMENT from <!ELEMENT from (#PCDATA)>(#PCDATA)> <!ELEMENT heading <!ELEMENT heading (#PCDATA)>(#PCDATA)> <!ELEMENT body <!ELEMENT body (#PCDATA)>(#PCDATA)>]>]><note><note> <to>Tove</to> <to>Tove</to> <from>Jani</from> <from>Jani</from> <heading>Remainder</heading> <heading>Remainder</heading> <message>Don’t forget me this weekend</message> <message>Don’t forget me this weekend</message></note></note>
DTD InterpretationDTD Interpretation
!DOCTYPE note!DOCTYPE note (in line 2) defines that this is a (in line 2) defines that this is a document of the type document of the type notenote..
!ELEMENT note !ELEMENT note (in line 3) defines the (in line 3) defines the notenote element as having four elements: “to, from, element as having four elements: “to, from, heading, body”. heading, body”.
!ELEMENT to !ELEMENT to (in line 4) defines the (in line 4) defines the toto element to element to be of type “#PCDATA”.be of type “#PCDATA”.
!ELEMENT from!ELEMENT from (in line 5) defines the (in line 5) defines the fromfrom element to be of type “PCDATA”.element to be of type “PCDATA”.
And so on . . .And so on . . .
External DOCTYPE declarationExternal DOCTYPE declaration
If the DTD is external to your XML source file, it If the DTD is external to your XML source file, it
should be wrapped in a DOCTYPE definition with should be wrapped in a DOCTYPE definition with the the
following syntax:following syntax:<!DOCTYPE root-element SYSTEM <!DOCTYPE root-element SYSTEM “filename”>“filename”>
E.g. :E.g. : <!DOCTYPE note SYSTEM “note.dtd”><!DOCTYPE note SYSTEM “note.dtd”><note><note><to>Tove</to><to>Tove</to><from>Jani</from><from>Jani</from><heading>Remainder</heading><heading>Remainder</heading><body>Don’t forget me this weekend!</body><body>Don’t forget me this weekend!</body></note></note>
““note.dtd”note.dtd”
And then the copy of the file “note.dtd” And then the copy of the file “note.dtd” will be like will be like
this:this:
<!ELEMENT note (to, from, heading, body)><!ELEMENT note (to, from, heading, body)>
<!ELEMENT to (#PCDATA)><!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)><!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)><!ELEMENT heading (#PCDATA)>
<!ELEMENT body (#PCDATA)><!ELEMENT body (#PCDATA)>
Why use a DTD?Why use a DTD?
With DTD, each of your XML file can carry a With DTD, each of your XML file can carry a description of its description of its
own format with it.own format with it.
With a DTD, independent groups of people can agree With a DTD, independent groups of people can agree to use a to use a
common DTD for interchanging data.common DTD for interchanging data.
Your application can use a standard DTD to verify Your application can use a standard DTD to verify that the data that the data
you received from the outside world is valid. you received from the outside world is valid.
You can also use DTD to verify your own data.You can also use DTD to verify your own data.
DTD-XML building blocksDTD-XML building blocks
The building blocks of XML documentsThe building blocks of XML documents
Seen from a DTD point of view, all XML Seen from a DTD point of view, all XML documents documents
are made up by following simple building are made up by following simple building blocks:blocks:
- Elements- Elements- Tags- Tags- Attributes- Attributes- Entities- Entities- PCDATA- PCDATA- CDATA- CDATA
DTD-ElementsDTD-Elements
In a DTD, XML elements are declared with a In a DTD, XML elements are declared with a DTD DTD
element declaration. element declaration. Declaring an elementDeclaring an element
An element declaration has the following An element declaration has the following syntax:syntax:
<!ELEMENT element-name category><!ELEMENT element-name category> OR OR
<!ELEMENT (element-content)><!ELEMENT (element-content)>
Empty elementsEmpty elementsEmpty elements are declared with the Empty elements are declared with the category keyword EMPTY:category keyword EMPTY:
<!ELEMENT element-name EMPTY><!ELEMENT element-name EMPTY>
example:example: <!ELEMENT br EMPTY><!ELEMENT br EMPTY>
in XML:in XML: <br /><br />
DTD-Elements (Cont.)DTD-Elements (Cont.)
Elements with only character dataElements with only character dataElements with only character data are Elements with only character data are declared with declared with #PCDATA inside #PCDATA inside parentheses:parentheses:
<!ELEMENT element-name (#PCDATA)><!ELEMENT element-name (#PCDATA)>example: <!ELEMENT from (#PCDATA)>example: <!ELEMENT from (#PCDATA)>
Elements with any contentsElements with any contentsElements declared with category keyword Elements declared with category keyword ANY, can ANY, can contain any combination of contain any combination of parsable data:parsable data:
<!ELEMENT element-name ANY><!ELEMENT element-name ANY>
example: <!ELEMENT note ANY>example: <!ELEMENT note ANY>
DTD-Elements (Cont.)DTD-Elements (Cont.)
Elements with children (sequences)Elements with children (sequences)
Elements with one or more children are defined with Elements with one or more children are defined with the name of the children elements inside parentheses:the name of the children elements inside parentheses:
<!ELEMENT element-name <!ELEMENT element-name (child-element-name, child-element-name, …)(child-element-name, child-element-name, …)example: <!ELEMENT note (to, from, heading, body)>example: <!ELEMENT note (to, from, heading, body)>
When children are declared in a sequence separated When children are declared in a sequence separated by by
commas, the children must appear in the same commas, the children must appear in the same sequence sequence
in the document. In a full declaration, the children in the document. In a full declaration, the children must must
also be declared, and the children can also have also be declared, and the children can also have children.children.
DTD-Elements (Cont.)DTD-Elements (Cont.)
The full declaration of the “note” element will beThe full declaration of the “note” element will be<!ELEMENT note (to, from, heading, body)><!ELEMENT note (to, from, heading, body)><!ELEMENT to<!ELEMENT to (#PCDATA)>(#PCDATA)><!ELEMENT from<!ELEMENT from (#PCDATA)>(#PCDATA)><!ELEMENT heading<!ELEMENT heading (#PCDATA>(#PCDATA><!ELEMENT body<!ELEMENT body (#PCDATA>(#PCDATA>
Declaring only one occurrence of the same Declaring only one occurrence of the same elementelement
<!ELEMENT element-name (child-name+)><!ELEMENT element-name (child-name+)>example: <!ELEMENT note (message+)>example: <!ELEMENT note (message+)>
The + sign in the example declares that the child The + sign in the example declares that the child element message can occur one or more times element message can occur one or more times
inside the inside the ““note” element.note” element.
DTD-Elements (Cont.)DTD-Elements (Cont.)
Declaring zero or more occurrences of Declaring zero or more occurrences of the samethe same
elementelement
<!ELEMENT element-name (child-name*)><!ELEMENT element-name (child-name*)>example: <!ELEMENT note (messages*)>example: <!ELEMENT note (messages*)>
The * sign in the example above declares The * sign in the example above declares that the child that the child
element message can occur zero or more element message can occur zero or more times inside the times inside the
element “note” element.element “note” element.
DTD-Elements (Cont.)DTD-Elements (Cont.)
Declaring zero or one occurrences of the Declaring zero or one occurrences of the same same
elementelement
<!ELEMENT element-name (child-name?)><!ELEMENT element-name (child-name?)>example: <!ELEMENT note (message?)>example: <!ELEMENT note (message?)>
The ? sign in the example above declares The ? sign in the example above declares that the child that the child
element message can occur zero or one times element message can occur zero or one times inside the inside the
““note” element.note” element.
DTD-Elements (Cont.)DTD-Elements (Cont.)
Declaring either/or contentDeclaring either/or content
Example:Example: <!ELEMENT note (to, from, header, (message|<!ELEMENT note (to, from, header, (message|body))>body))>
The example declares that the “note” The example declares that the “note” element must element must
contain a “to” element, a “from” element, a contain a “to” element, a “from” element, a “header” “header”
element, and either a “message” or a “body” element, and either a “message” or a “body” element.element.
DTD-AttributesDTD-Attributes
In a DTD, Attributes are declared with an In a DTD, Attributes are declared with an ATTLIST ATTLIST
declaration.declaration.
Declaring AttributesDeclaring AttributesAn attribute declaration has the following An attribute declaration has the following syntax:syntax:
<!ATTLIST element-name attribute-name <!ATTLIST element-name attribute-name attribute type default-value>attribute type default-value>
DTD example: <!ATTLIST payment type CDATA “check”>DTD example: <!ATTLIST payment type CDATA “check”>
XML example: <payment type=“check”/>XML example: <payment type=“check”/>
DTD-Attributes (Cont.)DTD-Attributes (Cont.)
The The attribute-typeattribute-type can have the following can have the following values:values:
ValueValue ExplanationExplanationCDATACDATA The value is character dataThe value is character data(en1|en2|..)(en1|en2|..) The value must be one from an The value must be one from an enumerated listenumerated listIDID The value is a unique IDThe value is a unique IDIDREFIDREF The value is the ID of another elementThe value is the ID of another elementIDREFSIDREFS The value is a list of other idsThe value is a list of other idsNMTOKENNMTOKEN The value is a valid XML nameThe value is a valid XML nameNMTOKENSNMTOKENS The value is a list of valid XML namesThe value is a list of valid XML namesENTITYENTITY The value is an entityThe value is an entityENTITIESENTITIES The value is a list of entitiesThe value is a list of entitiesNOTATIONNOTATION The value is a name of a notationThe value is a name of a notationxml:xml: The value is a predefined xml valueThe value is a predefined xml value
DTD-Attributes (Cont.)DTD-Attributes (Cont.)
The The default-value default-value can have the following can have the following values:values:
ValueValue ExplanationExplanationvaluevalue The default value of the attributeThe default value of the attribute#DEFAULT value#DEFAULT value The default value of the attribute The default value of the attribute #REQUIRED#REQUIRED The attribute value must be included The attribute value must be included in the elementin the element#IMPLIED#IMPLIED The attribute does not have to be The attribute does not have to be includedincluded#FIXED value#FIXED value The attribute value is fixed The attribute value is fixed
Attribute Declaration ExampleAttribute Declaration Example
DTD example:DTD example: <!ELEMENT square EMPTY><!ELEMENT square EMPTY><!ATTLIST square width <!ATTLIST square width
CDATA “0”>CDATA “0”>
XML example:XML example: <square width=“100” /><square width=“100” />
The square element is defined to be an The square element is defined to be an empty element empty element
with a “width” attribute of type CDATA. If with a “width” attribute of type CDATA. If no width no width
specified, it has a default value of 0.specified, it has a default value of 0.
Default Attribute ValueDefault Attribute Value
Syntax:Syntax: <!ATTLIST element-name attribute-name<!ATTLIST element-name attribute-nameattribute-type “default-value”>attribute-type “default-value”>
DTD example:DTD example: <!ATTLIST payment type CDATA “check”><!ATTLIST payment type CDATA “check”>
XML example:XML example: <payment type=”check” /><payment type=”check” />
Specifying a default value for an attribute Specifying a default value for an attribute ensures that ensures that
the attribute will get a value even if the author the attribute will get a value even if the author of the of the
XML document does not include it.XML document does not include it.
Implied AttributeImplied Attribute
Syntax:Syntax: <!ATTLIST element-name attribute-name<!ATTLIST element-name attribute-nameattribute-type #IMPLIED>attribute-type #IMPLIED>
DTD example:DTD example: <!ATTLIST contact type CDATA <!ATTLIST contact type CDATA #IMPLIED>#IMPLIED>
XML example:XML example: <contact fax=“555-6677” /> <contact fax=“555-6677” />
Use implied attribute if you don’t want to Use implied attribute if you don’t want to force the force the
author to include an attribute, and you don’t author to include an attribute, and you don’t have an have an
option for default value.option for default value.
Required AttributeRequired Attribute
Syntax:Syntax: <!ATTLIST element-name attribute-name <!ATTLIST element-name attribute-name attribute-type #REQUIRED>attribute-type #REQUIRED>
DTD example:DTD example: <!ATTLIST person number CDATA <!ATTLIST person number CDATA #REQUIRED>#REQUIRED>
XML example:XML example: <person number=“5678” /> <person number=“5678” />
Use a required attribute if you don’t have Use a required attribute if you don’t have an option for an option for
a default value, but still want to force the a default value, but still want to force the attribute to be attribute to be
present.present.
Fixed Attribute ValueFixed Attribute Value
Syntax:Syntax: <!ATTLIST element-name attribute-name<!ATTLIST element-name attribute-nameattribute-type #FIXED “value”>attribute-type #FIXED “value”>
DTD example: DTD example: <!ATTLIST sender company CDATA #FIXED “Microsoft”><!ATTLIST sender company CDATA #FIXED “Microsoft”>
XML example:XML example:<sender company=“Microsoft” /><sender company=“Microsoft” />
Use a fixed attribute value when you want an Use a fixed attribute value when you want an attribute attribute
to have a fixed value without allowing the author to have a fixed value without allowing the author to to
change it. If an author includes another value, the change it. If an author includes another value, the XML XML
parser will return an error.parser will return an error.
Enumerated Attribute ValuesEnumerated Attribute Values
Syntax:Syntax: <!ATTLIST element-name <!ATTLIST element-name attribute-name (en1|en2) default-value>attribute-name (en1|en2) default-value>
DTD example: DTD example: <!ATTLIST payment type (check|cash) “check”><!ATTLIST payment type (check|cash) “check”>
XML example: <payment type=“check” />XML example: <payment type=“check” />
oror
<payment type=“cash” /><payment type=“cash” />
Use enumerated attribute values when you Use enumerated attribute values when you want the want the
attribute values to be one of a fixed set of legal attribute values to be one of a fixed set of legal values.values.
DTD-EntitiesDTD-Entities
Entities are variables used to define shortcuts to Entities are variables used to define shortcuts to
common text.common text.
o Entity references are references to entities.Entity references are references to entities.o Entity can be declared internal or external.Entity can be declared internal or external.
Internal Entity DeclarationInternal Entity DeclarationSyntax:Syntax: <!ENTITY entity-name “entity-value”><!ENTITY entity-name “entity-value”>
DTD example:DTD example: <!ENTITY writer “Donald Duck.”><!ENTITY writer “Donald Duck.”><!ENTITY copyright “Copyright W3Schools.”<!ENTITY copyright “Copyright W3Schools.”
XML example:XML example:<author>&writer;©right;</author><author>&writer;©right;</author>
DTD-Entities (Cont.)DTD-Entities (Cont.)
External Entity DeclarationExternal Entity Declaration
Syntax:Syntax: <!ENTITY entity-name SYSTEM “URI/URL”><!ENTITY entity-name SYSTEM “URI/URL”>
DTD example: DTD example: <!ENTITY writer<!ENTITY writerSYSTEM SYSTEM
“http://www.w3schools.com/entities/entities.xml”>“http://www.w3schools.com/entities/entities.xml”>
<!ENTITY copyright<!ENTITY copyright
SYSTEM SYSTEM “http://www.w3schools.com/entities/entities.dtd”>“http://www.w3schools.com/entities/entities.dtd”>
XML example:XML example:<author>&writer;©right;</author><author>&writer;©right;</author>
XML SchemaXML Schema
Example in DTD:Example in DTD:
<!DOCTYPE bank [<!DOCTYPE bank [ <!ELEMENT bank ((account-customer-depositor)+)> <!ELEMENT bank ((account-customer-depositor)+)> <!ELEMENT account (account-number branch-name <!ELEMENT account (account-number branch-name balance)>balance)> <!ELEMENT customer (customer-name customer-street <!ELEMENT customer (customer-name customer-street customer-city)>customer-city)> <!ELEMENT depositor (customer-name account-number)> <!ELEMENT depositor (customer-name account-number)> <!ELEMENT account-number (#PCDATA)> <!ELEMENT account-number (#PCDATA)> <!ELEMENT branch-name (#PCDATA)> <!ELEMENT branch-name (#PCDATA)> <!ELEMENT balance (#PCDATA)> <!ELEMENT balance (#PCDATA)> <!ELEMENT customer-name (#PCDATA)> <!ELEMENT customer-name (#PCDATA)> <!ELEMENT customer-street (#PCDATA)> <!ELEMENT customer-street (#PCDATA)> <!ELEMENT customer-city (#PCDATA)> <!ELEMENT customer-city (#PCDATA)>]>]>
Can be re-written in XML Schema:Can be re-written in XML Schema:
XML Schema (Cont.)XML Schema (Cont.)
<xsd:schema xmlns:xsd=“http://www.w3.org/2001/XMLSchema><xsd:schema xmlns:xsd=“http://www.w3.org/2001/XMLSchema><xsd:element name=“bank” type=“BankType”/><xsd:element name=“bank” type=“BankType”/><xsd:element name=“account”><xsd:element name=“account”> <xsd:complexType> <xsd:complexType> <xsd:sequence> <xsd:sequence> <xsd:element name=“account-number” type=“xsd:string”/> <xsd:element name=“account-number” type=“xsd:string”/> <xsd:element name=“branch-name” type=“xsd:string”/> <xsd:element name=“branch-name” type=“xsd:string”/> <xsd:element name=“balance” type=“xsd:decimal”/> <xsd:element name=“balance” type=“xsd:decimal”/> </xsd:sequence> </xsd:sequence> </xsd:complexType> </xsd:complexType></xsd:element></xsd:element><xsd:element name=“customer”><xsd:element name=“customer”> <xsd:element name=“customer-number” type=“xsd:string”/> <xsd:element name=“customer-number” type=“xsd:string”/> <xsd:element name=“customer-street” type=“xsd:string”/> <xsd:element name=“customer-street” type=“xsd:string”/> <xsd:element name=“customer-city” type=“xsd:string”/> <xsd:element name=“customer-city” type=“xsd:string”/></xsd:element></xsd:element>
XML Schema (Cont.)XML Schema (Cont.)
<xsd:element name=“depositor”><xsd:element name=“depositor”><xsd:complexType><xsd:complexType> <xsd:sequence> <xsd:sequence>
<xsd:element name=“customer-name” type=“xsd:string”/><xsd:element name=“customer-name” type=“xsd:string”/><xsd:element name=“account-number” type=“xsd:string”/><xsd:element name=“account-number” type=“xsd:string”/>
</xsd:sequence> </xsd:sequence></xsd:complexType></xsd:complexType>
</xsd:element></xsd:element>
<xsd:complexType name=“BankType”><xsd:complexType name=“BankType”><xsd:sequence><xsd:sequence> <xsd:element ref=“account” minOccurs=“0” <xsd:element ref=“account” minOccurs=“0” maxOccurs=“unbounded”/>maxOccurs=“unbounded”/> <xsd:element ref=“customer” minOccurs=“0” <xsd:element ref=“customer” minOccurs=“0” maxOccurs=“unbounded”/>maxOccurs=“unbounded”/> <xsd:element ref=“depositor” minOccurs=“0” <xsd:element ref=“depositor” minOccurs=“0” maxOccurs=“unbounded”/>maxOccurs=“unbounded”/></xsd:sequence></xsd:sequence>
</xsd:complexType></xsd:complexType>
</xsd:schema></xsd:schema>
XML Schema (Cont.)XML Schema (Cont.)
The benefit of XMLSchema over DTDs are:The benefit of XMLSchema over DTDs are:* It allows user-defined types to be created.* It allows user-defined types to be created.* It allows the text that appears in elements to be * It allows the text that appears in elements to be constrained to specific types, such as numeric types in constrained to specific types, such as numeric types in specific formats or even more complicated types such as specific formats or even more complicated types such as lists or union.lists or union.* It allows types to be restricted to create specialized * It allows types to be restricted to create specialized types, for instance by specifying max and min values.types, for instance by specifying max and min values.* It allows complex types to be extended by using a form * It allows complex types to be extended by using a form of inheritance.of inheritance.* It is a superset of DTDs.* It is a superset of DTDs.* It allows uniqueness and foreign key constraints.* It allows uniqueness and foreign key constraints.
However, the price for these features is that However, the price for these features is that XMLSchema is significantly more complicated XMLSchema is significantly more complicated than DTDs.than DTDs.
Querying and TransformationQuerying and Transformation
Tools for querying and transformation of XML Tools for querying and transformation of XML data are data are
essential to extract information from large bodies essential to extract information from large bodies of XML of XML
data, and to convert data between different data, and to convert data between different representations (schemas) in XML.representations (schemas) in XML.
Several languages provide increasing degrees of Several languages provide increasing degrees of querying and transformation capabilities:querying and transformation capabilities:
- XPath - XPath - XSLT- XSLT- XQuery- XQuery
XPATHXPATH
XPATH addresses part of an XML document by XPATH addresses part of an XML document by means of path expressions.means of path expressions.
A path expression in XPATH is a sequence of A path expression in XPATH is a sequence of location location
steps separated by “/”.steps separated by “/”.The result of a path expression is a set of values. The result of a path expression is a set of values. For For
example:example: /bank-2/customer/name/bank-2/customer/name
would return:would return: <name>Joe</name><name>Joe</name><name>Lisa</name><name>Lisa</name><name>Mary</name><name>Mary</name>
The expression:The expression: /bank-2/customer/name/text( )/bank-2/customer/name/text( )
would return the same names, but w/out the would return the same names, but w/out the enclosingenclosing
tags.tags.
XPATH (Cont.)XPATH (Cont.)
XPATH supports a number of other features:XPATH supports a number of other features:
/bank-2/account[balance > 400]/bank-2/account[balance > 400]
returns account elements with a balance value greater returns account elements with a balance value greater than 400, whilethan 400, while
/bank-2/account[balance >400]/@account-number/bank-2/account[balance >400]/@account-number
returns the account numbers of those accounts.returns the account numbers of those accounts.
/bank-2/account/[customer/count( )>2]/bank-2/account/[customer/count( )>2]
returns accounts with more than 2 customers.returns accounts with more than 2 customers.
/bank-2/account/id(@owner)/bank-2/account/id(@owner)
returns all customers referred to from the owners returns all customers referred to from the owners attribute account attribute account
elements.elements.
XPATH (Cont.)XPATH (Cont.)
/bank-2/account/id(@owner)|/bank-2/loan/id(@borrower)/bank-2/account/id(@owner)|/bank-2/loan/id(@borrower)
gives customers with either accounts or loans. However, gives customers with either accounts or loans. However, the | the |
operator cannot be nested inside other operators.operator cannot be nested inside other operators.
/bank-2//name/bank-2//name
finds any name element finds any name element anywhereanywhere under the under the /bank-2 /bank-2 elementelement, ,
regardless the element in which it is contained.regardless the element in which it is contained.
Note: Note: the “//” described above is a short form the “//” described above is a short form for for specifying “all descendants”, while “..” specifying “all descendants”, while “..” specifies specifies parent.parent.
XSLTXSLT
A A stylestyle sheetsheet is a representation of formatting is a representation of formatting option option
for a document, usually stored outside the document for a document, usually stored outside the document itself.itself.
XML Stylesheet Language (XSL)XML Stylesheet Language (XSL) was originally was originally designed for generating HTML from XML. It designed for generating HTML from XML. It
includes a includes a general-purpose transformation mechanism, called general-purpose transformation mechanism, called
XSL XSL Transformations (XSLT)Transformations (XSLT), which can be used to , which can be used to transform one XML document into another XML transform one XML document into another XML document, or to other formats such as HTML.document, or to other formats such as HTML.
XSLT transformations are expressed as a series of XSLT transformations are expressed as a series of recursive rules, called recursive rules, called templatestemplates..
XSLT (Cont.)XSLT (Cont.)
A simple template for XSLT consists of a match part A simple template for XSLT consists of a match part
and a select part. and a select part.
For instance:For instance: <xls:template match=“/bank-2/customer”><xls:template match=“/bank-2/customer”> <xsl:value-of select=“customer-name”/> <xsl:value-of select=“customer-name”/></xls:template></xls:template><xls:template match=“.”/><xls:template match=“.”/>
The The xls:templatexls:template match statement contains an XPath expression match statement contains an XPath expression
that selects one or more nodes. The first template matches that selects one or more nodes. The first template matches customercustomer
elements that occur as children of the elements that occur as children of the bank-2bank-2 root element. The root element. The
xsl:value-ofxsl:value-of statement enclosed in the match statement outputs statement enclosed in the match statement outputs values values
from the nodes in the result of the XPath expression. The first from the nodes in the result of the XPath expression. The first templatetemplate
outputs the value of the customer-name subelement; note that the outputs the value of the customer-name subelement; note that the value value
does not contain the element tag.does not contain the element tag.
XSLT (Cont.)XSLT (Cont.)
The second template matches all nodes. This is The second template matches all nodes. This is required required
because the default behavior of XSLT on subtrees of the because the default behavior of XSLT on subtrees of the
input document that do not match any template is to input document that do not match any template is to copy copy
the subtrees to the output document.the subtrees to the output document.
Structural recursion Structural recursion is a key part of XSLT-When the is a key part of XSLT-When the
template matches an element in the tree structure, template matches an element in the tree structure, XSLT XSLT
can use structural recursion to apply template rules can use structural recursion to apply template rules
recursively by the recursively by the xls:apply-templatesxls:apply-templates directive, which directive, which appears appears
inside other templates.inside other templates.