Transcript

Extensible Markup Language

Natawut Nupairoj, Ph.D.

Department of Computer EngineeringChulalongkorn University

Outline

Overview. Basic XML Syntax. User-Defined XML Structure

Document Type Definition.

Overview

What is Markup Language ? Old style communication for editing. Between writer and editor. Example:

This are is a mark-up.

We can more text.

add Sometimes, called “Metalanguage”

^

Overview

Family of Computer Markup LanguageStandard Generalized Markup Language

(SGML) Father of them all. Complex.

HyperText Markup Language (HTML) The most popular child. Focus on presentation: for human.

Overview

Extensible Markup Language (XML) Become increasingly popular. Similar to HTML. Focus on describing data

For human and machine.

Extensible Language for creating other languages. Base syntax. User-defined structure.

Example

<?xml version=“1.0” encoding = “UTF-8”?><endangered_species><animal><name language=“English”>Tiger</name><name language=“Latin”>pantera tigris</name><threats><threat>poachers</threat><threat>habitat destruction</threat><threat>trade in tiger bones for traditional Chinese

medicine(TCM)</threat></threats><weight>500 pounds</weight><length>3 yards from nose to tail</length>

...

</endangered_species>

XML Siblings

XML Structure Definition Document Type Definition (DTD). XML Schema.

XML Parser DOM. SAX.

XML-related technologies XSLT. XPath.

XML Components

Element: tag and contentData.

<name>Tiger</name>

XML Components

Attribute: name and value Metadata = Data of Data.

<name language=“English”>Tiger</name>

<name> <language>English</language> <text>Tiger</text></name>

XML Components

Nested element

<animal>

<name language=“English”>Tiger</name>

<name language=“Latin”>Panthera tigris</name>

<weight>500 pounds</weight>

</animal>

XML Components

Empty element

<animal></animal>

<animal />

<picture filename=“tiger.jpg” />

XML Components

Special symbols&amp; for ampersand (&).&lt; for less than sign (<).&gt; for greater than sign (>).&quot; for double quotation (“).&apos; for single quotation or apostrophe (‘).

<weight>&lt;500 pounds</weight>

XML Components

Comment

<!–- This is a comment. It can span multiple lines. -->

Basic XML Syntax

All XML files/applications must conform to basic XML syntaxXML declaration is not required (but

recommended).

<?xml version=“1.0”?><endanger_species><name>Tiger</name></endangered_species>

Basic XML Syntax

One and only one root element.

<?xml version=“1.0”?>

<endanger_species>

<name>Tiger</name>

</endangered_species>

Basic XML Syntax

Balanced and matched opening/closing tags.

<?xml version=“1.0”?>

<endanger_species>

<name>Tiger</name>

<picture filename=“tiger.jpg” />

</endangered_species>

Basic XML Syntax

Case-sensitive.

<name>Tiger</Name>

Case-sensitive.

<picture filename=“tiger.jpg” />

User-Defined XML Structure

XML basic syntax The pattern of all XML documents. Does not say about “structure”. Followed basic syntax = well-formed document.

User-Defined XML Structure Which “tags” and “attributes” are allowed. Describe the structure. Followed “structure” = valid document.

Parser and DTD

XML ParserXMLDocument

Yes/No

DTD

Check input using basic syntax and DTD.

Document Type Definition (DTD)

Old-fashioned, simple, but widely used. Internal DTD.<?xml version=“1.0”?><!DOCTYPE endangered_species [...]><endangered_species><animal>...

Document Type Definition (DTD)

External DTD.<?xml version=“1.0” standalone=“no”?>

<!DOCTYPE endangered_species SYSTEM

“http://www.natawut.com/xml/my_xml.dtd”>

<endangered_species>

<animal>

...

Defining Elements

<!ELEMENT endanger_species (animal)>

<!ELEMENT picture EMPTY>

<!ELEMENT endanger_species ANY>

Defining Elements

<!ELEMENT name (#PCDATA)>

<!ELEMENT weight (#PCDATA)>

<!ELEMENT threat (#PCDATA)>

<name language=“English”>Tiger</name>

<weight>500 pounds</weight>

...

Defining Elements

<!ELEMENT animal (name, threats, weight, length, source, picture, subspecies)>

<animal>

<name language=“English”>Tiger</name>

<threats>

<threat>poachers</threat>

</threats>

<weight>500 pounds</weight>

...

</animal>

Defining Elements

<!ELEMENT characteristics ((weight, length) | picture)>

<characteristics>

<weight>500 pounds</weight>

<length>3 yards from nose to tail</length>

</characteristics>

<characteristics>

<picture filename=“tiger.jpg”/>

</characteristics>

Defining Elements

<!ELEMENT animal (name+, threats, weight?, length?, source, picture, subspecies*)>

<!ELEMENT threats (threat, threat, threat+)>

Defining Attributes

<!ELEMENT population (#PCDATA)>

<!ATTLIST population year CDATA #IMPLIES>

<population>445</population>

<population year=“2002”>445</population>

<population year=“year-rabbit”>445</population>

Defining Attributes

<!ELEMENT population (#PCDATA)>

<!ATTLIST population year CDATA #REQUIRED>

<!ELEMENT population (#PCDATA)>

<!ATTLIST population year (2002|2003) #REQUIRED>

<!ELEMENT population (#PCDATA)>

<!ATTLIST population year CDATA ”2002”>

Defining Attributes

<!ELEMENT population (#PCDATA)>

<!ATTLIST population year CDATA #FIXED ”2002”>

Putting Them Together

<!ELEMENT endangered_species (animal*)>

<!ELEMENT animal(name+, threats, weight?, length?, source, picture, subspecies+)>

<!ELEMENT name (#PCDATA)>

<!ATTLIST name language (English | Latin)>

...


Recommended