Upload
donald-briggs
View
218
Download
1
Embed Size (px)
Citation preview
XML
eXtensible Markup Language
XML
• A method of defining a format for exchanging documents and data.– Allows one to define a dialect of XML– A library of tags, with associated
structure<config>
<descriptor type="FILE" name="source">
<attribute name="media_type" type="svalue"/>
<attribute name="frame_rate" type="svalue"/>
</descriptor>
</config>
The Social Benefits
• Can specify an interchange format concisely and accurately enough to set up a validation service easily
• There is plenty of available software for dealing with XML files and translating from one format into another
Downsides
• Sometimes defining a representation can be a pain– Deciding what to leave as content and what to
move to attributes.– XML Schemas are confusing, while DTDs do not
offer enough control
• Verbose– ViPER increased about 2x uncompressed, 4/3x
gzip compressed
• Difficult to read– Lots of </…> and end tags get in the way of the
data
The Real Benefits to The Programmer
• XML Schema (or DTDs) allow you to validate a document without having to examine it
• Xpath allows you to specify a node, or set of nodes, in a document quickly and easily
• SAX makes it easy to write a quick parser• DOM makes it so you don’t even have to do
that• XSL:T allows you to transform from an XML
document into another document, possibly not even standard XML
• Etc.
XML As A File Format
• Makes parsing simpler, but currently no methods for making saving easier
• Saves you from dealing with things like character encoding and date formatting
• No more difficult than making up your own
• An unfamiliar or forgotten file grants more affordances than an XML or binary file
Defining A Dialect
• XML Schema – Structure and Data– Define elements and attributes– Associate them with data types
<?xml version="1.0" encoding="UTF-8"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://lamp.cfar.umd.edu/viper"
xmlns:viper="http://lamp.cfar.umd.edu/viper"
elementFormDefault="qualified">
<xsd:element name="viper"/>
<xsd:element name="config"/>
</xsd:schema>
Schema Datatypes
• Can create and assign datatypes to attributes and elements. For example:
<xsd:element name="data" type="xsd:base64Binary"/>
<xsd:attribute name="span" type="viper:framespanType"/>
<xsd:simpleType name="framespanType">
<xsd:restriction base="xsd:string">
<xsd:pattern value="\d+\:\d+" />
</xsd:restriction>
</xsd:simpleType>
Schema Structures
• Can specify order and contents of elements– Sequence, choice, mixed, etc. allow
specifying how and where elements appear
– Substitution groups allow one tag to take the place of another
• Can group elements without placing the into types
Extensiblity
• Inheritance– Can extend complex elements by adding
more attributes and elements to the bottom– Can restrict the data using the
<restriction/> elements
• The <any/> and <anyAttribute/> elements– The ultimate in extensibility, allow any valid
XML in from a given namespace or range of namespaces
Parsing
• Using the DOM:– The DOM provides a tree structure
that represents the document– Memory heavy
• Using SAX:– Event driven– Lightweight– Better for large documents
Xpath
• The common language for selecting individual pieces of an XML document shared between X-Link and XSL:T– Also used for defining uniqueness constraints
in Schemas– DOM Level 3 will support selecting by Xpath
• Looks sort of like a JavaScript DOM call:– /viper/config/descriptor[@type=“FILE”]
• Selects all of the file descriptor nodes that are of type “FILE”
Resources
• www.xml.com– O'Reilly's XML resource
• www.w3.org – The standards themselves, and lots of
good links to implementations.
• xml.apache.org– DOM, SAX, and XSLT for C and Java