Upload
brice-adams
View
218
Download
4
Embed Size (px)
Citation preview
XML, HTML and All That
What do they Mean, and Why Should you Care?
Ian GRAHAMCentre for Academic TechnologyTel: 978-4548Email: <[email protected]>Talk: http://www.utoronto.ca/ian/talks/
Overview
A little Web history and the birth of HTML
HTML is not enough -- why?XML for universal data
For communicating information of all typesExamples of XML in actionProfound conclusions ...
The Birth of the Web
The HyperText Markup Language A simple language for distributing text
All the other parts -- URLs, HTTP, CGI ...
Four Main Components
URLs: For addressing things HTTP: For transporting dataCGI: For adding functionalityHTML: For encoding text
information
HTML
HTTP
ShoutcastNNTP
FTP
URLs
CGIDatabases
& othersoftware
HTML
A simple, general-purpose languageSimple hypermediaOriginal idea --
Collaborative authoring Merging of concept of authoring/viewing
HTML Evolution
Started with very few tags …simple requirements (only need to
no a little bit about the tags, and then just muddle through)
Language evolved, as more tags were added forms, images, tables, frames, fonts, ...
HTML Problems (1)
Everyone wanted personalized tags
Want to put other data into HTML mathematics, database entries, literary
text, poems, purchase orders ….
HTML just isn’t designed for that!
HTML Problems (2)
Software processing Server
management of data
But -- HTML is so ill-formed, this is hard!
HTML
HTML
HTML
HTML
HTML
Web serverengine
HTML Problems (3)
Software processing Client data
processing (machine--machine communication)
But -- HTML is so ill-formed, this is hard!
HTMLClient
software
Database,viewer,whatever....
(fromsomewhere
on the Web ...)
Idea: Back to Basics
HTML was defined using SGML Standard Generalized Markup
Language A meta-language for defining
languages
Complex, sophisticated, powerful
Idea: Use SGML
Languages based on SGML
SGML
HTML TEI DocBook
. . .
Problems with SGML
SGML Too complicatedRules too strict
Can’t distribute ‘loosely’ formatted text (like HTML)
Not good in a distributed environment
Can’t mix different data together Can’t add arbitrary tags
Idea (2): “Webified” SGML
New eXtensible Markup Language: XMLCan use XML to define new languagesDistributes easily on the WebCan mix different types of data together
can easily add new tags, and tell a browser what to do with them (more or less....)
Basic XML Rules
Tags written as with HTML, but ...
Technical details Tag names are case-sensitive Always need end tags Special empty-element tags Always quote attribute values
Like this example …..
<?xml version="1.0" encoding="iso-8859-1"?><html xmlns="http://www.w3.org/TR/xhtml1" ><head> <title> Title of text XHTML Document </title></head><body><div class="myDiv"> <h1> Heading of Page </h1> ….. <p>And here is another paragraph, this one containing an <img src="image.gif" alt="waste of time" /> inline image, and a <br /> line break. </p></div> </body></html>
XML stuff
Special XML Things
<?xml version=“1.0” encoding=“iso-8859-1” ?> Says that this is an XML document
<html xmlns=“http://www.w3.org/TR/xhtml1”> Says that the meaning of the tags inside (and
including) the html element are defined as belonging in the same “space” of names.
xmlns XML namespace
Evolution of XML
Many XML languages, optimised for different Web roles MathML -- for mathematics SMIL -- for synchronised multimedia RDF -- for describing “things” XUL -- for describing the Nav5 user
interface SpeechML -- for synthesised voices
MathML
Designed to express layout of maths
Also can express semantics
Cut & paste into Maple, Mathematica
x2 + 4x + 4 =0<mrow> <mrow> <msup> <mi>x</mi> <mn>2</mn>
</msup> <mo>+</mo> <mrow> <mn>4</mn> <mo>&invisibletimes;</mo> <mi>x</mi> </mrow> <mo>+</mo> <mn>4</mn> </mrow> <mo>=</mo> <mn>0</mn></mrow>
SMIL
Synchronised Multimedia Integration Language
Integration of multimedia with text, audio, video
Support in RealPlayer G2
SMIL Example<smil> <head> <meta name="title" content="Online Teaching Services promo" /> <meta name="author" content="Jay Moonah, CAT" /> <layout type="text/smil-basic-layout"> <root-layout width="280" height="316" background-color="white"/> <region id="AnimChannel1" title="AnimChannel1" left="0" top="0" height="265" width="280" fit="hidden"/> </layout></head><body> <par title="Online Teaching Services promo" author="Jay Moonah, CAT" > <audio src="final.rm" id="Soundtrack" title="Soundtrack"/> <animation src="otscompfin.swf" id="Animation" region="AnimChannel1" title="Animation" fill="freeze"/> <text src="cc.rt" id="caption" region="cc" title="cc" fill="freeze"/> </par></body></smil>
XHTML: NextGen HTML<?xml version="1.0" encoding="iso-8859-1"?><html xmlns="http://www.w3.org/TR/xhtml1" ><head> <title> Title of text XHTML Document </title></head><body><div class="myDiv"> <h1> Heading of Page </h1> <p> here is a paragraph of text. I will include inside this paragraph a bunch of wonky text so that it looks fancy. </p> <p>Here is another paragraph with <em>inline emphasized</em> text, and <b> absolutely no</b> sense of humor. </p> <p>And another paragraph, this one with an <img src="image.gif" alt="waste of time" /> image, and a <br /> line break. </p></div> </body></html>
XHTML
Just like HTML, but based on XML rules
Will support integration of different data into a single document (Doesn’t work that way now,
unfortunately)
XHTML and other Data
<?xml version="1.0" encoding="iso-8859-1"?><html xmlns="http://www.w3.org/TR/xhtml1" ><head> <title> Title of XHTML Document </title></head><body><div class="myDiv"> <h1> Heading of Page </h1> <mathml xmlns=“http://www.w3.org/TR/mathml”> … MathML markup … </mathml> <p> more html stuff goes here </p> <smil xmlns=“http://www.w3.org/TR/smil1”> … SMIL markup … </smil></div> </body></html>
Displaying XML
More complicated than HTML XML represents data only, not how it
looks Need extra instructions (a “style
sheet” document) to define how things should look
What Browsers Do Now?
Navigator 4, Internet Explorer 4 Uggh…… (can’t handle XML at all)
Internet Explorer 5 -- shows a tree of elements
Netscape 5 -- ignores the tags ... or so it seems ...
Other Use: Data Abstraction
XML as a universal format for data interchange
Machines exchange data as XML-format messages
Eliminates proprietary data formats Lots of XML processing software
available
XML Messaging: Business
FactorySupplier
Supplier
Supplier
Place order
Response
XML Messaging: Database
DatabaseOther DB
Other DB
Other DB
Request/send data
Request/send data
Example Message
<partorders xmlns=“http://myco.org/Spec/partorders.desc”> <order ref=“x23-2112-2342” date=“25aug1999-12:34:23h”> <desc> Gold sprockel grommets, with matching
hamster</desc> <part number=“23-23221-a12” /> <quantity units=“gross”> 12 </quantity> <delivery-date date=“27aug1999-12:00h”> </order> <order ref=“x23-2112-2342” date=“25aug1999-12:34:23h”> …. Order something else ….. </order></partorders>
Other Examples
XUL: XML User Interface Language How Navigator 5 configures its interface Defines structure and software integration
(www.mozilla.org)
RDF: Resource Description Framework For describing things Used by Netscape Open Catalog project to
define Web accessible resources (www.dmoz.org)
The XML Family Tree
SGML
XML
HTML TEI
. . .
. . .
XHTML
SMIL
MathML
SpeechML
RDF
XUL
XML Summary
an integration tool for mixing different types of data
a universal format for exchanging data between machines
a framework for distributing information on the Web
XML, HTML and All That
Ian GRAHAMCentre for Academic TechnologyTel: 978-4548Email: <[email protected]>Talk: http://www.utoronto.ca/ian/talks/