XML, HTML and All That What do they Mean, and Why Should you Care? Ian GRAHAM Centre for Academic...

Preview:

Citation preview

XML, HTML and All That

What do they Mean, and Why Should you Care?

Ian GRAHAMCentre for Academic TechnologyTel: 978-4548Email: <ian.graham@utoronto.ca>Talk: http://www.utoronto.ca/ian/talks/

Overview

A little Web history and the birth of HTML

HTML is not enough -- why?XML for universal data

For communicating information of all typesExamples of XML in actionProfound conclusions ...

The Birth of the Web

The HyperText Markup Language A simple language for distributing text

All the other parts -- URLs, HTTP, CGI ...

Four Main Components

URLs: For addressing things HTTP: For transporting dataCGI: For adding functionalityHTML: For encoding text

information

HTML

HTTP

ShoutcastNNTP

FTP

URLs

CGIDatabases

& othersoftware

HTML

A simple, general-purpose languageSimple hypermediaOriginal idea --

Collaborative authoring Merging of concept of authoring/viewing

HTML Evolution

Started with very few tags …simple requirements (only need to

no a little bit about the tags, and then just muddle through)

Language evolved, as more tags were added forms, images, tables, frames, fonts, ...

HTML Problems (1)

Everyone wanted personalized tags

Want to put other data into HTML mathematics, database entries, literary

text, poems, purchase orders ….

HTML just isn’t designed for that!

HTML Problems (2)

Software processing Server

management of data

But -- HTML is so ill-formed, this is hard!

HTML

HTML

HTML

HTML

HTML

Web serverengine

HTML Problems (3)

Software processing Client data

processing (machine--machine communication)

But -- HTML is so ill-formed, this is hard!

HTMLClient

software

Database,viewer,whatever....

(fromsomewhere

on the Web ...)

Idea: Back to Basics

HTML was defined using SGML Standard Generalized Markup

Language A meta-language for defining

languages

Complex, sophisticated, powerful

Idea: Use SGML

Languages based on SGML

SGML

HTML TEI DocBook

. . .

Problems with SGML

SGML Too complicatedRules too strict

Can’t distribute ‘loosely’ formatted text (like HTML)

Not good in a distributed environment

Can’t mix different data together Can’t add arbitrary tags

Idea (2): “Webified” SGML

New eXtensible Markup Language: XMLCan use XML to define new languagesDistributes easily on the WebCan mix different types of data together

can easily add new tags, and tell a browser what to do with them (more or less....)

Basic XML Rules

Tags written as with HTML, but ...

Technical details Tag names are case-sensitive Always need end tags Special empty-element tags Always quote attribute values

Like this example …..

<?xml version="1.0" encoding="iso-8859-1"?><html xmlns="http://www.w3.org/TR/xhtml1" ><head> <title> Title of text XHTML Document </title></head><body><div class="myDiv"> <h1> Heading of Page </h1> ….. <p>And here is another paragraph, this one containing an <img src="image.gif" alt="waste of time" /> inline image, and a <br /> line break. </p></div> </body></html>

XML stuff

Special XML Things

<?xml version=“1.0” encoding=“iso-8859-1” ?> Says that this is an XML document

<html xmlns=“http://www.w3.org/TR/xhtml1”> Says that the meaning of the tags inside (and

including) the html element are defined as belonging in the same “space” of names.

xmlns XML namespace

Evolution of XML

Many XML languages, optimised for different Web roles MathML -- for mathematics SMIL -- for synchronised multimedia RDF -- for describing “things” XUL -- for describing the Nav5 user

interface SpeechML -- for synthesised voices

MathML

Designed to express layout of maths

Also can express semantics

Cut & paste into Maple, Mathematica

x2 + 4x + 4 =0<mrow> <mrow> <msup> <mi>x</mi> <mn>2</mn>

</msup> <mo>+</mo> <mrow> <mn>4</mn> <mo>&invisibletimes;</mo> <mi>x</mi> </mrow> <mo>+</mo> <mn>4</mn> </mrow> <mo>=</mo> <mn>0</mn></mrow>

SMIL

Synchronised Multimedia Integration Language

Integration of multimedia with text, audio, video

Support in RealPlayer G2

SMIL Example<smil> <head> <meta name="title" content="Online Teaching Services promo" /> <meta name="author" content="Jay Moonah, CAT" /> <layout type="text/smil-basic-layout"> <root-layout width="280" height="316" background-color="white"/> <region id="AnimChannel1" title="AnimChannel1" left="0" top="0" height="265" width="280" fit="hidden"/> </layout></head><body> <par title="Online Teaching Services promo" author="Jay Moonah, CAT" > <audio src="final.rm" id="Soundtrack" title="Soundtrack"/> <animation src="otscompfin.swf" id="Animation" region="AnimChannel1" title="Animation" fill="freeze"/> <text src="cc.rt" id="caption" region="cc" title="cc" fill="freeze"/> </par></body></smil>

XHTML: NextGen HTML<?xml version="1.0" encoding="iso-8859-1"?><html xmlns="http://www.w3.org/TR/xhtml1" ><head> <title> Title of text XHTML Document </title></head><body><div class="myDiv"> <h1> Heading of Page </h1> <p> here is a paragraph of text. I will include inside this paragraph a bunch of wonky text so that it looks fancy. </p> <p>Here is another paragraph with <em>inline emphasized</em> text, and <b> absolutely no</b> sense of humor. </p> <p>And another paragraph, this one with an <img src="image.gif" alt="waste of time" /> image, and a <br /> line break. </p></div> </body></html>

XHTML

Just like HTML, but based on XML rules

Will support integration of different data into a single document (Doesn’t work that way now,

unfortunately)

XHTML and other Data

<?xml version="1.0" encoding="iso-8859-1"?><html xmlns="http://www.w3.org/TR/xhtml1" ><head> <title> Title of XHTML Document </title></head><body><div class="myDiv"> <h1> Heading of Page </h1> <mathml xmlns=“http://www.w3.org/TR/mathml”> … MathML markup … </mathml> <p> more html stuff goes here </p> <smil xmlns=“http://www.w3.org/TR/smil1”> … SMIL markup … </smil></div> </body></html>

Displaying XML

More complicated than HTML XML represents data only, not how it

looks Need extra instructions (a “style

sheet” document) to define how things should look

What Browsers Do Now?

Navigator 4, Internet Explorer 4 Uggh…… (can’t handle XML at all)

Internet Explorer 5 -- shows a tree of elements

Netscape 5 -- ignores the tags ... or so it seems ...

Other Use: Data Abstraction

XML as a universal format for data interchange

Machines exchange data as XML-format messages

Eliminates proprietary data formats Lots of XML processing software

available

XML Messaging: Business

FactorySupplier

Supplier

Supplier

Place order

Response

XML Messaging: Database

DatabaseOther DB

Other DB

Other DB

Request/send data

Request/send data

Example Message

<partorders xmlns=“http://myco.org/Spec/partorders.desc”> <order ref=“x23-2112-2342” date=“25aug1999-12:34:23h”> <desc> Gold sprockel grommets, with matching

hamster</desc> <part number=“23-23221-a12” /> <quantity units=“gross”> 12 </quantity> <delivery-date date=“27aug1999-12:00h”> </order> <order ref=“x23-2112-2342” date=“25aug1999-12:34:23h”> …. Order something else ….. </order></partorders>

Other Examples

XUL: XML User Interface Language How Navigator 5 configures its interface Defines structure and software integration

(www.mozilla.org)

RDF: Resource Description Framework For describing things Used by Netscape Open Catalog project to

define Web accessible resources (www.dmoz.org)

The XML Family Tree

SGML

XML

HTML TEI

. . .

. . .

XHTML

SMIL

MathML

SpeechML

RDF

XUL

XML Summary

an integration tool for mixing different types of data

a universal format for exchanging data between machines

a framework for distributing information on the Web

XML, HTML and All That

Ian GRAHAMCentre for Academic TechnologyTel: 978-4548Email: <ian.graham@utoronto.ca>Talk: http://www.utoronto.ca/ian/talks/

Recommended