25
INTRODUCTION TO XML Fergus Fahey Training officer ARA(I)

Fergus Fahey - DRI/ARA(I) Training: Introduction to EAD - Introduction to XML

Embed Size (px)

Citation preview

Page 1: Fergus Fahey - DRI/ARA(I) Training: Introduction to EAD - Introduction to XML

INTRODUCTION TO XMLFergus Fahey – Training officer ARA(I)

Page 2: Fergus Fahey - DRI/ARA(I) Training: Introduction to EAD - Introduction to XML

Format of Workshop

• Description of xml features

• Practical exercise

Page 3: Fergus Fahey - DRI/ARA(I) Training: Introduction to EAD - Introduction to XML

What is XML

• XML stands for EXtensible Markup Language.

• XML was designed to store and transport data.

• XML was designed to be both human- and machine-readable

• XML is a software- and hardware-independent tool for storing and transporting

data

• “XML does not DO anything”

• Very widely used to store and share data:

• By libraries to share bibliographic data

• By software applications e.g. podcast metadata,

• By banks e.g. to process Single Euro Payments Area

Page 4: Fergus Fahey - DRI/ARA(I) Training: Introduction to EAD - Introduction to XML

eXtensible MARK-UP Language XML

Page 5: Fergus Fahey - DRI/ARA(I) Training: Introduction to EAD - Introduction to XML
Page 6: Fergus Fahey - DRI/ARA(I) Training: Introduction to EAD - Introduction to XML

XML Does Not DO Anything

<tramTicket><type>return</type>

<from>Central 1</from>

<to>Red 2</to>

<validUntil>Last Tram</validUntil>

<date>31 Jul 06</date>

<for>Adult</for>

<on>Luas only</on>

<timeIssued>21:15</timeissued>

<price>2.90</price>

<number>6004375019</number>

</tramTicket>

Page 7: Fergus Fahey - DRI/ARA(I) Training: Introduction to EAD - Introduction to XML

Before xml…html…before html…

Page 8: Fergus Fahey - DRI/ARA(I) Training: Introduction to EAD - Introduction to XML

Marc record processed

000 02617cam 22004931a 450

001 1197435

005 20030227130037.0

008 940923s1840 enkabcf 00 0 eng u

035 __ |a (UPRA)CTYXRL7078-B

035 __ |9 CAF1680YL

040 __ |c UPRA |d CtY-BR

043 __ |a n-us---

090 __ |a \Za W679\ |b +840s

100 1_ |a Willis, Nathaniel Parker, |d 1806-1867.

245 10 |a American scenery, or, Land, lake, and river

illustrations of transatlantic nature : |b 246

246 30 |a Land, lake and river illustrations of transatlantic

nature

260 __ |a London : |b George Virtue, |c 1840

Author: Willis, Nathaniel Parker, 1806-1867.

Title: American scenery, or, Land, lake, and river illustrations

of transatlantic nature : uniform with Dr. Beattie's

Switzerland, Scotland, & Waldenses / from drawings by

W.H. Bartlett, engraved in the first style of the art,

by R. Wallis, J. Cousen, Willmore, Brandard, Adlard,

Richardson, &c ; the literary department by N.P. Willis.

American scenery

Land, lake and river illustrations of transatlantic nature

Published: London : George Virtue, 1840

Description: 30 parts : ills., map, port. ; 29 cm.

Location: BEINECKE (Non-Circulating)

Call Number: 2003 +56

Library has: pt.1-pt.30

Page 9: Fergus Fahey - DRI/ARA(I) Training: Introduction to EAD - Introduction to XML

Html Hyper Text Mark-up Language

• HTML was designed to display data - with focus on how data looks (Unlike

the MARC example)

• HTML – Has predefined tags:

• <b> for bold

• <p> for paragraph

• HTML tags relate to layout and appearance of text/data and images

• HTML is permissive i.e. HTML will still render if it includes invalid tags.

Page 10: Fergus Fahey - DRI/ARA(I) Training: Introduction to EAD - Introduction to XML

HTML

<html>

<p>

The <b>cat</b>sat on the

<i>mat</i>

</p>

<img src=“catonmat.jpg”/>

</html>

The cat sat on the mat

Page 11: Fergus Fahey - DRI/ARA(I) Training: Introduction to EAD - Introduction to XML

xml

<animal type=‘cat’>

<name>Felix</name>

<colour>white</colour>

<state>seated</state>

<surface>mat</surface>

<attire>Dickie bow</attire>

<mood>Happy</mood>

</animal>

Page 12: Fergus Fahey - DRI/ARA(I) Training: Introduction to EAD - Introduction to XML

The Difference Between XML and HTML

• The XML language has no predefined tags

• The tags in the luas ticket example above (like <to> and

<price>) are not defined in any XML standard. These tags

are "invented" by the author of the XML document.

• HTML works with predefined tags like <p>, <b>, <img>,

etc.

• With XML, the author must define both the tags and the

document structure.

• XML Separates Data from Presentation

Page 13: Fergus Fahey - DRI/ARA(I) Training: Introduction to EAD - Introduction to XML

XML Tree root element

<eu>

element

<memberState>

element

<name>

element

<area>

element

<population>

element

<headOfstate>element

<capital>

element

<firstName>element

<lastName>

Text:

Brussels

Text:

Belgium

Text:

11,190,845

Text:

30,528

Text:

Philippe

Text:

Saxe-Coburg-

Gotha

element

<name>

attribute

“type”

Page 14: Fergus Fahey - DRI/ARA(I) Training: Introduction to EAD - Introduction to XML

XML Syntax

• XML documents must contain one root element that is the parent of all other

elements

• <root>

<child>

<subchild>.....</subchild>

</child>

</root>

Page 15: Fergus Fahey - DRI/ARA(I) Training: Introduction to EAD - Introduction to XML

XML Syntax example<memberstate>

<name>Belgium</name>

<area>30,528</area>

<population>11,190,845</population>

<headOfstate type="Constitutional Monarch">

<lastName>Saxe-Coburg-Gotha</lastName>

<firstName>Philippe</firstName>

</headOfstate>

<capital>

<name>Brussels</name>

<population AdministrativeDivision="Capital Region">1,138,854</population>

</capital>

</memberstate>

Page 16: Fergus Fahey - DRI/ARA(I) Training: Introduction to EAD - Introduction to XML

XML Elements

• An XML element is everything from (including) the element's start tag to (including) the element's end tag.

<population>11,190,845</population>

• An element can contain:• text

• attributes

• other elements

• or a mix of the above

<capital>

<name>Brussels</name>

<population AdministrativeDivision="Capital Region">1,138,854</population>

</capital>

Page 17: Fergus Fahey - DRI/ARA(I) Training: Introduction to EAD - Introduction to XML

XML Attributes

• Attributes are designed to contain data related to a specific element.

<headOfstate type="Constitutional Monarch">

<lastName>Saxe-Coburg-Gotha</lastName>

<firstName>Philippe</firstName>

</headOfstate>

--------------------------------------------------------------------------------------

<headOfstate>

<type>Constitutional Monarch</type>

<lastName>Saxe-Coburg-Gotha</lastName>

<firstName>Philippe</firstName>

</headOfstate>

Page 18: Fergus Fahey - DRI/ARA(I) Training: Introduction to EAD - Introduction to XML

XML Tree root element

<eu>

element

<memberState>

element

<name>

element

<area>

element

<population>

element

<headOfstate>element

<capital>

element

<firstName>element

<lastName>

Text:

Brussels

Text:

Belgium

Text:

11,190,845

Text:

30,528

Text:

Philippe

Text:

Saxe-Coburg-

Gotha

element

<name>

attribute

“type”

Page 19: Fergus Fahey - DRI/ARA(I) Training: Introduction to EAD - Introduction to XML

XML Namespaces

• In XML, element names are defined by the developer. This often results in a conflict when trying to mix XML documents from different XML applications.

• This XML carries HTML table information:<table><tr>

<td>Apples</td><td>Bananas</td>

</tr></table>

This XML carries information about a table (a piece of furniture):

<table><name>African Coffee Table</name><width>80</width><length>120</length>

</table>

Page 20: Fergus Fahey - DRI/ARA(I) Training: Introduction to EAD - Introduction to XML

Xml Namespaces

• <h:table>

<h:tr>

<h:td>Apples</h:td>

<h:td>Bananas</h:td>

</h:tr>

</h:table>

<f:table>

<f:name>African Coffee Table</f:name>

<f:width>80</f:width>

<f:length>120</f:length>

</f:table>

Page 21: Fergus Fahey - DRI/ARA(I) Training: Introduction to EAD - Introduction to XML

Validating XML

• XML documents must have a root element

• XML elements must have a closing tag

• XML tags are case sensitive

• XML elements must be properly nested

• XML attribute values must be quoted

<eu>

….

</eu>

<lastName>Mattarella</Lastname>

<lastName>Mattarella</lastName>

<eu>

<headOfstate type="Non executive President">

<eu>

<country>

<headOfstate type="Non executive President">

<population AdministrativeDivision=Capital Region>

<population AdministrativeDivision="Capital Region">

Page 22: Fergus Fahey - DRI/ARA(I) Training: Introduction to EAD - Introduction to XML

Validating xml - dtd

• An XML document with correct syntax is called "Well Formed".

• An XML document validated against a DTD is both "Well Formed" and "Valid“

• Xml parser only knows what is valid if you tell it, e.g. doesn’t know that a country has a head of state but a capital does not.

• Rules are created using a dtd file.

• <!DOCTYPE eu

• [<!ELEMENT eu (memberstate*)>

• <!ELEMENT memberstate(name,area,population,headOfstate,capital)>

• <!ELEMENT name (#PCDATA)>

• <!ELEMENT area (#PCDATA)>

• <!ELEMENT headOfstate(firstName,lastName)>

• <!ELEMENT capital (name,population)>

• <!ELEMENT firstName (#PCDATA)>

• <!ELEMENT lastName (#PCDATA)>

• <!ELEMENT population (#PCDATA)>

• <!ATTLIST headOfstate type CDATA "0">

• <!ATTLIST population AdministrativeDivisionCDATA "0">]>

Page 23: Fergus Fahey - DRI/ARA(I) Training: Introduction to EAD - Introduction to XML

Three types of error

• Badly formatted – missing closing tag, tags not matching, tags not nestled correctly

• Not valid – doesn’t comply with dtd rules

• Information is wrong, xml will not spot this in most circumstances, may spot it if information doesn’t comply with a rule.

• Won’t spot<lastName>O’Higgins</ lastName >

<firstName>Michael D.</firstName>

• Might spot (if expecting alphabetic characters only):<lastName>O’Higgins</ lastName >

<firstName>Michael D.</firstName>

Page 24: Fergus Fahey - DRI/ARA(I) Training: Introduction to EAD - Introduction to XML

XML and XSLT

• Xslt is one of a number of technologies which is used to process xml

• In our example we will use xslt to pick out individual xml elements and use

html to display them in a web browser.

• In my experience writing xslt is not easy, more difficult than any other

programing language I’ve used.

• Good news you don’t necessarily have to use xslt to use xml or EAD.

Page 25: Fergus Fahey - DRI/ARA(I) Training: Introduction to EAD - Introduction to XML

Useful links

• W3 schools xml tutorial http://www.w3schools.com/xml/default.asp

• W3 schools xslt tutorial http://www.w3schools.com/xsl/default.asp