23
The Semantic Web – introduction to the basic technology Week 2 - XML Lee McCluskey

The Semantic Web – introduction to the basic technology Week 2 - XML

  • Upload
    madison

  • View
    22

  • Download
    0

Embed Size (px)

DESCRIPTION

The Semantic Web – introduction to the basic technology Week 2 - XML. Lee McCluskey. Recap. - PowerPoint PPT Presentation

Citation preview

Page 1: The Semantic Web – introduction to the basic technology Week 2 - XML

The Semantic Web –introduction to the basic technologyWeek 2 - XML

Lee McCluskey

Page 2: The Semantic Web – introduction to the basic technology Week 2 - XML

Artform Research Group

Recap The Semantic Web is the Vision (not a current reality) of

having an internet with resources that are machine understandable or accessible to automated processes - machines should do much more than present the information visually or do human-consumable IR.

Central idea – we agree on a way of SPECIFYING vocabularies rather than agreeing on a particular vocabularies/languages. Then in communication, processes only need to point to the language (vocabulary) they are using. This is much more flexible than a common language.

XML is like a “machine code” in the SW. Processes on the SW will need to perform reasoning to fully

exploit the SW to do Knowledge Acquisition etc.

Page 3: The Semantic Web – introduction to the basic technology Week 2 - XML

Artform Research Group

WWW - A tool for people to access information- Interface to certain (online) databases, and to businesses- Human interface to some services (info retrieval, weather,

train timetables etc)

The WWW is successful largely through the use of layers of internationally accepted standards (TCP/IP,html) and now the fact that it is- Ubiquitous- Organic + Distributed- Dynamic + Unbounded

Page 4: The Semantic Web – introduction to the basic technology Week 2 - XML

Artform Research Group

WWW - a standard- ‘first generation’ - hand written html pages

- ‘second generation’ - dynamic web - pages created by programs to display the results of a process, or the output of a query of an accessed database.

Web pages used as an interface to networked processes (services) as well as for general information display.

Page 5: The Semantic Web – introduction to the basic technology Week 2 - XML

Artform Research Group

WWW +

Much R&D has been directed at writing programs/services that utilise HTML web info

EG the University of California’s travel assistant - a web service that uses other web services (weather, timetables, hotel) to make travel plans in response to a high level directive

“I need to be in X on days Y using budget Z”

BUT: this is very hard because of the web’s unstructured data .. Eg ISI’s travel assistant has to use a learning program to induce web page ‘wrappers’ before it can reliably extract data.

Page 6: The Semantic Web – introduction to the basic technology Week 2 - XML

Artform Research Group

WWW html example<html><head><title> Lee McCluskey </title></head><body bgcolor="#ffffff"><body><h1> McCluskey, Thomas Leo </h1><br> BSc (Maths), MSc (Maths), PhD (Computer Science), MBCS, C.Eng<br> Professor of Software Technology<br><br> School of Computing and Engineering,<br> University of Huddersfield,<br> Huddersfield,<br> West Yorkshire,<br> HD1 3DH,<br> United Kingdom.

<p> <b>email:</b> t.l.mccluskey followed by @hud.ac.uk</a><br> <b>telephone (direct):</b> (+44) (0) 1484 472247<br> <b>telephone (internal):</b> 2247<br> <b>telephone (messages):</b> (+44) (0) 1484 472150<br> <b>fax:</b> (+44) (0) 1484 421106<br> <b>room number:</b> CW2/09</p>

Page 7: The Semantic Web – introduction to the basic technology Week 2 - XML

Artform Research Group

Metadata and XML

We can start to giving ‘meaning’ to info on the web using META-DATA eg using tags around data to describe its content.

In XML - eXtensible Mark-up Language - tags are not fixed - one can invent new tags to structure the information in a web page.

XML is considered to be the basis for all semantic web languages - the “machine code” of the new generation web

Page 8: The Semantic Web – introduction to the basic technology Week 2 - XML

Artform Research Group

Rough Hierarchy of Languages in the Semantic Web

OWL .. Ontology language

DAML .. gives logic

RDFS .. gives classes

RDF .. gives tuples

XML .. gives content

Page 9: The Semantic Web – introduction to the basic technology Week 2 - XML

Artform Research Group

XML Overview XML is a subset of SGML (standard general mark-

up language) which was written originally for electronic documents and publications

XML has the advantages of HTML – it is platform-independent and a standardised language

see http://www.w3.org/TR/REC-xml/

But HTML has a FIXED set of tags, and holds no MEANING about the data in its document.

Page 10: The Semantic Web – introduction to the basic technology Week 2 - XML

Artform Research Group

Rough syntax of XML

= list of <name attributes> element </name> XML structures information using TAGS in a

composite fashion eg

<someTag> …… </someTag>

<someTag Attribute = “Value”> …… </someTag>

Info between tags is called an “element”

Page 11: The Semantic Web – introduction to the basic technology Week 2 - XML

Artform Research Group

XML XML allows the content to be structured so that it

is easy for a machine to extract meaningful data from an XML page. It is a meta-language – a language used in the description of other languages.

It can be used to structure data in a database, or as a communication language

It can be formatted using a style sheet language called XSL (like CSS for HTML)

Page 12: The Semantic Web – introduction to the basic technology Week 2 - XML

Artform Research Group

Example<?xml version="1.0"?>

<email date=“30/09/04”> <to>fred</to> <from>sue</from> <subject>xml example</subject> <message>This is the message</message>

</email> All tags have a start and end Tags must be correctly nested as a tree syntax Tags can have attributes

Page 13: The Semantic Web – introduction to the basic technology Week 2 - XML

Artform Research Group

Example - better<?xml version="1.0"?>

<email> <to>fred</to> <from>sue</from> <date>

<day>30</day><month>9</month><year>2004</year>

</date><subject>xml example</subject> <message>this is the message</message>

</email>

Page 14: The Semantic Web – introduction to the basic technology Week 2 - XML

Artform Research Group

Elements .. Logically every element has four key pieces: A name The attributes of the element The namespaces in scope on the element The content of the element

The content can be text, comments, more tagged info or Processing Information eg

<?xml-stylesheet type="text/xml" href="limited.xsl"?>

This is meta info about the document

Page 15: The Semantic Web – introduction to the basic technology Week 2 - XML

Artform Research Group

DTD’s XML is self describing – it uses a DTD

(Document Type Definition) to formally describe the structure of its contents

An XML doc is well-formed if its syntax is ok according to the XML standard. It is VALID if additionally it conforms to its DTD

DTD’s are formed so that we can share our document structures with other parties. Knowing our DTD, they can write programs to process our XML documents.

Page 16: The Semantic Web – introduction to the basic technology Week 2 - XML

Artform Research Group

Example with DTD<?xml version="1.0"?> <!DOCTYPE note [

<!ELEMENT email (to,from,subject,message)> <!ELEMENT to (#PCDATA<!ELEMENT from (#PCDATA)> <!ELEMENT subject (#PCDATA)> <!ELEMENT message (#PCDATA)> ]>

<email date=“30/09/04”> <to>fred</to> <from>sue</from> <subject>xml example</subject> <message>this is the message</message>

</email>

Page 17: The Semantic Web – introduction to the basic technology Week 2 - XML

Artform Research Group

DTD are like grammars..

<!ELEMENT address_book (listing+) >

<!ELEMENT listing (name, address) >

<!ELEMENT name (last_name, first_name) >

<!ELEMENT last_name (#PCDATA) >

<!ELEMENT first_name (#PCDATA) >

<!ELEMENT address (street, city, (state|province), zip) >

<!ELEMENT street (#PCDATA) >

<!ELEMENT city (#PCDATA) >

<!ELEMENT state (#PCDATA) >

<!ELEMENT province (#PCDATA) >

<!ELEMENT zip (#PCDATA) >

Page 18: The Semantic Web – introduction to the basic technology Week 2 - XML

Artform Research Group

DOMs

“.. The promise of the Internet is very much tied to interoperability and the value proposition of e-business depends on the ability to truly collaborate with partners and customers in a meaningful and efficient way..”

http://www.4infinitesolutions.com/course%20XML%20DTDs_Schema_DOM.htm

Page 19: The Semantic Web – introduction to the basic technology Week 2 - XML

Artform Research Group

DOMs

Document Object Models (DOMs) give an (abstract) program interface for constructing, querying accessing, and manipulating XML documents.

Concrete DOMs define methods and properties (instantiated for each programming language) which can be used to access/change XML documents from programs

Page 20: The Semantic Web – introduction to the basic technology Week 2 - XML

Artform Research Group

The Uniform Resource Identifier (URI)

!!! A “URI” is fundamental to the SW – it ‘defines a unique resource’ – a string that uniquely defines something.

Often (but not always) URI points to a webpage or an XML document.

In XML, element type names (tags) and attribute names may be qualified with a URI – so that the name can be understood globally.

Page 21: The Semantic Web – introduction to the basic technology Week 2 - XML

Artform Research Group

The Uniform Resource Identifier (URI)

Example: you need to refer to an ELEMENT annotated by <email> in the document..

http://scom.hud.uk/scomtlm/namespaces/example

You would set up a “namespace” in your XML document say

tlm = http://scom.hud.uk/scomtlm/namespaces/example

Then in your document you would use

tlm:email

To denote that this <email> tag is the same as the one in

http://scom.hud.uk/scomtlm/namespaces/example

Page 22: The Semantic Web – introduction to the basic technology Week 2 - XML

Artform Research Group

Namespaces - xmlns examples<tlm:email

xmlns:email="http://scom.hud.ac.uk/scomtlm/namespaces/example">

… <email:message …. >

</email>

You can also define a default namespace:

<email xmlns="http://scom.hud.ac.uk/scomtlm/namespaces/example">

</email>

Page 23: The Semantic Web – introduction to the basic technology Week 2 - XML

Artform Research Group

exercisesRead through some XML tutorials from relevant sites on the web eg http://www.ddj.com/documents/s=2803/nam1012432263/ http://www.ddj.com/documents/s=2799/nam1012432259/ http://www.xmlfiles.com/xml/ http://www.dcs.napier.ac.uk/~andrew/xml/ (this has some nice

tutorial questions and answers!)

Try the following exercises:1. 1. Write a small XML Bibliography, and then write a DTD for it.2. 2. Write a small XML Address book, and then write a DTD for it.3. 3. Cut and paste an XSL style-sheet from one of the example

websites and try to use it to present your XML files.For the Week ahead:Continue to read through the tutorials, and write down some notes on the

meaning and different roles of DTD, XSL, DOM and all the other jargon you come across!