41
Will It All Fit Together? The need for standards and the technical challenges Brian Kelly and Paul Miller UK Web Focus Interoperability Focus UKOLN University of Bath Bath, BA2 7AY UKOLN is funded by the Library and Information Commission, the Joint Information Systems Committee (JISC) of the Higher Education Funding Councils, as well as by project funding from the JISC and the European Union. UKOLN also receives support from the University of Bath [email protected] [email protected] http://www.ukoln.ac.uk/ B

Will It All Fit Together? The need for standards and the technical challenges Brian Kelly and Paul Miller UK Web Focus Interoperability Focus UKOLN University

Embed Size (px)

Citation preview

Page 1: Will It All Fit Together? The need for standards and the technical challenges Brian Kelly and Paul Miller UK Web Focus Interoperability Focus UKOLN University

Will It All Fit Together?The need for standards and

the technical challengesBrian Kelly and Paul Miller UK Web Focus Interoperability Focus

UKOLN

University of Bath

Bath, BA2 7AY

UKOLN is funded by the Library and Information Commission, the Joint Information Systems Committee (JISC) of the Higher Education Funding Councils, as well as by project funding from the JISC and the European Union. UKOLN also receives support from the University of Bath where it is based.

[email protected]@ukoln.ac.uk

http://www.ukoln.ac.uk/

[email protected]@ukoln.ac.uk

http://www.ukoln.ac.uk/

B

Page 2: Will It All Fit Together? The need for standards and the technical challenges Brian Kelly and Paul Miller UK Web Focus Interoperability Focus UKOLN University

2

Contents

• Introduction• Background:

• The Web• Library Information

• Problems • Solutions• Deployment Challenges• Conclusions

B

Page 3: Will It All Fit Together? The need for standards and the technical challenges Brian Kelly and Paul Miller UK Web Focus Interoperability Focus UKOLN University

3

About Us

UK Web Focus• Advises UK HE community on web developments• JISC-funded• Represents JISC on W3C

Interoperability Focus• Advises on issues related to the deployment of

‘interoperable’ services across libraries, museums, archives, etc.

• JISC and LIC funded• Represents community on various international

metadata and standardisation initiatives

B

Page 4: Will It All Fit Together? The need for standards and the technical challenges Brian Kelly and Paul Miller UK Web Focus Interoperability Focus UKOLN University

4

About You

How many are in the following groups:

“Webmasters”

Library catalogue/system Managers

Others

What do you hope to gain from the session?

…and if we use terms you don’t understand… ask!

B

Page 5: Will It All Fit Together? The need for standards and the technical challenges Brian Kelly and Paul Miller UK Web Focus Interoperability Focus UKOLN University

5

Aims of this Session

The aims of this session are:• To provide an update on web developments• To illustrate ways in which the web relates

to other library–based electronic information• To outline some of the advantages of

adopting a standardised solution to problems

• To look at the ways in which things might move in the near future

B

Page 6: Will It All Fit Together? The need for standards and the technical challenges Brian Kelly and Paul Miller UK Web Focus Interoperability Focus UKOLN University

6

StandardisationCommunity• Library groups• Cultural Heritage• Government

W3C• Produces W3C

Recommendations• Managed approach • Protocols initially

developed by W3C members

• Decisions made by W3C, influenced by member & public review

IETF• Produces Internet

Drafts on Internet protocols• Bottom-up approach to developments• Protocols developed by

interested individuals• "Rough consensus and working

code"

Formal• Formal international/

national standards processes

• ISO, CEN, NISO, ECMA, ANSI, BSI…

• Can be slow-moving and bureaucratic

• Produce robust standards

PNGHTMLHTTP

PNGHTMLHTTP

HTTPURNwhois++

HTTPURNwhois++

Proprietary• De facto standards• Often initially

appealing (cf PowerPoint, PDF)

• May emerge as standards

23950PNGHTMLJava

23950PNGHTMLJava

RelevantBodies

B

Page 7: Will It All Fit Together? The need for standards and the technical challenges Brian Kelly and Paul Miller UK Web Focus Interoperability Focus UKOLN University

7

Background to the Web

The web was initially very successful due to its simplicity

ClientNetscapeIELynxH

TM

L

Server

ApacheIIS...

Give me foo.htmlfrom www.bath.ac.uk

Here it is

The web is based on three key architectural components:Data Format:

HTML (HyperText Markup Language)Addressing:

URLs (Uniform Resource Locators)Transport:

HTTP (Hypertext Transfer Protocol)

The web is based on three key architectural components:Data Format:

HTML (HyperText Markup Language)Addressing:

URLs (Uniform Resource Locators)Transport:

HTTP (Hypertext Transfer Protocol)

B

Page 8: Will It All Fit Together? The need for standards and the technical challenges Brian Kelly and Paul Miller UK Web Focus Interoperability Focus UKOLN University

8

Background to Library Information

•Long tradition of categorising information• Card catalogue (local)• OPAC (local-ish)• WebPAC (potentially global)

•Proven track record on formalising practice• AACR (rules for cataloguing)• MARC (rules for transfer)• Z39.50 (linking and access)

P

Page 9: Will It All Fit Together? The need for standards and the technical challenges Brian Kelly and Paul Miller UK Web Focus Interoperability Focus UKOLN University

9

Problems With the Web

Although the web has been successful, there are problems:

• Performance - the web is too slow• Resource discovery - lack of a metadata

architecture• HTML’s lack of arbitrary structure• Accessibility - difficulties of accessing information

by visually impaired, people using PDAs, etc.• Functionality - difficult to deploy interactive

applications on the web• Addressing• etc.

B

Page 10: Will It All Fit Together? The need for standards and the technical challenges Brian Kelly and Paul Miller UK Web Focus Interoperability Focus UKOLN University

10

Solutions (Today)HTML 4.0 used in conjunction with CSS 2.0 (Cascading Style Sheets) and the DOM provides an architecturally pure, yet functionally rich environment

HTML 4.0 - W3C-Rec• Improved forms• Hooks for stylesheets• Hooks for scripting

languages• Table enhancements• Better printing

CSS 2.0 - W3C-Rec• Support for all HTML

formatting • Positioning of HTML

elements• Multiple media support

Problems• Changes during CSS development• Netscape & IE incompatibilities • Continued use of browsers with

known bugs

Problems• Changes during CSS development• Netscape & IE incompatibilities • Continued use of browsers with

known bugs

DOM - W3C-Rec• Document Object Model• Hooks for scripting

languages• Permits changes to

HTML & CSS properties and content

B

Page 11: Will It All Fit Together? The need for standards and the technical challenges Brian Kelly and Paul Miller UK Web Focus Interoperability Focus UKOLN University

11

HTML's Limitations

HTML 4.0 / CSS 2.0 have limitations:• Difficulties in introducing new elements

– Time-consuming standardisation process (<ABBREV>)– Dictated by browser vendor (<BLINK>, <MARQUEE>)

• Area may be inappropriate for standarisation:– Covers specialist area (maths, music, ...)– Application-specific (<STUD-NUM>)

• HTML is a display (output) format• HTML's lack of arbitrary structure limits functionality:

– Find all memos copied to John Smith– How many unique tracks on Jackson Browne CDs

B

Page 12: Will It All Fit Together? The need for standards and the technical challenges Brian Kelly and Paul Miller UK Web Focus Interoperability Focus UKOLN University

12

XML

XML:• Extensible Markup Language• A lightweight SGML designed for network use• Addresses HTML's lack of evolvability• Arbitrary elements can be defined (<STUDENT-NUMBER>, <PART-NO>, etc)

• Agreement achieved quickly - XML 1.0 became W3C Recommendation in Feb 1998

• Support from industry (SGML vendors, Microsoft, etc.)

• Support in Netscape 5 and IE 5

B

Page 13: Will It All Fit Together? The need for standards and the technical challenges Brian Kelly and Paul Miller UK Web Focus Interoperability Focus UKOLN University

13

XML Deployment

Ariadne issue 15 has article on "What Is XML?"

Describes how XML support can be provided:

• Natively by new browsers• Back end conversion

of XML - HTML• Client-side conversion

of XML - HTML / CSS• Java rendering of XML

Examples of intermediaries

See http://www.ariadne.ac.uk/issue15/what-is/See http://www.ariadne.ac.uk/issue15/what-is/

B

Page 14: Will It All Fit Together? The need for standards and the technical challenges Brian Kelly and Paul Miller UK Web Focus Interoperability Focus UKOLN University

14

Namespaces and Linking

XML NamespacesWhat if an XML document contains a <TITLE> for the document and a <TITLE> for the name of a book?

XML Namespaces enable such clashes to be resolved

The naming conventions are defined at a URL

XSL stylesheet language will provide extensibility and transformation facilities (e.g. create a table of contents)

B

Page 15: Will It All Fit Together? The need for standards and the technical challenges Brian Kelly and Paul Miller UK Web Focus Interoperability Focus UKOLN University

15

Challenges facing library informationAmazon.co.uk Many–MARC

Integration with other scholarly resources• AHDS Gateway• SOSIG• Web of Science

Alternative delivery• on–line document

delivery?

P

Competition?

Obfuscation ?

Complication !

Page 16: Will It All Fit Together? The need for standards and the technical challenges Brian Kelly and Paul Miller UK Web Focus Interoperability Focus UKOLN University

16

Addressing (Problems)

URLs (e.g. http://www.bristol-poly.ac.uk/depts/music/) have limitations:• Lack of long-term persistency

– Organisation changes name– Department shut down or merged– Directory structure reorganised

• Inability to support multiple versions of resources (mirroring)

ISBN/ISSN also problematic:•Not tied to the work•Nor to the item at hand

P

Page 17: Will It All Fit Together? The need for standards and the technical challenges Brian Kelly and Paul Miller UK Web Focus Interoperability Focus UKOLN University

17

Addressing (Solutions)

DOIs (Document Object Identifiers):• Proposed by publishing industry as a

solution• Aimed at supporting rights ownership• Business model needed• Do two copies of a digital object get

separate DOIs?

PURLs (Persistent URLs):• Provide single level of redirection

P

Page 18: Will It All Fit Together? The need for standards and the technical challenges Brian Kelly and Paul Miller UK Web Focus Interoperability Focus UKOLN University

18

Joined–up thinking

•Users can be anywhere. They need to search anywhere•Physical locations at which digital data are stored should not impinge upon access •Disciplinary boundaries should not be a barrier

P

Page 19: Will It All Fit Together? The need for standards and the technical challenges Brian Kelly and Paul Miller UK Web Focus Interoperability Focus UKOLN University

19

Z39.50

•International Standard (ISO 23950)•Permits remote searching of databases•Access via Z client or over web•Relies upon ‘Profiles’•Used outside the library

See http://www.ariadne.ac.uk/issue21/See http://www.ariadne.ac.uk/issue21/

P

Page 20: Will It All Fit Together? The need for standards and the technical challenges Brian Kelly and Paul Miller UK Web Focus Interoperability Focus UKOLN University

20

Z39.50 Challenges

•Profiles for each discipline• Defeats interoperability?

•Bib–1 bloat•Largely invisible•Seen as complicated•Seen as expensive•Seen as old–fashioned•Surely no match for XML/RDF/whatever

P

Page 21: Will It All Fit Together? The need for standards and the technical challenges Brian Kelly and Paul Miller UK Web Focus Interoperability Focus UKOLN University

21

Z39.50 Futures

•International Interoperability Profile•Cross–Domain Attribute Set•Attribute Architecture•Bib–2•XER•DNER/RDNC/NGDF/ New Library?

P

Page 22: Will It All Fit Together? The need for standards and the technical challenges Brian Kelly and Paul Miller UK Web Focus Interoperability Focus UKOLN University

22

When to use it?

•To provide remote access to a large catalogue of material (an OPAC, a museum collection management system…)•To facilitate/allow searching of your resources alongside like resources from elsewhere

P

Page 23: Will It All Fit Together? The need for standards and the technical challenges Brian Kelly and Paul Miller UK Web Focus Interoperability Focus UKOLN University

23

What is ‘Metadata’?– meaningless jargon– or

a fashionable term for what we’ve always done– or

“a means of turning data into information”– and

“data about data”– and

the name of a film director (‘Luc Besson’)– and

the title of a book (‘The Lord of the Flies’).

P

Page 24: Will It All Fit Together? The need for standards and the technical challenges Brian Kelly and Paul Miller UK Web Focus Interoperability Focus UKOLN University

24

What is ‘Metadata’?

Metadata exists for almost anything;• People• Places• Objects• Concepts• Web pages• Databases.

P

Page 25: Will It All Fit Together? The need for standards and the technical challenges Brian Kelly and Paul Miller UK Web Focus Interoperability Focus UKOLN University

25

What is ‘Metadata’?

Metadata fulfils three main functions;• Description of resource content

– “What is it?”

• Description of resource form– “How is it constructed?”

• Description of resource use– “Can I afford it?”.

P

Page 26: Will It All Fit Together? The need for standards and the technical challenges Brian Kelly and Paul Miller UK Web Focus Interoperability Focus UKOLN University

26

Introducing the Dublin Core

• An attempt to improve resource discovery on the Web

– now adopted more broadly

• Building an interdisciplinary consensus about a core element set for resource discovery

– simple and intuitive– cross–disciplinary– international– flexible.

Page 27: Will It All Fit Together? The need for standards and the technical challenges Brian Kelly and Paul Miller UK Web Focus Interoperability Focus UKOLN University

27

• 15 elements of descriptive metadata• All elements optional• All elements repeatable• The whole is extensible

– offers a starting point for semantically richer descriptions

• Interdisciplinary– libraries, museums, archives…

• International– available in 20 languages, with more on the

way...

Introducing the Dublin Core

Page 28: Will It All Fit Together? The need for standards and the technical challenges Brian Kelly and Paul Miller UK Web Focus Interoperability Focus UKOLN University

28

• Title• Creator• Subject• Description• Publisher• Contributor• Date• Type

• Format• Identifier• Source• Language• Relation• Coverage• Rights

http://purl.org/dc/

Introducing the Dublin Core

Page 29: Will It All Fit Together? The need for standards and the technical challenges Brian Kelly and Paul Miller UK Web Focus Interoperability Focus UKOLN University

29

Implementing the Dublin Core

•Normally thought of as being HTML•Most recently possible in XML/RDF•Dublin Core ‘view’ onto richer databases•DC elements in Bib–1•DC elements form basis of XD attribute set•DC closely mapped to GILS

See http://www.ukoln.ac.uk/metadata/resources/dc/datamodel/WD–dc–rdf/

See http://www.ukoln.ac.uk/metadata/resources/dc/datamodel/WD–dc–rdf/

See http://purl.org/dc/See http://purl.org/dc/

P

Page 30: Will It All Fit Together? The need for standards and the technical challenges Brian Kelly and Paul Miller UK Web Focus Interoperability Focus UKOLN University

30

RDFRDF - the metadata framework• Based on a formal

data model (direct label graphs)

• Syntax for interchange of data

• Schema model

page.html £0.05Cost

11-May-98

ValidUntil

Resource ValuePropertyType

Property

RDF Data Model

page.html £0.05

11-May-98

Property

Cost

InstanceOf

ValidUntil

ValuePropObj

Cost

PropName

P

Page 31: Will It All Fit Together? The need for standards and the technical challenges Brian Kelly and Paul Miller UK Web Focus Interoperability Focus UKOLN University

31

Authentication

We can’t (can we?) just make all these resources available for free.

Users need to authenticate.

ATHENS / Digital Signatures …

Authenticate once per site ?

Authenticate once per query per site ?

Complicated by Z39.50 searchesauthenticate once per Target queried ?!

Ideally, authenticate once

when you log on in the morning!

P

Page 32: Will It All Fit Together? The need for standards and the technical challenges Brian Kelly and Paul Miller UK Web Focus Interoperability Focus UKOLN University

32

Deployment

How to I deploy “the new stuff” in the real world?

Barriers:• Browser x doesn’t do CSS, …• Authoring tools don’t do RDF• I prefer the web as it is• I haven’t the time to learn anything new• This Z39.50 thing is just too hard

B

Page 33: Will It All Fit Together? The need for standards and the technical challenges Brian Kelly and Paul Miller UK Web Focus Interoperability Focus UKOLN University

33

Approaches to Deployment

Various interesting new technologies have been outlined

How can they be deployed in our environment?

Should we:• Ignore them?• Accept them fully?• Accept them partly?

B

Page 34: Will It All Fit Together? The need for standards and the technical challenges Brian Kelly and Paul Miller UK Web Focus Interoperability Focus UKOLN University

34

Ignore New Developments

We can chose to ignore new developments, and continue to use, say, HTML 3.2:Safe option, with no new training, support or

software costsExperience in effectiveness, limitations, etc.Fails to address current performance problemsFails to address accessibility problemsFails to provide new functionalityService likely to look "old-fashioned" compared

with competition

B

Page 35: Will It All Fit Together? The need for standards and the technical challenges Brian Kelly and Paul Miller UK Web Focus Interoperability Focus UKOLN University

35

Fully Accept New Developments

We can chose to more wholesale to, say, HTML 4.0 and CSS 2.0:Can be exciting to be at leading edgePerformance benefitsAccessibility benefitsBased on open-standardsProvides motivation for users to upgrade browsersLikely to be solution at some point (cf. Gopher)Backwards compatibility problems with old browsersCostly to deploy new authoring news, training, ..Likely to be bugs and incompatibilities with new tools

and browsers

B

Page 36: Will It All Fit Together? The need for standards and the technical challenges Brian Kelly and Paul Miller UK Web Focus Interoperability Focus UKOLN University

36

Implement "Safe" Solutions

An alternative is to use "safe" parts of technologies which are backwards compatible and avoid major browser bugsAttractive sounding compromise positionLose some functionality, but not allCan be difficult or expensive to find "safe" options

(does .margin-left work on IE on SGI?)Tools may not allow safe options to be chosenLack of validation tools for checking conformance

with restricted set of specification

Note

See <URL: www.webreview.com/guides/style/insafegrid.htm> for unsafe CSS 2.0 properties

B

Page 37: Will It All Fit Together? The need for standards and the technical challenges Brian Kelly and Paul Miller UK Web Focus Interoperability Focus UKOLN University

37

Decision Time

What would you opt for?

Stick with current technologiesCheap, default option. Continuation of performance and accessibility problems. Unlikely to be long term solution.

Deploy new technologiesMore expensive option. Functionality, performance and accessibility benefits. Access problems for old browsers.

Use "safe" new technologiesMay require home-grown tools and support. Avoids some of the problems of other solutions

B

Page 38: Will It All Fit Together? The need for standards and the technical challenges Brian Kelly and Paul Miller UK Web Focus Interoperability Focus UKOLN University

38

An Alternative

An alternative approach to deploying new technologies is available:

• Use more intelligent server-side software• Use "proxies" to address limitations of browser

technologies. The term intermediary was used in a paper [1] at the WWW 7 conference to describe this approach

• Protocol solutions, such as Transparent Content Negotiation (TCN) and (CC/PP)

[1] "Intermediaries: New Places For Producing and Manipulating Web Content"

B

Page 39: Will It All Fit Together? The need for standards and the technical challenges Brian Kelly and Paul Miller UK Web Focus Interoperability Focus UKOLN University

39

Intelligent Server Software

Simple model:• Server receives request for resource• Server delivers resource to client

More sophisticated model:• Server receives request for resource • Server processes header information from client• Server delivers resource to client based on client

information

Can be implemented used server add-ons such as PHP/FI and MS Active Server Pages or by use of Content Management systems

B

Page 40: Will It All Fit Together? The need for standards and the technical challenges Brian Kelly and Paul Miller UK Web Focus Interoperability Focus UKOLN University

40

Web Conclusions

To conclude:• New web protocols are still being developed• Deployment of new technologies can be expensive

or time-consuming, but is likely to be needed• Various deployment models:

Don't implement Implement fullyImplement via proxy Other solutions

• We can't do it all ourselves• Experience in developing (wide-area) web

applications will help in developing intermediaries

B

Page 41: Will It All Fit Together? The need for standards and the technical challenges Brian Kelly and Paul Miller UK Web Focus Interoperability Focus UKOLN University

41

Non–Web Conclusions

•Cross–domain interoperability is a laudable goal•Technical developments continue in a rapidly shifting environment•Libraries are not alone•To make an OPAC more widely available, look at Z39.50•To raise awareness of library web pages, or to describe particular resources, look at a ‘metadata’ solution like Dublin Core•We need to move beyond ‘traditional’ users (who know where the library is and what if offers)…

P