35
1 Standards, the Web and eLib Projects Brian Kelly Email Address UK Web Focus [email protected] UKOLN University of Bath http://www.ukoln.ac.uk/ UKOLN is funded by the British Library Research and Innovation Centre, the Joint Information Systems Committee of the Higher Education Funding Councils, as well as by project funding from the JISC’s Electronic Libraries Programme and the European Union. UKOLN also receives support from the University of Bath where it

1 Standards, the Web and eLib Projects Brian KellyEmail Address UK Web Focus [email protected] UKOLN University of Bath UKOLN

Embed Size (px)

Citation preview

Page 1: 1 Standards, the Web and eLib Projects Brian KellyEmail Address UK Web Focus B.Kelly@ukoln.ac.uk UKOLN University of Bath  UKOLN

1

Standards, the Web and eLib Projects

Brian Kelly Email Address

UK Web Focus [email protected]

UKOLN

University of Bath

http://www.ukoln.ac.uk/UKOLN is funded by the British Library Research and Innovation Centre, the Joint Information Systems Committee of the Higher Education Funding Councils, as well as by project funding from the JISC’s Electronic Libraries Programme and the European Union. UKOLN also receives support from the University of Bath where it is based.

Page 2: 1 Standards, the Web and eLib Projects Brian KellyEmail Address UK Web Focus B.Kelly@ukoln.ac.uk UKOLN University of Bath  UKOLN

2

Contents• Introduction• Web Standards Overview• Web Standards:

• Data Formats• Transport• Addressing• Metadata

• Accessibility• Programming Languages• Distributed Searching• Deployment Issues• Questions

Aims of Talk• To review key web

standards• To describe standards

bodies• To identify opportunity

for involvement• To briefly address

implementation models

Aims of Talk• To review key web

standards• To describe standards

bodies• To identify opportunity

for involvement• To briefly address

implementation models

Page 3: 1 Standards, the Web and eLib Projects Brian KellyEmail Address UK Web Focus B.Kelly@ukoln.ac.uk UKOLN University of Bath  UKOLN

3

UK Web Focus / W3C

UK Web Focus:• JISC funded post based at UKOLN (Bath Univ)• Advises UK HE community on web issues• Represents JISC on W3C

W3C (World Wide Web Consortium):• International consortium, with headquarters at

MIT, INRIA and Keio University (Japan)• Coordinates development of web protocols• Four domains:

• Architecture • Technology & Society• User Interface • Web Accessibility

Page 4: 1 Standards, the Web and eLib Projects Brian KellyEmail Address UK Web Focus B.Kelly@ukoln.ac.uk UKOLN University of Bath  UKOLN

4

Standardisation

W3C• Produces W3C

Recommendations on Web protocols

• Managed approach to developments

• Protocols initially developed by W3C members

• Decisions made by W3C, influenced by member and public review

IETF• Produces Internet

Drafts on Internet protocols• Bottom-up approach to developments• Protocols developed by

interested individuals• "Rough consensus and working

code"

ISO• Produces ISO

Standards• Can be slow moving

and bureaucratic• Produce robust

standards

Proprietary• De facto standards• Often initially appealing

(cf PowerPoint)• May emerge as

standards

PNGHTMLZ39.50Java?

PNGHTMLZ39.50Java?

PNGHTMLHTTP

PNGHTMLHTTP

HTTPURNwhois++

HTTPURNwhois++

NoteJISC Standards

SubcommiteeHTML extensionsPDF and Java?

HTML extensionsPDF and Java?

Page 5: 1 Standards, the Web and eLib Projects Brian KellyEmail Address UK Web Focus B.Kelly@ukoln.ac.uk UKOLN University of Bath  UKOLN

5

The Web Vision

Tim Berners-Lee's vision for the Web:• Automation of information management:

If a decision can be made by machine, it should• All structured data formats should be based on

XML• Migrate HTML to XML• All logical assertions to map onto RDF model• All metadata to use RDF

Page 6: 1 Standards, the Web and eLib Projects Brian KellyEmail Address UK Web Focus B.Kelly@ukoln.ac.uk UKOLN University of Bath  UKOLN

6

Standards

Need for standards to provide:• Platform independence• Application independence• Avoidance of patented technologies • Flexibility ("evolvability" - Tim Berners-Lee)• Architectural integrity• Long-term access to data

Ideally look at standards first, then find applications which support the standards

Difficult to achieve this ideal!

Page 7: 1 Standards, the Web and eLib Projects Brian KellyEmail Address UK Web Focus B.Kelly@ukoln.ac.uk UKOLN University of Bath  UKOLN

7

Web Protocols

Web initially based on three simple protocols:• Data Formats

HTML (HyperText Markup Language) provides the data format for native documents

• AddressingURLs (Uniform Resource Locator) provides an addressing mechanism for web resources

• TransportHTTP (HyperText Transfer Protocol) defines transfer of resources between client and server

Data FormatHTML

AddressingURL

TransportHTTP

Page 8: 1 Standards, the Web and eLib Projects Brian KellyEmail Address UK Web Focus B.Kelly@ukoln.ac.uk UKOLN University of Bath  UKOLN

8

HTML History

HTML 1.0 Unpublished specification. DTD developed by Tim Berners-Lee (CERN).

HTML 2.0 Spec. based on innovations from NCSA (forms and inline images!)

HTML 3.0 Proposed spec. (renamed from HTML+).Very comprehensive Failed to complete IETF standardisation Little implementation experience

Proprietary Introduction of proprietary HTML elements by Netscape and Microsoft

HTML 3.2 Spec. based on description of mainstream innovations in marketplace

HTML 4.0 Current recommendation

Page 9: 1 Standards, the Web and eLib Projects Brian KellyEmail Address UK Web Focus B.Kelly@ukoln.ac.uk UKOLN University of Bath  UKOLN

9

Problems with Extensions

Device Dependency• Resources are dependent on a particular browser• Platform dependency

Costs• Potential costs in re-engineering

Architecture• Proprietary innovations have been flawed:

– Merging content and appearance– Maintenance of resources

• Accessibility problems:– Poor support for access by disabled

But:• Experiments are needed

Page 10: 1 Standards, the Web and eLib Projects Brian KellyEmail Address UK Web Focus B.Kelly@ukoln.ac.uk UKOLN University of Bath  UKOLN

10

HTML 4.0, CSS 2.0 and DOMHTML 4.0 used in conjunction with CSS 2.0 (Cascading Style Sheets) and the DOM provides an architecturally pure, yet functionally rich environment

HTML 4.0• Improved forms• Hooks for stylesheets• Hooks for scripting

languages• Table enhancements• Better printing

CSS 2.0• Support for all HTML

formatting • Positioning of HTML

elements• Multiple media support

CSS Problems• Changes during CSS development• Netscape & IE incompatibilities • Continued use of browsers with

known bugs

CSS Problems• Changes during CSS development• Netscape & IE incompatibilities • Continued use of browsers with

known bugs

DOM• Document Object Model• Hooks for scripting

languages• Permits changes to

HTML & CSS properties and content

Page 11: 1 Standards, the Web and eLib Projects Brian KellyEmail Address UK Web Focus B.Kelly@ukoln.ac.uk UKOLN University of Bath  UKOLN

11

HTML Limitations

HTML 4.0 / CSS 2.0 have limitations:• Difficulties in introducing new elements

– Time-consuming standardisation process (<ABBREV>)

– Dictated by browser vendor (<BLINK>, <MARQUEE>)

• Area may be inappropriate for standarisation:– Covers specialist area (maths, music, ...)– Application-specific (<STUD-NUM>)

• HTML is a display (output) format• HTML's lack of arbitrary structure limits

functionality:– Find all memos copied to John Smith– How many unique tracks on Jackson Browne CDs

Page 12: 1 Standards, the Web and eLib Projects Brian KellyEmail Address UK Web Focus B.Kelly@ukoln.ac.uk UKOLN University of Bath  UKOLN

12

XML

XML:• Extensible Markup Language• A lightweight SGML designed for network use• Addresses HTML's lack of evolvability• Arbitrary elements can be defined (<STUDENT-NUMBER>, <PART-NO>, etc)

• Agreement achieved quickly - XML 1.0 became W3C Recommendation in Feb 1998

• Support from industry (SGML vendors, Microsoft, etc.)

• Support in Netscape 5 and IE 5

Page 13: 1 Standards, the Web and eLib Projects Brian KellyEmail Address UK Web Focus B.Kelly@ukoln.ac.uk UKOLN University of Bath  UKOLN

13

XML Concepts

Well-formed XML resources:Make end-tags explicit: <LI>...</LI>

Make empty elements explicit: <IMG .../>

Quote attributes <IMG SRC="logo" HEIGHT="20"

Use consistent upper/lower case

Valid XML resources:

Need DTD

XML Namespaces:Mechanism for ensuring unique XML elements:<?xml:namespace ns="http://foo.org/1998-001" prefix="i">

<P>Insert <i:PART>M-471</i:PART></P>

Page 14: 1 Standards, the Web and eLib Projects Brian KellyEmail Address UK Web Focus B.Kelly@ukoln.ac.uk UKOLN University of Bath  UKOLN

14

XML Deployment

Ariadne issue 14 has article on "What Is XML?"

Describes how XML support can be provided:

• Natively by new browsers

• Back end conversion of XML - HTML

• Client-side conversion of XML - HTML / CSS

• Java rendering of XML

Examples of intermediaries

See http://www.ariadne.ac.uk/issue15/what-is/See http://www.ariadne.ac.uk/issue15/what-is/

Page 15: 1 Standards, the Web and eLib Projects Brian KellyEmail Address UK Web Focus B.Kelly@ukoln.ac.uk UKOLN University of Bath  UKOLN

15

XLink, XPointer and XSL

XLink will provide sophisticated hyperlinking missing in HTML:

• Links that lead user to multiple destinations• Bidirectional links• Links with special behaviors:

– Expand-in-place / Replace / Create new window– Link on load / Link on user action

• Link databases

XPointer will provide access to arbitrary portions of XML resource

XSL stylesheet language will provide extensibility and transformation facilities (e.g. create a table of contents)

EnglandFrance

<commentary xml:link="extended" inline="false"> <locator href="smith2.1" role="Essay"/> <locator href="jones1.4" role="Rebuttal"/> <locator href="robin3.2" role="Comparison"/> </commentary>

<commentary xml:link="extended" inline="false"> <locator href="smith2.1" role="Essay"/> <locator href="jones1.4" role="Rebuttal"/> <locator href="robin3.2" role="Comparison"/> </commentary>

Page 16: 1 Standards, the Web and eLib Projects Brian KellyEmail Address UK Web Focus B.Kelly@ukoln.ac.uk UKOLN University of Bath  UKOLN

16

Adobe PDF

Adobe PDF: Proprietary format Provides control over document appearance (originally lacking

in HTML) Lack of support for document structure Requires proprietary (though free) plugin (Acrobat) Proprietary plugin provides richer functionality (e.g. suppress

printing) Development work on improved hyperlinking Becoming more open?

Conclusion• Acceptable output format?

NOTEPDF is not a W3C activity

NOTEPDF is not a W3C activity

Page 17: 1 Standards, the Web and eLib Projects Brian KellyEmail Address UK Web Focus B.Kelly@ukoln.ac.uk UKOLN University of Bath  UKOLN

17

Addressing

URLs (e.g. http://www.bristol-poly.ac.uk/depts/music/) have limitations:

• Lack of long-term persistency– Organisation changes name– Department scrapped– Directory structure reorganised

• Inability to support multiple versions of resources (mirroring)

URNs (Uniform Resource Names):• Proposed as solution• Difficult to implement (no W3C activity in this

area)

Page 18: 1 Standards, the Web and eLib Projects Brian KellyEmail Address UK Web Focus B.Kelly@ukoln.ac.uk UKOLN University of Bath  UKOLN

18

Addressing - SolutionsDOIs (Document Object Identifiers):

• Proposed by publishing industry as a solution• Aimed at supporting rights ownership• Business model needed

PURLs (Persistent URLs):• Provide single level of redirection

Cache support:• National caches could provide simple URN support

For further information see:<URL: http://www.ukoln.ac.uk/metadata/resources/urn/>

<URL: http://hosted.ukoln.ac.uk/biblink/wp2/links.html>

Page 19: 1 Standards, the Web and eLib Projects Brian KellyEmail Address UK Web Focus B.Kelly@ukoln.ac.uk UKOLN University of Bath  UKOLN

19

Transport

HTTP/0.9 and HTTP/1.0: Made the Web popular Design flaws and implementation problems caused poor

performance

HTTP/1.1: Addresses some of these problems 60% server support, client & proxy support beginning Performance benefits! (optimised implementation reduces

packet traffic by 2/3) Is acting as fire-fighter Poor usage counting Not sufficiently flexible or extensible

Page 20: 1 Standards, the Web and eLib Projects Brian KellyEmail Address UK Web Focus B.Kelly@ukoln.ac.uk UKOLN University of Bath  UKOLN

20

HTTP/NG

HTTP/NG:• Two W3C Working Groups:

Web Characterisations: Study Web usage and form requirementsNew log format for easier collection & anonymisation

Protocol Design: Redesign Web as distributed object application

• Transition to HTTP/NG will be gradual– Use of proxies / HTTP/1.1 UPGRADE header– Layer HTTP/NG on top of HTTP/NG using POST

• Distributed searching as HTTP/NG application?• W3C Briefing Package due out on 7 July

Page 21: 1 Standards, the Web and eLib Projects Brian KellyEmail Address UK Web Focus B.Kelly@ukoln.ac.uk UKOLN University of Bath  UKOLN

21

MetadataMetadata - the missing architectural component from the initial implementation of the web

Metadata

PICS, TCN,

MCF, DSig,

DC,...

AddressingURL

Data formatHTML

TransportHTTP

Metadata Needs:• Resource discovery• Content filtering• Authentication• Improved navigation• Multiple format support• Rights management

Metadata Needs:• Resource discovery• Content filtering• Authentication• Improved navigation• Multiple format support• Rights management

Page 22: 1 Standards, the Web and eLib Projects Brian KellyEmail Address UK Web Focus B.Kelly@ukoln.ac.uk UKOLN University of Bath  UKOLN

22

Privacy

P3P (Platform for Privacy Preferences):• Example of a metadata application• Privacy concerns are a current barrier to Web

development (esp. in US)

• P3P project developing methods for exchanging Privacy Practices of Web sites and user

• Documents on architecture and vocabulary available

• P3P1.0 draft spec released on 19 May 1998• See <URL: http://www.w3.org/P3P/>

Relevant to Jun 98 lis-elib discussion

Page 23: 1 Standards, the Web and eLib Projects Brian KellyEmail Address UK Web Focus B.Kelly@ukoln.ac.uk UKOLN University of Bath  UKOLN

23

Digital Signatures

DSig (Digital Signatures initiative):• Key component for providing trust on the web• DSig 1.0 is based on PICS• DSig 2.0 will be based on RDF and will support

signed assertion:– This page is from the University of Bath

– This page is a legally-binding list of courses provided by the University

• Potential for use in authentication but:– Little activity in this area in W3C

– Implementation would require expensive infrastructure

Page 24: 1 Standards, the Web and eLib Projects Brian KellyEmail Address UK Web Focus B.Kelly@ukoln.ac.uk UKOLN University of Bath  UKOLN

24

RDF

RDF (Resource Description Framework):• Highlight of WWW 7 conference

• Provides a metadata framework ("machine understandable metadata for the web")

• Based on ideas from content rating (PICS), resource discovery (Dublin Core) and site mapping (MCF)

• Applications include:– cataloging resources – resource discovery– electronic commerce – intelligent agents– digital signatures – content rating– intellectual property rights – privacy

• See <URL: http://www.w3.org/Talks/1998/0417-WWW7-RDF>

Page 25: 1 Standards, the Web and eLib Projects Brian KellyEmail Address UK Web Focus B.Kelly@ukoln.ac.uk UKOLN University of Bath  UKOLN

25

RDF ModelRDF:

• Based on a formal data model (direct label graphs)

• Syntax for interchange of data

• Schema model

Resource ValuePropertyType

Property

page.html £0.05Cost

11-May-98

ValidUntil

RDF Data Model

page.html £0.05

11-May-98

Property

Cost

InstanceOf

ValidUntil

ValuePropObj

Cost

PropName

Page 26: 1 Standards, the Web and eLib Projects Brian KellyEmail Address UK Web Focus B.Kelly@ukoln.ac.uk UKOLN University of Bath  UKOLN

26

RDF Example

Example of Dublin Core metadata in RDF<?xml:namespace ns="http://www.w3.org/TR/WD-rdf/" prefix="rdf"?>

<?xml:namespace ns="http://purl.org/dublin_core/schema/" prefix="dc"?>

<rdf:RDF> <rdf:Description RDF:HREF="page.html"> <dc:Creator>John Smith</dc:Creator> <dc:Title>John’s Home Page</dc:Title> </rdf:Description></rdf:RDF>

Page 27: 1 Standards, the Web and eLib Projects Brian KellyEmail Address UK Web Focus B.Kelly@ukoln.ac.uk UKOLN University of Bath  UKOLN

27

Browser Support for RDF

Mozilla (Netscape's source code release) provides support for RDF.

Mozilla supports site maps in RDF, as well as bookmarks and history lists

See Netscape's or HotWired home page for a link to the RDF file.

Trusted 3rd

Party Metadata

Embedded Metadata

e.g. sitemaps

Image from http://purl.oclc.org/net/eric/talks/www7/devday/Image from http://purl.oclc.org/net/eric/talks/www7/devday/

Page 28: 1 Standards, the Web and eLib Projects Brian KellyEmail Address UK Web Focus B.Kelly@ukoln.ac.uk UKOLN University of Bath  UKOLN

28

RDF Conclusion

RDF is a general-purpose framework RDF provides structured, machine-

understandable metadata for the Web Metadata vocabularies can be developed

without central coordination Role for eLib projects in defining schemas?

RDF Schemas describe the meaning of each property name

Signed RDF is the basis for trust

Page 29: 1 Standards, the Web and eLib Projects Brian KellyEmail Address UK Web Focus B.Kelly@ukoln.ac.uk UKOLN University of Bath  UKOLN

29

Languages

Java• Powerful platform independent object-oriented system with:

• Language • Java Virtual Machine • Chip • OS

• Owned by Sun but being standardised by ISO

• Beware Microsoft Java DK

• "This is the year the performance problem is solved"

• See <URL: http://java.sun.com/>

ECMAScript• Standardised version of JavaScript

• Important role in DHTML, DOM, XSL, ...

• See <URL: http://www.ecma.ch/stand/ecma-262.htm>

Page 30: 1 Standards, the Web and eLib Projects Brian KellyEmail Address UK Web Focus B.Kelly@ukoln.ac.uk UKOLN University of Bath  UKOLN

30

WAI

WAI (Web Accessibility Initiative):• Ensures web specs address accessibility issues

• Based on universal design principles

Authoring:• Page Author Accessibility Checklist and Guidelines

draft at <URL: http://www.w3.org/TR/1998/WD-WAI-PAGEAUTH-0203>

Software• WAI Accessibility Guidelines: User Agent draft at

<URL: http://www.w3.org/TR/WD-WAI-USERAGENT>

Note • JISC DISinHE project at Dundee University.

See <URL: http://www.disinhe.ac.uk/>

Page 31: 1 Standards, the Web and eLib Projects Brian KellyEmail Address UK Web Focus B.Kelly@ukoln.ac.uk UKOLN University of Bath  UKOLN

31

Distributed Searching

Distributed searching important for the DNER (Distributed National Electronic Resource)

ROADS prototype provides cross-searching using whois++

ROADS prototype provides cross-searching using whois++

http://prospero.ahds.ac.uk:8080/ahds_live/

AHDS prototype provides cross-searching using Z39.50

AHDS prototype provides cross-searching using Z39.50

Page 32: 1 Standards, the Web and eLib Projects Brian KellyEmail Address UK Web Focus B.Kelly@ukoln.ac.uk UKOLN University of Bath  UKOLN

32

Distributed Searching Issues

Providing access to resources by software rather than by humans raises several issues:

• Loss of visibility of service / value-added web services• Possible performance problems• Information overload• Finding the service

Solutions:• Giving visibility and pointers in results sets• Service metadata:

– Service only available for cross-searching by non AC.UK users outside peak hours

– Service covers UK Census data

• Need for agreed metadata standards (profiles, rights issues, …)

Page 33: 1 Standards, the Web and eLib Projects Brian KellyEmail Address UK Web Focus B.Kelly@ukoln.ac.uk UKOLN University of Bath  UKOLN

33

Intermediaries can provide functionality not available at client:

• DOI support• XML support• Format conversion

Intermediaries can provide functionality not available at client:

• DOI support• XML support• Format conversion

Deployment IssuesMore sophisticated deployment techniques can be adopted to overcome deficiencies in simple model

HTML resource

browserWeb server

Web server simply sends file to clientFile contains redundant information (for old browsers) plus client interrogation support

HTML / XML /

databaseresource browser

Server proxy

Client proxy

Original Model

Sophisticated Model

IntelligentWeb server

Example of an intermediary

Page 34: 1 Standards, the Web and eLib Projects Brian KellyEmail Address UK Web Focus B.Kelly@ukoln.ac.uk UKOLN University of Bath  UKOLN

34

Conclusions

To conclude:• Standards are important, especially for national

initiatives, such as eLib• Proprietary solutions are often tempting because:

– They are available– They are often well-marketed and well-supported– They may become standardised– Solutions based on standards may not be properly

supported by applications

• Intermediaries may have a role to play in deploying standards-based solutions

• Opportunity for involvement with standards bodies (e.g. W3C Working Groups)

Page 35: 1 Standards, the Web and eLib Projects Brian KellyEmail Address UK Web Focus B.Kelly@ukoln.ac.uk UKOLN University of Bath  UKOLN

35

Question Time

Any questions?