1 Accessing Multiple Resources via Z39.50 Paul Miller Interoperability Focus UK Office for Library...

Preview:

Citation preview

1

Accessing Multiple Resources via Z39.50

Paul Miller

Interoperability FocusUK Office for Library & Information Networking (UKOLN)

P.Miller@ukoln.ac.uk http://www.ukoln.ac.uk/

UKOLN is funded by the Library and Information Commission, the Joint Information Systems Committee (JISC) of the Higher Education Funding Councils, as well as by project funding from the JISC and the European Union.

UKOLN also receives support from the Universities of Bath and Hull where staff are based.

2

Outline

• What is Z39.50?• Some gory details

– Attribute Sets, Profiles, and all…

• Maintenance and development

• What’s wrong with Z39.50?

• The Bath Profile

• The New Attribute Architecture

• How it’s used• Tools, registries, etc.

See http://www.ariadne.ac.uk/issue21/z3950/See http://www.ariadne.ac.uk/issue21/z3950/

3

What is Z39.50?

• ANSI/NISO Z39.50–1995, Information Retrieval (Z39.50): Application Service Definition and Protocol Specification

• ISO 23950:1998, Information and Documentation — Information Retrieval (Z39.50) — Application Service Definition and Protocol Specification.

See http://lcweb.loc.gov/z3950/agency/1995doce.htmlSee http://lcweb.loc.gov/z3950/agency/1995doce.html

4

What is Z39.50?

“This standard specifies a client/server based protocol for Information Retrieval. It specifies procedures and structures for a client to search a database provided by a server, retrieve database records identified by a search, scan a term list, and sort a result set. Access control, resource control, extended services, and a ‘help’ facility are also supported. The protocol addresses communication between corresponding information retrieval applications, the client and server (which may reside on different computers); it does not address interaction between the client and the end-user.”

(Z39.50–1995, page 0).

See http://lcweb.loc.gov/z3950/agency/1995doce.htmlSee http://lcweb.loc.gov/z3950/agency/1995doce.html

5

Some gory details…• Z39.50 follows client/server model

• But calls them Origin and Target

Client/origin

Server/target

6

Client/Server architecture

7

Client/Server architecture

8

Some gory details…

• Z39.50–1995 is divided into eleven ‘Facilities’

Initialization Search

Retrieval Result–set–delete

Browse Sort

Access Control Accounting

Explain Extended Services

Termination.

See http://www.ariadne.ac.uk/issue21/z3950/See http://www.ariadne.ac.uk/issue21/z3950/

9

Facilities and Services

• Each Facility comprises at least one Service• A Service facilitates a particular

interaction between Origin and Target• The three key services are Init,

Search, and Present.

See http://www.ariadne.ac.uk/issue21/z3950/See http://www.ariadne.ac.uk/issue21/z3950/

10

Init

• The only Service of the Initialization Facility

• Origin–initiated

• Used to start a ‘Z–association’• Origin requests a number of

parameters under which the searches will be conducted

• Target responds, either accepting offered parameters or proposing others if necessary.

11

Search

• The only Service of the Search Facility

• Origin–initiated

• Used to actually conduct a search• Origin specifies databases to be

searched, attribute combinations, and query

• Target responds, identifying the number of matching results.

12

Present

• Main Service of the Retrieval Facility (along with Segment)

• Origin–initiated• although Target can initiate a Segment

request if the result set is very large

• Used to return records to the user.

13

Init for dummies

Hello. Do you speak English?

Hello. Yes, I do. Let’s talk.

14

Search for dummies

Cool. Can I have anything you’ve got on a place

called “Bristol”?

I’ve got 25 records matching your request, and here’s the first five. As you didn’t

specify anything else, I’ve sent them to you in MARC, so I hope

that’s OK.

15

Present for dummies25, eh? Can I have the first ten, please? Oh, and I really don’t like

MARC. If you can send Dublin Core that would be great, and if not I’ll

settle for some SUTRS.

DC:Creator – blahDC:Title – blah…

16

Now it gets hairy…

• To communicate successfully, Origin and Target need to use the same Attribute Set.• An Attribute Set like Bib–1 defines six

forms of Attribute —– Use– Relation– Truncation– Completeness– Position– Structure.

17

Use Attributes

• Define the ‘access points’ on which a search takes place• Title, author, subject, etc.

See http://lcweb.loc.gov/z3950/agency/defns/bib1.htmlSee http://lcweb.loc.gov/z3950/agency/defns/bib1.html

18

Relation Attributes

• Defines the relationship between the search term and values stored in the database/index• Less than, greater than, equal to,

phonetically matched, etc.

19

Truncation Attributes

• Defines which part of the stored value is to be searched on• Beginning of any word, end of any

word, etc.• ‘Smith’ finds ‘Smithsonian’ and not

‘Wordsmith’, and vice versa.

20

Completeness Attributes

• Defines how much of the stored index term must be in the search term• ‘Smith’ finds ‘Smith’, but not

‘Smithsonian’ or ‘the Smith’, etc.

21

Position Attributes

• Defines where in the index the search term should be located• At the start of the field, anywhere, etc.

22

Structure Attributes

• Specifies the form to be searched for• Word, phrase, date, etc.

23

Record Syntaxes• Record Syntaxes define the structure in which

results are returned to the Origin.• This does not mean that Targets need to store data

in these formats

• MARC• UKMARC, USMARC/MARC21, DANMARC, MARB,

UNIMARC…

• SUTRS• Simple Unstructured Text Record Syntax

• GRS–1• Generic Record Syntax

• XML.

24

Profiles• Groupings of Attribute Sets, Record

Syntaxes, etc. to meet specific needs• Disciplinary

– Cultural Heritage (CIMI)– Geospatial (GEO)

• Geographic/Cultural/National– Texas Profile– OPAC Network for Europe (ONE)– Conference of European National Librarians (CENL)

• Functional– Collections Profile

• Etc.

25

• Z39.50 Maintenance Agency• Based at Library of Congress,

and officially responsible for upkeep of the standard

• ZIG• Z39.50 Implementor’s Group• Informal grouping of vendors, users and

implementors who work to progress new areas of the standard

• Next meeting in Texas in January• Likely to be at UKOLN in 2001.

Maintenance and Development

See http://www.loc.gov/z3950/agency/See http://www.loc.gov/z3950/agency/

26

What’s wrong with Z39.50?• Profiles for each discipline

• Defeats interoperability?

• Vendor interpretation of the standard

• Bib–1 bloat

• Largely invisible to the user

• Seen as complicated, expensive and old–fashioned

• Surely no match for XML/RDF/ whatever.

27

The Bath Profile

• System vendors implement areas of the Z39.50 standard differently

• Regional, National, and disciplinary Profiles have appeared over previous years, many of which have basic functions in common

• Users wish to search across national/regional boundaries, and between vendors.

See http://www.ukoln.ac.uk/interop–focus/activities/z3950/int_profile/bath/

See http://www.ukoln.ac.uk/interop–focus/activities/z3950/int_profile/bath/

28

Learning from the past

• The Bath Profile is heavily influenced by• ATS–1• CENL• DanZIG• MODELS• ONE• Z Texas• vCUC

See http://www.ukoln.ac.uk/interop–focus/activities/z3950/int_profile/bath/

See http://www.ukoln.ac.uk/interop–focus/activities/z3950/int_profile/bath/

29

Learning from the past

See http://www.ukoln.ac.uk/interop–focus/activities/z3950/int_profile/bath/

See http://www.ukoln.ac.uk/interop–focus/activities/z3950/int_profile/bath/

30

Doing the work

• ZIP–PIZ–L mailing list, hosted by National Library of Canada

• Meeting face–to–face• The UK’s Joint Information Systems

Committee (JISC) supported a face–to–face meeting in Bath over the summer

• A draft, being widely circulated for comment.

See http://www.ukoln.ac.uk/interop–focus/activities/z3950/int_profile/bath/

See http://www.ukoln.ac.uk/interop–focus/activities/z3950/int_profile/bath/

31

What we proposed

• Minimisation of ‘defaults’• Where possible, every attribute is defined in the Profile

(Use, Relation, Position, Structure, Truncation, Completeness)

• Three Functional Areas• Basic Bibliographic Search & Retrieval• Bibliographic Holdings Search & Retrieval• Cross–Domain Search & Retrieval

• Three or more Levels of Conformance in each Area.

See http://www.ukoln.ac.uk/interop–focus/activities/z3950/int_profile/bath/

See http://www.ukoln.ac.uk/interop–focus/activities/z3950/int_profile/bath/

32

What we proposed

• SUTRS and one of UNIMARC or MARC21 for Bibliographic Search results• Or all three at Level 1?

• SUTRS and Dublin Core (in XML) for Cross–Domain results

• Other record syntaxes also permitted, but conformant tools must support at least these.

33

The new Attribute Architecture• Recognition of existing problems

• Probably 2–3 years away in mainstream implementations?

• Deals with Bib–1 bloat by identifying key attributes of value to multiple applications, and grouping them together

– Utility Attribute Set (description of records)– Cross–Domain Attribute Set (description of

resources, and closely related to Dublin Core element set)

– Bib–2etc.

34

The new Attribute Architecture

New Attribute Type Relation to Bib–1 Attributes

Access Point Use

Semantic Qualifier new

Language new

Content Authority new

Expansion/ Interpretation

Truncation and some of Relation

Normalized Weight new

35

The new Attribute Architecture

New Attribute Type Relation to Bib–1 Attributes

Hit Count new

Comparison Most of Relation and part of

Completeness

Format/ Structure Structure

Occurrence Completeness

Indirection new

Functional Qualifier new

36

Using Z39.50

• Z39.50 widely deployed in the library sector and elsewhere, although often invisibly• The Origin can be either a human user

or a second Origin computer– e.g. Z39.50 portals, summing resources

from multiple targets

• Users access Z39.50 Targets using proprietary clients or, increasingly, via web interfaces

– e.g. WinWillow, ZNavigator, many WOPACs.

37

Using Z39.50© A

rts & H

umanities D

ata S

ervice

38

Using Z39.50© A

rts & H

umanities D

ata S

ervice

39

Using Z39.50© U

niversity of C

alifornia

40

Using Z39.50© U

niversity of C

alifornia

41

Building the DNER

• Distributed National Electronic Resource• Policy aspiration of the Joint Information

Systems Committee• Intended to provide greater access to JISC’s

Current Content Collection– RDN– AHDS– MIMAS– EDINA– The Data Archive– EDUSERVE– eLib projects

etc.

42

Building the DNER

• Construction of Bath Profile–conformant Z39.50 Targets at data sources

• Construction of various Portals to facilitate access• ‘JISC Portal’ ?• Data Centre Portals• Subject Portals• Data Type Portals• Institutional Portals• Personal Portals ?

43

Building the DNER

• Remaining challenges• Authentication hell

– Move from endless authentication to single authentication

• Alignment of different data types– Ordnance Survey maps at Edinburgh– Satellite imagery in Manchester– Electronic journal articles in many formats, etc.– Census data at the Data Archive– Survey data in Manchester– Chemical structures in Manchester

• Collection Level Description.

Recommended