72
A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

Embed Size (px)

Citation preview

Page 1: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

A New Kind of Catalog

Charley PennellPrincipal Cataloger for Metadata

North Carolina State University

North Carolina Library Association 2007

Page 2: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

Where is this talk headed?

Local motivation National trends What is Endeca? Features Does Endeca work? Where are we going from here? Where is everybody else going?

Page 3: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

Why a new catalog?What was wrong with the old one?

Page 4: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

A little TRLN catalog primer

TRLN libraries (Duke, NCCU, NCSU, UNC-CH) jointly develop and maintain BIS, 1985-1992

DRA implemented for catalog (UNC & Duke continue Acq/Serials modules), 1991-1993

No integrated keyword/browse capability, 1993-1999

Web2 catalog implemented, 1999- Sirsi & DRA “merge” in 2002; Taos DOA

Page 5: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

A little TRLN catalog primer 2

NCSU & NCCU to Unicorn; Duke to Aleph; UNC-CH to Millenium, 2003-2004

Sirsi/Dynix merger, 2004: vendor focus shifts (even more) toward school/public market

While agreeing to continue to support Web2, S/D increasingly looking to merge all product catalogs into single interface

Page 6: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

What was the catalog lacking?

Simplicity: a simple, hopefully uncluttered interface Interactivity: ways to interact with results to get

better results Forgiveness: just fix my typos and case errors,

don’t make me feel stupid! Response time: always Real-time sorting: the limit is how many?!! Relevance ranking: as if! Web services: use the Web to repurpose data,

enable mash-ups, add-ons & improvements

Page 7: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

Which interface is ready for immediate use?

0

10

20

30

40

50

60

70

80

90

1stQtr

2ndQtr

3rdQtr

4thQtr

East

West

North

Page 8: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007
Page 9: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

So, why DOES everyone think that the catalog sucks stinks?

"Most integrated library systems, as they are currently configured and used, should be removed from public view."

- Roy Tennant, OCLC

Page 10: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

The old model

Page 11: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

The integrated library system

Historically, the ILS developed as an inventory control system for use by library staff only

First library automation systems (Plessey, CLSI, Geac, Innovative) were designed around circulation or acquisitions functions

Interaction time was calibrated to the slow pace of backroom work where the audience was basically captive

Staff focus on known-item searching, not resource discovery

Page 12: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

The catalog as part of the ILS

The first integrated OPACs were veneers on top of existing inventory management systems—patrons & staff competed for system resources! They still do!

First OPACs allowed for browse only; early keyword searching restricted to certain fields (A/T/S) only

Libraries with no IT support were stuck with what their vendor provided and the enhancement process for improvements

Libraries with IT support created their own systems: BIS, NOTIS, Clarement Colleges, Georgetown, PALS, DOBIS/LIBIS

Page 13: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

The state of the ILS in 2007 Customer demands for increasing

functionality in a marketplace with little $$ to spend has reduced the ILS vendor pool through mergers and buyouts

New functionality (multi-search, ERMS, E-Ref, ILL, etc.) increasingly being met by stand-alone and third party applications

Increasing competition from open source (Koha, Evergreen, Scriblio, LibraryThing) and e-commerce

Q: Is our dogged adherence to MARC the only thing keeping the remaining ILS vendors afloat?

Page 14: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

The state of the catalog 2007 Library users’ search expectations have been

conditioned by interactions with commercial Websites and Google, with which Libraries can barely afford to compete, but must

Libraries are becoming increasingly virtual as users interact with us online (e-resources, Second Life)

User expectations for online experiences are more interactive, instantaneous, and inviting

Page 15: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

Perhaps most importantly…

The information resources represented in the catalog represent a shrinking percentage of what end users need or want

Calhoun’s Aristotelian vs. Copernican views of the catalog

Page 16: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

What do users want from the OPAC?

Make subject searching in online catalogs easier using post-Boolean probabilistic searching with automatic spelling correction, term weighting, intelligent stemming, relevance feedback, and output ranking

Streamline users' book selection decisions at the catalog by adding tables of contents and back-of-the-book indexes to cataloging (i.e., metadata) records

Reduce the many failed subject searches by expanding the online catalog with full texts—journal and newspaper articles, encyclopedias, dissertations, government documents, etc. Increase finding strategies in online catalogs through the library classification-- Markey, Karen (2007). “The online library catalog: Paradise lost and paradise regained”, D-Lib Magazine, 13(1/2).

Page 17: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

“Many researchers express surprise at the brevity (from one to three words) of the queries people submit to online systems. Belkin tells why so few words make up their queries, "Precisely because of the inquirer's lack of knowledge about a problem area, it is impossible to specify what would resolve it." For Belkin, the saving grace is the inquirer's ability to recognize what he or she wants or does not want during the course of the search. Therein lies an important solution to the problem—information systems that report results for easy eyeballing and instantaneous recognition of relevant possibilities.” – Karen Markey

Page 18: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

What is an Endeca?

Page 19: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007
Page 20: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

A software company based in Cambridge, MA

A search and information access technology provider for a number of major e-commerce websites

Developers of the Endeca Information Access Platform

Page 21: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

Endeca features

Commercial-strength search/sort speeds

Site customizable relevance ranking

Faceted browse True browsing (LC

classification) Spell-checking ”Did you mean?” Automatic word

stemming

Page 22: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

Endeca at NCSU Libraries Went live in January 2006 Works with a text version

of a daily snapshot of Libraries’ MARC & other metadata

Used to improve the discovery portion of the library catalog

Interoperates with ILS for holdings, current availability status

Web2 interface still present for known item & authority searching

Page 23: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

Implementation timeline

License / negotiation: Spring 2005 Acquire: Summer 2005 Implementation:

August 2005 : vendor training September 2005 : finalize requirements October 2005 – January 2006 : design and

development January 12, 2006 : go-live date

Widen to TRLN partners: Winter 2008

Page 24: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

Implementation Team

Implementation Team brought together from IT, DLI, Cataloging, Collections, Reference, Circulation

Worked on indexing, UI, usability testing, etc. Areas of contention

Number of initial search boxes (1 or 2) Order, grouping of facets Placement of classification hierarchies, breadcrumbs Use of “search” and “browse” on tabs

Visualization aided by Tito’s wireframes

Page 25: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007
Page 26: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

8th (and Final) Revision: Aggregate holdings information by library.

Reduces complexity of continuing and online resources.

Brief view vs. Full view gives user choice about displaying holdings.

Page 27: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

NCSU Endeca features

Facets

Results

Call # browse

Breadcrumbs

Page 28: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

Features we started with

Faceted browse Availability facet Breadcrumbs Spell check / Did you mean Hierarchical subject browse based on LCC Fuzzy link to live Web2 data New book browse for titles added in last

week only

Page 29: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

Features that we’ve added

New book browse based on relative date (last week, last month, last three months)

RSS feeds based on user results “Search within” results Send search to TRLN partners Static unique link to live Web2 data

Page 30: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

Relevance ranking

Based on locally customizable algorithm: Most relevant: query exactly as entered For multi-term searches: phrase match Field match

title match more relevant than notes match Other factors:

number of fields matched weighted frequency static ordering (publication date, circulation stats)

Page 31: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

Faceting at the NCSU Libraries

Follows on what we have learned from the commercial Web search model

Mines metadata already available via MARC record, local class number, ILS item categories, circ status, and date stamping

Required massive clean-up of 6xx subdivisions Allows both pre- and post-coordinate limits Uses table mapping to enable drilling down through

call number results

Page 32: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

Facet refinements

Availability Author Library Format Language

New(ness)

LC Classification Subject: Topic Subject: Genre Subject: Region Subject: Era

Page 33: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

A single facet need not represent data from a single field

Single Unicorn item types (Book, Kit, Manuscript, Map, Data set)

Multiple Unicorn item types (Audio, Microform, Thesis/Dissertation, Software & Multimedia, Videos)

Leader byte 07 (Bib lvl): Journal, Magazine

Library (Online)

Page 34: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

Ranking facet results by number of postings makes sense in a short list, but not in a long list

Page 35: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

The author facet is less useful in some types of searches …

Page 36: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

… than others!

Page 37: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

Technical overview

Raw MARC data

NCSU exports and reformats

Flat text files

Data Foundr

yParse text

files Indices

MDEX Engine

NCSU Web Application

HTTP

HTTP

Information Access Platform

Page 38: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

MARC ingest

MARC flat text file(s) for ingest by Endeca. Transformation accomplished with MARC4J. Opportunity to manipulate data on the back-end.

Page 39: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

Transformed data

Page 40: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

The end result…

Video

Page 41: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

Other Endeca library catalogs

Phoenix Public Library: http://www.phoenixpubliclibrary.org/

McMaster University: http://libcat.mcmaster.ca

Florida Center for Library Automation http://catalog.fcla.edu/

Individual Florida universities http://fs.catalog.fcla.edu/, etc.

Page 42: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

Does Endeca work?

Page 43: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

Problems: authority control

Endeca is a keyword search engine; “browse” can only be effected using sort options

There is no authority control within Endeca itself, rather it relies on AC within ILS

To make use of available metadata, subjects were split along subdivisions. Authors were not

Talks were held with the vendor to explain the potential for drawing on authority x-refs to collocate searches

Page 44: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

Problems: subject context

Problems with wrong delimiter values (esp. $v) Problems maintaining context in atomized LCSH

One-way relationships English language$vDictionaries$xSpanish

Chronological headings devoid of geographic context Cuba$xHistory$yRevolution, 1959

Phrase headings expressed in multiple subdivisions Prisoners$xAbuse of

Page 45: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

Problems: subject hierarchies

Chronological hierarchy not built into $y “19th century” does not subsume 1800-1809, 1801-1861, 1809-1817, 1815-

1861, 1817-1825, Civil War, 1861-1865, etc. Geological periods exist as text only (Ordovician, Pleistocene, etc.)

Some chronological headings are expressed as text in 650$a Middle Ages Nineteen sixties

Geographic hierarchy not consistent between 651 and 650 $zNorth Carolina$zRaleigh $aRaleigh (N.C.)

BT/NT/RT relationships from authority file lacking

Page 46: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

Some potential solutions

Search behavior education FAST (Faceted Application of Subject

Terminology) Web2 x-refs to redirect searches to Endeca Combining $z hierarchies Hierarchy lists

Page 47: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

What do our users think?

Page 48: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

“The new Endeca system is incredible. It would be difficult to exaggerate how much better it is than our old online card catalog (and therefore that of most other universities). I've found myself searching the catalog just for fun, whereas before it was a chore to find what I needed.”

- NCSU Undergrad, Statistics

“The new library catalog search features are a big improvement over the old system. Not only is the search extremely fast, but seemingly it's much more intelligent as well.”

- NCSU faculty, Psychology

Page 49: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

Usability testing

Task Difficulty: Old Catalog

Easy43%

Medium12%

Hard22%

Failed23%

Task Difficulty: New Catalog

Easy59%

Medium12%

Hard7%

Failed22%

Page 50: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

Usability testingAverage Task Duration:

Old vs New Catalog00:00.0 00:43.2 01:26.4 02:09.6 02:52.8 03:36.0

Task 1

Task 2

Task 3

Task 4

Task 5

Task 6

Task 7

Task 8

Task 9

Task 10

Old Catalog

New Catalog

Page 51: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

Usage statistics

Searches by Field Type: J uly 06 - J an 07

0

60,000

120,000

180,000

240,000

300,000

360,000

420,000

Keyword(default)

I SBN Title Author Subject Multi-Field

Page 52: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

Search and Navigation

Search 67%Navigation 8%

Search -> Navigation 25%

Newness wearing off?

March ‘06 - May ‘06

July ‘06-January ‘07

Requests by Search Type

Search -> Navigation

29%

Navigation 20%

Search 51%

Page 53: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

Navigation by Dimensions

Subject: Topic26%

Availability2%

LC Classification21%

Format10%

New10%

Library10%

Subject: Genre6%

Subject: Era2% Language

3%

Subject: Region4%

Author6%

July 06 – Jan 07

Page 54: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

Navigation by Dimension (most used)

0 20,000 40,000 60,000 80,000 100,000 120,000 140,000

Availability

Subject: Era

Language

Subject: Region

Author

Subject: Genre

Library

New

Format

LC Classification

Subject: Topic

Requests

July 06 – Jan 07

Page 55: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

Navigation by Dimension (order of UI presentation)

32,650

16,009

12,257

22,818

54,476

57,667

34,096

145,589

120,644

9,286

0 20,000 40,000 60,000 80,000 100,000 120,000 140,000 160,000

Author

Language

Subject: Era

Subject: Region

Library

Format

Subject: Genre

Subject: Topic

LC Classification

Availability

Requests

July 06 – Jan 07

Page 56: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

Where are we going from here?

Page 57: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

Future directions

Additional hierarchies (geographic names, dates) Make use of NAF, SAF, particularly cross-reference

structure Massage underlying metadata

Addition of Date Cataloged – Done! Addition of LC Class numbers to e-resources – Done! FRBR work numbers/records? – Tested! FAST headings?

Accommodation of true browse for all indexes

Page 58: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

Future opportunities Expanding the scope of the implementation to the

10M records in TRLN (Duke, NCCU, NCSU, UNC-Chapel Hill)

Enrich catalog through external web services: book jackets, reviews, TOC, etc. – Amazon, OCLC.

LibraryThing, Bowker Syndetics Build use-case based cross-application shopping

cart functionality Integrate catalog w/other tools through web services

—“Free the Data”

Page 59: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

Web services…

Page 60: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007
Page 61: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

Mobile device searching

Page 62: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007
Page 63: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

Where is everybody else going?

Catalogs detaching themselves from ILS Detached data lends itself to experimentation Don’t have to throw out baby with bathwater when

better interfaces come out Data itself safe and secure in ILS

MARC becoming superfluous; MARC’s granularity NOT!

Social interaction: reviews, folksonomic tags, ratings

Page 64: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

Phoenix Public Library on Endeca

Page 65: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

III’s new faceted catalog, Encore

Page 66: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

ExLibris Primo at Vanderbilt

Page 67: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

Athens County, OH—Koha Zoom open source

Page 68: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

Georgia PINES—Evergreen open source

Page 69: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

Casey Bisson’s Scriblio

Page 70: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

Danbury Public powered by LibraryThing

Page 71: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

OCLC WorldCat Local at UW

Page 72: A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

Thanks for listening!

Charley Pennell

Principal Cataloger for Metadata

NCSU Libraries

North Carolina State University

Raleigh, NC 27695-7111

[email protected]

More info at: http://www.lib.ncsu.edu/endeca/