41
The DeathFlip Project: Killing Off Your Authors Practically Painlessly Michael Kreyche, Systems Librarian Amey Park, Database Maintenance Librarian Kent State University May18-20, 2009

The DeathFlip Project: Killing Off Your Authors Practically Painlessly Michael Kreyche, Systems Librarian Amey Park, Database Maintenance Librarian Kent

Embed Size (px)

Citation preview

Page 1: The DeathFlip Project: Killing Off Your Authors Practically Painlessly Michael Kreyche, Systems Librarian Amey Park, Database Maintenance Librarian Kent

The DeathFlip Project:Killing Off Your Authors Practically Painlessly

Michael Kreyche, Systems LibrarianAmey Park, Database Maintenance Librarian

Kent State UniversityMay18-20, 2009

Page 2: The DeathFlip Project: Killing Off Your Authors Practically Painlessly Michael Kreyche, Systems Librarian Amey Park, Database Maintenance Librarian Kent

2

Summary

At a Glance

LC Policy Change—Adding Death Dates

Record Maintenance Nightmare

Brainstorming for Solutions

"DeathFlip" Authority Records

Mostly Automated Procedures

Record Creation

Page 3: The DeathFlip Project: Killing Off Your Authors Practically Painlessly Michael Kreyche, Systems Librarian Amey Park, Database Maintenance Librarian Kent

3

The Short Version:

library.kent.edu/deathflip

Page 4: The DeathFlip Project: Killing Off Your Authors Practically Painlessly Michael Kreyche, Systems Librarian Amey Park, Database Maintenance Librarian Kent

4

Page 5: The DeathFlip Project: Killing Off Your Authors Practically Painlessly Michael Kreyche, Systems Librarian Amey Park, Database Maintenance Librarian Kent

5

Typical DeathFlip Record

00239nz 2200097n 4500

008 080125n| acannaab |n aaa ¶

100 1 ‡a Burns, George, ‡d 1896-1996¶

400 1 ‡a Burns, George, ‡d 1896-¶

667 ‡a DeathFlip¶

670 ‡a 20060201¶

910 ‡a n 79065048 ¶

Page 6: The DeathFlip Project: Killing Off Your Authors Practically Painlessly Michael Kreyche, Systems Librarian Amey Park, Database Maintenance Librarian Kent

6

The General Idea

1. Load all the DeathFlip records. Twice.

2. Wait for AACP to run overnight

3. Check your “Updated bibliographic headings” report.

4. Gloat.

5. Check your “Near matches” report.

6. Do a little cleanup.

Page 7: The DeathFlip Project: Killing Off Your Authors Practically Painlessly Michael Kreyche, Systems Librarian Amey Park, Database Maintenance Librarian Kent

7

FAQ

How many DF records do you have?– 18,358 so far.

Did you say load them TWICE? – Yes, once as names and once as subjects.

Should I load them all?– Yes.

Won’t I get a lot of blind references reported?– Yes. But it’s OK.

Won’t I get a lot of duplicates reported?– Yes. But it’s OK. Really.

Page 8: The DeathFlip Project: Killing Off Your Authors Practically Painlessly Michael Kreyche, Systems Librarian Amey Park, Database Maintenance Librarian Kent

8

Interesting equation: b + d = tDF

If:– Your authority file is up to date and– You load a DeathFlip file

The following equation holds true for your reports:Blind references + Duplicate records

= Total DeathFlip records in file

Page 9: The DeathFlip Project: Killing Off Your Authors Practically Painlessly Michael Kreyche, Systems Librarian Amey Park, Database Maintenance Librarian Kent

9

20 Year Old LC Policy

“Don't add dates to existing headings”

Criticized by members of the library community, especially when headings– are for famous dead people and – include a birth date but lack a death date

Examples:• 100 1# $a Warhol, Andy, $d 1928-• 100 1# $a Dali, Salvador, $d 1904-• 100 1# $a Nixon, Richard M. $q (Richard Milhous),$d 1913-• 100 0# $a Diana, $c Princess of Wales, $d 1961-• 100 0# $a John Paul $b II, $c Pope, $d 1920-

Page 10: The DeathFlip Project: Killing Off Your Authors Practically Painlessly Michael Kreyche, Systems Librarian Amey Park, Database Maintenance Librarian Kent

10

Death Dates Chronology

June 2005: LC Proposal

September 2005: LC Decisions

Fall 2005: LCRI 22.17 revision

2006: Big LC project to add death dates to "selected" headings

2007: KSU finds a solution

Page 11: The DeathFlip Project: Killing Off Your Authors Practically Painlessly Michael Kreyche, Systems Librarian Amey Park, Database Maintenance Librarian Kent

11

LC's Proposed Change, 2005

“Allow the optional addition of dates (birth, death or both) to existing personal name headings at will.... Catalogers would not be required to add dates to existing personal name headings (other than to resolve conflicts) but may exercise judgment and add the date or dates if these are judged to be useful.”

Comments?

http://www.loc.gov/catdir/cpso/pndatesorig.html

Page 12: The DeathFlip Project: Killing Off Your Authors Practically Painlessly Michael Kreyche, Systems Librarian Amey Park, Database Maintenance Librarian Kent

12

Comments

93 “yes” comments: wholeheartedly supporting the proposal as posted

28 “partial yes” comments: ... not the addition of dates “at will” ... reduce the initial impact of BFM*

12 “non-voting” comments

6 “no” comments: totally disapproved ... generally cited the impact of BFM*

* Bibliographic File Maintenance

Page 13: The DeathFlip Project: Killing Off Your Authors Practically Painlessly Michael Kreyche, Systems Librarian Amey Park, Database Maintenance Librarian Kent

13

A Few Specific Comments

Provide notification lists of the names changed in order to expedite BFM in local catalogs

Retain former heading in a 400 field to expedite BFM or machine “flipping” in some local systems

Use “b.” for all beginning dates, thus eliminating open dates altogether

Page 14: The DeathFlip Project: Killing Off Your Authors Practically Painlessly Michael Kreyche, Systems Librarian Amey Park, Database Maintenance Librarian Kent

14

LC's Key Decisions

Allow the optional addition of death dates to established headings that contain birth dates only

Investigate the development of a notification service for changed headings

Investigate changes to the MARC 21 authority format for coding “former headings” in a discrete MARC tag

Page 15: The DeathFlip Project: Killing Off Your Authors Practically Painlessly Michael Kreyche, Systems Librarian Amey Park, Database Maintenance Librarian Kent

15

OCLC Feed and Archive

http://www.oclc.org/rss/feeds/authorityrecords/default.htm

Page 16: The DeathFlip Project: Killing Off Your Authors Practically Painlessly Michael Kreyche, Systems Librarian Amey Park, Database Maintenance Librarian Kent

16

MARC Revision Chronology

December, 2006: proposal published(http://www.loc.gov/marc/marbi/2007/2007-02.html)

January 2007: amended/approved by MARC Advisory Committee

May 2007: approved by LC/LAC/BL

October 2007: MARC21 Authorities Update 8

Proposal No. 2007-02"Incorporating invalid former headings in 4XX

fields of the MARC 21 Authority Format"

Page 17: The DeathFlip Project: Killing Off Your Authors Practically Painlessly Michael Kreyche, Systems Librarian Amey Park, Database Maintenance Librarian Kent

17

MARC21 Authorities Update 8

Two Changes– Added code "h" for 4xx $w, byte 1 (No

reference structures): Indicates that the reference is not valid in any reference structure [i.e. name/subject/series].

– Expanded usage of 4xx $i:When subfield $w/1 contains code h (No reference structures), subfield $i may contain the date that a heading became invalid.

Page 18: The DeathFlip Project: Killing Off Your Authors Practically Painlessly Michael Kreyche, Systems Librarian Amey Park, Database Maintenance Librarian Kent

18Suppressing Display of “Flip” 4xx FieldsExisting coding (seems good enough to me):

100 1 ‡a Burns, George, ‡d 1896-1996¶

400 1 ‡wnnea ‡a Burns, George, ‡d 1896-¶

New coding:

100 1 ‡a Burns, George, ‡d 1896-1996¶400 1 ‡wnhe ‡a Burns, George, ‡d 1896-¶

Page 19: The DeathFlip Project: Killing Off Your Authors Practically Painlessly Michael Kreyche, Systems Librarian Amey Park, Database Maintenance Librarian Kent

19

For the Time Being:

No LC 4xx for “Former Headings”

AACP can't flip the bib headings automatically.

Each new death date requires global changes or manual correction

Very unhappy authority control librarian!

Page 20: The DeathFlip Project: Killing Off Your Authors Practically Painlessly Michael Kreyche, Systems Librarian Amey Park, Database Maintenance Librarian Kent

20

Workflow Implications

Records from Vendor:– updated authority record only• heading shows up on blind reference list• use global update when more than a few

records• manual editing for onesies/twosies

– with a "current" bib record• older bibliographic records not reported—

split file!!!• check all headings on the Death Date list?

Page 21: The DeathFlip Project: Killing Off Your Authors Practically Painlessly Michael Kreyche, Systems Librarian Amey Park, Database Maintenance Librarian Kent

21

Vendor Option (Backstage)

Developed for Wofford College– Supply brief records to flip headings– Supply full records to replace brief ones– 72% increase in cost over existing service

Can we do it ourselves?

Page 22: The DeathFlip Project: Killing Off Your Authors Practically Painlessly Michael Kreyche, Systems Librarian Amey Park, Database Maintenance Librarian Kent

22

DeathFlip Plan A

Create DF records from OCLC feed

Load DF records with custom table– Match existing records– Add identifying field

Output a copy of matched records

Overlay matched records with DF records

Wait overnight for AACP

Overlay DF with original records

Page 23: The DeathFlip Project: Killing Off Your Authors Practically Painlessly Michael Kreyche, Systems Librarian Amey Park, Database Maintenance Librarian Kent

23

Thinking the Unthinkable

What if:

We loaded all the DeathFlip records– Even if it meant two records per heading– Even if we didn’t have bib headings

Would AACP work?

Yes!

Page 24: The DeathFlip Project: Killing Off Your Authors Practically Painlessly Michael Kreyche, Systems Librarian Amey Park, Database Maintenance Librarian Kent

24

DeathFlip Plan B

Create DF records from OCLC feed

Load All DF records– Some will "duplicate" real records– Some will report as blind references

Wait overnight for AACP

Suppress/Delete DF records

Repeat as needed

Page 25: The DeathFlip Project: Killing Off Your Authors Practically Painlessly Michael Kreyche, Systems Librarian Amey Park, Database Maintenance Librarian Kent

25

Typical Basic Workflow

Day 1– Clear headings reports– Load DeathFlip records

Day 2– Generate reports– Suppress or delete records– Clear reports (wait till next day if

suppressed)

Names/Subjects can be done separately

Page 26: The DeathFlip Project: Killing Off Your Authors Practically Painlessly Michael Kreyche, Systems Librarian Amey Park, Database Maintenance Librarian Kent

26Last Quarterly Loads (March 2009)

Subjects

Names

All

Records loaded 1,115 1,115

Fields updated 247 3,497 3,744

Bib records updated

133 2,493 2,626

Headings 24 333 577

Page 27: The DeathFlip Project: Killing Off Your Authors Practically Painlessly Michael Kreyche, Systems Librarian Amey Park, Database Maintenance Librarian Kent

27

Timing

Load DF before vendor records– Flips retro records and the newly returned

records. – Report when loading bslw shows lots of

duplicate records, but can pretty much ignore or scan quickly. Blinds worth looking at.

Otherwise:– New records with a death date report as

blind because non-current bib records haven't been flipped yet—have to look at all because some really are legit (especially conference and corporate names).

Page 28: The DeathFlip Project: Killing Off Your Authors Practically Painlessly Michael Kreyche, Systems Librarian Amey Park, Database Maintenance Librarian Kent

28Initial Record Loads—December 2007

Subjects

Names

All

Records loaded 9,949 9,949

Fields updated 461 4,078 4,539

Near matches 617 419 1,036

Headings from February 2006-August 2007

Page 29: The DeathFlip Project: Killing Off Your Authors Practically Painlessly Michael Kreyche, Systems Librarian Amey Park, Database Maintenance Librarian Kent

29

Testing the equation: b + d = tDF

Subjects

Names

All

tDF (records loaded)

9,949 9,949

b (blind references)

8,280 4,845

d (duplicate records)

1,554 4,965

b + d 9,834 9,810

Missing records 115 139 254

Page 30: The DeathFlip Project: Killing Off Your Authors Practically Painlessly Michael Kreyche, Systems Librarian Amey Park, Database Maintenance Librarian Kent

30Identifying Missing Authority Records

Tentative process:1. Make a list of loaded DF records;

export record number, 910, 1xx subject, 1xx name to text file

2. Make a list from the blind/duplicate report; export record number, 1xx subject, 1xx name to text file

3. Merge and sort text files; identify non-duplicated record numbers

4. Look up in OCLC by LCCN (910)

Page 31: The DeathFlip Project: Killing Off Your Authors Practically Painlessly Michael Kreyche, Systems Librarian Amey Park, Database Maintenance Librarian Kent

31

Timing for duplicate detection

Duplicate detection works best when your database is more up to date than the DeathFlip records.

If you get records from a vendor, load the DeathFlip records after the vendor records.

Page 32: The DeathFlip Project: Killing Off Your Authors Practically Painlessly Michael Kreyche, Systems Librarian Amey Park, Database Maintenance Librarian Kent

32

OCLC's RSS Feed

Page 33: The DeathFlip Project: Killing Off Your Authors Practically Painlessly Michael Kreyche, Systems Librarian Amey Park, Database Maintenance Librarian Kent

33

Processing the Feed—Raw Code

Page 34: The DeathFlip Project: Killing Off Your Authors Practically Painlessly Michael Kreyche, Systems Librarian Amey Park, Database Maintenance Librarian Kent

34

Step 1: Download Feed

Perl script saves current feed– Covers about the last two months– Older data migrated to HTML archive

Two stages:– Initially, harvested archive– Harvest feed at least every two

months

Character encoding changes over time

Page 35: The DeathFlip Project: Killing Off Your Authors Practically Painlessly Michael Kreyche, Systems Librarian Amey Park, Database Maintenance Librarian Kent

35

Step 2: Extract Data

Another Perl script– Parses Feed– Produces Tab-Delimited file

Page 36: The DeathFlip Project: Killing Off Your Authors Practically Painlessly Michael Kreyche, Systems Librarian Amey Park, Database Maintenance Librarian Kent

36

Step 3: Build MARC Record

Yet another Perl script– Fixes delimiters and other characters– Restructures data into MARC

Page 37: The DeathFlip Project: Killing Off Your Authors Practically Painlessly Michael Kreyche, Systems Librarian Amey Park, Database Maintenance Librarian Kent

37

Occasional Retrospective Flip?

Unsuppress or reload earlier DF records to fix newly loaded bib records

March 2008 (3 months after first load)– 17 fields changed

May 2009 (1½ years after first load):– 122 fields changed for 36 headings

How long is it worthwhile???

Page 38: The DeathFlip Project: Killing Off Your Authors Practically Painlessly Michael Kreyche, Systems Librarian Amey Park, Database Maintenance Librarian Kent

38

Outcomes

Pluses– Saves a huge amount of work– Can find missing authority records

Minuses– Doesn't catch TOC fields (970)– Doesn't catch headings with $e, $t,

etc.– Sometimes the DF headings get

updated

Page 39: The DeathFlip Project: Killing Off Your Authors Practically Painlessly Michael Kreyche, Systems Librarian Amey Park, Database Maintenance Librarian Kent

39

Original:– Gimbutas, Marija Alseikaitė,|d1921-

DeathFlip:– Gimbutas, Marija Alseikaitė,|d1921-1994

Latest:– Gimbutas, Marija,|d1921-1994– This heading reported as blind.

Page 40: The DeathFlip Project: Killing Off Your Authors Practically Painlessly Michael Kreyche, Systems Librarian Amey Park, Database Maintenance Librarian Kent

40

Go For It!

library.kent.edu/deathflip

Page 41: The DeathFlip Project: Killing Off Your Authors Practically Painlessly Michael Kreyche, Systems Librarian Amey Park, Database Maintenance Librarian Kent

41

Bibliography

Addition of Death Dates to Personal Names. http://www.loc.gov/catdir/cpso/pndates.html. See also:

– http://www.loc.gov/catdir/cpso/pndatesorig.html

– http://www.loc.gov/catdir/cpso/deathdates.pdf

LC Proposal, Addition of Dates to Existing Personal Name Headings. June, 2005. http://www.loc.gov/catdir/cpso/pndatesorig.html

LC Analysis of Comments Received and Corresponding Decisions. September, 2005 http://www.loc.gov/catdir/cpso/deathdates.pdf

Policy on the Implementation of Revised LCRI 22.17, Notice of OCLC RSS Feed for Local Bibliographic File Maintenance. Fall 2005. http://www.loc.gov/catdir/cpso/lcri22_17imp.html

Park,Amey L. Death Dates Added to Some Personal Name Headings. TECHKNOW: A Quarterly Review of Bright Ideas For the Technical Services Division. Ohio Library Council. http://www.library.kent.edu/files/TechKNOW_July_2006.pdf