Upload
ursula-thornton
View
216
Download
1
Tags:
Embed Size (px)
Citation preview
Authority Control and Bib Enhancement with Marcive
Mark Sandford
William Paterson University
http://goo.gl/S5FyR8
Cheng Library
➔ Voyager 8.2➔ Approx 435,000 records➔ Over 100,000 Ebrary records
◆ Authority control on creation, but not ongoing
➔ Thousands of books withdrawn in the past 8 years
➔ Inconsistent handling of problematic headings in Voyager batch processing of authorities◆ E.g. Name-title entries
Authority Control @ WPU
➔ Initial contract with MARCIVE about 10 years ago
➔ Regular incremental updates➔ Plan was to do full authority run every
10 years➔ Had heard of RDA-ification through
MARCIVE marketing materials➔ It was time➔ Time was running out!
Our choices
➔ Name Authority Files➔ LCSH Authority Files➔ MESH Authority Files➔ Additional processing
◆ Addition of RDA CMC (33X fields)◆ Kept GMD (Just in case (in case of what?))◆ 260 to 264 field*◆ Lexile, Accelerated Reader, Reading Counts
(Curriculum Materials)◆ Convert relator codes to relator terms◆ Various cleanup/tidying of fields and
normalization
* Required update to our VUFind installation
Working with MARCIVE
➔ Library Information Services and Cataloging completed 11 page profile
➔ No custom processing was needed◆ A la carte options worth a look
➔ Profile sent May 16➔ Bib file sent to MARCIVE June 5
◆ No existing bibs should be changed as of this date!
➔ June 9 confirmation email from MARCIVE with notification of 2 bad records in our file
➔ June 11 test file received➔ June 12 approval sent to MARCIVE➔ June 19 processed MARC files available via FTP
Loading into Voyager
➔ MARCIVE provides batches of 50,000 records
➔ Voyager suggests loads of 10,000 per batch
➔ MarcEdit to the rescue➔ Already have Voyager 001s in records,
so no complex matching needed➔ Start Loading!
Keyword Indexing (Or How NOT to import)
➔ Pre Voyager 9.0 (we were on 8.2)◆ Default import performed keyword
indexing◆ Generates a temporary index file◆ Each record loaded takes a tiny bit longer
to load than previous record◆ If temporary file gets too large, the system
crashes➔ Required nightly keyword index regeneration➔ Only 20k to 30k records per day loaded
◆ 1 example log showed 10k record loaded in 2 hours 14 minutes
◆ The NEXT 10k took 4 hours
No Keyword Index (How to import)
➔ Pre Voyager 9.0, Pbulkimport –X (NOKEY Disable keyword index and maintenance)
➔ As of Voyager 9.0, this is the DEFAULT for Pbulkimport
➔ Each batch of 10k took just under 1 hour
➔ 3 loads per day vs 8 per day➔ Single Keyword Regen at project
completion
Authorities Records Loads
➔ Decision was made not to wipe out existing authorities database
➔ Voyager retains heading ids to match on – easy!
➔ Also loaded in batches of 10k➔ Kept the –X switch, but no idea if it
mattered
Result logs
In addition to the bib file, we received the following log files:
➔ Reader notes report (for a small fee)➔ Unrecognized Headings for Corporate Names➔ Unrecognized Headings for Geographic
Names➔ Unrecognized Headings for LCSH➔ Unrecognized Headings for Meeting Names➔ Unrecognized Headings for MESH➔ Unrecognized Headings for Personal Names ➔ Unrecognized Headings for Series
Some numbers
435866 Total bibliographic records were input
110949 Bibliographic records were changed
435866 Total bibliographic records were output
2195243 Total headings selected for processing
2028671 Authorized headings verified without change
55081 Authorized headings required modification
791 Verified by quality assurance review
18160 Recognized unauthorized headings replaced
2380 Headings matched multiple authorized forms
7634 Undifferentiated personal names
82526 Unrecognized headings left unchanged
312216 Subject geographic subdivisions processed
650388 Subject general subdivisions processed
262315 Subject form subdivisions processed
383 Non-filing indicators corrected
22 $h subfields corrected in a 245 field
LC adult subject headings processed in 6XX(,0)
807771 Total headings processed
799910 99.00% Authorized headings verified without change
2221 0.29% Authorized headings required modification
101 0.01% Verified by quality assurance review
4254 0.54% Recognized unauthorized headings replaced
35 0.00% Headings matched multiple authorized forms
1250 0.15% Unrecognized headings left unchanged
100 Field Personal Names
309813 Total headings processed
288709 93.19% Authorized headings verified without change
6005 1.94% Authorized headings required modification
32 0.01% Verified by quality assurance review
2698 0.87% Recognized unauthorized headings replaced
704 0.23% Headings matched multiple authorized forms
2638 0.85% Undifferentiated personal names
9027 2.91% Unrecognized headings left unchanged
Cleaning Up
Unmatched headings fall into several categories➔ Uncontrolled names from OCLC➔ Bad MARC data
◆ Example: 651 _0 ‡ aBrass quartets (Baritone, horn, trombone, trumpet), Arranged
➔ Tons of Meeting Names never authorized➔ Corporate Names never authorized➔ Hideous computer generated MARC records
◆ Example: 700 1_ ‡a Bailyn, Ph.D., Bernard, ‡e Performer ‡u Adams University Professor, Harvard University, author of Voyagers to the West.
➔ Decide what to agonize over
Other Data Projects
➔ Using VUFind layer to map Lexile scores to facets◆ https://chengfind.wpunj.edu/Search/Results
?lookfor=butterflies&type=AllFields➔ Potential to use RDA 33x and 34x fields to improve
format facets
Questions?
Mark Sandford Special Formats CatalogerWilliam Paterson University [email protected]
http://goo.gl/S5FyR8