23
Versioning for Authorities Authority Control IG, Chicago Midwinter 2015

Versioning for Authorities, presentation at Midwinter Chicago 2015

Embed Size (px)

Citation preview

Versioning for

Authorities

Authority Control IG, Chicago Midwinter 2015

Topics

Questioning our assumptions about Authority Control

Differences between records and people

NAF ‘Work’ records

Limitations of our current approach

What’s online now?

Other sources of name data?

2/1/15ACIG/MW 2015 2

NAF for Person

2/1/15ACIG/MW 2015 3

Current Approach

Limitations of library authority control:

Focus is on name variants to support unique text

strings

Record ID in NAF, SAF and VIAF are not name IDs

ORCID, ISNI, etc. intended to identify person

Rules don’t support references and ‘outlinks’

Centralized management both a strength & weakness

Doesn’t address needs for more automated solutions

2/1/15ACIG/MW 2015 4

Name Aggregation

VIAF—an aggregation of authority records

Records gathered are used to create services and

visualizations

Timelines

Associates, works, publishers, etc.

Versioned in a manner different from id.loc.gov, but with

significant limitations

Policy documentation missing

2/1/15ACIG/MW 2015 5

IDs

2/1/15ACIG/MW 2015 6

2/1/15ACIG/MW 2015 7

What changed?

Was it a

semantic

change?

2/1/15ACIG/MW 2015 8

Usage Questions

If we use URIs for names and subjects, should we

also cache the data behind them?

If we cache, we need to worry about change

management

What kind of support will we expect from external

systems? How do we express those expectations?

What constitutes change in a name ID file?

2/1/15ACIG/MW 2015 9

2/1/15ACIG/MW 2015 10

ORCID

“ORCID provides a persistent digital identifier that distinguishes you from every other researcher and, through integration in key research workflows such as manuscript and grant submission, supports automated linkages between you and your professional activities ensuring that your work is recognized.”

Began as a way to disambiguate scientific researchers, now more broadly used

Encourages linking ORCID with other identifiers

2/1/15ACIG/MW 2015 11

ISNI Record

2/1/15ACIG/MW 2015 12

ISNI

Audience: libraries, publishers, databases and rights

management organizations

Website and API access

Limited online input

Online enrichment data moderated

Users must request ISNI through organizations

charged with maintaining the information

2/1/15ACIG/MW 2015 13

NAF ISNI ORCID

2/1/15ACIG/MW 2015 14

*Centrally managed and

distributed

*Expert input only

*Files available, no

persistent deletes

*Centrally managed

*Some ‘improvement’ by

non-experts

*Must be signed in to add

data

*Self-registration and content

management by acct. owner

*Some ‘private’ data not

available via public API

*Member institutions can

integrate access and update

Uniform Titles?

2/1/15ACIG/MW 2015 15

Developed for ...?

Managing fixed name/title strings as ‘works’ made

more sense in the catalog card days

Does it make sense now?

Will RDA supplant this tradition with ‘work’ entity

records?

Some compilations (Bible, historically anonymous

titles) may require a hybrid approach

2/1/15ACIG/MW 2015 16

Dealing with change

Many flavors of version control!

Fine granularity at transaction level

Dated URIs (may be links to earlier versions)

Last date only (unspecified changes)

Linked access to old versions

Dated release number (and sometime diffs)

Most recent raw file availability

2/1/15ACIG/MW 2015 17

Alternatives?

We need to find something which is optimized for

automated updating

The model for software versioning and updating is

already used by all of us (even if we’re unaware of it)

‘Semantic versioning’ (semver.org) can be used to

bring similar version control options to semantic

information (elements, vocabularies, etc.)

2/1/15ACIG/MW 2015 18

3 tier numbering system: major.minor.patch

X.X.X

Major: breaks backwards semantic compatibility

Minor: change in semantics of any property of any

element

Patch: no change in semantics of any element

Semantic Versioning

2/1/15ACIG/MW 2015 19

Smart Semantics

Smooth interaction between application and vocab

Transparent to users (until major change requires some

user decisions)

Distributed version control (Git, etc.)

Vocabulary managers trusted to comply with (simple)

semantic versioning policies and practices

And encouraged to provide details of semantic

breakage between major versions

2/1/15ACIG/MW 2015 20

2/1/15ACIG/MW 2015 21

Remaining Questions

How do we make the shift from assuming human one-by-one lookup to the kind of environment we see in the software industry?

Is that lack of capability one of the reasons that the vendors have been holding back?

How much of a problem is it that the ‘new IDs’ (ORCID and ISNI) don’t seem to do semantic versioning? Are they assuming only lookup will maintain their ‘share of the market?’

2/1/15ACIG/MW 2015 22

No More Handcrafting!

RIMMF’s automated approaches emphasize using

available sources, like NAF

If a name cannot be made unique, does it matter?

Moves the requirement of uniqueness to URI, not string

Does that mean we can stop worrying about

undifferentiated names?

2/1/15ACIG/MW 2015 23