22
Combine_and_stir (Aleph data + RDF + Python + other things) IGeLU 2015 Developer’s Day Budapest, Hungary 2015-09- 05 Laura Akerman

Combine_and_stir (Aleph data + RDF + Python + other things) IGeLU 2015 Developers Day Budapest, Hungary 2015-09-05 Laura Akerman

Embed Size (px)

DESCRIPTION

Background/Inspiration Emory University has 3.5 million Aleph records. As a metadata specialist for digital projects, I became familiar with some faculty requests to use data from our catalog in new ways. IGELU-ELUNA Linked Open Data Interest Group – Use Cases ses+and+scenarios ses+and+scenarios Emory University’s linked data pilot project, “Connections”: https://scholarblogs.emory.edu/connections/ https://scholarblogs.emory.edu/connections/ Bernardo Gomez’s work: scripts to extract data from Aleph Combine and Stir - Laura Akerman - IGeLU Dev Day 2015

Citation preview

Page 1: Combine_and_stir (Aleph data + RDF + Python + other things) IGeLU 2015 Developers Day Budapest, Hungary 2015-09-05 Laura Akerman

Combine_and_stir (Aleph data + RDF + Python + other things)

IGeLU 2015 Developer’s DayBudapest, Hungary 2015-09-05

Laura Akerman

Page 2: Combine_and_stir (Aleph data + RDF + Python + other things) IGeLU 2015 Developers Day Budapest, Hungary 2015-09-05 Laura Akerman

Question:

How can we use linked data with our library metadata to better support student and faculty research?• Small scale use cases that focus on a subset of our

resources (Why? Emory Libraries’ capacity to do new things limited this year (Alma migration & Hydra implementation)

• Integration with Primo – including Vivo• Large scale integration of campus information, a la LD4L

Combine and Stir - Laura Akerman - IGeLU Developers Day 2015

Page 3: Combine_and_stir (Aleph data + RDF + Python + other things) IGeLU 2015 Developers Day Budapest, Hungary 2015-09-05 Laura Akerman

Combine and Stir - Laura Akerman - IGeLU Dev Day 2015

Background/Inspiration• Emory University has 3.5 million Aleph records.• As a metadata specialist for digital projects, I became

familiar with some faculty requests to use data from our catalog in new ways.

• IGELU-ELUNA Linked Open Data Interest Group – Use Cases http://exlibrisgroup.org/display/CrossProduct/LOD+Use+cases+and+scenarios

• Emory University’s linked data pilot project, “Connections”: https://scholarblogs.emory.edu/connections /

• Bernardo Gomez’s work: scripts to extract data from Aleph

Page 4: Combine_and_stir (Aleph data + RDF + Python + other things) IGeLU 2015 Developers Day Budapest, Hungary 2015-09-05 Laura Akerman

Combine and Stir - Laura Akerman - IGeLU Dev Day 2015

Bernardo’s Work

• Presented at ELUNA and in a use case call: Exposing Aleph Bibliographic and Authority records as RDF/XML and providing auto-discovery in Primo http://mail.library.emory.edu/Slidy/linked_data_webseminar.html

• He is planning to share code, but meanwhile, can contact him at [email protected] if you’re interested.

Page 5: Combine_and_stir (Aleph data + RDF + Python + other things) IGeLU 2015 Developers Day Budapest, Hungary 2015-09-05 Laura Akerman

Combine and Stir - Laura Akerman - IGeLU Dev Day 2015

A few details:

• Service to retrieve authority keys for headings in a bib record in Aleph.

• Service to retrieve both bib and authority records, represent them in “MarcEdit” text format, and convert them to RDFXML (using simple vocabularies for demonstration).

• Could store these in triplestore (Sesame), or retrieve fuller representations from OCLC for bib and auth, or VIAF for name authorities.

Page 6: Combine_and_stir (Aleph data + RDF + Python + other things) IGeLU 2015 Developers Day Budapest, Hungary 2015-09-05 Laura Akerman

Combine and Stir - Laura Akerman - IGeLU Dev Day 2015

Bernardo’s use cases:

• Link to WorldCat Identities page for first author in Primo bib record

Page 7: Combine_and_stir (Aleph data + RDF + Python + other things) IGeLU 2015 Developers Day Budapest, Hungary 2015-09-05 Laura Akerman

Combine and Stir - Laura Akerman - IGeLU Dev Day 2015

Page 8: Combine_and_stir (Aleph data + RDF + Python + other things) IGeLU 2015 Developers Day Budapest, Hungary 2015-09-05 Laura Akerman

Combine and Stir - Laura Akerman - IGeLU Dev Day 2015

Page 9: Combine_and_stir (Aleph data + RDF + Python + other things) IGeLU 2015 Developers Day Budapest, Hungary 2015-09-05 Laura Akerman

Combine and Stir - Laura Akerman - IGeLU Dev Day 2015

Adding JSON-LD to our “Full View”

Page 10: Combine_and_stir (Aleph data + RDF + Python + other things) IGeLU 2015 Developers Day Budapest, Hungary 2015-09-05 Laura Akerman

Combine and Stir - Laura Akerman - IGeLU Dev Day 2015

Page 11: Combine_and_stir (Aleph data + RDF + Python + other things) IGeLU 2015 Developers Day Budapest, Hungary 2015-09-05 Laura Akerman

Combine and Stir - Laura Akerman - IGeLU Dev Day 2015

My ideas:• Small applications for specific purposes• Class projects, faculty and grad student research projects.• Marry RDF from Aleph (or Primo?) with external linked

data. • Create web interfaces depending on use case – display,

possibly search, visualization, map, timeline, etc. • Write Python scripts/modules to do this – create building

blocks that can be modified for more projects, maybe by savvy faculty or students.

• Share with everyone on GitHub

Page 12: Combine_and_stir (Aleph data + RDF + Python + other things) IGeLU 2015 Developers Day Budapest, Hungary 2015-09-05 Laura Akerman

Combine and Stir - Laura Akerman - IGeLU Dev Day 2015

Why Aleph?

• We have Bernardo’s scripts NOW to get the authority IDs out of Aleph, get authority records, and use LCCN etc. to get to VIAF.

• In future when we go to Alma, need to be able to use Alma APIs to get this data. If Primo JSON-LD API could furnish identifiers (authority IDs) it could be useful here.

Page 13: Combine_and_stir (Aleph data + RDF + Python + other things) IGeLU 2015 Developers Day Budapest, Hungary 2015-09-05 Laura Akerman

Combine and Stir - Laura Akerman - IGeLU Dev Day 2015

Why Python?

• I wanted to really learn it. (Newbie alert!)• I thought lots of library programmers might use it too.

• IS THIS TRUE? IS PYTHON A GOOD LANGUAGE FOR SHARING SOMETHING LIKE THIS?

Page 14: Combine_and_stir (Aleph data + RDF + Python + other things) IGeLU 2015 Developers Day Budapest, Hungary 2015-09-05 Laura Akerman

Combine and Stir - Laura Akerman - IGeLU Dev Day 2015

101 USE CASES?

• 1. Hurricane Katrina study class. Actual request some years ago to set up a database of our records for resources about the hurricane, which would be augmented with other resources found on the web by students and annotated.

Catalog record + student contributed data + annotation (+ Wikidata?) + ?

Page 15: Combine_and_stir (Aleph data + RDF + Python + other things) IGeLU 2015 Developers Day Budapest, Hungary 2015-09-05 Laura Akerman

Combine and Stir - Laura Akerman - IGeLU Dev Day 2015

More

• 2. Famous author: Catalog records for works by and about the person, + VIAF + Wikidata + archive data

• 3. Group of musicians, artists, poets… expand from #2. *

• SEE ALSO Networking the Belfast Group, a much more extensive and cool project from Rebecca Koeser and others at Emory involving linked data: http://digitalscholarship.emory.edu/projects/project-networking-belfast.html

Page 16: Combine_and_stir (Aleph data + RDF + Python + other things) IGeLU 2015 Developers Day Budapest, Hungary 2015-09-05 Laura Akerman

Combine and Stir - Laura Akerman - IGeLU Dev Day 2015

More…….

5. Class reading list – catalog records + Wikidata on the authors + ?6. Unfolding event: RSS feed from Primo (limited to Aleph/Alma records?) + Wikidata successive harvests + Twitter? + ?……………………………………………………………..……YOUR IDEAS?

Page 17: Combine_and_stir (Aleph data + RDF + Python + other things) IGeLU 2015 Developers Day Budapest, Hungary 2015-09-05 Laura Akerman

Combine and Stir - Laura Akerman - IGeLU Dev Day 2015

So far…• Code to take a file of aleph numbers, and retrieve bibs as

XML using Bernardo’s service• File of bib and auth numbers associating the record IDs for

authorities in the bib with the bib ID• Retrieve auth records as XML from Aleph • Create Primo permalinks, and store the relationship

between a “resource” and the permalink as RDF. Working on code to harvest bibs and send to Sesame.

Choice: do our own conversion on our bib/auth data, or Harvest RDF from OCLC?

Page 18: Combine_and_stir (Aleph data + RDF + Python + other things) IGeLU 2015 Developers Day Budapest, Hungary 2015-09-05 Laura Akerman

Combine and Stir - Laura Akerman - IGeLU Dev Day 2015

To do:• Code to send chunks of RDF from bibs and auth in Sesame• Extract Viaf record ID from auths for persons or organizations. Get the

record in RDF.• Use links in Viaf records to Wikidata to retrieve RDF related to the

person or organization• Select elements to be included in Sesame and send to Sesame.• Create web display of information about the person or organization of

interest, information about the related resources and other persons, selected other data and links.

• Complete the route by coding the extraction of Aleph IDs from an RSS feed or eshelf folder.

• End result / first demo – enhanced info on interesting person or organization (Zsa Zsa Gabor, Mark Twain, Dalai Lama,

Page 19: Combine_and_stir (Aleph data + RDF + Python + other things) IGeLU 2015 Developers Day Budapest, Hungary 2015-09-05 Laura Akerman

Combine and Stir - Laura Akerman - IGeLU Dev Day 2015

Challenges for me:

• Using Pymarc to extract data from MARCXML – does the job but not well documented. Talking to Heidi Frank.

• XML handling modules and making REST api calls – differences in Python versions (using 2.7) mean older examples found on the web didn’t work.

Page 20: Combine_and_stir (Aleph data + RDF + Python + other things) IGeLU 2015 Developers Day Budapest, Hungary 2015-09-05 Laura Akerman

Combine and Stir - Laura Akerman - IGeLU Dev Day 2015

Questions:

• Are there gotchas in these scenarios, and if so how could we work around…– Uncertainty about metadata rights?– Desire to include Primo Central?– Other datasources lacking “authority records”?– External data that’s less open than we want?– Need to establish “by hand” – mapping across

vocabularies (could this be shared?)

Page 21: Combine_and_stir (Aleph data + RDF + Python + other things) IGeLU 2015 Developers Day Budapest, Hungary 2015-09-05 Laura Akerman

Combine and Stir - Laura Akerman - IGeLU Dev Day 2015

Your thoughts…

• Would it be worthwhile pursuing the development of tools like this – and would you contribute?

Page 22: Combine_and_stir (Aleph data + RDF + Python + other things) IGeLU 2015 Developers Day Budapest, Hungary 2015-09-05 Laura Akerman

Combine and Stir - Laura Akerman - IGeLU Dev Day 2015

Thank you

• Contact me if you have more ideas/interest:• Laura Akerman, [email protected]• Latest working code will be up on GitHub in a

week or so as soon as I figure out how to resolve a “conflict”...

https://github.com/lake44me/Link