Upload
patrick-hochstenbach
View
654
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Introduction into library data processing with help of Catmandu http://librecat.org
Citation preview
Processing Linked Data with
CatmanduPatrick Hochstenbach | UGenthttp://librecat.org
Processing Linked Data with Catmandu | http://librecat.org
LUND
GHENT
BIELEFELD
Processing Linked Data with Catmandu | http://librecat.org
RATIONALE
Processing Linked Data with Catmandu | http://librecat.org
KAHN-WILENSKI WEBHANDLE
SERVICE PROVIDER
REPOSITORY
REPOSITORY
I search a paper about...
Processing Linked Data with Catmandu | http://librecat.org
Hypothesis 1: one network with a common schemaHypothesis 2: object-oriented designHypothesis 3: the resource is the message
Processing Linked Data with Catmandu | http://librecat.org
Hypothesis 1: one network with a common schema
GOOGLE EUROPEANA
OPENAIRE CRISVideos
Images
Books
Data sets
Processing Linked Data with Catmandu | http://librecat.org
Hypothesis 2: object-oriented design
Drive
Race
Park
Economy Compact
MinivanConvertible
Wheel
Half car
Bicycle
Zeppelin
Processing Linked Data with Catmandu | http://librecat.org
Hypothesis 3: the resource researcher is the message
DNS
REPOSITORY
CLOUD
Dr. Müller
Processing Linked Data with Catmandu | http://librecat.org
LIBRECAT/CATMANDU
Processing Linked Data with Catmandu | http://librecat.org
CATMANDU
PubMed
MARCMODS
EXCEL
DSPACE
Fedora
SRUOAI-PMH
DBI
ISI Twitter
DBIAtom
EXCELRDF
JSONXML
SolrElasticSearch
MongoDB
FedoraAleph
Fix
Processing Linked Data with Catmandu | http://librecat.org
FUNCTIONAL DESIGN
JSON}each
slicetake
group
select
map
reduce
add_field
join_field
lookup
remove_field
marc_mabcount
Processing Linked Data with Catmandu | http://librecat.org
LIBRECAT
Institutional Repositories
Search Engines
Image Databases
Archival Systems
Data cleaning workbench
Citation Style Processor
Processing Linked Data with Catmandu | http://librecat.org
CATMANDUcatmandu convert MARC to JSON < records.mrc
catmandu convert OAI --url http://server/OAI to JSON
catmandu convert SRU --url http://server/SRU --query dna to JSON
catmandu convert DBI --query ‘SELECT * FROM table’ to JSON
catmandu convert MARC to JSON < records.mrc
catmandu convert OAI --url http://server/OAI to XML
catmandu convert SRU --url http://server/SRU --query dna to YAML
catmandu convert ArXiv --query ‘all:electron’ to CSV
CONVERT
Processing Linked Data with Catmandu | http://librecat.org
CATMANDU
catmandu convert X to Y --fix ‘marc_map(“245”,”title”)’
catmandu convert X to Y --fix ‘prepend(“title”,”abcd-”)’
catmandu convert X toY --fix fixes.txt
fixes.txt: remove_field(“_id”); marc_map(“001”, “merge.id”); prepend(“merge.id”, “author:”); add_field(“merge.source”,”author”); copy_field(“merge.id”,”_id”);
FIX
set_field add_field move_field copy_field remove_field upcase downcase capitalize trim substring prepend appendlookup lookup_in_store countcmdsplit_field join_field retain_field replace_all collapse expand cloneif_all_match if_any_match if_exists
Processing Linked Data with Catmandu | http://librecat.org
CATMANDU
catmandu import JSON to MongoDB --opt ... --opt ...catmandu import MARC to ElasticSearch catmandu import DC to FedoraCommonscatmandu import CSV to DBI
catmandu export MongoDB to JSONcatmandu export Solr to YAMLcatmandu export DBI to CSVcatmandu export FedoraCommons to Template --template test.tt test.tt: (TemplateToolKit)
[%- FOREACH f IN record %] [% _id %] [% f.shift %][% f.shift %][% f.shift %][% f.join(“:”) %][%- END %]
IMPORT / EXPORT
Processing Linked Data with Catmandu | http://librecat.org
CATMANDU
https://metacpan.org/pod/Catmanduhttps://github.com/LibreCat/Catmandu http://librecat.org/tutorial/http://librecat.org/catmandu/2013/06/21/catmandu-cheat-sheet.html
Processing Linked Data with Catmandu | http://librecat.org
LIBRECAT http://biblio.ugent.be
Processing Linked Data with Catmandu | http://librecat.org
LIBRECAT http://pub.uni-bielefeld.de/en
Processing Linked Data with Catmandu | http://librecat.org
LIBRECAT http://adore.ugent.be
Processing Linked Data with Catmandu | http://librecat.org
LIBRECAT http://libnew.ugent.be
Processing Linked Data with Catmandu | http://librecat.org
Architecture
FEDORA
MEDIAMOSA
VLE BIBLIO SCANNING
RECEIVE
ALEPH ABS
CLOUD
DEDUP/MERGE/AUGMENT
BLACKLIGHT
Processing Linked Data with Catmandu | http://librecat.org
LINKED DATA
Processing Linked Data with Catmandu | http://librecat.org
PRODUCTION
CATALOG
MARC
245 $$a ... $$b 260 $$a ... 700 $$a ...
JSON/YAML
LINKED DATA
Processing Linked Data with Catmandu | http://librecat.org
STAGE 1: CATALOG to MARC
CATALOG
MARC
245 $$a ... $$b 260 $$a ... 700 $$a ...
$ catmandu export ALEPH to MARC
Processing Linked Data with Catmandu | http://librecat.org
STAGE 2: MARC to JSON
MARC
245 $$a ... $$b 260 $$a ... 700 $$a ...
JSON/YAML
Processing Linked Data with Catmandu | http://librecat.org
MARC
245 $$a ... $$b 260 $$a ... 700 $$a ...
JSON/YAML
STAGE 2: MARC to JSON
Tolstoj, Lev Nikolaevič,Author
War and peace /Title
1952.Publication Year
Napoleonic Wars,Subject
Processing Linked Data with Catmandu | http://librecat.org
STAGE 2: MARC to JSON
MARC
245 $$a ... $$b 260 $$a ... 700 $$a ...
JSON/YAML
Tolstoj, Lev Nikolaevič,War and peace /1952.Napoleonic Wars,
AuthorTitleYear
Subject
Tolstoj, Lev NikolaevičWar and peace1952Napoleonic Wars
AuthorTitleYear
Subject
FIX
Processing Linked Data with Catmandu | http://librecat.org
STAGE 3a: JSON to RDF
JSON/YAML
LINKED DATA
Tolstoj, Lev NikolaevičWar and peace1952Napoleonic Wars
AuthorTitleYear
Subject
?
Tolstoj, Lev NikolaevičWar and peace1952Napoleonic Wars
dc:creator
dc:title
dc:date
dc:subject
FIX
Processing Linked Data with Catmandu | http://librecat.org
JSON/YAML
LINKED DATA
STAGE 3a: JSON to RDF
<http://example.org/000000008> <http://purl.org/dc/elements/1.1/creator> “Tolstoj, Lev Nikolaevič”; <http://purl.org/dc/elements/1.1/title> “War and peace” ; <http://purl.org/dc/elements/1.1/date> “1952” ; <http://purl.org/dc/elements/1.1/subject> “Napoleonic Wars” ; a <http://www.europeana.eu/schemas/edm/Book> .
FIX
RDF/Turtle
http://demo.librecat.org/
Processing Linked Data with Catmandu | http://librecat.org
STAGE 3b: RDF to Linked Data
JSON/YAML
LINKED DATA
<http://example.org/000000008> <http://purl.org/dc/elements/1.1/creator> “Tolstoj, Lev Nikolaevič”; <http://purl.org/dc/elements/1.1/title> “War and peace” ; <http://purl.org/dc/elements/1.1/date> “1952” ; <http://purl.org/dc/elements/1.1/subject> “Napoleonic Wars” ; a <http://www.europeana.eu/schemas/edm/Book> .
<http://example.org/000000008> <http://purl.org/dc/elements/1.1/creator> <http://viaf.org/viaf/96987389>; <http://purl.org/dc/elements/1.1/title> “War and peace” ; <http://purl.org/dc/elements/1.1/date> “1952” ; <http://purl.org/dc/elements/1.1/subject> <http://dbpedia.org/page/Napoleonic_Wars> ; a <http://www.europeana.eu/schemas/edm/Book> .
FIX
Processing Linked Data with Catmandu | http://librecat.org
THANK YOU
Nicolas SteenlantNicolas Franck Snorri Briem
Jörgen ErikssonMaria HedbergDave Sheroman
Friedrich SummannNajko JahnVitali PeilPetra KohorstChristian PietschMathias Lösch
Johan RolschewskiJakob Voß
UGENTLUND
BIELEFELD
GBV STAATSBIBLIOTHEK ZU BERLIN
Wouter WillaertINUITS