Upload
europeana
View
148
Download
3
Tags:
Embed Size (px)
Citation preview
D A T A Q U A L I T Y I N T H E A G E O F L I N K E D D A T A
Trust me if you dare
http://gallica/ark:/12148/btv1b90519196
G I L D A S I L L I E N , B N F S É B A S T I E N P E Y R A R D , B N F
data.bnf.fr : one vision, three goals
Be reusable
Be visible
Be legible
https://www.flickr.com/photos/ramonbaile/2274662139
http://commons.wikimedia.org/wiki/File:Carte_M%C3%A9tro_de_Paris.jpg#mediaviewer/File:Carte_M%C3%A9tro_de_Paris.jpg
By humans
By machines
https://www.flickr.com/photos/bdesham/2432400623
2
DATA.BNF.FR: WHAT IT DOES
data.bnf.fr: a guided tour
4
DATA.BNF.FR: HOW IT WORKS
Any black Magic?
1xx(creator of)
0070(author)
WORK
PUBLICATION
PERSON
INTERMARC
INTERMARC
INTERMARC
FRBNF11896956
FRBNF11967514
FRBNF37465618
Any black Magic?
dc:creator
WORK
PUBLICATION
PERSON
http://catalogue.bnf.fr/ark:/12148/cb118969563
http://catalogue.bnf.fr/ark:/12148/cb11967514v
http://catalogue.bnf.fr/ark:/12148/cb374656186
dc:creator
RDF triples
RDF triples
RDF triples rdarelationships:workManifested
S T R U C T U R E D D A T A
V I N T A G E L I N K E D D A T A
S I N C E 1 9 8 7
T R U S T E D I D E N T I F I E R S
The true magic behind this is:
C A T A L O G E R S A N D C A T A L O G U E Q A
T E A M :
- I N T E L L I G E N T D A T A
- C O N S I S T E N C Y
- L I N K C U R A T I O N
https://www.flickr.com/photos/bohman/4394901689
Work-manifestation links: machine calculated
dc:creator
WORK
PUBLICATION
PERSON
http://catalogue.bnf.fr/ark:/12148/cb118969563
http://catalogue.bnf.fr/ark:/12148/cb11967514v
http://catalogue.bnf.fr/ark:/12148/cb374656186
dc:creator
RDF triples
RDF triples
RDF triples rdarelationships:workManifested
DATA.BNF.FR : RETURN ON INVESTMENTS?
Why not give the data back to the source catalogue?
MARC catalogue
data.bnf.fr
structured data
enriched data
How to make it happen?
Start easy
evolution vs. revolution
one challenge at a time
Speak the language of the catalogue
channel the skills and tools of the QA experts
leverage the organization around the catalogue
What happened… and what will happen
Tests: improve the algorithm
discussion with experts
injection whenever one decision
tolerance of a certain levelof error
Use the algorithm to suggest
suggest multiple candidates to an expert when undecicable
Next steps: new algorithms
Aggregates (one manifestation, many works)
Create new works out of clusters
Deduplication of sparse records
M I T T L E R Z W I S C H E N H I R N U N D H Ä N D E N M U S S D A S H E R Z S E I N
(temporary)conclusion
to be continued…