23
DATA QUALITY IN THE AGE OF LINKED DATA Trust me if you dare http://gallica/ark:/12148/btv1b90519196 GILDAS ILLIEN, BNF SÉBASTIEN PEYRARD, BNF

Gildas Illien & Sébastien Peyrard EuropeanaTech 2015

Embed Size (px)

Citation preview

Page 1: Gildas Illien & Sébastien Peyrard EuropeanaTech 2015

D A T A Q U A L I T Y I N T H E A G E O F L I N K E D D A T A

Trust me if you dare

http://gallica/ark:/12148/btv1b90519196

G I L D A S I L L I E N , B N F S É B A S T I E N P E Y R A R D , B N F

Page 2: Gildas Illien & Sébastien Peyrard EuropeanaTech 2015

data.bnf.fr : one vision, three goals

Be reusable

Be visible

Be legible

https://www.flickr.com/photos/ramonbaile/2274662139

http://commons.wikimedia.org/wiki/File:Carte_M%C3%A9tro_de_Paris.jpg#mediaviewer/File:Carte_M%C3%A9tro_de_Paris.jpg

By humans

By machines

https://www.flickr.com/photos/bdesham/2432400623

2

Page 3: Gildas Illien & Sébastien Peyrard EuropeanaTech 2015

DATA.BNF.FR: WHAT IT DOES

Page 4: Gildas Illien & Sébastien Peyrard EuropeanaTech 2015

data.bnf.fr: a guided tour

4

Page 5: Gildas Illien & Sébastien Peyrard EuropeanaTech 2015
Page 6: Gildas Illien & Sébastien Peyrard EuropeanaTech 2015
Page 7: Gildas Illien & Sébastien Peyrard EuropeanaTech 2015
Page 8: Gildas Illien & Sébastien Peyrard EuropeanaTech 2015
Page 9: Gildas Illien & Sébastien Peyrard EuropeanaTech 2015
Page 10: Gildas Illien & Sébastien Peyrard EuropeanaTech 2015
Page 11: Gildas Illien & Sébastien Peyrard EuropeanaTech 2015
Page 12: Gildas Illien & Sébastien Peyrard EuropeanaTech 2015
Page 13: Gildas Illien & Sébastien Peyrard EuropeanaTech 2015
Page 14: Gildas Illien & Sébastien Peyrard EuropeanaTech 2015

DATA.BNF.FR: HOW IT WORKS

Page 15: Gildas Illien & Sébastien Peyrard EuropeanaTech 2015

Any black Magic?

1xx(creator of)

0070(author)

WORK

PUBLICATION

PERSON

INTERMARC

INTERMARC

INTERMARC

FRBNF11896956

FRBNF11967514

FRBNF37465618

Page 16: Gildas Illien & Sébastien Peyrard EuropeanaTech 2015

Any black Magic?

dc:creator

WORK

PUBLICATION

PERSON

http://catalogue.bnf.fr/ark:/12148/cb118969563

http://catalogue.bnf.fr/ark:/12148/cb11967514v

http://catalogue.bnf.fr/ark:/12148/cb374656186

dc:creator

RDF triples

RDF triples

RDF triples rdarelationships:workManifested

Page 17: Gildas Illien & Sébastien Peyrard EuropeanaTech 2015

S T R U C T U R E D D A T A

V I N T A G E L I N K E D D A T A

S I N C E 1 9 8 7

T R U S T E D I D E N T I F I E R S

The true magic behind this is:

C A T A L O G E R S A N D C A T A L O G U E Q A

T E A M :

- I N T E L L I G E N T D A T A

- C O N S I S T E N C Y

- L I N K C U R A T I O N

https://www.flickr.com/photos/bohman/4394901689

Page 18: Gildas Illien & Sébastien Peyrard EuropeanaTech 2015

Work-manifestation links: machine calculated

dc:creator

WORK

PUBLICATION

PERSON

http://catalogue.bnf.fr/ark:/12148/cb118969563

http://catalogue.bnf.fr/ark:/12148/cb11967514v

http://catalogue.bnf.fr/ark:/12148/cb374656186

dc:creator

RDF triples

RDF triples

RDF triples rdarelationships:workManifested

Page 19: Gildas Illien & Sébastien Peyrard EuropeanaTech 2015

DATA.BNF.FR : RETURN ON INVESTMENTS?

Page 20: Gildas Illien & Sébastien Peyrard EuropeanaTech 2015

Why not give the data back to the source catalogue?

MARC catalogue

data.bnf.fr

structured data

enriched data

Page 21: Gildas Illien & Sébastien Peyrard EuropeanaTech 2015

How to make it happen?

Start easy

evolution vs. revolution

one challenge at a time

Speak the language of the catalogue

channel the skills and tools of the QA experts

leverage the organization around the catalogue

Page 22: Gildas Illien & Sébastien Peyrard EuropeanaTech 2015

What happened… and what will happen

Tests: improve the algorithm

discussion with experts

injection whenever one decision

tolerance of a certain levelof error

Use the algorithm to suggest

suggest multiple candidates to an expert when undecicable

Next steps: new algorithms

Aggregates (one manifestation, many works)

Create new works out of clusters

Deduplication of sparse records

Page 23: Gildas Illien & Sébastien Peyrard EuropeanaTech 2015

M I T T L E R Z W I S C H E N H I R N U N D H Ä N D E N M U S S D A S H E R Z S E I N

(temporary)conclusion

to be continued…