25
Biblissima: Medieval Manuscripts and the Semantic Web Stefanie GEHRKE [email protected] Équipex Biblissima http://biblissima-con dorcet.fr Textual Heritage and Information Technologies El’Manuscript-2016 - Vilnius, 24th August 2016 (Use of XML and TEI in preparing, processing, and publishing digital resources)

Biblissima: Medieval Manuscripts and the Semantic Web

Embed Size (px)

Citation preview

Page 1: Biblissima: Medieval Manuscripts and the Semantic Web

Biblissima: Medieval Manuscripts and the Semantic Web

Stefanie [email protected]

Équipex Biblissimahttp://biblissima-condorcet.fr

Textual Heritage and Information Technologies

El’Manuscript-2016 - Vilnius, 24th August 2016(Use of XML and TEI in preparing, processing, and publishing digital resources)

Stefanie Gehrke
@Yann - possible ?
Page 2: Biblissima: Medieval Manuscripts and the Semantic Web

http://biblissima-condorcet.fr/en 2013 - 2016 (2019)

Page 3: Biblissima: Medieval Manuscripts and the Semantic Web
Page 4: Biblissima: Medieval Manuscripts and the Semantic Web

Library of Libraries

Page 5: Biblissima: Medieval Manuscripts and the Semantic Web

Biblissima’s Data• Manuscripts

– Parts– Folios / Pages

- Editions• Incunabulas

• Illuminations• Provenance Marks• Texts

http://biblissima-condorcet.fr/fr/ressources/ressources-biblissima (Records and Digital Surrogates)

• Inventories• Sales Catalogues

• Collections

• Places• Persons• Organisations

– Libraries

Page 6: Biblissima: Medieval Manuscripts and the Semantic Web

Structured in 40 Databases• MySQL• Access

• EAD

• TEI-P5

• MARC-XML + TEI-P5

Page 7: Biblissima: Medieval Manuscripts and the Semantic Web

Single Access-Point

Challenges :

- missing IDs

- partially no use of authority data

- different spellings- different versions of

shelfmark- libraries sometimes

also “former owners”

- images in silos

Page 8: Biblissima: Medieval Manuscripts and the Semantic Web

URLs from existing LoD Data Sets

3507 works in relation with >10 000

textual units

254 in 2109 authors could not be aligned

with an external reference (564 not with

data.bnf.fr)

Page 9: Biblissima: Medieval Manuscripts and the Semantic Web

Our Solution

data alignment and data cleaning

Biblissima person / organisationplacecollectionbookpart folio / pagework / expression => URL Biblissima

Page 10: Biblissima: Medieval Manuscripts and the Semantic Web

XML Pivot Biblissima● Inspired by EAD, TEI-P5 (Manuscript Description) and FRBRoo

● Export format AND import format

● for the moment a very light DTD

● <RecordList><Database/></RecordList>

● Book | Identifier | Repository |Manifestation | GroupBooks | HasPart | Place | Participant | Work | Text | Language | Collection | HasFeature | Concept | Name

● @role | @id | @id_bbma | @canonical

http://doc.biblissima-condorcet.fr/contribuer-a-biblissima ; github

Page 11: Biblissima: Medieval Manuscripts and the Semantic Web

Dataflow Biblissima• Data delivery in EAD, TEI-P5 or XML-Biblissima by partner• Extraction of authority data (CSV via XSLT)• Identification of the individuals (links to authority records of BnF, DNB, LoC,

VIAF and records in GeoNames, TGN, Wikidata)• Import of links to authority records (<Concept>) (CSV 2 XML via XSLT) • Delivery to technical partner• Quality control and ingest into CubicWeb (+ merging)• Publication (Web pages, Download, SPARQL-endpoint)

http://doc.biblissima-condorcet.fr/contribuer-a-biblissima ; github

Page 12: Biblissima: Medieval Manuscripts and the Semantic Web

Sample Data (XML Text/Work)

Source : Europeana Regia - BnF, Ms Français 263

ERegia : BnF, MSS Français 263

Page 13: Biblissima: Medieval Manuscripts and the Semantic Web

Search + View “Work”

Titus Livius : Ab urbe condita

Stefanie Gehrke
TODO Stefanie screenshot
Page 14: Biblissima: Medieval Manuscripts and the Semantic Web

View “Text” and “Manuscript”

Page 15: Biblissima: Medieval Manuscripts and the Semantic Web

Why RDF ?

Biblissima

Page 16: Biblissima: Medieval Manuscripts and the Semantic Web

• Interlinking• Enrichment

• Data Sharing

Why RDF ?

Page 17: Biblissima: Medieval Manuscripts and the Semantic Web

Vocabularies used for our RDFFRBRoo

Page 18: Biblissima: Medieval Manuscripts and the Semantic Web

RDF for “Work” and “Expression”

Page 19: Biblissima: Medieval Manuscripts and the Semantic Web

More Demos

Page 20: Biblissima: Medieval Manuscripts and the Semantic Web

Manuscripts and Textual UnitsChristine de Pisan (1363?-1431?) - Epistre à la reine Isabeau - Epistre à Eustache Morel - Proverbes moraux - Livre de Prudence

Stefanie Gehrke
Concernant les unités textuelles : Œuvre associée : ??? => remplacer ??? par preflabel data.bnf.fr de l'oeuvre
Stefanie Gehrke
Texte : ??? => remplacer ??? par auteur (preflabel data.bnf.fr) + ': ' + oeuvre (preflabel data.bnf.fr) = '[' langue ']'
Stefanie Gehrke
remplacer Intervenants : Christine de Pisan (1363?-1431?) => par Auteur : Christine de Pisan (1363?-1431?)
Page 21: Biblissima: Medieval Manuscripts and the Semantic Web

Historical Collections

Stefanie Gehrke
changer URL collection vers coldata + ID ancien possesseur (id data.bnf.fr)
Page 22: Biblissima: Medieval Manuscripts and the Semantic Web

Manuscripts and Sales Catalogues

Stefanie Gehrke
RDF
Stefanie Gehrke
concernant la relation ms arsenal et ms décrits dans catalogue de vente : en RDF : frbroo:F4_Manifestation_Singleton rdf:about="URL 'BnF' " owl:sameAs rdf:resource="URL "racine Biblissima + ID manuscrit Esprit des Livres
Page 23: Biblissima: Medieval Manuscripts and the Semantic Web

Sales Catalogue

Stefanie Gehrke
à ajouter https://extranet.logilab.fr/demo/BIBLISSIMA/inventory/132361 en haut à droite : "Source : Esprit des Livres"
Page 24: Biblissima: Medieval Manuscripts and the Semantic Web

Intention• Increase visibility of the partners databases• Interconnect the partners data• Combine data from libraries and research institutes• Provide persistent URLs• Interlink with authority data• For the general public AND domain experts• And for machines

Page 25: Biblissima: Medieval Manuscripts and the Semantic Web

Technical and scientific teams BnF, IRHT, EPHE, CESR, CIHAM, ENC, CRAHAM, MRSH

Team Data “pool” Biblissima (structure and content of the application)Doudou Dieye, IRHT (support data team)

Team Web “pool” Biblissima (front-end and iframe Mirador)

Matthieu Bonicel, coordinator “pool” Biblissima, BnFPierre-Yves Buard, Cyril Masset, Marjorie Burghart (technical advisors, Biblissima)

Anne-Marie Turcan-Verkerk, scientific responsible for Biblissima, Campus Condorcet

Team Logilab (technical realisation)

Thank you for your attention !

Stefanie Gehrke, Data Coordinator - Coordinator Prototype - Coordinator Semantic Web Publication Biblissima