36
Knowledge engineering techniques for the creation of a semantic digital edition of Saussure's manuscripts Gilles Falquet, Luka Nerima, Massimo Brero

Knowledge engineering techniques for the creation of a semantic

Embed Size (px)

Citation preview

Page 1: Knowledge engineering techniques for the creation of a semantic

Knowledge engineering techniques for the creation of a semantic digital edition of

Saussure's manuscripts

Gilles Falquet, Luka Nerima, Massimo Brero

Page 2: Knowledge engineering techniques for the creation of a semantic

Fribourg Workshop – 27.02.2014

Université de Genève - CUI 2

1.  Storage, visualization, annotation, transcriptions of manuscripts from Ferdinand de Saussure

2.  Digital scholarly publishing of manuscripts

} 

Page 3: Knowledge engineering techniques for the creation of a semantic

A system for the visualization, annotation, and transcription of manuscripts from Ferdinand de Saussure

Fribourg Workshop – 27.02.2014

Université de Genève - CUI 3

Swiss linguist (1857 – 1913 Famous for

modern linguistics structuralism Cours de linguistique générale

very few publications in his lifetime but 15'000 sheets of paper given to libraries (Harvard, Paris, Geneva)

Page 4: Knowledge engineering techniques for the creation of a semantic

Aims of the project

A usable tool for researchers

1.  Visualization 2.  Annotation 3.  Transcription

of manuscrits from F. de Saussure

Université de Genève - CUI Fribourg Workshop – 27.02.2014

4

Page 5: Knowledge engineering techniques for the creation of a semantic

Typical (human) task: ���Reconstructing the reading order

Fribourg Workshop – 27.02.2014

Université de Genève - CUI 5

Page 6: Knowledge engineering techniques for the creation of a semantic

Main concepts

Fribourg Workshop – 27.02.2014

Université de Genève - CUI 6

Transcriptionelement

zone

Writing surface

Pictures

Covered surface

zone

Annotation Transcriptionelement

Transcription

Page 7: Knowledge engineering techniques for the creation of a semantic

Data/Knowledge Model Represent •  basic metadata about manuscripts •  location, date, image file, ...

•  (scientific) transcriptions •  annotations •  semantic annotations

Available on the semantic web •  expressed in RDF/S •  stored in a RDF triple store

Université de Genève - CUI Fribourg Workshop – 27.02.2014

7

Page 8: Knowledge engineering techniques for the creation of a semantic

From classification numbers to URIs

Université de Genève - CUI 27.02.14 8

Semantic web => universal identification (URI) •  library classification number → URI

Example (BGE) •  Cote : Ms. fr. 3951/10, f. 28 •  Nom de fichier : ms_fr_03951_10_f028v_029.tif

URI : •  x:ms_fr •  x:ms_fr_03951 •  x:ms_fr_03951_10 •  x:ms_fr_03951_10_f028v_029 •  x:ms_fr_03951_10_f028v_029-DOT-jp2 •  x:ms_fr_03951_10_f028v_029_Z_001 •  x:ms_fr_03951_10_f028v_029_Z_001_annot_001 •  x:ms_fr_03951_10_f028v_029_Z_001_Shape_001

Page 9: Knowledge engineering techniques for the creation of a semantic

Data Model

Université de Genève - CUI Fribourg Workshop – 27.02.2014

9

Page 10: Knowledge engineering techniques for the creation of a semantic

System / User Interface

Fribourg Workshop – 27.02.2014

Université de Genève - CUI 10

Page 11: Knowledge engineering techniques for the creation of a semantic

Manuscript visualization

27.02.14 Université de Genève - CUI 11

Page 12: Knowledge engineering techniques for the creation of a semantic

Manuscript visualization

27.02.14 Université de Genève - CUI 12

Page 13: Knowledge engineering techniques for the creation of a semantic

Manuscript visualization

27.02.14 Université de Genève - CUI 13

Page 14: Knowledge engineering techniques for the creation of a semantic

Manuscript visualization

27.02.14 Université de Genève - CUI 14

Page 15: Knowledge engineering techniques for the creation of a semantic

Manuscript visualization

27.02.14 Université de Genève - CUI 15

Page 16: Knowledge engineering techniques for the creation of a semantic

IIP Image server

}  Tiles

Université de Genève - CUI 27.02.14 16

Page 17: Knowledge engineering techniques for the creation of a semantic

Creating Annotations (texts or concepts)

Fribourg Workshop – 27.02.2014 Université de Genève - CUI 17

Page 18: Knowledge engineering techniques for the creation of a semantic

Navigation in the corpus

27.02.14 Université de Genève - CUI 18

Page 19: Knowledge engineering techniques for the creation of a semantic

Navigation in the corpus

27.02.14 Université de Genève - CUI 19

Page 20: Knowledge engineering techniques for the creation of a semantic

Full text search

27.02.14 Université de Genève - CUI 20

Page 21: Knowledge engineering techniques for the creation of a semantic

System Architecture

Université de Genève - CUI 21

Web Server/ Front end (REST)

Back end Storage control (updates, authentification)

Image import

Page 22: Knowledge engineering techniques for the creation of a semantic

Example: Inserting a new annotation

27.02.14 Université de Genève - CUI 22

}  Insert request sent to the RDF server

Page 23: Knowledge engineering techniques for the creation of a semantic

Usability Testing

27.02.14 Université de Genève - CUI 23

Methodology •  14 users (linguists, librarians, ...) •  13 tasks (4 scenarios)

•  find a manuscrit, create an annotation, ...

•  Measurements: •  #completed tasks •  time to complete each task •  user satisfaction

¨  System Usability Scale (SUS) questionaire

Page 24: Knowledge engineering techniques for the creation of a semantic

Results

27.02.14 Université de Genève - CUI 24

100% 85%

50%

Task completion by task

Task completion by user

Page 25: Knowledge engineering techniques for the creation of a semantic

Satisfaction evaluation

Fribourg Workshop - 27.02.14 Université de Genève - CUI 25

68

SUS scores (by question)

SUS scores (by user)

Page 26: Knowledge engineering techniques for the creation of a semantic

Demo Site

27.02.14 Université de Genève - CUI 26

fds.unige.ch/iipmooviewer/homepage.php

Page 27: Knowledge engineering techniques for the creation of a semantic

Digital scholarly publishing of manuscripts

27.02.14 Université de Genève - CUI 27

a knowledge representation and management model ... and a system for the digital edition of large corpora of original works

Page 28: Knowledge engineering techniques for the creation of a semantic

Context and goals

Fribourg Workshop – 27.02.2014 Université de Genève - CUI 28

Digital Critical Edition – current state •  based on paper critical edition

•  DCE of Nietzsche, Peirce, Wittgenstein •  other obstacles:

•  no scientific catalogue

Digital edition of Saussure’s manuscripts project •  to provide a cooperative edition platform for the next 20 years •  to use computers as convergence and mediation tools •  the scientific catalogue and the critical edition will be the

outputs

Page 29: Knowledge engineering techniques for the creation of a semantic

Digital editions as knowledge networks

Fribourg Workshop – 27.02.2014 Université de Genève - CUI 29

Manuscripts

Transcriptions

terminologies

Articles/Monographs

ontologies

Page 30: Knowledge engineering techniques for the creation of a semantic

Digital editions as knowledge networks

Fribourg Workshop – 27.02.2014 Université de Genève - CUI 30

Manuscripts

Transcriptions

terminologies

Articles/Monographs

ontologies

Semantic indexes

Alignment

Page 31: Knowledge engineering techniques for the creation of a semantic

Digital editions as knowledge networks

Fribourg Workshop – 27.02.2014 Université de Genève - CUI 31

Manuscripts

Transcriptions

terminologies

Articles/Monographs

ontologies

Semantic indexes

Alignment

Inferred relations

Page 32: Knowledge engineering techniques for the creation of a semantic

Knowledge modeling challenge

27.02.14 Université de Genève - CUI 32

To represent the current state of our knowledge about the manuscripts

different types of resources •  direct transcriptions •  scholarly transcriptions •  related terminologies, ontologies, dictionaries •  annotations •  ...

and resource interconnections •  semantic indexes •  text alignments / ontology alignments

Page 33: Knowledge engineering techniques for the creation of a semantic

Operations

Fribourg Workshop – 27.02.2014 Université de Genève - CUI 33

Manuscripts

Transcriptions

multiword lexical units

Articles/Monographs

ontologies

MLU extraction

Ontology Alignment

Handwriting recognition

Semantic indexing

Page 34: Knowledge engineering techniques for the creation of a semantic

Operations

Fribourg Workshop – 27.02.2014 Université de Genève - CUI 34

alignment operations: Finding correspondences between elements of different resources,

aligning ontologies, aligning texts at the sentence or term level. enrichment operations: Create new resources that describe an existing one,

add transcriptions to manuscript pictures, extract collocations from texts, create a semantic index.

}  Specific to each type of resource }  Based on OCR, NLP, AI algorithms

Challenge: define a minimal and expressive set of operations

Page 35: Knowledge engineering techniques for the creation of a semantic

System/Workbench ���for linguists/knowledge engineers

Fribourg Workshop – 27.02.2014

Université de Genève - CUI 35

}  Transcription acquisition •  crowdsourcing

}  Indexing •  word spotting, handwriting recognition ?

}  Knowledge network operations •  NLP techniques for multiword lexical unit extraction •  terminology extraction •  semantic indexing •  resource alignment (existing ontologies, terminologies, ...)

•  define operation workflows •  define virtual (hyper) document generation

Page 36: Knowledge engineering techniques for the creation of a semantic

Thank you

Fribourg Workshop – 27.02.2014 Université de Genève - CUI 36

Questions ?