37
Introduction The GapVis Interface Event annotation Secondary Literature Insights in the World of Thucydides The Hellespont Project as a research environment for Digital History A. Thomas ab F. Mambrini b M. Romanello bc a Universität zu Köln b Deutsches Archäologisches Institut, Berlin c King’s College, London August 9, 2013 Thomas, Mambrini, Romanello The Hellespont Project

Digital Classicist London Seminars 2013 - Seminar 10 - Agnes Thomas et al

Embed Size (px)

DESCRIPTION

Agnes Thomas, Francesco Mambrini & Matteo Romanello (DAI, Berlin) 'Insights in the World of Thucydides: The Hellespont Project as a research environment for Digital History'. Digital Classicist London & Institute of Classical Studies seminar 2013, Friday August 9th. The Hellespont Project (German Archaeological Institute and Tufts University) aims to integrate two of the largest online collections for the study of Antiquity, the Perseus Digital Library and the Arachne archaeological database, in a dynamic digital research environment. Historians will have access to materials and resources of heterogeneous type, like ancient texts, archaeological evidence, historical background, and modern scholarly literature, while the documents related to each single historical event taken from the textual evidence will be interconnected through the CIDOC-CRM model. Hellespont as a case study focuses on a limited historical period, the 50-year period in the history of Athens between the end of the Persian Wars (479 BCE) and the outburst of the Peloponnesian War (431 BCE). Furthermore, it follows the narration presented by the most important written source, chapters 1.89-118 of the Histories of Thucydides, who was a contemporary to some of the facts. One of the point of departure for the project is the annotation of Thucydides' text with multiple layers of linguistic information. Our goal is really to create a "digital sourcebook" including a lot of machine-actionable information, where historians can go to find references to sources, and tools to help linguistic analysis of the original texts. Documents are bridged using the event-based CIDOC-CRM. We are working with two different concepts of events. In CIDOC ontology, events encompass all changes of states in cultural systems: they are identified by reference to historical scholarship. In Ancient History, where event reconstruction is mostly based on the interpretation of written sources, this definition isinsufficient. We are therefore implementing a data-driven approach, based on the semantic/syntactic strategies that express mutation in the external words through language. We aim to identify such strategies through a fine-grained semantic annotation of the written ancient texts. We are going to present the digitally analysed text of Thucydides including different kind of additional information in a single Virtual Research Environment (VRE). The interface, which is currently still being implemented, is based on the same idea of GapVis, that is a visual interface for reading texts providing the user with multiple views on the same passage of text. In the presentation we will show the most important parts of the different views the user will access in the interface.

Citation preview

Page 1: Digital Classicist London Seminars 2013 - Seminar 10 - Agnes Thomas et al

IntroductionThe GapVis Interface

Event annotationSecondary Literature

Insights in the World of ThucydidesThe Hellespont Project as a research environment for Digital

History

A. Thomasab F. Mambrinib M. Romanellobc

aUniversität zu Köln

bDeutsches Archäologisches Institut, Berlin

cKing’s College, London

August 9, 2013

Thomas, Mambrini, Romanello The Hellespont Project

Page 2: Digital Classicist London Seminars 2013 - Seminar 10 - Agnes Thomas et al

IntroductionThe GapVis Interface

Event annotationSecondary Literature

Outline

1 Introduction

2 The GapVis Interface

3 Event annotationManual event annotationLinguistic annotation

4 Secondary Literature

Thomas, Mambrini, Romanello The Hellespont Project

Page 3: Digital Classicist London Seminars 2013 - Seminar 10 - Agnes Thomas et al

IntroductionThe GapVis Interface

Event annotationSecondary Literature

The Hellespont ProjectIntegrating Arachne and Perseus

October 2010 - September 2013

http://arachne.uni-koeln.de/drupal/?q=de/node/231

Thomas, Mambrini, Romanello The Hellespont Project

Page 4: Digital Classicist London Seminars 2013 - Seminar 10 - Agnes Thomas et al

IntroductionThe GapVis Interface

Event annotationSecondary Literature

Cooperating Institutions and Persons

German ArchaeologicalInstitute Berlin:Ortwin DallyReinhard FörtschFrancesco MambriniMatteo RomanelloWolfgang Schmidle

The Perseus Project:Bridget AlmasAlison BabeuLisa CerratoGregory Crane

Cologne DigitalArchaeology Laboratory:Carina BerningRobert KummerAlexander RechtMarcel RiedelKaren SchwaneAgnes Thomas

Thomas, Mambrini, Romanello The Hellespont Project

Page 5: Digital Classicist London Seminars 2013 - Seminar 10 - Agnes Thomas et al

IntroductionThe GapVis Interface

Event annotationSecondary Literature

GapVis for Hellespont

Named entities, linguistic information, event annotation, andbibliography connected in one interface:

A case study on Thuc. 1.89-118Different formats (TEI, CIDOC-CRM, AGDT, PML. . . )User interface based on GapVis:

http://nrabinowitz.github.io/gapvis

Thomas, Mambrini, Romanello The Hellespont Project

Page 6: Digital Classicist London Seminars 2013 - Seminar 10 - Agnes Thomas et al

IntroductionThe GapVis Interface

Event annotationSecondary Literature

Book Summary

Thomas, Mambrini, Romanello The Hellespont Project

Page 7: Digital Classicist London Seminars 2013 - Seminar 10 - Agnes Thomas et al

IntroductionThe GapVis Interface

Event annotationSecondary Literature

Entity Detail

Thomas, Mambrini, Romanello The Hellespont Project

Page 8: Digital Classicist London Seminars 2013 - Seminar 10 - Agnes Thomas et al

IntroductionThe GapVis Interface

Event annotationSecondary Literature

Arachne Topography

Thomas, Mambrini, Romanello The Hellespont Project

Page 9: Digital Classicist London Seminars 2013 - Seminar 10 - Agnes Thomas et al

IntroductionThe GapVis Interface

Event annotationSecondary Literature

Related Entities

Thomas, Mambrini, Romanello The Hellespont Project

Page 10: Digital Classicist London Seminars 2013 - Seminar 10 - Agnes Thomas et al

IntroductionThe GapVis Interface

Event annotationSecondary Literature

Reading View

Thomas, Mambrini, Romanello The Hellespont Project

Page 11: Digital Classicist London Seminars 2013 - Seminar 10 - Agnes Thomas et al

IntroductionThe GapVis Interface

Event annotationSecondary Literature

Manual event annotationLinguistic annotation

Outline

1 Introduction

2 The GapVis Interface

3 Event annotationManual event annotationLinguistic annotation

4 Secondary Literature

Thomas, Mambrini, Romanello The Hellespont Project

Page 12: Digital Classicist London Seminars 2013 - Seminar 10 - Agnes Thomas et al

IntroductionThe GapVis Interface

Event annotationSecondary Literature

Manual event annotationLinguistic annotation

Going through secondary literature

Thomas, Mambrini, Romanello The Hellespont Project

Page 13: Digital Classicist London Seminars 2013 - Seminar 10 - Agnes Thomas et al

IntroductionThe GapVis Interface

Event annotationSecondary Literature

Manual event annotationLinguistic annotation

Event List

Thomas, Mambrini, Romanello The Hellespont Project

Page 14: Digital Classicist London Seminars 2013 - Seminar 10 - Agnes Thomas et al

IntroductionThe GapVis Interface

Event annotationSecondary Literature

Manual event annotationLinguistic annotation

Oinophyta Event

Thomas, Mambrini, Romanello The Hellespont Project

Page 15: Digital Classicist London Seminars 2013 - Seminar 10 - Agnes Thomas et al

IntroductionThe GapVis Interface

Event annotationSecondary Literature

Manual event annotationLinguistic annotation

Myronides as a general

Thomas, Mambrini, Romanello The Hellespont Project

Page 16: Digital Classicist London Seminars 2013 - Seminar 10 - Agnes Thomas et al

IntroductionThe GapVis Interface

Event annotationSecondary Literature

Manual event annotationLinguistic annotation

Outline

1 Introduction

2 The GapVis Interface

3 Event annotationManual event annotationLinguistic annotation

4 Secondary Literature

Thomas, Mambrini, Romanello The Hellespont Project

Page 17: Digital Classicist London Seminars 2013 - Seminar 10 - Agnes Thomas et al

IntroductionThe GapVis Interface

Event annotationSecondary Literature

Manual event annotationLinguistic annotation

Natural language and events

Thuc. 1.102.2μάλιστα δ᾿ αὐτοὺς ἐπεκαλέσαντο ὅτι τειχομαχεῖν ἐδόκουν δυνατοὶ

εἶναι, τοῖς δὲ πολιορκίας μακρᾶς καθεστηκυίας τούτου ἐνδεᾶ

ἐφαίνετο: βίᾳ γὰρ ἂν εἷλον τὸ χωρίον.

Translation

[The siege of Ithome proved tedious, and the Lacedaemonianscalled in, among other allies, the Athenians . . . ]

[They] invited them especially because [they] considered [them]particularly skilled in siege operations, while, since the siege forthem was dragging on, [their] own deficiency in that sort ofwarfare was clear: for otherwise [they] would have taken theplace by force.

Thomas, Mambrini, Romanello The Hellespont Project

Page 18: Digital Classicist London Seminars 2013 - Seminar 10 - Agnes Thomas et al

IntroductionThe GapVis Interface

Event annotationSecondary Literature

Manual event annotationLinguistic annotation

Natural language and events

Thuc. 1.102.2μάλιστα δ᾿ αὐτοὺς ἐπεκαλέσαντο ὅτι τειχομαχεῖν ἐδόκουν δυνατοὶ

εἶναι, τοῖς δὲ πολιορκίας μακρᾶς καθεστηκυίας τούτου ἐνδεᾶ

ἐφαίνετο: βίᾳ γὰρ ἂν εἷλον τὸ χωρίον.

Translation[The siege of Ithome proved tedious, and the Lacedaemonianscalled in, among other allies, the Athenians . . . ][They] invited them especially because [they] considered [them]particularly skilled in siege operations, while, since the siege forthem was dragging on, [their] own deficiency in that sort ofwarfare was clear: for otherwise [they] would have taken theplace by force.

Thomas, Mambrini, Romanello The Hellespont Project

Page 19: Digital Classicist London Seminars 2013 - Seminar 10 - Agnes Thomas et al

IntroductionThe GapVis Interface

Event annotationSecondary Literature

Manual event annotationLinguistic annotation

Natural language and events

Thuc. 1.102.2μάλιστα δ᾿ αὐτοὺς ἐπεκαλέσαντο ὅτι τειχομαχεῖν ἐδόκουν δυνατοὶ

εἶναι, τοῖς δὲ πολιορκίας μακρᾶς καθεστηκυίας τούτου ἐνδεᾶ

ἐφαίνετο: βίᾳ γὰρ ἂν εἷλον τὸ χωρίον.

Translation[The siege of Ithome proved tedious, and the Lacedaemonianscalled in, among other allies, the Athenians . . . ][They] invited them especially because [they] considered [them]particularly skilled in siege operations, while, since the siege forthem was dragging on, [their] own deficiency in that sort ofwarfare was clear: for otherwise [they] would have taken theplace by force.

Thomas, Mambrini, Romanello The Hellespont Project

Page 20: Digital Classicist London Seminars 2013 - Seminar 10 - Agnes Thomas et al

IntroductionThe GapVis Interface

Event annotationSecondary Literature

Manual event annotationLinguistic annotation

Natural language and events

Thuc. 1.102.2μάλιστα δ᾿ αὐτοὺς ἐπεκαλέσαντο ὅτι τειχομαχεῖν ἐδόκουν δυνατοὶ

εἶναι, τοῖς δὲ πολιορκίας μακρᾶς καθεστηκυίας τούτου ἐνδεᾶ

ἐφαίνετο: βίᾳ γὰρ ἂν εἷλον τὸ χωρίον.

Translation[The siege of Ithome proved tedious, and the Lacedaemonianscalled in, among other allies, the Athenians . . . ][They] invited them especially because [they] considered [them]particularly skilled in siege operations, while, since the siege forthem was dragging on, [their] own deficiency in that sort ofwarfare was clear: for otherwise [they] would have taken theplace by force.

Thomas, Mambrini, Romanello The Hellespont Project

Page 21: Digital Classicist London Seminars 2013 - Seminar 10 - Agnes Thomas et al

IntroductionThe GapVis Interface

Event annotationSecondary Literature

Manual event annotationLinguistic annotation

NLP Pipeline

Tokenization POS-TaggingSyntacticParsing

ThematicRoles

InformationStructure

CoreferenceResolution

Thomas, Mambrini, Romanello The Hellespont Project

Page 22: Digital Classicist London Seminars 2013 - Seminar 10 - Agnes Thomas et al

IntroductionThe GapVis Interface

Event annotationSecondary Literature

Manual event annotationLinguistic annotation

NLP Pipeline

NLP Process Ancient Greek?

Chunking

Lemmatization

POS-tagging

Syntactic parsing

Word-sense disambiguation

Co-reference resolution

Semantic role annotation

Thomas, Mambrini, Romanello The Hellespont Project

Page 23: Digital Classicist London Seminars 2013 - Seminar 10 - Agnes Thomas et al

IntroductionThe GapVis Interface

Event annotationSecondary Literature

Manual event annotationLinguistic annotation

Using and Enhancing the available resourcesThe Ancient Greek Dependency Treebank

AGDT: treebank with word-by-word morphological anddependency-based syntactical description

a step forward: semantic information

Thomas, Mambrini, Romanello The Hellespont Project

Page 24: Digital Classicist London Seminars 2013 - Seminar 10 - Agnes Thomas et al

IntroductionThe GapVis Interface

Event annotationSecondary Literature

Manual event annotationLinguistic annotation

Analytical Level“Surface” syntax

a-a-1999.01.0199_book1-chapter89_3AuxS

οἱAtr

γὰρAuxY

ἈθηναῖοιSb

τρόπῳAdv

τοιῷδεAtr

ἦλθονPred

ἐπὶAuxP

τὰAtr

πράγματαObj

ἐνAuxP

οἷςAdv

ηὐξήθησανAtr

.AuxK

Thomas, Mambrini, Romanello The Hellespont Project

Page 25: Digital Classicist London Seminars 2013 - Seminar 10 - Agnes Thomas et al

IntroductionThe GapVis Interface

Event annotationSecondary Literature

Manual event annotationLinguistic annotation

Valency

The verbal node expresses a little drama. As adrama, it implies a process and, most of the times,actors and circumstances

L. Tesnière

Thomas, Mambrini, Romanello The Hellespont Project

Page 26: Digital Classicist London Seminars 2013 - Seminar 10 - Agnes Thomas et al

IntroductionThe GapVis Interface

Event annotationSecondary Literature

Manual event annotationLinguistic annotation

Tectogrammatical annotation

t-t_tree-grc-s1-rootroot

γάρ1PRECatom

Ἀθηναῖος1ACTn.denot

ἔρχομαι1 enuncPREDv

πρᾶγμα1DIR3 staten.denot

ὅς1ACMP circn.denot

#PersPronACTn.denot

αὐξάνω1RSTRv

τρόπος1MANNn.denot

τοιόσδε1RSTRadj.pron.def.demon

.

.

.

Thomas, Mambrini, Romanello The Hellespont Project

Page 27: Digital Classicist London Seminars 2013 - Seminar 10 - Agnes Thomas et al

IntroductionThe GapVis Interface

Event annotationSecondary Literature

Manual event annotationLinguistic annotation

From treebanks to event data-bases

Thomas, Mambrini, Romanello The Hellespont Project

Page 28: Digital Classicist London Seminars 2013 - Seminar 10 - Agnes Thomas et al

IntroductionThe GapVis Interface

Event annotationSecondary Literature

Manual event annotationLinguistic annotation

What can you do with multi-layer trees?“Meaningful” relations between NEs

[The Athenians]. . . broughtthe territories of Boeotia andPhocis under their obedience,and withal razed the walls ofTanagra and took of thewealthiest of the Locrians ofOpus a hundred hostages,and finished also at the sametime their long walls at home(1.108.3)

Thomas, Mambrini, Romanello The Hellespont Project

Page 29: Digital Classicist London Seminars 2013 - Seminar 10 - Agnes Thomas et al

IntroductionThe GapVis Interface

Event annotationSecondary Literature

Manual event annotationLinguistic annotation

Maps with semantically relevant relationsE.g. travels by sea

πλέω(sail)

Actor

DIR 3 (to)

DIR1 (from)

The Athenians

Other NE's

Thomas, Mambrini, Romanello The Hellespont Project

Page 30: Digital Classicist London Seminars 2013 - Seminar 10 - Agnes Thomas et al

IntroductionThe GapVis Interface

Event annotationSecondary Literature

Manual event annotationLinguistic annotation

What can you do with multi-layer trees?Extraction and analysis of events

What actions do the Athenians perform?

Thomas, Mambrini, Romanello The Hellespont Project

Page 31: Digital Classicist London Seminars 2013 - Seminar 10 - Agnes Thomas et al

IntroductionThe GapVis Interface

Event annotationSecondary Literature

Manual event annotationLinguistic annotation

What can you do with multi-layer trees?Extraction and analysis of events

What actions do the Spartans perform?

Thomas, Mambrini, Romanello The Hellespont Project

Page 32: Digital Classicist London Seminars 2013 - Seminar 10 - Agnes Thomas et al

IntroductionThe GapVis Interface

Event annotationSecondary Literature

Related Secondary Literature (from JSTOR)

Figure : http://tiny.cc/GapVis-SecLit

Thomas, Mambrini, Romanello The Hellespont Project

Page 33: Digital Classicist London Seminars 2013 - Seminar 10 - Agnes Thomas et al

IntroductionThe GapVis Interface

Event annotationSecondary Literature

Mining JSTOR:Where is Thuc. “hiding”?

A meaningful subsample

mining citations from all ~171k journal articles, not the best approach

curated bibliography (2009) before project started (CiteULike)

articles in JSTOR related to Thuc 1.89-118343 articles, 62 journalsjournals from bibliography as “seeds”samples ~73k articles (out of ~171k)

top-down vs bottom-up bibliographic approach

Pros and Cons

comprehensive coverage; > 2 centuries; multilingual

data not openly licensed

Thomas, Mambrini, Romanello The Hellespont Project

Page 34: Digital Classicist London Seminars 2013 - Seminar 10 - Agnes Thomas et al

IntroductionThe GapVis Interface

Event annotationSecondary Literature

Extracting Citations

Thomas, Mambrini, Romanello The Hellespont Project

Page 35: Digital Classicist London Seminars 2013 - Seminar 10 - Agnes Thomas et al

IntroductionThe GapVis Interface

Event annotationSecondary Literature

NLP Pipeline

Thomas, Mambrini, Romanello The Hellespont Project

Page 36: Digital Classicist London Seminars 2013 - Seminar 10 - Agnes Thomas et al

IntroductionThe GapVis Interface

Event annotationSecondary Literature

Extracting Citations: Challenges

sentence segmentationsentence = sensible unit of contextboth for extraction and data analysis (co-citation)

dirty OCRinvalid character sequences (e.g. \n)

“inconsistent” use of punctuation1, 110-15 ; 1.89.1, 1.90 ; I 1, 102, 1solution: reason based on domain knowledge

similar references, surface similarityfragments, papyri, inscriptions

Thomas, Mambrini, Romanello The Hellespont Project

Page 37: Digital Classicist London Seminars 2013 - Seminar 10 - Agnes Thomas et al

IntroductionThe GapVis Interface

Event annotationSecondary Literature

Thank you!Our contacts and temporary development server

[email protected]@[email protected]://www.tiny.cc/GapVis-Hellespont

Thomas, Mambrini, Romanello The Hellespont Project