18
1 CAA2004, April 13-17, 2004 Doerr, Schaller, Theodoridou Integration of complementary archaeological sources Martin Doerr Maria Theodoridou ICS-FORTH, Heraklion, Crete, Greece Kurt Schaller Magistrat der Stadt Wien Geschäftsgruppe Kultur und Wissenschaft – Stadtarchäologie, Wien, Austria

Integration of complementary archaeological sources

  • Upload
    adeola

  • View
    37

  • Download
    0

Embed Size (px)

DESCRIPTION

Integration of complementary archaeological sources. Martin Doerr Maria Theodoridou ICS-FORTH, Heraklion, Crete, Greece Kurt Schaller Magistrat der Stadt Wien Geschäftsgruppe Kultur und Wissenschaft – Stadtarchäologie, Wien, Austria. Outline. Problem statement – Working context Objective - PowerPoint PPT Presentation

Citation preview

Page 1: Integration of complementary archaeological sources

1CAA2004, April 13-17, 2004 Doerr, Schaller, Theodoridou

Integration of complementary archaeological sources

Martin DoerrMaria Theodoridou

ICS-FORTH, Heraklion, Crete, Greece

Kurt Schaller

Magistrat der Stadt WienGeschäftsgruppe Kultur und Wissenschaft –

Stadtarchäologie, Wien, Austria

 

Page 2: Integration of complementary archaeological sources

2CAA2004, April 13-17, 2004 Doerr, Schaller, Theodoridou

Outline

Problem statement – Working context

Objective

Approach

Technical description

Results

Conclusion, future work

Page 3: Integration of complementary archaeological sources

3CAA2004, April 13-17, 2004 Doerr, Schaller, Theodoridou

Project VBI ERAT LVPA The Internet Tracks of the Roman She-Wolf

Traditional corpora: • very high quality, difficult to maintain, difficult to search, uncorrelated to

complementary resourcesNew Database Projects:• varying quality, overlapping contents, continuously updated, easy to search,

uncorrelated between each other.Altogether: • A conglomerate of highly interrelated archaeological sources

• of overwhelming detail and volume

Ubi-erat-lupa: A European “Culture 2000” Project• An aggregation of complementary scientific databases and corpora

describing finds with inscriptions and iconography of the Roman era

• to create a body of unique archaeological knowledge in digital form.

Page 4: Integration of complementary archaeological sources

4CAA2004, April 13-17, 2004 Doerr, Schaller, Theodoridou

VBI ERAT LVPA Objective

• creation of a global index about a set of semi-autonomous sources for global access to the unified knowledge

• integration of complementary information under a common ontology/schema and identification of common elements in different sources

• development of an integration algorithm that converges to the best state of knowledge und continuous update

• creation of a research tool for formulating queries of archaeological content to detect contextual relationships that cannot be derived from interpreting the sources in isolation

Page 5: Integration of complementary archaeological sources

5CAA2004, April 13-17, 2004 Doerr, Schaller, Theodoridou

Approach

• Develop a semantic network based on the CIDOC CRM model to integrate the complementary archaeological sources

• Data, relevant to global querying over all contents, are extracted, transformed and stored in an RDF repository, that is incrementally updated over time.

Integration in two phases:• source schema is intellectually interpreted in terms of the CIDOC model

• “non canonical” data reported to respective source

• mistakes in sources removed, quality of source improved

• actual data automatically transformed and stored into an RDF repository

• an a posteriori data cleaning process removes as many duplicates as can be (semi-) automatically detected

Page 6: Integration of complementary archaeological sources

6CAA2004, April 13-17, 2004 Doerr, Schaller, Theodoridou

The CIDOC CRMTop-level Entities relevant for Integration

E39 Actor

E55 Types

E22 Man-Made Object

E34 Inscription

E31 Document

E41

Ap

pel

lati

ons refer to / refine

refe

r to

/ i d

ent i f

ie

location

E53 Place

is documented in participate in

atE5 Event

affect or / refer to

Page 7: Integration of complementary archaeological sources

7CAA2004, April 13-17, 2004 Doerr, Schaller, Theodoridou

The CIDOC CRM – VBI-ERAT-LVPARepository Indexing

Objects Events Places

Derivedknowledge data (RDF)

Thesauriextent

CRM entities

On

tolo

gy

ex

pa

ns

ion

Sources

Backgroundknowledge /Authorities

CIDOCCRM

arachnelupa

AECSIR

CILStone databasesName data bases

Epigraphic corpora

OPEL

Page 8: Integration of complementary archaeological sources

8CAA2004, April 13-17, 2004 Doerr, Schaller, Theodoridou

Complementary archaeological sourcesStone data bases

Lupa - 7000 archaeological records, City of Vienna, Austria Arachne - 40.000 archaeological records, Antike Plastik, Cologne

Name data bases

ONOMASTICON PROVINCIARVM EVROPAE LATINARVM (OPEL)Information about the amount and distribution of Roman names in the European provinces of the empire, City of Vienna, Austria

Epigraphic corpora

CIL – Corpus Inscriptionum Latinarum AE – L'Année Epigraphique Inscriptions Clauss/Slaby – University of Frankfurt

Thesauri / Dictionaries

TGN – Getty Thesaurus of Geographic Names Alexandria DL Gazetteer – 5.000.000 current place names (web service) Barrington Atlas of the Greek and Roman World Map-by-Map Directory – provides

information about every place or feature in the Atlas

Page 9: Integration of complementary archaeological sources

9CAA2004, April 13-17, 2004 Doerr, Schaller, Theodoridou

E35 Title

Stele des C. Iulius

E42 Object Identifier

OID:LUPA.5E55 Type

Stele

P102F.has_title

E31 Document

CIL III

E31 Document

CIT: CIL III 13483

P106B.forms part of

E31 DocumentVorbeck

E31 Document

CIT: Vorbeck, Militarinschr (1980) Nr. 182

P70B.is documented in P106B.forms part of

E53 Place

Petronell (Carnuntum)

E5 Event

DiscoveryOf:PO: LUPA.5

P7F.took place at

E53 Place

Petronell

P55F.has current location

P12B.was present at

P1F.is identified_by

P2F.has_type

Mapping stone data bases to CIDOC-CRM

E53 Place

Pannonia

P89F.falls within

E22 Man-Made Object

PO:LUPA.5

Page 10: Integration of complementary archaeological sources

10CAA2004, April 13-17, 2004 Doerr, Schaller, Theodoridou

E41 AppellationE31 Document

CIT: Vorbeck, Militarinschr (1980)

Nr. 182

E31 DocumentVorbeck

P106B.forms part ofE31 Document

CIL III

P106B.forms part of

E34 Inscription

INSC:PO:LUPA.5

P65F.shows visual item Literal

CIULIUSCFCORNETHESSALMILLEGXVAPOLLIANNXXXISTIPXIIHSECCLVIUSETBASSUSLHP

LiteralC(aius) Iulius C(ai) f(ilius) Corne(lia) Thessal(onica) mil(es) leg(ionis) XV Apolli(naris) ann(orum) XXXI stip(endiorum) XII h(ic) s(itus) e(st) C(aius) Cl(u)vius et Bassus l(ibertus) h(eredes) p(osuerunt)

LiteralCaius Iulius Cai filius Cornelia Thessalonica miles legionis XV Apollinaris annorum XXXI stipendiorum XII hic situs est Caius Cluvius et Bassus libertus heredes posuerunt

P1F.is identified byP70B.is documented in

P150Fshows characters

P152Fhas clear text

P151Fhas transcription

Mapping stone data bases to CIDOC-CRM

E41 AppellationE31 Document

CIT: CIL III, 13483

E22 Man-Made Object

PO:LUPA.5

Page 11: Integration of complementary archaeological sources

11CAA2004, April 13-17, 2004 Doerr, Schaller, Theodoridou

E41 AppellationE31 Document

CIT: AE 1896, 00024

E41 AppellationE31 Document

CIT: IG-10-02-01, 01033

E41 AppellationE31 Document

CIT: CIL III, 13483E31 Document

DOC:IG-10-02-01P106B.forms part of

E31 Document

DOC:AE 1896P106B.forms part of

E34 Inscription

INSC:CIL III, 13483

Literal

CIULIUSCFCORNETHESSALOMILLEGXVAPOLLIANNXXXISTIPXIIHSECCLUVIUSETBASSUSLHP

Literal

C(aius) Iulius / C(ai) f(ilius) Corne(lia) / Thessalo(nica) / mil(es) leg(ionis) XV / Apolli(naris) ann(orum) / XXXI stip(endiorum) XII / h(ic) s(itus) e(st) / C(aius) Cluvius / et Bassus / l(ibertus) h(eredes) p(osuerunt)

P1F.is identified byP70B.is documented in

P150F.shows characters

P151F.has transcription

Mapping epigraphic corpora to CIDOC-CRM

E31 Document

DOC: CIL IIIP106B.forms part of

Page 12: Integration of complementary archaeological sources

12CAA2004, April 13-17, 2004 Doerr, Schaller, Theodoridou

E5 Event

DiscoveryOf:PO: CIL III 13483

P70B.is documented in

E31 Document

CIT: CIL III 13483

E34 Inscription

INSC:CIL III 13483

E55 Type

cognomina

P2F.has type

P139F.has alternative form

E31 Document

CIL III

P106B.forms part of

P12B.was present at

E24 Physical Man-Made Stuff

PO:CIL III 13483

P65B.is shown by

E53 Place

Pannonia

E41 Appellation

APPEL: [B]assa

E41 Appellation

APPEL: Bassu[s]

E41 Appellation

APPEL: Bassus

E41 Appellation

APPEL: Bassa

E41 Appellation

APPEL: BASSVS*

P7F.took place at

P67B.is referred to by

Mapping OPEL to CIDOC-CRM

Page 13: Integration of complementary archaeological sources

13CAA2004, April 13-17, 2004 Doerr, Schaller, Theodoridou

Thesauri/DictionariesStone data bases Name data basesEpigraphic corpora

E53 Place

Pannonia

E53 Place

Petronell (Carnuntum)

E5 EventDiscoveryOf:PO:

LUPA.5

was present at

took place at

falls within

is identified by

E31 DocumentCIL III

forms part of

is documented in

E41 AppellationE31 Document

CIT: CIL III, 13483

E41 Appellation

APPEL: Bassus

is referred to by

E5 Event

DiscoveryOf:PO: CIL III 13483

E24 Physical Man-Made Stuff

PO:CIL III 13483E41 Appellation

APPEL: BASSVS*

has alternative formis shown by

was present at

took place at

E41 AppellationE31 Document

CIT: AE 1896, 00024

E31 Document

DOC:AE

forms part of

E34 Inscription

INSC:CIL III 13483

is identified byis documented in

E34 Inscription

INSC:PO:LUPA.5

shows visual item

E22 Man-Made Object

PO:LUPA.5

Integration Into One Resource

Page 14: Integration of complementary archaeological sources

14CAA2004, April 13-17, 2004 Doerr, Schaller, Theodoridou

Identity Problem

Two approaches:

a) avoid taking two different items for the same => use local id, where uniqueness is guaranteed

b) try to find global names with a high chance to match.

Lupa solution is a): • We give a serial number to any new object we insert• We use the serial number of the source database.

Example: P.O: arachne.45305or : P.O: lupa.4501

• We maintain local id in the global index as valid names and remove detected duplicates continuously.

Cost-benefit optimization of over- and under-identification!

Page 15: Integration of complementary archaeological sources

15CAA2004, April 13-17, 2004 Doerr, Schaller, Theodoridou

E34 Inscription

INSC:CIL III 10514shows visual item

PO:LUPA.2849

E22 Man-Made Object

E35 TitleStele des Nertus Lingauster

E42 Object Identifier

OID:LUPA.2849

has title

is identified by

has type

E22 Man-Made Object

E35 TitleGrabstele des Nertus

has title

E55 Type

Stele

E42 Object Identifier

OID:ARACHNE.80581

is identified byPO:ARACHNE.80581

shows visual item

Reactive Data CleaningInitial Data

Page 16: Integration of complementary archaeological sources

16CAA2004, April 13-17, 2004 Doerr, Schaller, Theodoridou

E34 Inscription

INSC:CIL III 10514

PO:LUPA.2849

E22 Man-Made Object

E35 TitleStele des Nertus Lingauster

E42 Object Identifier

OID:LUPA.2849

has title

is identified by

has type

E35 TitleGrabstele des Nertus

has title

E55 Type

Stele

E42 Object Identifier

OID:ARACHNE.80581

is identified by

shows visual item

Reactive Data CleaningResult

Page 17: Integration of complementary archaeological sources

17CAA2004, April 13-17, 2004 Doerr, Schaller, Theodoridou

VBI ERAT LVPA Results

A method and architecture for integration of diverse archaeological copora on the Roman stone monuments under the CIDOC CRM model.

We developed an efficient way for place name recognition

We are developing a research tool suitable for formulating queries and drawing conclusions on archaeological data

detection of contextual relationships that cannot be derived from interpreting the sources in isolation

a method of identifying epigraphic references and finds

test bed for the CIDOC CRM model - proved its adequacy

First large scale integration project of multiple complementary resources as a global index to the original sources

Page 18: Integration of complementary archaeological sources

18CAA2004, April 13-17, 2004 Doerr, Schaller, Theodoridou

Future work

integrate more data sources

support a mechanism to visualize a source

support an automatic mapping process so that archaeologists will be able to maintain the system b themselves.