Upload
adeola
View
37
Download
0
Embed Size (px)
DESCRIPTION
Integration of complementary archaeological sources. Martin Doerr Maria Theodoridou ICS-FORTH, Heraklion, Crete, Greece Kurt Schaller Magistrat der Stadt Wien Geschäftsgruppe Kultur und Wissenschaft – Stadtarchäologie, Wien, Austria. Outline. Problem statement – Working context Objective - PowerPoint PPT Presentation
Citation preview
1CAA2004, April 13-17, 2004 Doerr, Schaller, Theodoridou
Integration of complementary archaeological sources
Martin DoerrMaria Theodoridou
ICS-FORTH, Heraklion, Crete, Greece
Kurt Schaller
Magistrat der Stadt WienGeschäftsgruppe Kultur und Wissenschaft –
Stadtarchäologie, Wien, Austria
2CAA2004, April 13-17, 2004 Doerr, Schaller, Theodoridou
Outline
Problem statement – Working context
Objective
Approach
Technical description
Results
Conclusion, future work
3CAA2004, April 13-17, 2004 Doerr, Schaller, Theodoridou
Project VBI ERAT LVPA The Internet Tracks of the Roman She-Wolf
Traditional corpora: • very high quality, difficult to maintain, difficult to search, uncorrelated to
complementary resourcesNew Database Projects:• varying quality, overlapping contents, continuously updated, easy to search,
uncorrelated between each other.Altogether: • A conglomerate of highly interrelated archaeological sources
• of overwhelming detail and volume
Ubi-erat-lupa: A European “Culture 2000” Project• An aggregation of complementary scientific databases and corpora
describing finds with inscriptions and iconography of the Roman era
• to create a body of unique archaeological knowledge in digital form.
4CAA2004, April 13-17, 2004 Doerr, Schaller, Theodoridou
VBI ERAT LVPA Objective
• creation of a global index about a set of semi-autonomous sources for global access to the unified knowledge
• integration of complementary information under a common ontology/schema and identification of common elements in different sources
• development of an integration algorithm that converges to the best state of knowledge und continuous update
• creation of a research tool for formulating queries of archaeological content to detect contextual relationships that cannot be derived from interpreting the sources in isolation
5CAA2004, April 13-17, 2004 Doerr, Schaller, Theodoridou
Approach
• Develop a semantic network based on the CIDOC CRM model to integrate the complementary archaeological sources
• Data, relevant to global querying over all contents, are extracted, transformed and stored in an RDF repository, that is incrementally updated over time.
Integration in two phases:• source schema is intellectually interpreted in terms of the CIDOC model
• “non canonical” data reported to respective source
• mistakes in sources removed, quality of source improved
• actual data automatically transformed and stored into an RDF repository
• an a posteriori data cleaning process removes as many duplicates as can be (semi-) automatically detected
6CAA2004, April 13-17, 2004 Doerr, Schaller, Theodoridou
The CIDOC CRMTop-level Entities relevant for Integration
E39 Actor
E55 Types
E22 Man-Made Object
E34 Inscription
E31 Document
E41
Ap
pel
lati
ons refer to / refine
refe
r to
/ i d
ent i f
ie
location
E53 Place
is documented in participate in
atE5 Event
affect or / refer to
7CAA2004, April 13-17, 2004 Doerr, Schaller, Theodoridou
The CIDOC CRM – VBI-ERAT-LVPARepository Indexing
Objects Events Places
Derivedknowledge data (RDF)
Thesauriextent
CRM entities
On
tolo
gy
ex
pa
ns
ion
Sources
Backgroundknowledge /Authorities
CIDOCCRM
arachnelupa
AECSIR
CILStone databasesName data bases
Epigraphic corpora
OPEL
8CAA2004, April 13-17, 2004 Doerr, Schaller, Theodoridou
Complementary archaeological sourcesStone data bases
Lupa - 7000 archaeological records, City of Vienna, Austria Arachne - 40.000 archaeological records, Antike Plastik, Cologne
Name data bases
ONOMASTICON PROVINCIARVM EVROPAE LATINARVM (OPEL)Information about the amount and distribution of Roman names in the European provinces of the empire, City of Vienna, Austria
Epigraphic corpora
CIL – Corpus Inscriptionum Latinarum AE – L'Année Epigraphique Inscriptions Clauss/Slaby – University of Frankfurt
Thesauri / Dictionaries
TGN – Getty Thesaurus of Geographic Names Alexandria DL Gazetteer – 5.000.000 current place names (web service) Barrington Atlas of the Greek and Roman World Map-by-Map Directory – provides
information about every place or feature in the Atlas
9CAA2004, April 13-17, 2004 Doerr, Schaller, Theodoridou
E35 Title
Stele des C. Iulius
E42 Object Identifier
OID:LUPA.5E55 Type
Stele
P102F.has_title
E31 Document
CIL III
E31 Document
CIT: CIL III 13483
P106B.forms part of
E31 DocumentVorbeck
E31 Document
CIT: Vorbeck, Militarinschr (1980) Nr. 182
P70B.is documented in P106B.forms part of
E53 Place
Petronell (Carnuntum)
E5 Event
DiscoveryOf:PO: LUPA.5
P7F.took place at
E53 Place
Petronell
P55F.has current location
P12B.was present at
P1F.is identified_by
P2F.has_type
Mapping stone data bases to CIDOC-CRM
E53 Place
Pannonia
P89F.falls within
E22 Man-Made Object
PO:LUPA.5
10CAA2004, April 13-17, 2004 Doerr, Schaller, Theodoridou
E41 AppellationE31 Document
CIT: Vorbeck, Militarinschr (1980)
Nr. 182
E31 DocumentVorbeck
P106B.forms part ofE31 Document
CIL III
P106B.forms part of
E34 Inscription
INSC:PO:LUPA.5
P65F.shows visual item Literal
CIULIUSCFCORNETHESSALMILLEGXVAPOLLIANNXXXISTIPXIIHSECCLVIUSETBASSUSLHP
LiteralC(aius) Iulius C(ai) f(ilius) Corne(lia) Thessal(onica) mil(es) leg(ionis) XV Apolli(naris) ann(orum) XXXI stip(endiorum) XII h(ic) s(itus) e(st) C(aius) Cl(u)vius et Bassus l(ibertus) h(eredes) p(osuerunt)
LiteralCaius Iulius Cai filius Cornelia Thessalonica miles legionis XV Apollinaris annorum XXXI stipendiorum XII hic situs est Caius Cluvius et Bassus libertus heredes posuerunt
P1F.is identified byP70B.is documented in
P150Fshows characters
P152Fhas clear text
P151Fhas transcription
Mapping stone data bases to CIDOC-CRM
E41 AppellationE31 Document
CIT: CIL III, 13483
E22 Man-Made Object
PO:LUPA.5
11CAA2004, April 13-17, 2004 Doerr, Schaller, Theodoridou
E41 AppellationE31 Document
CIT: AE 1896, 00024
E41 AppellationE31 Document
CIT: IG-10-02-01, 01033
E41 AppellationE31 Document
CIT: CIL III, 13483E31 Document
DOC:IG-10-02-01P106B.forms part of
E31 Document
DOC:AE 1896P106B.forms part of
E34 Inscription
INSC:CIL III, 13483
Literal
CIULIUSCFCORNETHESSALOMILLEGXVAPOLLIANNXXXISTIPXIIHSECCLUVIUSETBASSUSLHP
Literal
C(aius) Iulius / C(ai) f(ilius) Corne(lia) / Thessalo(nica) / mil(es) leg(ionis) XV / Apolli(naris) ann(orum) / XXXI stip(endiorum) XII / h(ic) s(itus) e(st) / C(aius) Cluvius / et Bassus / l(ibertus) h(eredes) p(osuerunt)
P1F.is identified byP70B.is documented in
P150F.shows characters
P151F.has transcription
Mapping epigraphic corpora to CIDOC-CRM
E31 Document
DOC: CIL IIIP106B.forms part of
12CAA2004, April 13-17, 2004 Doerr, Schaller, Theodoridou
E5 Event
DiscoveryOf:PO: CIL III 13483
P70B.is documented in
E31 Document
CIT: CIL III 13483
E34 Inscription
INSC:CIL III 13483
E55 Type
cognomina
P2F.has type
P139F.has alternative form
E31 Document
CIL III
P106B.forms part of
P12B.was present at
E24 Physical Man-Made Stuff
PO:CIL III 13483
P65B.is shown by
E53 Place
Pannonia
E41 Appellation
APPEL: [B]assa
E41 Appellation
APPEL: Bassu[s]
E41 Appellation
APPEL: Bassus
E41 Appellation
APPEL: Bassa
E41 Appellation
APPEL: BASSVS*
P7F.took place at
P67B.is referred to by
Mapping OPEL to CIDOC-CRM
13CAA2004, April 13-17, 2004 Doerr, Schaller, Theodoridou
Thesauri/DictionariesStone data bases Name data basesEpigraphic corpora
E53 Place
Pannonia
E53 Place
Petronell (Carnuntum)
E5 EventDiscoveryOf:PO:
LUPA.5
was present at
took place at
falls within
is identified by
E31 DocumentCIL III
forms part of
is documented in
E41 AppellationE31 Document
CIT: CIL III, 13483
E41 Appellation
APPEL: Bassus
is referred to by
E5 Event
DiscoveryOf:PO: CIL III 13483
E24 Physical Man-Made Stuff
PO:CIL III 13483E41 Appellation
APPEL: BASSVS*
has alternative formis shown by
was present at
took place at
E41 AppellationE31 Document
CIT: AE 1896, 00024
E31 Document
DOC:AE
forms part of
E34 Inscription
INSC:CIL III 13483
is identified byis documented in
E34 Inscription
INSC:PO:LUPA.5
shows visual item
E22 Man-Made Object
PO:LUPA.5
Integration Into One Resource
14CAA2004, April 13-17, 2004 Doerr, Schaller, Theodoridou
Identity Problem
Two approaches:
a) avoid taking two different items for the same => use local id, where uniqueness is guaranteed
b) try to find global names with a high chance to match.
Lupa solution is a): • We give a serial number to any new object we insert• We use the serial number of the source database.
Example: P.O: arachne.45305or : P.O: lupa.4501
• We maintain local id in the global index as valid names and remove detected duplicates continuously.
Cost-benefit optimization of over- and under-identification!
15CAA2004, April 13-17, 2004 Doerr, Schaller, Theodoridou
E34 Inscription
INSC:CIL III 10514shows visual item
PO:LUPA.2849
E22 Man-Made Object
E35 TitleStele des Nertus Lingauster
E42 Object Identifier
OID:LUPA.2849
has title
is identified by
has type
E22 Man-Made Object
E35 TitleGrabstele des Nertus
has title
E55 Type
Stele
E42 Object Identifier
OID:ARACHNE.80581
is identified byPO:ARACHNE.80581
shows visual item
Reactive Data CleaningInitial Data
16CAA2004, April 13-17, 2004 Doerr, Schaller, Theodoridou
E34 Inscription
INSC:CIL III 10514
PO:LUPA.2849
E22 Man-Made Object
E35 TitleStele des Nertus Lingauster
E42 Object Identifier
OID:LUPA.2849
has title
is identified by
has type
E35 TitleGrabstele des Nertus
has title
E55 Type
Stele
E42 Object Identifier
OID:ARACHNE.80581
is identified by
shows visual item
Reactive Data CleaningResult
17CAA2004, April 13-17, 2004 Doerr, Schaller, Theodoridou
VBI ERAT LVPA Results
A method and architecture for integration of diverse archaeological copora on the Roman stone monuments under the CIDOC CRM model.
We developed an efficient way for place name recognition
We are developing a research tool suitable for formulating queries and drawing conclusions on archaeological data
detection of contextual relationships that cannot be derived from interpreting the sources in isolation
a method of identifying epigraphic references and finds
test bed for the CIDOC CRM model - proved its adequacy
First large scale integration project of multiple complementary resources as a global index to the original sources
18CAA2004, April 13-17, 2004 Doerr, Schaller, Theodoridou
Future work
integrate more data sources
support a mechanism to visualize a source
support an automatic mapping process so that archaeologists will be able to maintain the system b themselves.