15
Semantic Data Integration I6 Core Group Nic Bertrand Herbert Schentz LTER-Europe Conference, Mallorca, Dec. 2008

Semantic data integration proof of concept

Embed Size (px)

Citation preview

Page 1: Semantic data integration proof of concept

Semantic Data Integration

I6 Core GroupNic BertrandHerbert Schentz

LTER-Europe Conference, Mallorca, Dec. 2008

Page 2: Semantic data integration proof of concept

Overview

■Testing goals■Test Architecture■Results■Outlook: Applicability for LTER-Europe

Page 3: Semantic data integration proof of concept

Architecture

Goal:Enable seamless access to distributed data Allow local data analysis for all members with their own tools

DistributedSocio-Ecological Data

See all data as if it came from ONE Data Source

Distributed Data miningwith local tools

Portal

Page 4: Semantic data integration proof of concept

Distributed Applications

Longer term visionExtend seamless access to distributed services (SOA)Allow local data analysis for all members with their own tools and common services

See all data as if data came from ONE Data Source processed within ONE application

Distributed Socio-ecological Data

Distributed Data MiningWith local tools

Page 5: Semantic data integration proof of concept

Distributed Data Miningwith local tools

Role of Ontology

Distributed Socio-Ecological Data

SERONTO

SERONTO: basis to discover, retrieve and integrate distributed heterogenous data

common conceptsand structures

Portal

Page 6: Semantic data integration proof of concept

Testing... Why?

■To validate the use of SERONTO for data integration of ALTER-Net and LTER Europe ■Test the feasibility of mapping REAL ecological data to SERONTO ■Test the querying of the connected database(s) from the semantic concepts in SERONTO

Page 7: Semantic data integration proof of concept

Proof of concept:Acceptance Criteria

• The databases must have different structures and must have been developed independently of SERONTO;

• The databases must feature reference lists (e.g. species lists);

• The database structures must not be altered as a result of the integration work;

• New concepts may be imported into SERONTO as and when required;

• The databases must contain data relevant to Long Term Ecological Research (e.g. vegetation surveys, records of species occurrences, measurement of biotic and abiotic components).

Page 8: Semantic data integration proof of concept

Testing: Connecting 5 databases

JOKLcultural

landscapes

JODIvegetation

2835floodplain

ECN Summary Database

More about the databases:Independently developed, Not developed for the purpose of data integrationDifferent data models Different languagesSimilar data types collected in ALTER-Net, Some obvious integration points (e.g. Vegetation)

Pythiavegetation

SERONTO

Page 9: Semantic data integration proof of concept

Data Integration using SERONTO

ImportOntology

Connect Databases

QuerySERONTOResults

Page 10: Semantic data integration proof of concept

Getting value sets back

SERONTO

parameter_method

parameter method

Value_sets Unit

Scale

Page 11: Semantic data integration proof of concept

Data Integration Results➢ Import SERONTO and Units Ontologies into Ontostudio

SERONTO

Page 12: Semantic data integration proof of concept

12

Data Integration Resultsimport diverse ecological databases

JOKLcultural

landscapes

JODIvegetation

2835floodplain

ECN Summary Database

Pythiavegetation

Page 13: Semantic data integration proof of concept

13

Data Integration Results

Extend SERONTO ClassesUsing the content of the databases

(SERONTO Core does not contain domain specific concepts)

Map databases to SERONTO (Simple and complex mappings)

Query individual databases directly

Query multiple databases from the SERONTO (Simple and Complex queries)

Map once, reuse data many times, querying does not require knowledge of the structures of the databases

Semantic data integration is possible

Page 14: Semantic data integration proof of concept

Open Questions

SERONTO Core

domain ontologies ?

<?xml version="1.0" encoding="UTF-8"?><flg:flogic xmlns:flg="http://www.wsmo.org/2004/d16/d16.2/v0.1/"> <!-- Test data to test the WSML F-Logic XML syntax --> <!-- The following <rule></rule> encodes this fact (taken from the F-Logic JACM paper, page 7):bob[name -> "Bob"; age -> 40; affiliation -> cs1[dname -> "CS"; mngr -> bob; assistents -> {john, sally}]

this encoding writes only elementary molecules--> <rule> <head> <molecule> <object> <constant name="bob"/> </object> <superclass isaType=":"> <class> <constant name="empl"/> </class> </superclass> <methodSpec arrow="->"> <name> <constant name="name"/> </name> <result> <oid> <constant name=""Bob""/> </oid> </result>

Portal

Query

Databases

Performance

Page 15: Semantic data integration proof of concept

Possible uses for LTER Europe

Distributed Data Miningwith local tools

Distributed Socio-Ecological Data

SERONTO & Domain Ontologiescommon conceptsand domain knowledge

Portal

Seamless access... Ready for use now