23
ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife Transitioning Relational Databases to Ontologies Farid Cerbah Dassault Aviation [email protected]

ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife Transitioning Relational Databases to Ontologies Farid Cerbah Dassault

Embed Size (px)

Citation preview

Page 1: ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife Transitioning Relational Databases to Ontologies Farid Cerbah Dassault

ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife

Transitioning Relational Databases to Ontologies

Farid CerbahDassault Aviation

[email protected]

Page 2: ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife Transitioning Relational Databases to Ontologies Farid Cerbah Dassault

2ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife

Outline

Problem statement Previous work The RDBToOnto tool and the RTAXON method Improving the process through database

optimisation A case study in aircraft maintenance Extending RDBToOnto Conclusion

Page 3: ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife Transitioning Relational Databases to Ontologies Farid Cerbah Dassault

3ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife

Problem statement

Relational databases are valuable heterogeneous sources for ontology learning Better accuracy can be expected than from text corpora

Ontology learning from relational databases is not a new research issue

Limitations of existing support Problem often restricted to finding automated ways to

import “tables” into ontologies

Derivation of ontologies with flat structure that look like the source databases

Page 4: ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife Transitioning Relational Databases to Ontologies Farid Cerbah Dassault

4ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife

Our contribution

RDBToOnto Platform

A comprehensive software support to learn fine-tuned ontologies

A framework that eases the development and the experimentation of transitioning methods

RTAXON Method

To find out taxonomies hidden in the data

Page 5: ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife Transitioning Relational Databases to Ontologies Farid Cerbah Dassault

5ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife

A motivating example

Typical mappingscovered by

several methods

Specific toRTAXON

Page 6: ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife Transitioning Relational Databases to Ontologies Farid Cerbah Dassault

6ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife

Previous work (1)

RDB -> Ontology Transformation Database Reverse Engineering

Many transformation rules from this domain are reused for ontology learning

[Behm et al. 1997], [Ramanathan & Hodges 1997], …

Approaches mostly based on an analysis of the RDB schema

Data correlations are considered but with the restriction "Data ≡ Key Values" Key inclusion may express inheritance

Exploiting null values semantics [Lammari et al. 2007] Partitioning of a table on the basis of null values may reveal

concept hierarchies Involves data from non-key attributes

Page 7: ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife Transitioning Relational Databases to Ontologies Farid Cerbah Dassault

7ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife

Previous work (2) Mapping languages and tools

D2RQ RDB to OWL/RDF mapping Ontology-based access to relational databases Rewriting SPARQL queries into SQL

Relational.OWL A minimal ontology of ‘tables’ and ‘column’ and a processor to populate

this ontology with data from relational databases Can be used to exchange data between databases

Triplify Plugin for web applications Converts the result of SQL queries into RDF

KAON Reverse Software support to interactively map an RDB schema to a predefined

ontology

DataMaster Protégé Plugin to import table data into ontologies

Page 8: ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife Transitioning Relational Databases to Ontologies Farid Cerbah Dassault

8ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife

RDBToOnto

A user-oriented tool with a full-fledged user interface

Supports an extensive process from the access to the data to ontology generation

Includes the RTAXON converter

Though automated to a large extent, local constraints can be interactively included to progressively refine the ontologies

Types of local constraints Table and column exclusion Naming patterns for classes and instances Categorisation patterns

Page 9: ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife Transitioning Relational Databases to Ontologies Farid Cerbah Dassault

9ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife

The RTAXON method

Major improvement over existing methods Further refine the classes derived from the schema with subclasses found in the

content of the relations Focus on reliable categorisation patterns

Demo

Access Zones (X 516)

A/C Codes Description Type

F7X 2103 nose cone DOOR

F7X 281FL windshield retainers PANEL

F7X 300ZZ umbrella access panel No.1 PANEL

F7X 243DF servicing compartment floor No.1 FLOOR

F7X 342EZ rear under pylon fairing FAIRG

Access Zone

Door Panel Fairing Floor

Two sources involved in the identification of categ. attributes Attribute names

Revealed by lexical clues Redundancy in attribute extensions

Entropy-based approach to find good profiles Formal definition of RTAXON

Categorising attribute

Page 10: ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife Transitioning Relational Databases to Ontologies Farid Cerbah Dassault

10ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife

Optimising the source databases

Another key improvement is the inclusion of a database optimisation step Many input databases suffer from data duplication problems Optimisation -> eliminate data duplication through the processing of

inclusion dependencies

Dassault-AviationF0214

Messier-DowtyF564

ParkerF0086

NameCage_Code

(PKEY)

Companies (X 105)

Data Duplication

eels, Brakes and Braking

Landing Gear Emergency Control System

Landing Gears

Hydraulic Power

WP Title

ABSB45335

Dassault-AviationF021434A

Messier-DowtyF56434

ParkerF008633

Company NameCompany CodeWP Number

WorkPackages (X 82)

Companies (X 106)

eels, Brakes and Braking

Landing Gear Emergency Control System

Landing Gears

Hydraulic Power

WP Title

B45335

F021434A

F56434

F008633

Company CodeWP Number

WorkPackages (X 82)

Foreign Key Relationship

Name]Companies[ Code] es[CompanyWorkPackag Name] Code, CageCompanies[ Names]Company Code, es[CompanyWorkPackag

Inclusion dependency

Page 11: ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife Transitioning Relational Databases to Ontologies Farid Cerbah Dassault

11ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife

Effect of inclusion dependency processing Inclusion dependencies more inter-class relations (i.e. object properties).

Without ID identification

With ID identification

Page 12: ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife Transitioning Relational Databases to Ontologies Farid Cerbah Dassault

12ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife

Identification of inclusion dependencies RDBToOnto includes an editor to interactively define inclusion dependencies

Automated identification of inclusion dependencies A data mining approach Based on LATINO

See presentation in this tutorial on ontology learning by Miha Grčar (JSI) Dependencies discovered by LATINO are exported in RDBToOnto and can be

validated in the ID editor

Page 13: ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife Transitioning Relational Databases to Ontologies Farid Cerbah Dassault

13ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife

Mining inclusion dependencies with LATINO

Page 14: ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife Transitioning Relational Databases to Ontologies Farid Cerbah Dassault

14ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife

A case study in aircraft maintenance

KCIT(GATE-based annotator)RDBToOnto + LATINO

Radiant

OWLIM

Page 15: ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife Transitioning Relational Databases to Ontologies Farid Cerbah Dassault

15ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife

The ontology acquisition process

The legacy data LSA database: an heterogeneous relational database

that gathers all information related to maintenance activity

Required logistic resources Aircraft parts (Product tree) Scheduling data

Standards: Documents including widely shared conceptual models

The ontology acquisition process A multi-step transitioning process that favours modular

design

Page 16: ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife Transitioning Relational Databases to Ontologies Farid Cerbah Dassault

16ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife

Model Boostrapping + Ontology Normalisation

<>…</><>… </>….

<> …</>

Reusable Ontologies

Ontology Learning Tools

MSG-3 SNS/ATA FOAF

ModelBootstrapping

Ontology Normalisation

ATA

imports

Legacy Data

OWLIM/HKSRepository

Page 17: ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife Transitioning Relational Databases to Ontologies Farid Cerbah Dassault

17ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife

The defined RDBToOnto conversion project

75 constraints Mostly naming patterns and inclusion dependencies

Resulting ontology Ontology model

115 classes, 334 datatypes, 54 object properties Population

49617 class instances, 51449 object property instances

No constraints for categorisation The ten discovered hierarchies by RTAXON are relevant Good behaviour when faced with categorisation conflicts

Page 18: ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife Transitioning Relational Databases to Ontologies Farid Cerbah Dassault

18ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife

The generated class hierarchy

Page 19: ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife Transitioning Relational Databases to Ontologies Farid Cerbah Dassault

19ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife

Identified object properties

Page 20: ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife Transitioning Relational Databases to Ontologies Farid Cerbah Dassault

20ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife

RDBToOnto extension capabilities

RDBToOnto is a user-oriented tool but it is also a framework Written in Java OWL as target language (exploiting Jena 2.5 API)

Two types of components can be added Database readers to cover more database

formats Converters to implement new learning methods New converters can have their specific global

options, local constraints and GUI

Page 21: ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife Transitioning Relational Databases to Ontologies Farid Cerbah Dassault

21ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife

Structure of RDBToONTO

DBReader

Database getDatabase()

Table ReadData(String name)

MSAccessReader DB2Reader

Database

RDBToOntoConverter

OntModel Convert(Database db)

OntClass CreateClass(TableDef)

RTAXON BasicConverter

can be extended by the users

Page 22: ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife Transitioning Relational Databases to Ontologies Farid Cerbah Dassault

22ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife

The neutral database model

DatabaseDBSchema

TableDef

Key

PrimaryKey ForeignKey

Attribute

Table Column

StringfriendlyNames Values

*

* *

*

**

*Input to any converter

Page 23: ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife Transitioning Relational Databases to Ontologies Farid Cerbah Dassault

23ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife

Conclusion

We presented a significant support for transitioning relational databases to ontologies

RDBToOnto and RTAXON method have been evaluated on significant databases

RTAXON is just a first step as many extensions can be studied Learning two-level hierarchies Automatically generating local constraints (e.g. naming patterns)

More resources are available on TAO project web site, including User Guide and demos Development Guide A fully implemented sample showing how to extend the tool