26
22/6/27 ISWC2007, Nov. 14. Discovering simple mappings between Relational database schemas and ontologies Wei Hu, Yuzhong Qu {whu, yzqu}@seu.edu.cn Institute of Web Science School of Computer Science and Engineering Southeast University, China

Discovering simple mappings between Relational database schemas and ontologies

  • Upload
    ahanu

  • View
    35

  • Download
    0

Embed Size (px)

DESCRIPTION

Discovering simple mappings between Relational database schemas and ontologies. Wei Hu, Yuzhong Qu {whu, yzqu}@seu.edu.cn Institute of Web Science School of Computer Science and Engineering Southeast University, China. Outline. Introduction Our approach Evaluation Related work - PowerPoint PPT Presentation

Citation preview

Page 1: Discovering simple mappings between  Relational database schemas and ontologies

23/4/21 ISWC2007, Nov. 14.

Discovering simple mappings between Relational database schemas and ontologies

Wei Hu, Yuzhong Qu{whu, yzqu}@seu.edu.cn

Institute of Web ScienceSchool of Computer Science and Engineering

Southeast University, China

Page 2: Discovering simple mappings between  Relational database schemas and ontologies

23/4/21 ISWC2007, Nov. 14.

Outline

IntroductionOur approachEvaluationRelated workSummary and future work

Page 3: Discovering simple mappings between  Relational database schemas and ontologies

23/4/21 ISWC2007, Nov. 14.

Introduction

The popularity of ontologies is rapidly growing since the emergence of the Semantic Web. Swoogle collected more than 10,000 ontologies so far. Falcons indexed more than 2 million classes/properties.

However, most of the world’s data today are still locked in data stores, and are not published as an open Web of inter-referring resources. [Ref.4. Creating a science of the Web. 2006]

About 77.3% data on the current Web are stored in relational databases. [Ref.6. SIGMOD Record. 33(3) (2004)]

So, it is necessary to establish interoperability between (Semantic) Web applications using relational databases and ontologies for creating a Web of data.

Page 4: Discovering simple mappings between  Relational database schemas and ontologies

23/4/21 ISWC2007, Nov. 14.

Introduction – By an example

Left part: relations, attributes, primary keys, foreign keys. Right part: classes, properties (data valued or object properties)

Page 5: Discovering simple mappings between  Relational database schemas and ontologies

23/4/21 ISWC2007, Nov. 14.

Introduction (cont’d)

Manually discovering such simple mappings is tedious and improbable at the Web scale.

So (semi-) automatic approaches have been proposed. Not well consider the characteristics of relational data models and ontology

model The mappings are not accurate enough.

Most of the present approaches cannot construct semantic mappings The (missed) semantic mappings are useful in various practical applications.

Page 6: Discovering simple mappings between  Relational database schemas and ontologies

23/4/21 ISWC2007, Nov. 14.

Introduction – the contribution

We propose a new approach to discovering simple mappings It constructs virtual documents for the entities

To discover mappings by comparing virtual documents .

It validates mapping consistency To eliminate certain incorrect mappings.

It explores contextual mappings Can be transformed directly to view-based mappings with selection conditions. Be useful for applications in real world domains.

[Ref. 5. Putting context into schema matching. VLDB'06]

Page 7: Discovering simple mappings between  Relational database schemas and ontologies

23/4/21 ISWC2007, Nov. 14.

Introduction – Terminology

R denotes a relation, and A denotes an attribute. type(A): the domain name of A; rel(A): the relation which specifies A; pk(R): the attributes appeared as the primary keys of R; ref(A): the attributes referenced by A;

C represents a class, and P represents a property. PD denotes a data valued property and PO denotes an object property. d(P): the domain(s) of P; r(P): its range(s) of P.

Page 8: Discovering simple mappings between  Relational database schemas and ontologies

23/4/21 ISWC2007, Nov. 14.

Introduction – Terminology (cont’d)

A mapping m is a 5-tuple: < id, u, v, t, f >, where: id is a unique identifier; u is an entity in {R} {∪ A}, and v is an entity in {C} {∪ P}; t is a relationship, e.g. equivalence and subsumption, holding between u

and v; f is a confidence measure in the [0, 1] range.

Examples < 1, writes, hasAuthor, , 1.0 > < 2, id, hasID, , 1.0 > < 3, Paper, JournalPaper, , 0.8 >

Page 9: Discovering simple mappings between  Relational database schemas and ontologies

23/4/21 ISWC2007, Nov. 14.

Outline

IntroductionOur approachEvaluationRelated workSummary and future work

Page 10: Discovering simple mappings between  Relational database schemas and ontologies

23/4/21 ISWC2007, Nov. 14.

Overview of the approach

Phase 1: Classifying entity types (A preprocess step) Heuristically classifies entities into different groups, coordinates different characteristics.

Phase 2: Discovering simple mappings Constructs virtual documents for entities, calculating confidence measure via TF/IDF model.

Phase 3: Validating mapping consistency Use <relation, class> mappings to validate the consistency of <attribute, property> ; Also, the comparability between the data types of attributes and data valued properties.

Phase 4: Constructing contextual mappings <relation, class> + sample instances contextual mappings.

Page 11: Discovering simple mappings between  Relational database schemas and ontologies

23/4/21 ISWC2007, Nov. 14.

Phase 1: Classifying entity types

Relation: strong entity relation (SER), weak entity relation (WER), regular relationship relation (RRR), specific relationship relation (SRR).

Attribute: foreign key attribute (FKA), non-foreign key attribute (NFKA).

[Ref.9. Data & Knowledge Engineering. 12 (1994)]

Group 1: {{SER} {∪ WER}}×{C};Group 2: {{RRR} {∪ SRR}}×{PO};Group 3: {FKA}×{PO};Group 4: {NFKA}×{{PD} {∪ PO}}.

Coordinate different characteristics Reifying n-arity relationship (n>2) Others.

Page 12: Discovering simple mappings between  Relational database schemas and ontologies

23/4/21 ISWC2007, Nov. 14.

Phase 2: Discovering simple mappings

We construct virtual documents for the entities in both the relational schema and the ontology to capture their structural information. A virtual document represents a collection of weighted tokens,

which are derived not only from the description of the entity itself, but also from the descriptions of its neighbors. The weights of the tokens indicate their importance, and could then be viewed as a vector in the TF/IDF model.

Rationality: the semantic information of a relational schema is characterized mainly by its ICs; an OWL ontology can be mapped to an RDF graph, which also indicates the semantic information in its structure.

Page 13: Discovering simple mappings between  Relational database schemas and ontologies

23/4/21 ISWC2007, Nov. 14.

Discovering simple mappings (cont’d.)

Relations and attributes:

Classes and properties:

Page 14: Discovering simple mappings between  Relational database schemas and ontologies

23/4/21 ISWC2007, Nov. 14.

Phase 3: Validating mapping consistency

Using mappings between <relations, classes> to validate the consistency of <attributes, properties> mappings. Attributes cannot stand alone without relations. The restriction construct in an OWL ontology specifies local

domain and range constraints on the classes.

Page 15: Discovering simple mappings between  Relational database schemas and ontologies

23/4/21 ISWC2007, Nov. 14.

Phase 4: Constructing contextual mappings

Focus on a special type of mappings – contextual mappings Directly translated to conditional mappings or view-based mappings.

Page 16: Discovering simple mappings between  Relational database schemas and ontologies

23/4/21 ISWC2007, Nov. 14.

Constructing contextual mappings (cont’d.)

Page 17: Discovering simple mappings between  Relational database schemas and ontologies

23/4/21 ISWC2007, Nov. 14.

Outline (cont’d.)

IntroductionOur approachEvaluationRelated workSummary and future work

Page 18: Discovering simple mappings between  Relational database schemas and ontologies

23/4/21 ISWC2007, Nov. 14.

Evaluation – Data sets

Data sets:

http://www.cs.toronto.edu/~yuana/research/maponto/relational/testData.html [Ref.1. MapOnto]

We implemented our approach in Java, called Marson.

Page 19: Discovering simple mappings between  Relational database schemas and ontologies

23/4/21 ISWC2007, Nov. 14.

Evaluation – Experimental methodology

Experiment 1. Discovering simple mappings: Marson vs. Simple, VDoc, Valid, RONTO

Simple: not constructing virtual documents, not checking mapping consistency; VDoc: constructing virtual documents, not validating mapping consistency; Valid: not constructing virtual documents, validating mapping consistency; RONTO: an existing prototype, distinguish the types of entities, using I-Sub.

F1-Measure: a combination of precision and recall. Testing various thresholds for each approach, and selecting the best ones.

Experiment 2. Constructing contextual mappings Collecting instances from the Web for the first three data sets:

More than 50 instances for each relation and class.

Comparing with the mappings established by experienced volunteers.

Page 20: Discovering simple mappings between  Relational database schemas and ontologies

23/4/21 ISWC2007, Nov. 14.

Evaluation – Experiment 1

Under Intel Pentium IV 2.8GHz processor, 512MB DDR2 memory, Windows XP Professional, and Java SE 6, Marson takes about 5 seconds to complete all the five tests (including the parsing time).

Page 21: Discovering simple mappings between  Relational database schemas and ontologies

23/4/21 ISWC2007, Nov. 14.

Evaluation – Experiment 2

In Case 1, missing < academic_staff, Professor (subclasses of Faculty ) >. Not finding the mapping <academic_staff, Faculty>:

Without background knowledge.

Page 22: Discovering simple mappings between  Relational database schemas and ontologies

23/4/21 ISWC2007, Nov. 14.

Evaluation – Experiment 2 (cont’d.)

In Case 2: finding <the relation Event, the class Conference> When the values of the attribute type in Event equals to “Research Sessio

n” or “Industrial Session”, the subsumption relationship between Event and Conference can be converted to the equivalence relationship.

Page 23: Discovering simple mappings between  Relational database schemas and ontologies

23/4/21 ISWC2007, Nov. 14.

Outline (cont’d.)

IntroductionOur approachEvaluationRelated workSummary and future work

Page 24: Discovering simple mappings between  Relational database schemas and ontologies

23/4/21 ISWC2007, Nov. 14.

Related work

Interested by both Database and Semantic Web communities. At an early stage: visual toolkits, help users specify mappings manually. At present: discovering mappings (semi-) automatically.

For example, COMA, RONTO: – Not considering the structural differences in models;

– Not validating the consistency between mappings.

Other research directions: Describing system framework, e.g., OntoGrate; Defining mapping expression language, e.g., R2O; Extending OWL with ICs; Inferring complex mappings, e.g., MapOnto.

Page 25: Discovering simple mappings between  Relational database schemas and ontologies

23/4/21 ISWC2007, Nov. 14.

Summary and future work

Summary An approach to discovering simple mappings; An algorithm to build contextual mappings; Experiments to evaluate our approach.

Future work Instance matching; Machine learning techniques for mining semantic mappings; Others.

Page 26: Discovering simple mappings between  Relational database schemas and ontologies

23/4/21 ISWC2007, Nov. 14.

Thanks for your attention!

Any comments are welcome!

http://iws.seu.edu.cn/

Tools: Marson, Falcon-AO, OntoSum

Services: Falcons (Searching the SW with CSpaces)