18
Schema Matching and Query Rewriting in Ontology- based Data Integration Zdeňka Linková [email protected] ICS AS CR Advisor: Július Štuller

Schema Matching a nd Query Rewriting in Ontology-based Data Integration

  • Upload
    bishop

  • View
    25

  • Download
    1

Embed Size (px)

DESCRIPTION

Schema Matching a nd Query Rewriting in Ontology-based Data Integration. Zdeňka Linková [email protected] ICS AS CR Advisor: Július Štuller. Acknowledgement. - PowerPoint PPT Presentation

Citation preview

Page 1: Schema Matching  a nd Query Rewriting in Ontology-based Data Integration

Schema Matching and Query Rewriting in Ontology-based Data Integration

Zdeňka Linková[email protected] AS CR

Advisor: Július Štuller

Page 2: Schema Matching  a nd Query Rewriting in Ontology-based Data Integration

Acknowledgement

This work was supported by the project 1ET100300419 of the Program Information Society (of the Thematic Program II of the National Research Program of the Czech Republic) “Intelligent Models, Algorithms, Methods and Tools for the Semantic Web Realization”.

Page 3: Schema Matching  a nd Query Rewriting in Ontology-based Data Integration

Outline of presentation

Introduction Virtual data integration Ontology based system Matching in the system Mapping in the system Query rewriting Conclusion

Page 4: Schema Matching  a nd Query Rewriting in Ontology-based Data Integration

Introduction

Today’s world is a world of information Web data use expansion Need of efficient information processing => Semantic web idea (XML, RDF, ontologies)

Many data providers, working with distributed data Need of data integration

=> Semantic web data integration

Page 5: Schema Matching  a nd Query Rewriting in Ontology-based Data Integration

Virtual data integration

Data stays physically stored in original sources

Data integration provides an integrated view over distributed data

Virtual data integration: Schema matching Schema mapping Query processing

Page 6: Schema Matching  a nd Query Rewriting in Ontology-based Data Integration

Ontology-based system

Sources: Semantic web data (local and global)... RDF/XML Available ontologies for the sources ... OWL Task input: sources Si and ontologies Oj

Use of ontologies: Source ontologies and global ontology for provided

integrated data To do matching To describe mapping To query rewriting

Page 7: Schema Matching  a nd Query Rewriting in Ontology-based Data Integration

Relationships in the system

Schema matching – process of searching schema correspondences

Schema mapping – description of found schema correspondences, i.e. definition of relation, rule, formula etc. (1-1 rules, use of views, LAV and GAV approaches ...)

Consider correspondences kinds: Is-a hierarchical relationship , Equivalence Disjointness

Page 8: Schema Matching  a nd Query Rewriting in Ontology-based Data Integration

Matching and mapping in the system

For description of found correspondences in mapping, OWL ontologies and its features are used: rdfs:subClassOf for and

owl:equivalentClass for owl:disjointWith for

=> Ontology OI ... ontology of the integration system

... contains mapping in the system

How is OI obtained?

Page 9: Schema Matching  a nd Query Rewriting in Ontology-based Data Integration

Matching and mapping in the system Shared ontology case:

All data are described in only one (shared) ontology – in that data relationships are described => no need to search somewhere else

General case – shared ontology not available: Local ontologies describing data in the local

sources Need to obtain shared ontology => Integration local sources’ ontologies

The task is transformed to the ontology merging task

Available tools developed when solving this task kind can be employed: Chimaera, PROMPT (Protégé), FCA-MERGE, HCONE (WordNet)

Page 10: Schema Matching  a nd Query Rewriting in Ontology-based Data Integration

Related work on matching Various approaches searching schema correspondences

at different levels: Instance – data processing, e.g. domain Terms – string processing, vocabularies use, ... Structure – graphs methods applying, ...

Classical approaches in schema matching and mapping: Estimation from available information (data, structure,

external informational sources, …) Candidates selection (meassures, uncertainty, ...)

Here, the task is solved by merging ontologies: However, in ontology merging, similar principles as

mentioned above are used => similar principles are used at different level

Page 11: Schema Matching  a nd Query Rewriting in Ontology-based Data Integration

Querying the integrated data

Sources Sj contains RDF/XML data Querying using SPARQL language

Given guery in global environment ... QG

However, data available only in local sources with local environments

Task: to rewrite the query to the local environment of the local sources with use of mapping ... QL

Si

Use of mapping for rewriting

Page 12: Schema Matching  a nd Query Rewriting in Ontology-based Data Integration

Using mapping described in ontology

Passing the OWL ontology graph through equivalent or hierarchical relation

Using the known OOP rule: a child can substitute its parent

For term t:

generating set of all possible term rewritings ... R(t)

End condition: difference in between two passing steps is zero

Page 13: Schema Matching  a nd Query Rewriting in Ontology-based Data Integration

Using mapping described in ontology

Page 14: Schema Matching  a nd Query Rewriting in Ontology-based Data Integration

Simple query processing

Simple query – only simple condition on RDF triple For each term t in the query generate set of all

possible term rewritings … R(t) Using all R(t) for each term in the query obtain all

possible query rewritings … QL

Using local queries QL on local sources obtain local answers

Using reverse rewriting return answer placed in global environment … global answer

Page 15: Schema Matching  a nd Query Rewriting in Ontology-based Data Integration

Simple query rewriting

Optimalization: Querying all possible query rewritings in each local

source is not effective => Using set of supported terms for each source

Obtained from ontology, source schema, source preprocessing…

Generating set of all relevant term/query rewritings for each source

Page 16: Schema Matching  a nd Query Rewriting in Ontology-based Data Integration

Complex query processing

Complex query – also complex condition on searched RDF triple

Complex query is divided into simple queries by dividing complex condition into simple ones

Obtained answers corresponding to simple queries must be composed to the answer corresponding to the original (complex) query

Page 17: Schema Matching  a nd Query Rewriting in Ontology-based Data Integration

Conclusion

Use of ontologies in virtual data integration: Transformes data integration task to ontology

merging task Can bring use of formalism, methods and tools from

the other task area Can help in task automatization effort Standardized structure instead of particular project

oriented mapping rules bring possibility of reuse of mapping

Possibility of expression various terms relations Future plans: experiments with real data

Page 18: Schema Matching  a nd Query Rewriting in Ontology-based Data Integration

Thank for your attention