35
Data Data Integration by Integration by Bi- Bi- Directional Schema Directional Schema Transformation Transformation Rules Rules By Peter McBrien and Alexandria Poulovassilis Presented by Suman Paladugu

Data Integration by Bi-Directional Schema Transformation Rules Data Integration by Bi-Directional Schema Transformation Rules By Peter McBrien and Alexandria

Embed Size (px)

Citation preview

Page 1: Data Integration by Bi-Directional Schema Transformation Rules Data Integration by Bi-Directional Schema Transformation Rules By Peter McBrien and Alexandria

Data Integration Data Integration by by Bi-Bi-Directional Schema Directional Schema Transformation Transformation RulesRules By

Peter McBrien and Alexandria Poulovassilis

Presented bySuman Paladugu

Page 2: Data Integration by Bi-Directional Schema Transformation Rules Data Integration by Bi-Directional Schema Transformation Rules By Peter McBrien and Alexandria

IntroductionIntroduction

A new approach to data integration called both as view (BAV)BAV is based on the use of reversible sequences of schema transformations

Derive GAV and LAV view definitions from BAV schema transformation sequences

Support of BAV in the evolution of both global and local schemas

Implementation of the BAV approach within the AutoMed system

Page 3: Data Integration by Bi-Directional Schema Transformation Rules Data Integration by Bi-Directional Schema Transformation Rules By Peter McBrien and Alexandria

Example Local and Global Example Local and Global SchemasSchemasSg student (id, name, left #,degree)

monitors (sno ,id)

staff (sno, sname, dept#)

S1 ug (id, name, left #, degree, sno)

tutor (sno, sname)

S2 phd (id, name, left#, title)

supervises (sno, id)

supervisor (sno, sname, dept)

Page 4: Data Integration by Bi-Directional Schema Transformation Rules Data Integration by Bi-Directional Schema Transformation Rules By Peter McBrien and Alexandria

Example Local and Global Example Local and Global SchemasSchemasSg student (id, name, left

#,degree)

monitors (sno ,id)

staff (sno, sname, dept#)

S1 ug (id, name, left #,

degree, sno)

tutor (sno, sname)

S2 phd (id, name, left#, title)

supervises (sno, id)

supervisor (sno, sname,

dept)

G1 Student (id, name, left, degree) ={x, y, z, w | (x, y, z, w, -) ug Λ (x, -, -, -) phd V (x, y, z, w) phd Λ w=‘phd’ }

Page 5: Data Integration by Bi-Directional Schema Transformation Rules Data Integration by Bi-Directional Schema Transformation Rules By Peter McBrien and Alexandria

Example Local and Global Example Local and Global SchemasSchemasSg student (id, name, left

#,degree)

monitors (sno ,id)

staff (sno, sname, dept#)

S1 ug (id, name, left #,

degree, sno)

tutor (sno, sname)

S2 phd (id, name, left#, title)

supervises (sno, id)

supervisor (sno, sname,

dept)

G2 monitors (sno, id) = {x, y | (x, -, -, -y) ug Λ (x, -, -, -) phd V (x, y) supervises}

Page 6: Data Integration by Bi-Directional Schema Transformation Rules Data Integration by Bi-Directional Schema Transformation Rules By Peter McBrien and Alexandria

Example Local and Global Example Local and Global SchemasSchemasSg student (id, name, left

#,degree)

monitors (sno ,id)

staff (sno, sname, dept#)

S1 ug (id, name, left #,

degree, sno)

tutor (sno, sname)

S2 phd (id, name, left#, title)

supervises (sno, id)

supervisor (sno, sname,

dept)

G3 staff (sno, sname, dept) = {x, y, z | (x, y) tutor Λ (x, -, -) supervisor V (x, y) supervisor}

Page 7: Data Integration by Bi-Directional Schema Transformation Rules Data Integration by Bi-Directional Schema Transformation Rules By Peter McBrien and Alexandria

Example Local and Global Example Local and Global SchemasSchemasSg student (id, name, left

#,degree)

monitors (sno ,id)

staff (sno, sname, dept#)

S1 ug (id, name, left #,

degree, sno)

tutor (sno, sname)

S2 phd (id, name, left#, title)

supervises (sno, id)

supervisor (sno, sname,

dept)

L1 tutor (sno, sname) = {x, y | (x, y, -) staff Λ (x, z) monitors Λ (z, -, -, w) student Λ w ‘phd’}

Page 8: Data Integration by Bi-Directional Schema Transformation Rules Data Integration by Bi-Directional Schema Transformation Rules By Peter McBrien and Alexandria

Example Local and Global Example Local and Global SchemasSchemasSg student (id, name, left

#,degree)

monitors (sno ,id)

staff (sno, sname, dept#)

S1 ug (id, name, left #,

degree, sno)

tutor (sno, sname)

S2 phd (id, name, left#, title)

supervises (sno, id)

supervisor (sno, sname,

dept)

L2 ug (id, name, left, degree, sno) = {x, y, z, w, v | (x, y, z ) student Λ (v, x) monitors Λ w ‘phd’}

Page 9: Data Integration by Bi-Directional Schema Transformation Rules Data Integration by Bi-Directional Schema Transformation Rules By Peter McBrien and Alexandria

Example Local and Global Example Local and Global SchemasSchemasSg student (id, name, left

#,degree)

monitors (sno ,id)

staff (sno, sname, dept#)

S1 ug (id, name, left #,

degree, sno)

tutor (sno, sname)

S2 phd (id, name, left#, title)

supervises (sno, id)

supervisor (sno, sname,

dept)

L3 phd (id, name, left, title) = {x, y, z, w | (x, y, z, v) student Λ v = ‘phd’ Λ w = null}

Page 10: Data Integration by Bi-Directional Schema Transformation Rules Data Integration by Bi-Directional Schema Transformation Rules By Peter McBrien and Alexandria

Example Local and Global Example Local and Global SchemasSchemasSg student (id, name, left

#,degree)

monitors (sno ,id)

staff (sno, sname, dept#)

S1 ug (id, name, left #,

degree, sno)

tutor (sno, sname)

S2 phd (id, name, left#, title)

supervises (sno, id)

supervisor (sno, sname,

dept)

L4 supervises (sno, id) = {x, y | (x, y) monitors Λ (x, -, -, z) student Λ z = ‘phd’}

Page 11: Data Integration by Bi-Directional Schema Transformation Rules Data Integration by Bi-Directional Schema Transformation Rules By Peter McBrien and Alexandria

Example Local and Global Example Local and Global SchemasSchemasSg student (id, name, left

#,degree)

monitors (sno ,id)

staff (sno, sname, dept#)

S1 ug (id, name, left #,

degree, sno)

tutor (sno, sname)

S2 phd (id, name, left#, title)

supervises (sno, id)

supervisor (sno, sname,

dept)

L5 supervisor (sno, sname, dept) = {x, y, z | (x, y, z) staff Λ (x, w,) monitors Λ (w, -, -, v) student Λ v = ‘phd’}

Page 12: Data Integration by Bi-Directional Schema Transformation Rules Data Integration by Bi-Directional Schema Transformation Rules By Peter McBrien and Alexandria

Example Local and Global Example Local and Global SchemasSchemasSg student (id, name, left #,degree)

monitors (sno ,id)

staff (sno, sname, dept#)

S1 ug (id, name, left #, degree, sno)

tutor (sno, sname)

S2 phd (id, name, left#, title)

supervises (sno, id)

supervisor (sno, sname, dept)

Page 13: Data Integration by Bi-Directional Schema Transformation Rules Data Integration by Bi-Directional Schema Transformation Rules By Peter McBrien and Alexandria

Evolution Problems of GAV and Evolution Problems of GAV and LAVLAVGAV not ready to support the evolution of local schema

In LAV, changes to a local schema impact only on the derivation rules defined for that schema

But there is a problem for LAV

Page 14: Data Integration by Bi-Directional Schema Transformation Rules Data Integration by Bi-Directional Schema Transformation Rules By Peter McBrien and Alexandria

BAV IntegrationBAV Integration

Common Data Model- HDM.

In LAV, changes to a local schema impact only on the derivation rules defined for that schema

Schemas are incrementally transformed by applying to them a sequence of primitive transformation stepst1, t2, t3……tn .

Intermediate (and final) schemas may contain constructs of more than one modeling language.

Page 15: Data Integration by Bi-Directional Schema Transformation Rules Data Integration by Bi-Directional Schema Transformation Rules By Peter McBrien and Alexandria

BAV Integration… ContdBAV Integration… Contd

Each add or del transformation is accompanied by a query specifying the extent of the new/deleted construct in terms of the rest of the constructs in the schema.

This allows automatic translation of data and queries between schemas linked by a transformation pathway e.g. for global query processing

Page 16: Data Integration by Bi-Directional Schema Transformation Rules Data Integration by Bi-Directional Schema Transformation Rules By Peter McBrien and Alexandria

Example: A Simple Example: A Simple Relational ModelRelational Model

k1, k2, k3……kn , n≥1, are the primary key attributes

a1, a2, a3……am , m ≥ 0, are the non-primary key attributes

Page 17: Data Integration by Bi-Directional Schema Transformation Rules Data Integration by Bi-Directional Schema Transformation Rules By Peter McBrien and Alexandria

Primitive Transformation of Primitive Transformation of this Modelthis ModeladdRel(( (R, k1, k2, k3……kn) ,q)) adds to the schema a new relation R

addAtt(( R, a), c, q)) adds to the schema a non-primary key attribute for relation R

delRel (((R, k1, k2, k3……kn ) ,q)) deletes relation R

delAtt (((R, a ),c, q))

Page 18: Data Integration by Bi-Directional Schema Transformation Rules Data Integration by Bi-Directional Schema Transformation Rules By Peter McBrien and Alexandria

Primitive Transformation of Primitive Transformation of this Modelthis ModelextRel(( R, k1, k2, k3……kn )), q))

extAtt(( R, a )), c, q))

conRel(( R, k1, k2, k3……kn )), q))

conAtt(( R, a)), c, q))

Page 19: Data Integration by Bi-Directional Schema Transformation Rules Data Integration by Bi-Directional Schema Transformation Rules By Peter McBrien and Alexandria

BAV integration of SBAV integration of S11 and S and S22 into Sinto Sg : ‘g : ‘addadd’ Steps’ Steps

Page 20: Data Integration by Bi-Directional Schema Transformation Rules Data Integration by Bi-Directional Schema Transformation Rules By Peter McBrien and Alexandria

BAV integration of SBAV integration of S11 and S and S22 into Sinto Sg : ‘g : ‘deletedelete’ and ‘contract’ ’ and ‘contract’ StepsSteps

Page 21: Data Integration by Bi-Directional Schema Transformation Rules Data Integration by Bi-Directional Schema Transformation Rules By Peter McBrien and Alexandria

Correspondence between Correspondence between GAV/LAV GAV/LAV The ‘add’ steps correspond to GAV since global schema constructs are being defined in terms of local ones

The ‘del’ and ‘con’ steps correspond to LAV since local schema constructs are being defined in terms of global ones

Page 22: Data Integration by Bi-Directional Schema Transformation Rules Data Integration by Bi-Directional Schema Transformation Rules By Peter McBrien and Alexandria

Correspondence between Correspondence between BAV and GAV/LAVBAV and GAV/LAVGAV or LAV definition can be converted into a partial BAV definition

Complete GAV or LAV definition can be derived from a BAV definition.

BAV thus combines the benefits of GAV and LAV in the sense that any reasoning or processing which is possible with the view definitions of GAV or LAV will also be possible with the BAV definition

Page 23: Data Integration by Bi-Directional Schema Transformation Rules Data Integration by Bi-Directional Schema Transformation Rules By Peter McBrien and Alexandria

Deriving BAV from GAVDeriving BAV from GAV

GAV definition is derived using some of the information present in BAV definition:First, Decomposition rule applied to each GAV rule

G1-- generates 1-4, G2 --8, and G3 generates 5-7Second, each construct c of type T in the source schema is removed using transformation step of form con T( c, void). conAtt((( tutor, sname)), notnull, void))) conAtt((( phd, title)), notnull, void))) conRel((( phd, id)), void)

Page 24: Data Integration by Bi-Directional Schema Transformation Rules Data Integration by Bi-Directional Schema Transformation Rules By Peter McBrien and Alexandria

Deriving BAV from LAVDeriving BAV from LAV

LAV definition is also derived using some of the information present in BAV definition:

L1 to L5-- generates reverse transformation steps of 23-9.

All the BAV transformations steps generated must be ‘extend’ rather than ‘add’ ones

extRel((( phd, id)), {x |x (( student, id))} V (x, ‘phd’) ((student, degree)) })

extAtt((( tutor, sname)), notnull, {x, y |(x, y) ((staff, sname)) x ((tutor, sno))})

Page 25: Data Integration by Bi-Directional Schema Transformation Rules Data Integration by Bi-Directional Schema Transformation Rules By Peter McBrien and Alexandria

Deriving GAV from BAV

Take the subset, G, of the add and ext steps in the transformation sequence from S1 U S2 U ……Sg

Take each addRel/extRel step in G, together with all addAtt/extAtt steps for the same relation

Form a join of the schemes ((R, a1))…… ((R, am)) to restore relation R.

Page 26: Data Integration by Bi-Directional Schema Transformation Rules Data Integration by Bi-Directional Schema Transformation Rules By Peter McBrien and Alexandria

ExampleExample

Page 27: Data Integration by Bi-Directional Schema Transformation Rules Data Integration by Bi-Directional Schema Transformation Rules By Peter McBrien and Alexandria

Example Contd…Example Contd…

Page 28: Data Integration by Bi-Directional Schema Transformation Rules Data Integration by Bi-Directional Schema Transformation Rules By Peter McBrien and Alexandria

Example Contd…Example Contd…

Page 29: Data Integration by Bi-Directional Schema Transformation Rules Data Integration by Bi-Directional Schema Transformation Rules By Peter McBrien and Alexandria

Deriving LAV from BAVDeriving LAV from BAV

Take the subset, L, of the del and con steps on constructs of Si in the transformation sequence from S1 U S2 U ……Sg .

Construction of the LAV view definitions from L proceeds in a similar fashion to the construction of GAV view definitions

E.g. the steps forming L for schema Si are 9 – 15 above

Rule L1 can then be derived from 9 – 10 and rule L2 from 11 – 15

Page 30: Data Integration by Bi-Directional Schema Transformation Rules Data Integration by Bi-Directional Schema Transformation Rules By Peter McBrien and Alexandria

BAV support for Global Schema EvolutionIf a global schema S evolves to a new schema, S’ the evolution is specified as a transformation pathway S S’Three possible steps:1.If t is an add or del, then S’ is semantically equivalent to S.2. If t is a contract, then there will be information that used to be present in S no longer available from S’.3. If t is an extend transformation then domain knowledge is required to determine if the new construct in S’ can in fact be completely derived from the local data sources

Page 31: Data Integration by Bi-Directional Schema Transformation Rules Data Integration by Bi-Directional Schema Transformation Rules By Peter McBrien and Alexandria

BAV support for Local Schema EvolutionLittle Complex compared to the previous one

Suppose that some local schema S evolves, to S’ . The evolution is again defined as a transformation pathway S S’

Each transformation step, t , in this pathway is again considered in turn

As with global schema evolution, only if t is an extend is domain knowledge required

Page 32: Data Integration by Bi-Directional Schema Transformation Rules Data Integration by Bi-Directional Schema Transformation Rules By Peter McBrien and Alexandria

The AutoMed Architecture

Page 33: Data Integration by Bi-Directional Schema Transformation Rules Data Integration by Bi-Directional Schema Transformation Rules By Peter McBrien and Alexandria

ConclusionsConclusions

GAV and LAV views can be derived from a BAV specification

BAV thus combines the benefits of GAV and LAV, in that any reasoning or processing which is possible with GAV or LAV view definitions will also be possible with a BAV specification

A key advantage of BAV is that it readily supports the evolution of both local and global schemas, allowing transformation pathways and schemas to be incrementally modified

Page 34: Data Integration by Bi-Directional Schema Transformation Rules Data Integration by Bi-Directional Schema Transformation Rules By Peter McBrien and Alexandria

Questions?Questions?

Page 35: Data Integration by Bi-Directional Schema Transformation Rules Data Integration by Bi-Directional Schema Transformation Rules By Peter McBrien and Alexandria

Thank YouThank You