34
Model-independent Schema and Data Translation Paolo Atzeni 14/10/2009

Model-independent Schema and Data Translation

  • Upload
    malo

  • View
    31

  • Download
    1

Embed Size (px)

DESCRIPTION

Model-independent Schema and Data Translation. Paolo Atzeni 14/10/2009. References. P. Atzeni, G. Gianforme, and P. Cappellari. A Universal Metamodel and Its Dictionary . Transactions on Large-Scale Data-and Knowledge-Centered Systems I, 2009 - PowerPoint PPT Presentation

Citation preview

Page 1: Model-independent Schema and Data Translation

Model-independent Schema and Data Translation

Paolo Atzeni

14/10/2009

Page 2: Model-independent Schema and Data Translation

References

• P. Atzeni, G. Gianforme, and P. Cappellari. A Universal Metamodel and Its Dictionary. Transactions on Large-Scale Data-and Knowledge-Centered Systems I, 2009

• P. Atzeni, P. Cappellari, R. Torlone, P. A. Bernstein, and G. Gianforme. Model-independent schema translation. VLDB Journal, 2008

• P. Atzeni, P. Cappellari, and P. A. Bernstein. Model independent schema and data translation. EDBT 2006.

ITD 14/10/2009 2

Page 3: Model-independent Schema and Data Translation

ITD 14/10/2009 3

Schema and data translation

• Schema translation:– given schema S1 in model M1 and model M2– find a schema S2 in M2 that “corresponds” to S1

• Schema and data translation:– given also a database D1 for S1– find also a database D2 for S2 that “contains the same data”

as D1

Page 4: Model-independent Schema and Data Translation

ITD 14/10/2009 4

A wider perspective

• (Generic) Model Management:– A proposal by Bernstein et al (2000 +)– Includes a set of operators on

• schemas and mappings between schemas– The main operators

• Match• Merge• Diff• ModelGen (=schema translation)

Page 5: Model-independent Schema and Data Translation

ITD 14/10/2009 5

A long standing issue

• Translations from a model to another have been studied since the 1970’s

• Whenever a new model is defined, techniques and tools to generate translations are studied

• However, proposals and solutions are usually model specific: – Given an ER schema, find the suitable relational schema that

“implements” it • the original paper (Chen 1976) contains the basics• further discussions by many (e.g. Markowitz and Shoshani

1989)• illustrated in every textbook

– Similarly with • any other conceptual model and any other logical one• XML and relational (or object)

Page 6: Model-independent Schema and Data Translation

ITD 14/10/2009 6

 A simple example

• An object relational database, to be translated in a relational one

• Source: the OR-model

• Target: the relational model

System managed ids

used as references

Page 7: Model-independent Schema and Data Translation

ITD 14/10/2009 7

 Example, 2

• How do we traslate?

– Two possibilities. Does the OR model allow for keys?

– If YES (EmpNo and Name are keys)

Page 8: Model-independent Schema and Data Translation

ITD 14/10/2009 8

 Example, 3

• How do we traslate?

– Two possibilities. Does the OR model allow for keys?

– If NOT

Page 9: Model-independent Schema and Data Translation

ITD 14/10/2009 9

Many different models (plus variants …)

OR

Relational

OO

ER

XSD

Page 10: Model-independent Schema and Data Translation

ITD 14/10/2009 10

Heterogeneity

• We need to handle artifacts and data in various models– Data are defined wrt to schemas– Schemas wrt to models– How can models be defined? We need metamodels

Data

Models

Schemas

= metaschemas

= metadata

Page 11: Model-independent Schema and Data Translation

ITD 14/10/2009 11

A metamodel approach

• The constructs in the various models are rather similar:

– can be classified into a few categories (Hull & King 1986):

• Abstract (entity, class, …)

• Lexical: set of printable values (domain)

• Aggregation: a construction based on (subsets of) cartesian products (relationship, table)

• Function (attribute, property)

• Hierarchies

• …

• We can fix a set of metaconstructs (each with variants):

– abstract, lexical, aggregation, function, ...

– the set can be extended if needed, but this will not be frequent

• A model is defined in terms of the metaconstructs it uses

Page 12: Model-independent Schema and Data Translation

ITD 14/10/2009 12

 The metamodel approach, example

• The ER model:– Abstract (called Entity)– Function from Abstract to Lexical (Attribute)– Aggregation of abstracts (Relationship) – …

• The OR model:– Abstract (Table with ID)– Function from Abstract to Lexical (value-based Attribute)– Function from Abstract to Abstract (reference Attribute)– Aggregation of lexicals (value-based Table)– Component of Aggregation of Lexicals (Column)– …

Page 13: Model-independent Schema and Data Translation

ITD 14/10/2009 13

 The supermodel

• A model that includes all the meta-constructs (in their most general forms)– Each model is subsumed by the supermodel (modulo

construct renaming) – Each schema for any model is also a schema for the

supermodel (modulo construct renaming)

• In the example, a model that generalizes OR and relational

• Each translation from the supermodel SM to a target model M is also a translation from any other model to M:– given n models, we need n translations, not n2 

Page 14: Model-independent Schema and Data Translation

ITD 14/10/2009 14

Generic translations

1. Copy

2. Translation

3. Copy

Supermodel

Source model

Target model

Translation:composition 1,2 & 3

Page 15: Model-independent Schema and Data Translation

ITD 14/10/2009 15

Translations within the supermodel

• We still have too many models:– we have few constructs, but each has several independent

features which give rise to variants• for example, within simple OR model versions,

– Key may be specifiable or not– Generalizations may be allowed or not– Foreign keys may be used or not– Nesting may be used or not

– Combining all these, we get hundreds of models!– The management of a specific translation for each model

would be hopeless

Page 16: Model-independent Schema and Data Translation

ITD 14/10/2009 16

The metamodel approach, translations

• As we saw, the constructs in the various models are similar: – can be classified according to the metaconstructs– translations can be defined on metaconstructs,

• there are standard, known ways to deal with translations of constructs (or variants theoreof)

• Elementary translation steps can be defined in this way– Each translation step handles a supermodel construct (or a

feature thereof) "to be eliminated" or "transformed"• Then, elementary translation steps to be combined• A translation is the concatenation of elementary translation

steps

Page 17: Model-independent Schema and Data Translation

ITD 14/10/2009 17

Many different models

OR

Relational

OO

ER

XSD

Page 18: Model-independent Schema and Data Translation

ITD 14/10/2009 18

Many different models (and variants …)

OR w/ PK, gen, ref, FK

Relational…

OR w/ PK, gen, ref

OR w/ PK, gen, FK

OR

w/ PK, ref, FK

OR w/ gen, ref

OR w/ PK, FK

OR w/ PK, ref

OR w/ ref

Page 19: Model-independent Schema and Data Translation

ITD 14/10/2009 19

A complex translation, example(0,N) (0,N)

Target: simple object model• Eliminate N-ary relationships • Eliminate attributes from relationships • Eliminate many-to-many relationships• Replace relationships with references• Eliminate generalizations

Page 20: Model-independent Schema and Data Translation

ITD 14/10/2009 20

Complex translations

Binary ER w/o gen

N-ary ER w/ gen

Binary ERw/ gen

Relational

Bin ER w/ gen w/o attr on rel

OO w/ gen

N-ary ER w/o gen

Bin ER w/o gen w/o attr on rel

Bin ER w/o gen w/o M:N rel

Elim. N-ary relationships Elim. Relationship attr.sElim. M:N relationshipsReplace relationships with

referencesElim OO generalizations Elim ER generalizations

Bin ER w/ gen w/o M:N rel

OO w/o gen…

Page 21: Model-independent Schema and Data Translation

ITD 14/10/2009 21

 A more complex example

• An object relational database, to be translated in a relational one

– Source: an OR-model

– Target: the relational model

Page 22: Model-independent Schema and Data Translation

ITD 14/10/2009 22

A more complex example, 2

ENG

EMP DEPT

Last Name

School

Dept

Name

Address

ID ID

ID

Dept_ID

Emp_ID

Target: relational model

Eliminate generalizations Add keysReplace refs with FKs

Page 23: Model-independent Schema and Data Translation

A more complex example, 3

ITD 14/10/2009 23

EMP

Last Name

ID

Dept_ID

DEPT

Name

Address

ID

ENG

School

ID

Emp_ID

Target: relational model

Eliminate generalizations Add keysReplace refs with FKsReplace objects with tables

Page 24: Model-independent Schema and Data Translation

ITD 14/10/2009 24

Many different models (and variants …)

OR w/ PK, gen, ref, FK

Relational

OR w/ PK, gen, ref

OR w/ PK, gen, FK

OR

w/ PK, ref, FK

OR w/ gen, ref

OR w/ PK, FK

OR w/ PK, ref

OR w/ ref Eliminate generalizations

Add keysReplace refs with FKsReplace objects with tables

Source

Target

Page 25: Model-independent Schema and Data Translation

ITD 14/10/2009 25

Translations in MIDST (our tool)

• Basic translations are written in a variant of Datalog, with OID invention– We specify them at the schema level– The tool "translates them down" to the data level

(both in off-line and run-time manners, see later)– Some completion or tuning may be needed

Page 26: Model-independent Schema and Data Translation

ITD 14/10/2009 26

A Multi-Level Dictionary

• Handles models, schemas and data• Has both a model specific and a model independent component• Relational implementation, so Datalog rules can be easily

specified

Page 27: Model-independent Schema and Data Translation

ITD 14/10/2009 27

A common dictionary (for an ER design tool)

Entity

OID Schema Name

301 1 Employees

302 1 Departments

201 3 Clerks

202 3 Offices

AttributeOfEntity

OID Schema Name isIdent isNullable Type Entity

401 1 EmpNo T F Int 301

402 1 Name F F Text 301

404 1 Name T F Char 302

405 1 Address F F Text 302

501 3 Code T F Int 201

… … … … … … …

Relationship

OID Schema Name Entity1 Entity2

401 1 Memb 301 302 … …

… … … … … … …

Employees DepartmentsMemb

Name

AddressName

EmpNo

Page 28: Model-independent Schema and Data Translation

ITD 14/10/2009 28

Multi-Level Dictionary

model

schema

modelgeneric

modelspecific

description

modelindependence

Supermodel description

(mSM)

Model descriptions

(mM)

Supermodel schemas

(SM)

Model specific schemas

(M)

Supermodel instances

(i-SM)

Model specific instances

(i-M)data

Page 29: Model-independent Schema and Data Translation

ITD 14/10/2009 29

The supermodel description

MSM-Construct

OID Name IsLex

1 AggregationOfLexicals F

2 ComponentOfAggrOfLex T

3 Abstract F

4 AttributeOfAbstract T

5 BinaryAggregationOfAbstracts F

MSM-Property

OID Name Constr. Type

11 Name 1 String

12 Name 2 String

13 IsKey 2 Boolean

14 IsNullable 2 Boolean

15 Type 2 String

16 Name 3 String

17 Name 4 String

18 IsIdentifier 4 Boolean

19 IsNullable 4 Boolean

20 Type 4 String

21 IsFunct1 5 Boolean

22 IsOptional1 5 Boolean

23 Role1 5 String

24 IsFunct2 5 Boolean

25 IsOptional2 5 Boolean

26 Role2 5 String

MSM-Reference

OID Name Construct Target

30 Aggregation 2 1

31 Abstract 4 3

32 Abstract1 5 3

33 Abstract2 5 3

mSMmMSMMi-SMi-M

Page 30: Model-independent Schema and Data Translation

ITD 14/10/2009 30

Model descriptions

MSM-Construct

OID Name IsLex

1 AggregationOfLexicals F

2 ComponentOfAggrOfLex T

3 Abstract F

4 AttributeOfAbstract T

5 BinaryAggregationOfAbstracts F

MM-Construct

OID Model MSM-Constr Name

1 2 3 ER_Entity

2 2 4 ER_Attribute

3 2 5 ER_Relationship

4 1 1 Rel_Table

5 1 2 Rel_Column

6 3 3 OO_Class

MM-Model

OID Name

1 Relational

2 Entity-Relationship

3 Object

MSM-Property

OID Name Construct Type

… … … …

MSM-Reference

OID Name Construct …

… … … …

MM-Property

… … … …

MM-Reference

… … … …

mSMmMSMMi-SMi-M

Page 31: Model-independent Schema and Data Translation

ITD 14/10/2009 31

Schemas in a model

ER-Entity

OID Schema Name

301 1 Employees

302 1 Departments

201 3 Clerks

202 3 Offices

ER-AttributeOfEntity

OID Schema Name isIdent isNullable Type AbstrOID

401 1 EmpNo T F Int 301

402 1 Name F F Text 301

404 1 Name T F Char 302

405 1 Address F F Text 302

501 3 Code T F Int 201

… … … … … … …

SM-Construct

OID Name IsLex

… … …

3 ER-Entity F

4 ER-AttributeOfEntity T

… … …

ER schemas

mSMmMSMMi-SMi-M

Employees

Departments

EmpNo

Name

Name

Address

Page 32: Model-independent Schema and Data Translation

ITD 14/10/2009 32

Schemas in the supermodel

SM-Abstract

OID Schema Name

301 1 Employees

302 1 Departments

201 3 Clerks

202 3 Offices

SM-AttributeOfAbstract

OID Schema Name isIdent isNullable Type AbstrOID

401 1 EmpNo T F Int 301

402 1 Name F F Text 301

404 1 Name T F Char 302

405 1 Address F F Text 302

501 3 Code T F Int 201

… … … … … … …

MSM-Construct

OID Name IsLex

… … …

3 Abstract F

4 AttributeOfAbstract T

… … …

Supermodel schemas

mSMmMSMMi-SMi-M

Employees

Departments

EmpNo

Name

Name

Address

Page 33: Model-independent Schema and Data Translation

ITD 14/10/2009 33

Multi-Level Repository, generation and use

model

schema

modelgeneric

modelspecific

description

modelindependence

Supermodel description

(mSM)

Model descriptions

(mM)

Supermodel schemas

(SM)

Model specific schemas

(M)

Supermodel instances

(i-SM)

Model specific instances

(i-M)data

Structure fixed, content provided by tool

designers

Structure generated by the tool from the content of mSM

Structure generated by the tool from the content of mSM

Structure fixed, content provided by model

designers out of mSM

Structure generated by the tool from the content of mM

Page 34: Model-independent Schema and Data Translation

ITD 14/10/2009 34

continued next week