37
Multidatabase and Distributed Manipulations Part 1 Witold Litwin

Multidatabase and Distributed Manipulations Part 1

  • Upload
    lan

  • View
    37

  • Download
    0

Embed Size (px)

DESCRIPTION

Multidatabase and Distributed Manipulations Part 1. Witold Litwin. Plan. Introduction Technical problems Origin of the concept Approach : Centralized DB (ANSI-SPARC) Approach : DDB (top-down) Approach : Global Schema (bottom-up) Reference architectures - PowerPoint PPT Presentation

Citation preview

Page 1: Multidatabase and Distributed Manipulations Part 1

1

Multidatabase and Distributed Manipulations

Part 1

Multidatabase and Distributed Manipulations

Part 1

Witold Litwin

Page 2: Multidatabase and Distributed Manipulations Part 1

2

Plan

Introduction Technical problems Origin of the concept

– Approach : Centralized DB (ANSI-SPARC)

– Approach : DDB (top-down)

– Approach : Global Schema (bottom-up)

Reference architectures– Multidatabase architecture

– Federated architecture

Autonomy, semantic heterogeneity , common model

Page 3: Multidatabase and Distributed Manipulations Part 1

3

Multidatabase Model

Single database model ANSI-SPARC : – The real universe is modeled as one DB

The real universe is modeled as multiple dbs – Autonomous

– Semantically heterogeneous

– Manipulated through a multibase language "Multidatabase interoperability". Litwin, W. Abdellatif, A. Multidatabase

systems: An advanced solution for global information sharing. Hurson, A., R., Bright, M., W., Pakzad, S., H., (Ed.). IEEE press, 1993

Page 4: Multidatabase and Distributed Manipulations Part 1

4

Multidatabase Model

Cours & étudiants

Bibliothèque

Employés

Rest.

Mes-amisAutres BDs

surInternet

Paris 9Privé

Teletel

FolioCine

Page 5: Multidatabase and Distributed Manipulations Part 1

5

Problems

Reference architecture Semantic heterogeneity in presence of local

autonomy Common data model Fonctions of MDB langage Transactions Protocols & standards Performance

Page 6: Multidatabase and Distributed Manipulations Part 1

6

Reference Architectures

Multidatabase architecture– Generalization of the ANSI-SPARC

architecture – Federated dbs architecture – Generalization of the federated DB

architecture Others

Page 7: Multidatabase and Distributed Manipulations Part 1

7

ANSI-SPARC Architecture Centralized Integrated BD (1960-70)

CS

PS

ES ES

Internal Level

Ext. Level

Conc. level

Reel Univers

ES - External Schema

CS - Conceptual Schema

PS - Physical or Internal Schema

Page 8: Multidatabase and Distributed Manipulations Part 1

8

Distributed DB Origin of the concept (years 1970)

– WAN development ( 20 kb/s)

– Overload of centralized dbs

Page 9: Multidatabase and Distributed Manipulations Part 1

9

Ddb Idea : distribution of functions other than local

communication ("top-down" approach) Which ones ?

Distr. Execution (OS) File access The DB

Then, what data model for CS ? Network (codasyl) ? Relational

Page 10: Multidatabase and Distributed Manipulations Part 1

10

Relation Fragmentation

ParisLyon

Hotels (H#, Ville, Cat, #Chambres)

(H#, Ville) (H#, Cat, #Chambres)

1 Fragment

Page 11: Multidatabase and Distributed Manipulations Part 1

11

Problems

GS scalability GS utility for a local user Query performance (bad case) Data migration from ldbs

– IMS, IDMS, Socrate Local applications

Page 12: Multidatabase and Distributed Manipulations Part 1

12

Problemes With the "Bottom-up" Approach

Creation of GS Semantic heterogeneity Time for the integration / local

restructuring autonomy In other words: scalability of GS Updates Performance Heterogeneos views

Creation of GS Semantic heterogeneity Time for the integration / local

restructuring autonomy In other words: scalability of GS Updates Performance Heterogeneos views

CS CS CS

GS

ES ES

PS PS PS

GS Approach("bottom-up")

Page 13: Multidatabase and Distributed Manipulations Part 1

13

User may have data in multiple dabs ANSI-SPARC compatibles

One may face multiple css» In general, it will be impossible

to create a GS

User may have data in multiple dabs ANSI-SPARC compatibles

One may face multiple css» In general, it will be impossible

to create a GS

MDB Architecture Absence of (GS)

CS CS CS

GS

ES ES

PS PS PS

GS Approach("bottom-up")

Page 14: Multidatabase and Distributed Manipulations Part 1

14

MBD Architecture Concept of the MDB Language

Language for definition and manipulation of collections of dbs (multidatabases)

» Definition of MDB ESs Perhaps GSs

» Definition of MDB dependancies Semantics, integrity, security, manipulation...

» Formulation of MDB queries (explicitly) Referencing DB names With MDB joins...

Find in DB michelin and in DB gaumont all restaurants '**' and cinemas at the same street

Language for definition and manipulation of collections of dbs (multidatabases)

» Definition of MDB ESs Perhaps GSs

» Definition of MDB dependancies Semantics, integrity, security, manipulation...

» Formulation of MDB queries (explicitly) Referencing DB names With MDB joins...

Find in DB michelin and in DB gaumont all restaurants '**' and cinemas at the same street

Page 15: Multidatabase and Distributed Manipulations Part 1

15

MBD Architecture Multibases

A multidatabase (MDB) is a collection of dbs with MDB language– E.G. MSQL

A collection of dbs without MDB langage is not an MDB, but just a collection of dbs– As a collection of flat files (tables) without a

DB language , SQL for example, is not a DB

Page 16: Multidatabase and Distributed Manipulations Part 1

16

Potential Multibases

Cours & étudiants

Bibliothèque

Employés

Rest.

Mes-amisOther DBs

surInternet

Paris 9Privé

Teletel

FolioCine

Page 17: Multidatabase and Distributed Manipulations Part 1

17

MDB Architecture Concept of Internal Logical Sublayer

The legacy data models can be heterogeneous– Different SQL dialects– Relational, hierarchical, network– OO and object-relational

It is preferable to have one model at MDB layer– One needs a sub-layer for translations

Also local DBA may wish not to show some of the data at MDB layer

Solution: ILS - internal logical schema » Unknown of ANSI-SPARC» Called also gateway

Page 18: Multidatabase and Distributed Manipulations Part 1

18

Multibase Architecture Result of All This : (W. Litwin & Al, Années 1980)

PS

ES

CSDS DS

ILS

CS

PS

CS

PS

CS

PS

ES ES

MDB layers

Internal layer

Ext. MDB layer

CS Conc. MDB layer

Usagers

ESmultibase

Req.MDB

Page 19: Multidatabase and Distributed Manipulations Part 1

19

Federated Architecture (Hambiger & Mcleod, Années 1980)

Every DB should be autonomous In general, there will be no GS

– Global integration is against the autonomy

The dbs used in common should form a federation of autonomous dbs

Every DB in a federation should be provided with three schemes:– ES: export schema

– IS: import schema

– PS: private schema : for all the private data, of ES and of IS included

There should be some federation dictionary (FD)

Every DB should be autonomous In general, there will be no GS

– Global integration is against the autonomy

The dbs used in common should form a federation of autonomous dbs

Every DB in a federation should be provided with three schemes:– ES: export schema

– IS: import schema

– PS: private schema : for all the private data, of ES and of IS included

There should be some federation dictionary (FD)

Page 20: Multidatabase and Distributed Manipulations Part 1

20

Federated Architecture (Hambiger & Mcleod, Années 1980)

PS

ESIS

PSES

IS

PSES

IS

PSES

IS

Fig 3. Federated databases architecture

FD

Page 21: Multidatabase and Distributed Manipulations Part 1

21

Comparison

MDB architecture focuses on the concept of MDB language

Federated architecture focuses on the concept of autonomy– No MDB language

– But there is the notion of autonomy also in MDB arch.

MDB architecture is + decentralized– No equivalent of FD

– Several DSs

Both architectures are popular

Page 22: Multidatabase and Distributed Manipulations Part 1

22

Comparison MDB <-> Fed.MDB Arch. Fed.Arch.

Multidatabase Federation

Autonomy Autonomy

MDB Lang.

¬ GS ¬ GS

CS PS

MDB ES IS

DB ES ES

ILS ES

DSs Fed Dict.

Page 23: Multidatabase and Distributed Manipulations Part 1

23

DB Autonomy(local autonomy)

Capability to control local DB data by the DBA– Naming– Value Types– Data Structures – Physical Structures– Query Execution – Security– Priority to local queries

Page 24: Multidatabase and Distributed Manipulations Part 1

24

Multidatabase Autonomy

Capability to controle multiple DBs by a DBA

Same aspects as for the local autonomy– Naming...

May create a conflict with a DB autonomy

Priority to local autonomy

B1 B2 B3

Page 25: Multidatabase and Distributed Manipulations Part 1

25

Semantic Heterogeneity Differences in representations of the same real

properties Names André Andrew Value Types

– Representations– units of measure cm/s pied/h– precision 1 g 1 Kg

Data Structures One table in 2 NF several tables in 3 NF

Page 26: Multidatabase and Distributed Manipulations Part 1

26

Solutions (partial) Schemas + descriptive Protocols + descriptive Data Dictionaries Thesaurus Automatic Representation Conversions Automatic Unit Conversion Implicit Joints Higher level models and manip. languages

– IDL (Krishnamourthy, Litwin, Kent, ACM-Sigmod 92

Page 27: Multidatabase and Distributed Manipulations Part 1

27

Common Models Ext. Relational

– EDA-SQL– MSQL (research)– ODBC Microsoft SQL

Object-Relational– SQL 3

CCS language for inf. retr. DBs Numerous gateways towards SQL

– IMS SQL– Codasyl SQL

Page 28: Multidatabase and Distributed Manipulations Part 1

28

UniSQL/M

UniSQL/M

SybaseOracleUniSQL

IMS

Page 29: Multidatabase and Distributed Manipulations Part 1

29

Other gateways

UniSQL/M

SybaseOracleUniSQL

IMS

Page 30: Multidatabase and Distributed Manipulations Part 1

30

Yet other gateways

UniSQL/M

SybaseOracleUniSQL

IMS

EDA-SQL

Page 31: Multidatabase and Distributed Manipulations Part 1

31

The future (ODBC, JDBC, UDA…)

ODBC x

Page 32: Multidatabase and Distributed Manipulations Part 1

32

Conclusion

Modern DBMSs are in general MDBMSs– UniSQL/M, Oracle, Sybase, MsAccess, SQL

Server, InterBase, DB2... MDB access requires new functions for the

management of – autonomy– semantic heterogeneity – physical distribution of data

Page 33: Multidatabase and Distributed Manipulations Part 1

33

Conclusion technical solutions are basd on :

– new reference architectures » multidatabase architecture

» federated architecture

– common data models » relational and object-relational

– Gateways in rapid development » Any DBMS towards any other using ODBC, JDBC…

» Relational or XML wrappers

Page 34: Multidatabase and Distributed Manipulations Part 1

34

Conclusion

MDB Languages– MSQL et SQL-x ; x > 2

New transaction models Protocoles et Standards

– ODBC, JDBC…

All this is dealt with more in depth– later on– in the class books & several others

Page 35: Multidatabase and Distributed Manipulations Part 1

35

Exercises & Research Problems All these already in the text Difference between the concepts of a DB, DDB,MDB and FDB. What does it mean «  reference architecture », as ANSI-SPARC for example ? Differences between the architectures: « top-down » « bottom-up », multibase et

federated. Comment on the actual architecture of federated DBs in DB2 V. 6 and later (see DB2

Help or RedBooks at the IBM web site). Comment on the actual architecture of federated DBs in SQL Server 2000 (see SQL

Server 2000 Help or the papers at the MS web site). Design SQL SELECT statements realizing the fragmentation of the Hotels DB. Can you propose then how to deal with the updates ? Comment on the concept of ILS, of gateway and of mediator What multidatabase common model is most used today ? Comment the concept of local autonomy (what, why, how) Provide examples of various types of semantic heterogeneity Prove that usual associability of equijoints does not hold if units of measure of

values to join may have different precisions What are consequences for relational DBMSs ? Propose an extension to SQL for units of measure and the corresponding query

optimization (PH.D Thesis).

Page 36: Multidatabase and Distributed Manipulations Part 1

TheEND

Page 37: Multidatabase and Distributed Manipulations Part 1