28
DDBMS Architecture DDBMS Architecture Session-9 Data Management for Decision Support

DDBMS Architecture DDBMS Architecture Session-9 Data Management for Decision Support

Embed Size (px)

Citation preview

Page 1: DDBMS Architecture DDBMS Architecture Session-9 Data Management for Decision Support

DDBMS ArchitectureDDBMS Architecture

Session-9Data Management for Decision

Support

Page 2: DDBMS Architecture DDBMS Architecture Session-9 Data Management for Decision Support

ANSI ArchitectureANSI Architecture

External View

External View

External View

Conceptual View

Internal View

ExternalSchema

Users

ConceptualSchema

InternalSchema

Page 3: DDBMS Architecture DDBMS Architecture Session-9 Data Management for Decision Support

The Classical DDBMS The Classical DDBMS ArchitectureArchitecture

Site 1

Other sites

Site IndependentSchemas

Global Schema

Fragmentation Schema

Allocation Schema

LOCAL DB 1

Local Mapping Schema

DBMS 1

LOCAL DB 2

Local Mapping Schema

DBMS 2

Site 2

Page 4: DDBMS Architecture DDBMS Architecture Session-9 Data Management for Decision Support

DDBMS SchemasDDBMS Schemas

Global Schema: a set of global relations as if database were not distributed at all

Fragmentation Schema: global relation is split into “non-overlapping” (logical) fragments. 1:n mapping from relation R to fragments Ri.

Allocation Schema: 1:1 or 1:n (redundant) mapping from fragments to sites. All fragments corresponding to the same relation R at a site j constitute the physical image Rj. A copy of a fragment is denoted by Rj

i. Local Mapping Schema: a mapping from

physical images to physical objects, which are manipulated by local DBMSs.

Page 5: DDBMS Architecture DDBMS Architecture Session-9 Data Management for Decision Support

Global Relations, Fragments and Global Relations, Fragments and Physical ImagesPhysical Images

R

GlobalRelation

R33

R32

R22

R21

R12

R11

R3

R2

R1

R2

(Site2)

R1

(Site 1)

R3

(Site3)

Physical Images

Fragments

Page 6: DDBMS Architecture DDBMS Architecture Session-9 Data Management for Decision Support

Motivation for this Motivation for this Architecture Architecture Separating the concept of data

fragmentation from the concept of data allocation

fragmentation transparency location transparency Explicit control of redundancy Independence from local databases:

allows local mapping transparency

Page 7: DDBMS Architecture DDBMS Architecture Session-9 Data Management for Decision Support

Rules for Data Fragmentation Rules for Data Fragmentation

Completeness All the data of the global relation must be mapped into the fragments

Reconstruction It must always be possible to reconstruct each global relation from its fragments

Disjointedness it is convenient that fragments be disjoint, so that the replication of data can be controlled explicitly at the allocation level

Page 8: DDBMS Architecture DDBMS Architecture Session-9 Data Management for Decision Support

Types of Data FragmentationTypes of Data Fragmentation

Vertical Fragmentation•Projection on relation

(subset of attributes)•Reconstruction by join•Updates require no tuple

migrationHorizontal Fragmentation

•Selection on relation (subset of tuples)

•Reconstruction by union•Updates may requires

tuple migrationMixed Fragmentation•A fragment is a Select-

Project query on relation.

Page 9: DDBMS Architecture DDBMS Architecture Session-9 Data Management for Decision Support

Horizontal Fragmentation Horizontal Fragmentation

Partitioning the tuples of a global relation into subsets Example:

Supplier(SNum, Name, City)

Horizontal Fragmentation can be: Supplier 1 = City = ``SFO'' Supplier

Supplier2 = City != “SFO” Supplier

Reconstruction is possible: Supplier = Supplier1 Supplier2

The set of predicates defining all the fragments must be complete, and mutually exclusive

Page 10: DDBMS Architecture DDBMS Architecture Session-9 Data Management for Decision Support

Derived Horizontal Derived Horizontal Fragmentation Fragmentation The horizontal fragmentation is derived

from the horizontal fragmentation of another relationExample:

Supply (SNum, PNum, DeptNum, Quan)

SNum is a supplier number Supply1 = Supply SNum=SNum Supplier1

Supply2 = Supply SNum=SNum Supplier2

The predicates defining derived horizontal fragments are: Supply.SNum = Supplier.SNum and Supplier. City =

``SFO'' Supply.SNum = Supplier.SNum and Supplier. City !=

``SFO''

is the semi join operation.

Page 11: DDBMS Architecture DDBMS Architecture Session-9 Data Management for Decision Support

Vertical Fragmentation Vertical Fragmentation

The vertical fragmentation of a global relation is the subdivision of its attributes into groups; fragments are obtained by projecting the global relation over each groupExample

EMP (ENum,Name,Sal,Tax,MNum,DNum)

A vertical fragmentation can be EMP1 = ENum, Name, MNum, DNum EMP

EMP2 = ENum, Sal, Tax EMP

Reconstruction:

ENum = ENumEMP = EMP1 EMP2

Page 12: DDBMS Architecture DDBMS Architecture Session-9 Data Management for Decision Support

Distribution Transparency Distribution Transparency

We analyze the different levels of distribution transparency which can be provided by DDBMS for applications. A Simple Application

Supplier(SNum, Name, City)

Horizontally fragmented into: Supplier 1 = City = ``SFO'' Supplier at Site1

Supplier2 = City != “SFO” Supplier at Site2, Site3

Application: Read the supplier number from the terminal and

return the name of the supplier with that number

Page 13: DDBMS Architecture DDBMS Architecture Session-9 Data Management for Decision Support

Level 1 of Distribution Level 1 of Distribution Transparency Transparency

Fragmentation transparency:

The DDBMS interprets the database operation by accessing the databases at different sites in a way which is completely determined by the system

Supplier2

Supplier2

Supplier1

DDBMS

read(terminal, $SNum); Select Name into $Name from Supplier where SNum = $SNum;

write(terminal, $Name). S3

S2

S1

Page 14: DDBMS Architecture DDBMS Architecture Session-9 Data Management for Decision Support

Level 2 of Distribution Level 2 of Distribution Transparency Transparency

Location Transparency

The application is independent from changes in allocation schema, but not from changes to fragmentation schema

Supplier2

Supplier2

Supplier1

DDBMS

read(terminal, $SNum); Select Name into $Name from Supplier1 where SNum = $SNum;

If not FOUND thenSelect Name into $Name from Supplier2 where SNum = $SNum;

write(terminal, $Name).

S3

S2

S1

Page 15: DDBMS Architecture DDBMS Architecture Session-9 Data Management for Decision Support

Level 3 of Distribution Level 3 of Distribution Transparency Transparency

Local Mapping Transparency

The applications have to specify both the fragment names and the sites where they are located. The mapping of database operations specified in applications to those in DBMSs at sites is transparent

Supplier2

Supplier2

Supplier1

DDBMS

read(terminal, $SNum); Select Name into $Name from S1.Supplier1 where SNum = $SNum;

If not FOUND thenSelect Name into $Name from S3.Supplier2 where SNum = $SNum;

write(terminal, $Name).

S3

S2

S1

Page 16: DDBMS Architecture DDBMS Architecture Session-9 Data Management for Decision Support

Level 4 of Distribution Level 4 of Distribution Transparency Transparency No Transparency

read(terminal, $SNum); $SupIMS($Snum,$Name,$Found) at S1;

If not FOUND then$SupCODASYL($Snum,$Name,$Found) at S3;

write(terminal, $Name).

S1

Supplier1

IMS

DDBMS

S3

Supplier2

Codasyl

Page 17: DDBMS Architecture DDBMS Architecture Session-9 Data Management for Decision Support

Distribution Transparency for Distribution Transparency for Updates Updates

EMP1 = ENum,Name,Sal,TaxDNum10 (EMP) EMP2 = ENum,MNum,DNumDNum10 (EMP) EMP3 = ENum,Name,DNumDnum>10 (EMP) EMP4 = ENum,MNum,Sal,TaxDnum>10 (EMP)

EnumName Sal Tax 100 Ann 100 10

EnumMnumDnum100 20 3

EnumMnum Sal Tax 100 20 100 10

EnumName Dnum100 Ann 15

Update Dnum=15 for Employee with Enum=100

EMP1

EMP3 EMP4

EMP2

Difficult• broadcasting

updates to all copies

• migration of tuples because of change of fragment defining attributes

Page 18: DDBMS Architecture DDBMS Architecture Session-9 Data Management for Decision Support

An Update Application An Update Application

UPDATE Emp SET DNum = 15 WHERE ENum = 100;

Select Name, Tax, Sal into $Name, $Sal, $Tax From EMP 1 Where ENum = 100; Select MNum into $MNum From Emp 2 Where ENum = 100; Insert into EMP 3 (ENum, Name, DNum) (100, $Name, 15); Insert into EMP 4 (ENum, Sal, Tax, MNum) (100, $Sal, $Tax, $MNum); Delete EMP 1 where ENum = 100; Delete EMP 2 where ENum = 100;

With Level 1 Fragmentation Transparency

With Level 2 Location

Transparency only

Page 19: DDBMS Architecture DDBMS Architecture Session-9 Data Management for Decision Support

Levels of Distribution Levels of Distribution TransparencyTransparency Fragmentation Transparency

Just like using global relations. Location Transparency

Need to know fragmentation schema; but no need not know where fragments are located

Applications access fragments (no need to specify sites where fragments are located).

Local Mapping Transparency Need to know both fragmentation and allocation

schema; no need to know what the underlying local DBMSs are.

Applications access fragments explicitly specifying where the fragments are located.

No Transparency Need to know local DBMS query languages, and

write applications using functionality provided by the Local DBMS

Page 20: DDBMS Architecture DDBMS Architecture Session-9 Data Management for Decision Support

On Distribution TransparencyOn Distribution Transparency

Higher levels of distribution transparency require appropriate DDBMS support, but makes end-application developers work easy.

Less distribution transparency the more the end-application developer needs to know about fragmentation and allocation schemes, and how to maintain database consistency.

Complex issues - query optimization and transaction management

Page 21: DDBMS Architecture DDBMS Architecture Session-9 Data Management for Decision Support

Some Aspects of the Classical Some Aspects of the Classical ArchitectureArchitecture Distributed database technology is an

“add-on” technology, most users already have populated centralized DBMSs. Whereas top down design assumes implementation of new DDBMS from scratch.

Current relational DBMS products provide for some form of location transparency (such as, by using nicknames).

Page 22: DDBMS Architecture DDBMS Architecture Session-9 Data Management for Decision Support

Bottom Up Architecture - Bottom Up Architecture - Present & FuturePresent & FuturePossible ways in which multiple databases may

be put together for sharing by multiple DBMSs, which are characterized according to

Autonomy (A) degree to which individual DBMSs can operate independently.

0 - Tightly coupled - integrated (A0), 1 - Semiautonomous - federated (A1), 2- Total Isolation - multidatabase systems(A2)

Distribution (D) 0- no distribution - single site (D0), 1 - client-server - distribution of DBMS

functionality (D1), 2- full distribution - peer to peer distributed

architecture(D2) Heterogeneity (H)

0 - homogeneous (H0) 1 - heterogeneous (H1)

Page 23: DDBMS Architecture DDBMS Architecture Session-9 Data Management for Decision Support

Autonomy Autonomy

Autonomy refers to the distribution of control, not of data.

Degree of independence of operations of individual DBMSs

Requirements of an autonomous system Local operations of the individual DBMSs

are not affected

Local query processing and optimization unaffected

System consistency or operation should not be compromised

Page 24: DDBMS Architecture DDBMS Architecture Session-9 Data Management for Decision Support

Taxonomy of Distributed Taxonomy of Distributed Databases Databases Composite DBMSs -tight integration

single image of entire database is available to any user can be single or multiple sites can be homogeneous or heterogeneous

Federated DBMSs - semiautonomous DBMSs that can operate independently, but have decided

to make some parts of their local data shareable can be single or multiple sites. they need to be modified to enable them to exchange

information Multidatabase Systems - total isolation

individual systems are stand alone DBMSs, which know neither the existence of other databases or how to communicate with them

no global control over the execution of individual DBMSs.

can be single or multiple sites homogeneous or heterogeneous

Page 25: DDBMS Architecture DDBMS Architecture Session-9 Data Management for Decision Support

DDBMSs Implementation DDBMSs Implementation AlternativesAlternatives

D

Distributed heterogeneous multidatabasesystem

Logically integratedand homogeneous DBMSs

Distributed heterogeneous DBMS

Distributed homogeneous multidatabase system

Heterogeneous Integrated DBMS

Multidatabase System

Single site homogeneous federated DBMS

Distributed homogeneous federated system

Distributed homogeneous DBMS

Heterogeneous multidatabase system

Single Site heterogeneous federated DBMS

Distributed Heterogeneous federated DBMS

A

H

Page 26: DDBMS Architecture DDBMS Architecture Session-9 Data Management for Decision Support

Components of Client/Server Components of Client/Server SystemSystem

UI Application Program

Client DBMS

Communication softwareOpe

ratin

g Sy

stem

System

Recovery Manager

Transaction Manager

Query Optimizer

Semantic Data Controller

Communication software

Ope

ratin

g

Runtime Support Processor

SQL Queries Result Relation

Database

Page 27: DDBMS Architecture DDBMS Architecture Session-9 Data Management for Decision Support

Components of DDBMSComponents of DDBMS

Local

Qu

ery

Pro

cess

or

Local

Recove

ryM

an

ag

er

Ru

nti

me

Su

pp

ort

Use

r In

terf

ace

Han

dle

r

Sem

an

tic D

ata

Con

troll

er

Glo

bal

Qu

ery

Op

tim

izer

Glo

bal

Execu

tion

M

on

itor

ExternalSchema

GlobalSchema

LocalConceptual

Schema

LocalInternalSchema

GD/Dlog

USER PROCESSOR DATA PROCESSOR

Page 28: DDBMS Architecture DDBMS Architecture Session-9 Data Management for Decision Support

Global DirectoryGlobal Directory

Directory is itself a database that contains meta-data about the actual data stored in the database. It includes the support for fragmentation transparency for the classical DDBMS architecture.

Directory can be local or distributed.

Directory can be replicated and/or partitioned.

Directory issues are very important for large multi-database applications, such as digital libraries.