76
Yunsheng Liu College of Software, HUST College of Software, HUST 2008.11 2008.11

Yunsheng Liu

  • Upload
    ariane

  • View
    29

  • Download
    0

Embed Size (px)

DESCRIPTION

Ⅶ. Database Design. Yunsheng Liu. College of Software, HUST 2008.11. 7.1 Introduction. 1. Generic Objective Data characteristics, application properties and DBMS features are integrated into a DBS by effectively using alternative DB - PowerPoint PPT Presentation

Citation preview

Page 1: Yunsheng Liu

Yunsheng LiuYunsheng Liu

College of Software, HUSTCollege of Software, HUST2008.112008.11

Page 2: Yunsheng Liu

22Yunsheng LiuYunsheng Liu

7.1 Introduction7.1 Introduction7.1 Introduction7.1 Introduction

1. Generic Objective Data characteristics, application properties and DBMS features are integrated into a DBS by effectively using alternative DB design approaches, techniques and tools

7.1.1 Problems in DB Design7.1.1 Problems in DB Design

Page 3: Yunsheng Liu

33Yunsheng LiuYunsheng Liu

Determination and representation of requirements Translation of the requirements into an effective

DB structure Implementation of the DB structure on a particular

computer system Consideration on the implementation in evolution

2. The Main Tasks

7.1.1 Problems in DB Design7.1.1 Problems in DB Design2

Page 4: Yunsheng Liu

44Yunsheng LiuYunsheng Liu

3. Design principles1). Matching the application properties and data 1). Matching the application properties and data characteristics with the DBMS features characteristics with the DBMS features 2). Cost/performance tradeoff 2). Cost/performance tradeoff 3). Data integration and structure flexibility3). Data integration and structure flexibility

Information changes Process changes Performance changes System SW/HW changes

Central control of data Sharing data Data availability Data non-redundancy

7.1.1 Problems in DB Design7.1.1 Problems in DB Design

Page 5: Yunsheng Liu

55Yunsheng LiuYunsheng Liu

7.1.2 Database Design Methodologies7.1.2 Database Design Methodologies

1. Concept

A methodology is a combination of an

organizational framework/serial steps for

successive DB structure development

projects and a set of techniques and tools

used in the framework /steps sequentially.

Page 6: Yunsheng Liu

66Yunsheng LiuYunsheng Liu

Function-based Methodology Data-based Methodology Object-Oriented Methodology

3. Criteria

2. Classification

availability, generality and flexibility, reproducibility

7.1.2 Database Design Methodologies7.1.2 Database Design Methodologies

Page 7: Yunsheng Liu

77Yunsheng LiuYunsheng Liu

7.1.3 Generic DB Design Process7.1.3 Generic DB Design Process

1. Overview of DB Design1. Overview of DB Design

Application

Environment

Database Design

Database Design

System Environment

properties

System Environment

propertiesConstraintsConstraints

DBSDB struc.

Appl. guide

Control procedureOp Requi.

Proc. Requi.

Infor. Requi.

Methodologies:Tech. Toots.

Methodologies:Tech. Toots.

Sys. spec.Sys. spec.

Sys. impl.Sys. impl.

DBMS OS

Page 8: Yunsheng Liu

88Yunsheng LiuYunsheng Liu

2. Technology Flow2. Technology Flow

AnalysisAnalysis ModelingModeling DesignDesign Implem.Implem.

User’sRequirements

Organization’sInfo. Stru.

DB Data Model

DB Structure

PhysicalDB

Data Relatio

n Theory

Data Relatio

n Theory

Data Modeling Tech.

Data Modeling Tech.

DBMSFeature

s

DBMSFeature

s

Data Organizing

Access

7.1.3 Generic DB Design Process7.1.3 Generic DB Design Process

Page 9: Yunsheng Liu

99Yunsheng LiuYunsheng Liu

3. Considerations

3). Tech., tools and methods, and evolution

Performance, System environment, Techniques availability Organization (politic, bureaucratic etc)

2). Constraints in aspects:

1). Requirements formulation: Information Function/Process Operation

7.1.3 Generic DB Design Process7.1.3 Generic DB Design Process

Page 10: Yunsheng Liu

1010Yunsheng LiuYunsheng Liu

7.1.4 The DB Design Steps7.1.4 The DB Design Steps

Appli. EnviProperties

ConceptualModeling

Techniques

DBMS Features

Design Tools,Technologies

OS, HWSupport

Appl. Analy&Requi. Determi.

Conceptual Design

Implem. Design

Physical Design

Implementation

Requi. Spec.

Conceptual (E-R) Model

Logical DB Stru., Appl. Spec.

Stored DB stru. Access Methods.Running DBS

Running DBS

Page 11: Yunsheng Liu

1111Yunsheng LiuYunsheng Liu

1. Data requirementsData in ISP —Information Structure Perspective,

- describing all the natural and conceptual data and the relationships of them in the DB, not just bounded to any processes or applications

- providing flexibility and adaptability

Data in UP —Usage Perspective

- describing all the data and the relationships used in applications, current and estimated known future applications

- providing efficiency

7.2 Requirements Analysis & 7.2 Requirements Analysis & FormulationFormulation7.2 Requirements Analysis & 7.2 Requirements Analysis & FormulationFormulation7.2.1 Types of Requirements

Page 12: Yunsheng Liu

1212Yunsheng LiuYunsheng Liu

2. Functional/Processing requirements

3. Operational requirements

- representing goal related to the running environment and performance

Consistency and security constraints

Response time constraints

Recovery time constraints

Number of applications supported in simultaneity

7.2.1 Types of Requirements

Page 13: Yunsheng Liu

1313Yunsheng LiuYunsheng Liu

7.2.2 Tasks and Goals of Requirements Analysis

1. The Main Tasks

Appl. Enviro.AnalysisMetadataCollection

EnterpriseModeling

Appl. Funct.Analysis

RequirementsSpecification

Page 14: Yunsheng Liu

1414Yunsheng LiuYunsheng Liu

2. The Overall Goals

- Analyzing info. structures and decision process to understand:

(1). The organization’s mechanisms

(2). The scope of the system to be designed

(3). User views of data (business dept.) and data elements in the views

(4). Relationships among the elements

(5). Characteristics and the primary keys of the data

7.2.2 Tasks and Goals of Requirements Analysis

Page 15: Yunsheng Liu

1515Yunsheng LiuYunsheng Liu

(6). Data processing functions and properties (business activities or applications)(7). Relationships between the functions and the data(8). Operational requirements:

Integrity, Consistency and Security constraints Query modes Response time requests …

7.2.2 Tasks and Goals of Requirements Analysis

2. The Overall Goals

Page 16: Yunsheng Liu

1616Yunsheng LiuYunsheng Liu

Review the previous developmentsAnalyze existing reports, files,

documents,

display, etc.Questionnaire Interview with all kinds of key users Look into work fields

The Approaches

7.2.3 Application Environment Analysis

Page 17: Yunsheng Liu

1717Yunsheng LiuYunsheng Liu

7.2.4 Metadata Collection 7.2.4 Metadata Collection

1. Metadata concept

2. Static metadata

- The information of data structure, i.e. definitions

and descriptions of data Naming Constructions Formats Types and lengths

3. Dynamic metadata

Page 18: Yunsheng Liu

1818Yunsheng LiuYunsheng Liu

Operational requirements Integrity, Consistency and Security Constraints Query modes Response time constraints DB changes in volume and structure

Behavior Usage modes: interactive or preprogramming Operation types and frequencies Interactions among data Message switches etc.

7.2.4 Metadata Collection 7.2.4 Metadata Collection

Two aspects of dynamic metadata

Page 19: Yunsheng Liu

1919Yunsheng LiuYunsheng Liu

7.2.5 Application Analysis7.2.5 Application Analysis

1. Objectives to determine a variety of applications with their data, and data usages in the applications2. Tasks Analysis Hierarchically to decompose the

applications into function modules . Be performed within one area of business Be authorized as a whole at the same level Be performed as a whole, and no different parts of a task can be performed by different callers or triggering conditions, or within different time slice Utilize the same set of data uniformly

Page 20: Yunsheng Liu

2020Yunsheng LiuYunsheng Liu

7.2.5 Application Analysis7.2.5 Application Analysis

3. Data Analysis —Hierarchically to decompose the data

according to the applications decompositions - Data elements definitions: names, types, lengths, pictures etc. DD - Documenting data structures and formats - Relationships between data E-R-A model4. Identifying —the relationships between the processes and data objects - Operation types, frequencies and data volumes

Page 21: Yunsheng Liu

2121Yunsheng LiuYunsheng Liu

5. Integrating —respectively, the decomposed data and functions in the different parts are integrated into a completed database structure and an application systematic module structure

7.2.5 Application Analysis7.2.5 Application Analysis

● Removing the inconsistencies, such as the same name with different meanings and the same meaning with different names ● Removing the redundancies

Page 22: Yunsheng Liu

2222Yunsheng LiuYunsheng Liu

7.2.6 Enterprise Modeling7.2.6 Enterprise Modeling

1. Describing the organizational model of the

enterprise—Indicating the system scope

2. Representing a Syst./info. Flows Chart (SFC)

3. Developing a logical model of the system Physical procedures logical processes Human being actions abstract processes Particular implementations of an algorithm the description/representation of the algorithm Operational objectives computations for achieving the objectives

Page 23: Yunsheng Liu

2323Yunsheng LiuYunsheng Liu

1. Specifying the logical model of the system

with:

7.2.7 System/Requirements 7.2.7 System/Requirements SpecificationSpecification

DFD DD Task IPO chart Data usage matrix

2. DFD Data Flow Diagram

Hierarchy

Page 24: Yunsheng Liu

2424Yunsheng LiuYunsheng Liu

A Hierarchical DFD SampleA Hierarchical DFD SampleA Hierarchical DFD SampleA Hierarchical DFD Sample

P2

p1

sourceP3 printer

a

b

c d

e

f

g Level 1

D2

D1

h

Level 2d

P3.1

P3.2

P3.3D1 printer

x

yz

g

h D2f

Page 25: Yunsheng Liu

2525Yunsheng LiuYunsheng Liu

A collection of the information on definitions,

structures and usages of data elements in an DBThe contents: typically

1). General: name, aliases, or synonyms, description

2). Format: type, length, domain, format or picture

3). Structure: parent/subsidiary, location (file, record..)

4). Usage: range of values, frequency of uses, kind

( I/O, global, local)

5). Control: origin, users, authorizations (C, Q, D, I…)

conditions of use( key, Consistency constraints…),

3. DD Data Dictionary

Page 26: Yunsheng Liu

2626Yunsheng LiuYunsheng Liu

An Example - for defining or finding a data item, you may need to use more than one entry in the DD.

BATCH_STATUS_FILE = {BATCH_JOB_STATUS} BATCH_JOB_STATUS = JOB_ID_NUMBER + JOB_STATUS JOB_ID_NUMBER = JOB_ID + JOB_NUMBER JOB_ID = Char(2), JOB_NUMBER = Number(4) : SALARY = Real(5.2) MEETING_TIME = YY:MM:DD + HH:MN:SS

Page 27: Yunsheng Liu

2727Yunsheng LiuYunsheng Liu

Local data:Local data:

Input:Input: Output:Output:

Caller:Caller:Called by:Called by:

Process:Process:

IPO CHART-iDIPO CHART-iD

SYSTEM

DESIGNER AUTHORIZER

SUBSYSTEM MODUL

DATE

Notes:Notes:

4. Application spec. Input-Process-Output Chart

Page 28: Yunsheng Liu

2828Yunsheng LiuYunsheng Liu

D1 D2 Dj Dn

T1 I5, U, Q12, Not Q3, U2

T2 C,

Ti All C, I2, D, U6, Q15

Tm

5. Task-Data Usage Matrix

C, I, D, U, Q — Create, Insert, Delete, Update Q12 —12 times performing Q per time unit

Page 29: Yunsheng Liu

2929Yunsheng LiuYunsheng Liu

Create Query Insert Update Deiete

D1 V1c N1c V1i N1ii

D2

Di Vic Nic Viu Niu

Dm

6. Data-Operation Relationship Matrix

Page 30: Yunsheng Liu

3030Yunsheng LiuYunsheng Liu

7.3 Conceptual Design Overview7.3 Conceptual Design Overview

7.3.1 Introduction

1. Concept

- a process to develop a conceptual DB

structure, which is independent of

system specifics, with modeling user

views and then integrating them

Page 31: Yunsheng Liu

3131Yunsheng LiuYunsheng Liu

2. The Main Tasks

— analyzing and modeling data based on the data

requirements

conceptual data modeling

3. The Objective

— an abstract data representation that is

comprehensible to both users and designers

E-R

7.3.1 Introduction

Page 32: Yunsheng Liu

3232Yunsheng LiuYunsheng Liu

Management perspective —management view of dataOperation and transaction perspective —processing view of dataStructure perspective —intra- and inter- structures of dataEvent perspective —requirements on time and scheduling of applications

4. Design Considerations

7.3.1 Introduction

Page 33: Yunsheng Liu

3333Yunsheng LiuYunsheng Liu

Modeling management user views Modeling operation/transaction user views Modeling hierarchical structures and relationships

among data elements Modeling application events—“Wh” : when, wh

at( tran./op.), which( data),

7.3.2 Approaches and Techniques7.3.2 Approaches and Techniques

1. Entity Analysis —top-down design

1). Modeling the user’s views of data as follows:

Page 34: Yunsheng Liu

3434Yunsheng LiuYunsheng Liu

2). Modeling entities - Formulating E/R/A 3). Consolidating the user’s views into an integrated conceptual view

2. Attributes Synthesis —bottom-up - Classifying items: Identity (E, R) Description - FD sets

Requirements

Entitiesformulation

Entity-Attr.analysis

Graphicalrepresentation

Classifyingitems

Composingentities

Formulatingrelationships

Relationshipcreation

Entity Modeling

Attributes Synthesis

7.3.2 Approaches and Techniques7.3.2 Approaches and Techniques

Page 35: Yunsheng Liu

3535Yunsheng LiuYunsheng Liu

7.3.3 Conceptual Design Steps7.3.3 Conceptual Design Steps

Modeling User Views

IntegratingThe Views

Conceptual Model

Development

Conceptual DesignReview

Data Requi. Spec.

1. Steps

2. User views modeling - Based on individual user perspectives: Identify data elements Identify data groups Form relationships among the data elements, intra-groups and inter-groups

Page 36: Yunsheng Liu

3636Yunsheng LiuYunsheng Liu

3. Views integration - Synthesizing the views into global single structure Remove redundancies Coordinate inconsistencies

4. Conceptual model components Logical data constructs—intra-structures of entities Logical data relationships—inter-structures of entities Logical access map—logical access patches of appli. 5. Conceptual review —for correctness, consistency,

completeness, un-redundancy

7.3.3 Conceptual Design Steps7.3.3 Conceptual Design Steps

Page 37: Yunsheng Liu

3737Yunsheng LiuYunsheng Liu

Real world

SemanticData ModelsSemantic

Data Models

DBData Models

DBData Models

Computerworld

2. The design process: Semantic modeling Rrelational analysis Logical conversion (to target DM)

1. What

7.4 Semantic Data Analysis and Modeling7.4 Semantic Data Analysis and Modeling

7.4.1 Introduction7.4.1 Introduction

Page 38: Yunsheng Liu

3838Yunsheng LiuYunsheng Liu

7.4.2 Data Abstractions7.4.2 Data Abstractions

Products

P-name P-typeP-factory

F-name F-address F-phone

Products

P10001 TV LG BT-49’

P21888 T-Shirt XYZ Playboy

2. Classification

- The relationship:

Is-Instance-of

- The relationship: Is-Part-of

1. Aggregation

Page 39: Yunsheng Liu

3939Yunsheng LiuYunsheng Liu

Products

Machine FabricElectric

Computer TV Refrigerator

Products

Wasteproducts

Excellentproducts

Substandardproducts

Qualifiedproducts

Standardproducts

4. Association

- The relationship: Is-Member-of

- The relationship: Is-a

3. Generalization

7.4.2 Data Abstractions7.4.2 Data Abstractions

Page 40: Yunsheng Liu

4040Yunsheng LiuYunsheng Liu

7.4.3 E-R Modeling 7.4.3 E-R Modeling

1. Representations of objects in a real world Sometimes, an object in real-world may be represented an entity, a relationship, or even an attributes, depending on designer’s views. Example:

Suppliers

Items

supply

Supply

supplieritem

Items

supplier

Suppliers

item

Page 41: Yunsheng Liu

4141Yunsheng LiuYunsheng Liu

2. E/R/A selection“noun, verb and adjective” principlePractical rules:

Identifier entity typeProperty if “A is a property of B”, B is an entity

type and A is an attribute of BEvent/action the subjects and objects in an

event or action corresponds to entity types, and

its behavior relates to a relationship type

7.4.3 E-R Modeling 7.4.3 E-R Modeling

Page 42: Yunsheng Liu

4242Yunsheng LiuYunsheng Liu

Optional - General Dependent - Weak entity Conditional - Normalization!

Employee Enterpriseemploy

Employee Honoraward

managerEmployee appoint

3. Relationships Analysis 1). Memberships

7.4.3 E-R Modeling 7.4.3 E-R Modeling

2). Identifying — both of the two E’s keys

Page 43: Yunsheng Liu

4343Yunsheng LiuYunsheng Liu

4. Multi-value relationships: - How to identify? Create a dependent entity type or give a special Id

Teacher Courselecture Teacher Courselecture

5. N-ary relationships: replaced with a “relationship entity”

Supplier

Project

supplyPart

Supplier

Project

SupplyPart

7.4.3 E-R Modeling 7.4.3 E-R Modeling

Page 44: Yunsheng Liu

4444Yunsheng LiuYunsheng Liu

1. General rule: E relation, R relation, A attribute, Id key

2. Conversions of nondeterministic entity types - An entity type with an attribute of repeating group - The repeating group attribute must be converted into a dependent entity for normalization

3. Minimizing the number of relations

7.4.4 Conversion from E-RM to RM

Page 45: Yunsheng Liu

4545Yunsheng LiuYunsheng Liu

Relations which have a common key should be

merged into a single relationA relation corresponding to an 1:1 relationship type

should be merged into one of the two entity relationsA relation corresponding to an 1:M relationship type

should be merged into the relation corresponding to the entity at M side of the relationship

- Because there exist superfluous attributes in the key, and after removing the superfluous attributes from the key, the relationship relation has the same key as one of the entity at M side of the relationship

7.4.4 Conversion from E-RM to RM

Page 46: Yunsheng Liu

4646Yunsheng LiuYunsheng Liu

7.5 Relational Analysis7.5 Relational Analysis

7.5.1 Introduction

3. Normal Forms

NF2 1NF 2NF 3NF BCNF

1. What constitute a bad DB design? Redundancies Null values Operation anomalies

2. How to avoid a bad DB design? Normalization

How to normalize? Decompositions

Page 47: Yunsheng Liu

4747Yunsheng LiuYunsheng Liu

7.5.2 Functional Dependencies7.5.2 Functional Dependencies

1. FD Concepts

Def.: Let R(A) be a relation and X, YA be two attribute sets of R(A). If for any given X-value in R(A), there exists only one corresponding Y-value in R(A), then we say that X functionally determines Y in R(A), notated by XY. And XY is called a functional dependence (FD) of R(A).

● X: determinator of the FD

● Y: dependent of the FD, respectively.

Page 48: Yunsheng Liu

4848Yunsheng LiuYunsheng Liu

Example

EMPLOYEE#

PROJECT#

WORK-TIME

P-BUDGET

P-NAME

7.5.2 Functional Dependencies7.5.2 Functional Dependencies

Page 49: Yunsheng Liu

4949Yunsheng LiuYunsheng Liu

Def.: Let R(A) be a relation and XA be an attribute set of R(A). X is referred as a key of R(A) if the following two predicates are held:

● Ai A ( X∈ Ai )

● ∄X’ X ( Ai A(∈ X’ Ai ) )

2. An alternative definition of a key of a relation

7.5.2 Functional Dependencies7.5.2 Functional Dependencies

Page 50: Yunsheng Liu

5050Yunsheng LiuYunsheng Liu

3. Full FD

For R(A), if XY and X’ X ( X’∀ ⊂ ↛KY ), then XY is called a full FD.

7.5.2 Functional Dependencies7.5.2 Functional Dependencies

WORK-TIME

P-BUDGET

P-NAME EMPLOYEE#

PROJECT#

Example:Example:

Page 51: Yunsheng Liu

5151Yunsheng LiuYunsheng Liu

A1. If Y X, then X Y. Reflexivity A2. If XY and ZW, then XWYZ Augmentation A3. If XY and YZ, then XZ Transitivity B1. If XY and WYZ, then WXZ pseudotransitivity B2, If XY and WZ, then WXYZ Union B3. If XYZ then XY and XZ decomposition

4. Natures of FDs —Armstrong Beeri Axioms

7.5.2 Functional Dependencies7.5.2 Functional Dependencies

Page 52: Yunsheng Liu

5252Yunsheng LiuYunsheng Liu

Example: Let F= {Z↛A , B↛X , AX↛Y, ZB↛Y} be a FDS.

Ask if the FD ZB↛Y is redundant ?Yes! Because:

Z↛A ,∴ ZB↛AB ( A2 ); B↛X and AX↛Y , ∴ AB↛Y ( B1 ); ZB↛AB and AB↛Y , ∴ ZB↛Y ( A3 )

That is ZB↛Y can be derived by the other FDs in F

7.5.2 Functional Dependencies7.5.2 Functional Dependencies

Page 53: Yunsheng Liu

5353Yunsheng LiuYunsheng Liu

1. Definitions: For a relation R(A), 1NF: if AiA (Domain(Ai ) is elementary) hold, then R

is said to be in the first normal form(1NF) 2NF: if R is in 1NF and if every non-prime attribute is fu

lly FD keys of R, then R is said to be in the second normal form(2NF)

3NF: if R is in 2NF and there not exists any FD between non-prime attributes.

BCNF(R. F. Boyce, E. F. Codd): if (XY)SFDC, the smallest closure of FDs over R, X is a superkey of R

7.5.3 Normal Forms7.5.3 Normal Forms

Page 54: Yunsheng Liu

5454Yunsheng LiuYunsheng Liu

GRADEI-ADDRI-NAMEC-TITLECOUR#MAJORS-NAMESTUD#

92

84

96

68

83

70

XYZ1

XYZ2

XYZ3

XYZ2

XYZ1

XYZ3

刘 军王明华张继业王明华刘 军张继业

PL

DS

OS

DB

PL

OS

CS200

CS360

CS420

CS460

CS200

SF420

….

CST

CST

CST

CST

ISYS

SOFT

李文明李文明李文明李文明赵大元刘蓉润

2006030074

2006030074

2006030074

2006030074

2007100125

2007110103

GRADE-REPORT

Example:Example:

7.5.3 Normal Forms7.5.3 Normal Forms

Page 55: Yunsheng Liu

5555Yunsheng LiuYunsheng Liu

2. Trivial FD: for a FD XY, it is trivial if YX. - The FDs met Reflexivity is trivial3. Lemma: BCFN is stronger than 3NF Proof: Rewrite BCNF: XYSFDCR, one of the followings is true: YX, that is, it is a trivial FD, or K X (K is a key of R) held Rewrite 3NF: XYFDSR, one of the followings is true: YX, that is, it is a trivial FD, or KX ( K is a key of R) held or KY ( K is a key of R) held

7.5.3 Normal Forms7.5.3 Normal Forms

Page 56: Yunsheng Liu

5656Yunsheng LiuYunsheng Liu

3NF

2NF1NF

Removing FD between PA’s

BCNF

Removing partial FD on key

Removing FD between non-PA’s

Removing repeating groups

Non-NF

44. The relationships between NFs. The relationships between NFs

7.5.3 Normal Forms7.5.3 Normal Forms

Page 57: Yunsheng Liu

5757Yunsheng LiuYunsheng Liu

Omission— currently unknownInapplicability—never existing propertyException—un-applicable in a particular case

1. Null values — an operation anomaly - The sources:

7.5.4 Anomalies of DB operations

- The effects: Bring about undefined relational operationsRequiring special processing softwareWasting spaceConvenient to represent record

Page 58: Yunsheng Liu

5858Yunsheng LiuYunsheng Liu

250

103

24

Price

5800200P-1008181818

158026P-1003133452

190056P-0001123456

BudgetNumberPart#Proj#R:

2. Insertion anomalies Insert( R, 10000, - , - , - , 2000 )

Insert( R, 133452, P-0001, 42, 17 , 2000 )

??

133452 P-0001 42 17 2000

100000 - - - 2000

7.5.4 Anomalies of DB operations

Page 59: Yunsheng Liu

5959Yunsheng LiuYunsheng Liu

3. Deletion anomalies Delete( R, 181818, P1008 ) The intention is going to cancel the part p1008. What will be happened ?4. Update anomalies Update( R, 123456, -, -, -, -, budget + 200 ) The intention is going to add 200 to the budget of project 123456 How can we do ? 5. Eliminating the operation anomalies —Decompositions

7.5.4 Anomalies of DB operations

Page 60: Yunsheng Liu

6060Yunsheng LiuYunsheng Liu

7.5.5 Decomposition Concept7.5.5 Decomposition Concept

1. Why: Removing redundancy and Normalization

2. What: The process replacing a relation schema R(A) with a collection of subsets Ri (Xi)’s of R(A)

Def.: Let R(A) be a relation schema and Xi⊆A (i=1,2,…,k ) be subsets of the attributes of R. If

R(A)={ R∪ i(Xi) | i=1,2,…,k}

where it is unnecessary that Ri’s be disjoint, then ρ=( R1, R2, … , Rk ) is called a decomposition of R.

Page 61: Yunsheng Liu

6161Yunsheng LiuYunsheng Liu

3. Problems related to decomposition For a relations, we need decompose it ?

- considering the NF of the relation schema What problems does a decomposition cause?

- Lossless-join - Dependency-preservation

Queries may require to join decomposed relations - Trade-off the number of this kinds of queries with potential impacts caused be not decomposing

7.5.5 Decomposition Concept7.5.5 Decomposition Concept

Page 62: Yunsheng Liu

6262Yunsheng LiuYunsheng Liu

1. Lossless-Join Decomposition(LJ-decomposition)

Def.: If R is a relation schema decomposed into schemas Ri(i=1,2,…,k) and F is an FDS over R, the decomposition is said to be in lossless join (with respect to F) or is a LJ-decomposition (wrt. F) if for any instance r of R, the following format is held:

r = R1(r )⋈R2

(r )⋈…⋈Rk(r )

7.5.6 Lossless Decomposition 7.5.6 Lossless Decomposition

Page 63: Yunsheng Liu

6363Yunsheng LiuYunsheng Liu

d1p1s3

d3p1s1

d3p1s3

d2p2s2

d1p1s1

DPS

d3p1s3

d2p2s2

d1p1s1

DPSInstance r of RInstance r of R

p1s3

p2s2

p1s1

PS

S,P (r)

d3P1

d2p2

d1p1

DP

P,D (r)

2. Examples

S# Sname phone# city

1000 Smith 12345 Wuhan

1200 John 21434 Beijing

S# Sname Phone#

1000 Smith 12345

1200 John 21434

S# city

1000 Wuhan

1200 Beijing

⋈⋈==

⋈⋈ ==

7.5.6 Lossless Decomposition7.5.6 Lossless Decomposition

Page 64: Yunsheng Liu

6464Yunsheng LiuYunsheng Liu

3. Theorem: Let ρ=(R1, R2) be a decomposition of R

and F be a FDS over R. ρ is a LJ-decomposition wrt. F iff (R1∩R2)R1 or (R1∩R2)R2 .

4. Rule: For an FD XY over R(A), if X∩Y=, then the decomposition of R into R(A-Y) and R(X,Y) is in lossless join.

5. The existence of LJ-Decompositions For any R(A), there always exists LJ-decomposition

of R(A) into 3NF.

7.5.6 Lossless Decomposition7.5.6 Lossless Decomposition

Page 65: Yunsheng Liu

6565Yunsheng LiuYunsheng Liu

1. Concepts

- For a R(A) and its FDS, a decomposition of R(A) into two R1(X) and R2(Y) with respect to FDS is Dependency-preserving (DP) if

(FDSX∪FDSY )+ = FDS+

2. For any R(A), there always exists LJ- and DP-de there always exists LJ- and DP-decomposition into 3NF.composition into 3NF.

3.3. For any R(A), there always exists a LJ-decompoFor any R(A), there always exists a LJ-decomposition into BCNF, but it is not ensured that the dsition into BCNF, but it is not ensured that the decomposition is DP ecomposition is DP

7.5.7 Dependency-Preserving Decomposition7.5.7 Dependency-Preserving Decomposition

Page 66: Yunsheng Liu

6666Yunsheng LiuYunsheng Liu

7615345676李卫国PLC005CS王大国S010

8812345678刘 厚DBSC004CS赵一清S003

5122345678纪严明DSC003CS刘华明S001

6512345679张扬名OSC002CS刘华明S001

98

Grade

12345678

Tphone

刘 厚Tname

DBDC001CS刘华明S001

CnameC#DeptSname S#

( NF2

)

CS王大国S010

CS赵一清S003

CS刘华明S001

DeptSname S# (3NF )

7615345676李卫国PLC005S010

C004

C003

C002

C001

C#

S003

S001

S001

S001

S#

8812345678刘 厚DBS

5122345678纪严明DS

6512345679张扬名OS

9812345678刘 厚DBD

GradeTphoneTnameCname

(1NF )

7.5.8 Relation Decompositions

Page 67: Yunsheng Liu

6767Yunsheng LiuYunsheng Liu

7615345676李卫国PLC005S010

C004

C003

C002

C001

C#

S003

S001

S001

S001

S#

8812345678刘 厚DBS

5122345678纪严明DS

6512345679张扬名OS

9812345678刘 厚DBD

GradeTphoneTnameCname

( 1NF )

76C001S010

88

51

65

98

Grade

C001S003

C003S001

C002S001

C001S001

C#S#

( 3NF )

12345678刘 厚DBSC004

15345676李卫国PLC005

22345678纪严明DSC003

12345679张扬名OSC002

12345678刘 厚DBDC001

TphoneTnameCnameC#

( 2NF )

Page 68: Yunsheng Liu

6868Yunsheng LiuYunsheng Liu

12345678刘 厚DBSC004

15345676李卫国PLC005

22345678李 事DSC003

12345679张扬名OSC002

12345678刘 厚DBDC001

TphoneTnameCnameC#

( 2NF )

李卫国PLC005

刘 厚

纪严明张扬名刘 厚

Tname

DBSC004

DSC003

OSC002

DBDC001

CnameC#

( 3NF )

15345676李卫国

22345678纪严明12345679张扬名12345678刘 厚TphoneTname

( 3NF )

Page 69: Yunsheng Liu

6969Yunsheng LiuYunsheng Liu

Page 70: Yunsheng Liu

7070Yunsheng LiuYunsheng Liu

7.6.1 The Objectives and Tasks

7.6 Implementation Design7.6 Implementation Design

1. The overall objective - Mapping the conceptual DB model into a DBMS- processible logical DB structure2. The main tasks Transforming the conceptual DB model into a RDM Normalizing the RDM to avoid to get a bad DB design Developing the schema, subschemas in the DDL of th

e DBMS chosen Developing an application program design guidance

Page 71: Yunsheng Liu

7171Yunsheng LiuYunsheng Liu

3. Components of Implementation Design

Requi. formulation, Conceptual design

Implementationdesign

Conceptual DB model

Conceptual DB model Access

Requi. spec.Access

Requi. spec.OperationalRequi. spec.OperationalRequi. spec.

Consistencyconstraints

Consistencyconstraints

Volume &Usagequantification

Volume &Usagequantification

Sys. Envir.Char.

Sys. Envir.Char.DBMS

FeaturesDBMS

Features

Operationalguidance

Operationalguidance DB

SchemaDB

SchemaAppl. Design

guidanceAppl. Design

guidance

Spec. forphysical des.

Spec. forphysical des. SubschemasSubschemas

Page 72: Yunsheng Liu

7272Yunsheng LiuYunsheng Liu

7.6.2 Steps of Implementation Design7.6.2 Steps of Implementation Design

Logical conversion

Logical DBStructures

Logical DBStructures User-view

StructuresUser-viewStructures

Sub-modelsdesign

DBMSFeaturesDBMS

Features

Applicationsdesign

Defining DB

Volume &Usagequantification

Volume &Usagequantification

OperationalRequi. spec.OperationalRequi. spec.

Sys. Envir. Char.

Sys. Envir. Char.

AccessRequi. spec.

AccessRequi. spec.

Conceptual DB modelConceptual DB model

DB Schemaanalysis

Operationalguidance

Operationalguidance

Appl. Designguidance

Appl. Designguidance

DBSchema

DBSchema SubschemasSubschemas

Spec. forphysical des.

Spec. forphysical des.

Normalizing

Page 73: Yunsheng Liu

7373Yunsheng LiuYunsheng Liu

1. Sub-models Design transforming the user views

into data sub-structures. They are: interfaces between the DB and the applications; consistent with the logical DB structures

2. Applications Design developing a specification

for major DB transactions: to support to the applications to outline processes to access required data in the

sub-structures

Page 74: Yunsheng Liu

7474Yunsheng LiuYunsheng Liu

3. Schema Analysis analyzing and improving the

schema and the subschemas Evaluation on quantitative information the average number of records of each type the frequencies of transactions the average numbers of operations in a transaction the average volume of data accessed in an operation

Performance estimation response time I/O service time

Page 75: Yunsheng Liu

7575Yunsheng LiuYunsheng Liu

Total bytes in the DB Total bytes transferred by each application

match with the features of the chosen DBMS meet the user’s requirements have any inconsistencies, omissions, errors and

violations of the constraints have a reasonable performance

Review — evaluating whether the schema

Page 76: Yunsheng Liu

7676Yunsheng LiuYunsheng Liu

- The main tasks and stepsStored records formats designStored records clusteringAccess methods design—selecting suitable AM’sSystem performance estimation

7.7 Physical Design7.7 Physical Design

Query response time Update transaction cost Report generation cost Reorganization frequency and cost Memory and secondary storage space cost