74
6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

  • View
    222

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.1

Lecture 6

Data Design

IMS2805 - Systems Design and Implementation

Page 2: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.2

References

HOFFER, J.A., GEORGE, J.F. and VALACICH (2002) 3rd ed., Modern Systems Analysis and Design, Prentice-Hall, New Jersey, Chap 12

HOFFER, J.A., GEORGE, J.F. and VALACICH (2005) 4th ed., Modern Systems Analysis and Design, Prentice-Hall, New Jersey, Chap 10

WHITTEN, J.L., BENTLEY, L.D. and DITTMAN, K.C. (2001) 5th ed., Systems Analysis and Design Methods, Irwin/McGraw-Hill, New York, NY. Chapter 12

Page 3: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.3

Data Design

Logical data design:

Logical data modelling

Physical file and database design

Page 4: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.4

Overview of logical data modelling

Logical data modelling involves:

collecting detailed attributes for each entity and relationship identified

converting ER models to relations normalising the relations merging relations from each user viewpoint converting the normalised and merged relations

to create a data structure diagram

Page 5: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.5

The data structures developed in the logical system design phase need to be converted to physical data structures for the physical design stage.

A knowledge of the relevant data base management system (DBMS) is important for this phase.

The aim is to develop a compromise between the “ideal” (most flexible) structure developed during normalisation and the most efficient (fastest, cheapest) structure preferred for implementation.

Data Design

Page 6: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.6

Information required:

Normalised relations, including volume estimates

Definitions of each attribute

Descriptions of where and when data are used (CRUD)

Expectations/requirements for response time and data integrity

Descriptions of technologies to be used for implementing the files and databases

Physical Data Design

Page 7: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.7

Key decisions: The storage format (data type) for each attribute, e.g.

length,coding scheme, decimal places, range of values

Grouping of attributes into physical records (data structure)

Arranging records in secondary memory for storage, retrieval, update (file organisation)

Selection of media and structures for efficient access

DBMS technologies require selection of options for retrieval and for implementation of relationships between entities

Physical Data Design

Page 8: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.8

The relational database model

The relational database model represents data in the form of tables or relations

Important concepts are: relation primary key foreign key functional dependency normalisation

Page 9: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.9

Normalisation

Normalisation is a process for converting complex data structures into simple, stable data structures in the form of relations.

Data models consisting of normalised relations: are robust and stable have minimum redundancy are flexible are technology-independent (logical)

Page 10: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.10

Normalisation ensures that each attribute is attached to the appropriate relation:

i.e. each attribute is contained in the relation which represents the real world system object or concept that the attribute describes or is a property ofe.g. the attribute Student-name should be in the relation STUDENT which represents the real world object “student” of interest to a student records system

Normalisation

Page 11: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.11

Normalisation

Normalisation was originally developed as part of relational database theory by E.F. Codd (1970)

Normalisation is accomplished in stages, each of which corresponds to a “normal form”

Originally, Codd defined first, second and third normal forms. Third normal form is adequate for most business applications.

Later extensions include Boyce-Codd, 4th, 5th and domain-key normal forms.

Page 12: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.12

Relation

A relation is a named, two-dimensional table of data

Each relation consists of a set of named columns and an arbitrary number of rows

Each column corresponds to an attribute of the relation

Each row corresponds to an instance (or record) that contains values for that instance

Page 13: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.13

Example relation

A relation generally corresponds to some real world object or concept of interest to the system (similar to an entity) e.g.:

Emp# Name Salary Dept

1247

1982

9314

Adams

Smith

Jones

24000

27000

33000

Finance

MIS

Finance

Employee

Employee (Emp#, Name, Salary, Dept)

Page 14: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.14

Properties of relations

Relational tables are tables in which:

data values are atomic (single-valued)

data values in columns are from the same domain

each row in the relation is unique

the sequence of columns is insignificant

the sequence of rows is insignificant

Page 15: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.15

Primary key An attribute or group of attributes which uniquely

identifies a row of a relation Entity integrity (relational data base theory)

requires that each relation has a non-null primary key

Where several possible keys are identified, they are known as candidate keys - choose one to be the primary key

Employee (Emp#, Name, Salary, Dept)

Order-item (Order#, Item#, Qty-ordered)

E.g.

Page 16: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.16

Foreign key

A foreign key is an attribute in one relation that is also a primary key in another relation

The referential integrity constraint (relational database theory) specifies that if an attribute value exists in one relation then it must also exist in a linked relation

A foreign key must satisfy referential integrity

Page 17: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.17

Foreign key example

In the example below, if a given Dept# exists in an Employee relation then that Dept# must exist in the Department relation

Employee (Emp#, Name, Salary, Dept#)

Department (Dept#, Dname, Budget)

foreign key

Page 18: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.18

Functional dependency A functional dependency is a particular

relationship between attributes in a relation For any relation R, attribute B is functionally

dependent on attribute A if each value of A has only ONE value of B associated with it,i.e. if for every valid instance of A, that value of A uniquely determines the value of B

A identifies B

A B

Emp# Emp-name

Emp# Salary

Page 19: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.19

Normalisation to third normal form is accomplished in 3 steps each corresponding to a basic normal form

A normal form is a state of a relation that can be determined by applying simple rules concerning dependencies within that relation

Each step of the normalisation process is applied to a single relation in sequence so that the relation is converted to third normal form

Steps in normalisation

Page 20: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.20

Steps in normalisation

First Normal Form

Second Normal Form

Third Normal Form

Unnormalised table

Remove repeating groups

Remove partial dependencies

Remove transitive dependencies

Page 21: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.21

Well Structured Relations

A well structured relation contains a minimum amount of redundancy and allows users to insert, modify, and delete rows in a table without errors or inconsistencies (known as “anomalies”)

Three types of anomaly are possible: insertion deletion Modification

Third normal form relations are considered to be well structured relations

Page 22: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.22

First normal form (1 NF) A relation is in first normal form if it contains no

repeating data :the value of the data at the intersection of each row and column must be single-valued

Remove any repeating groups of attributes to convert a relation to 1 NF (key of the removed group is a composite key)

Order (Order#, Customer#, (Item#, Desc, Qty))

Order-Item (Order#, Item#, Desc, Qty)

Order (Order#, Customer#)

Page 23: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.23

Second normal form (2 NF)

A relation is in 2 NF if it is in 1 NF and every non-ket attribute is fully functionally dependent on the primary key

A partial dependency exists if one or more non-key attributes are dependent on only part of a composite primary key

Remove any partial dependencies to convert a 1 NFrelation to 2 NF

If the primary key consists of only one attribute or there are no non-key attributes, then a 1 NF relation is automatically in 2 NF

Page 24: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.24

Second normal form (2 NF)

Remove Partial Dependencies a non-key attribute cannot be identified by part

of a composite key

Order (Order#, Item#, Desc, Qty-ordered)

Order-Item (Order#, Item#, Qty-ordered)

Item (Item#, Desc)

Page 25: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.25

Partial dependency anomalies

Order# Item# Item-desc Qty

27

28

28

873

402

873

nut

bolt

nut

2

1

10

Order-Item

30 495 washer 50

UPDATE change item =-desc in many places DELETE data for last item lost when last order

for that item is deleted CREATE cannot add new item until it is

ordered

Page 26: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.26

The solution for these anomalies: 2 NF

Order# Item# Qty

27

28

28

873

402

873

2

1

10

Order

30 495 50

Item# Desc

873

402

nut

bolt

495 washer

delete last order for item, but item remains

add new item at any time

change item description in one place only

Item

Page 27: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.27

If a relation has a composite key, it must be checked for dependencies within the key to determine whether they should be retained e.g.

DEPT (Dept#, Dept-name, (Emp#, Emp-name))

DEPT (Dept#, Dept-name), DEPT-EMP (Dept#, Emp#, Emp-name)

But DEPT-EMP (Dept#, Emp#, Emp-name)

Remove Dept# from the key:EMP (Emp# ,Emp-name, Dept#)

Dependencies within the primary key

Page 28: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.28

Third normal form (3 NF)

A relation is in 3 NF if it is in 2 NF and no transitive dependencies exist

A transitive dependency is a functional dependency between two or more non-key attributes

Remove any transitive dependencies to convert a 2 NF relation to 3 NF

Page 29: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.29

Third normal form

Remove Transitive Dependencies a non-key attribute cannot be identified

by another non-key attribute

Employee (Emp#, Ename, Dept#, Dname)

Employee (Emp#, Ename, Dept#)

Department( Dept#, Dname)

(look for foreign keys and their attributes)

Page 30: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.30

Transitive Dependency Anomalies

Emp# Emp-name Dept# Dname

10

20

25

Smith

Jones

Smith

D5

D7

D7

MIS

Finance

Finance

Employee

30 Black D8 Sales

UPDATE change dept name in many places DELETE data for dept lost when last employee

for that dept is deleted CREATE cannot add new dept until an

employee is allocated to it

Page 31: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.31

The solution for these anomalies: 3 NF

Emp# Ename Dept#

10

20

25

Smith

Jones

Smith

D5

D7

D7

Employee

30 Black D8

Dept# Dname

D5

D7

MIS

Finance

D8 Sales

delete last emp in dept, but dept remains

add new dept at any time

change dept name in one place

Item

Page 32: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.32

Normalisation to 3 NF

A relation is normalised if all attributes are fullyfunctionally dependent on the primary key:

Remove repeating groups

Remove partial dependencies

Remove transitive dependencies

Page 33: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.33

Transforming ER Diagrams into Relations

Transforming an ER diagram into normalised relations, and then merging all the relations into one final, consolidated set of relations can be accomplished in four steps:

1. Represent entities as relations2. Represent relationships as relations3. Normalise each relation4. Merge the relations

Page 34: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.34

Representing Entities as Relations

Each entity in the ER model is transformed into a relation e.g.

becomes

CUSTOMER

Customer (Customer-no, Name, Address, City, State, Postcode, Discount)

Page 35: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.35

Representing Relationships as Relations

1. Binary Relationships (1:N, 1:1)

CUSTOMER ORDER

Customer (Customer-no, Name, Address, City, State, Postcode, Discount)

Order (Order-no, Order-date, Promised-date, Customer-no)

places

Page 36: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.36

Representing Relationships as Relations

1. Binary Relationships (1:N, 1:1)

For 1:M, add the primary key of the entity on the one side of the relationship as a foreign key in the relation that is on the many side

For 1:1 relationship involving entities A and B, choose from

add the primary key of A as a foreign key of Badd the primary key of B as a foreign key of Aboth of the above

Page 37: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.37

2. Binary and Higher Degree Relationships (M:N)

Representing Relationships as Relations

PRODUCT ORDER

Where we wish to know the quantity of each product on each order, i.e. this attribute is an attribute of the relationship requests/requested by:

requests

Order (Order-no, Order-date, Promised-date)Order Line (Order-no, Product-no, Quantity-ordered)Product (Product-no, description, (other attributes))

Page 38: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.38

Representing Relationships as Relations

2. Binary and Higher Degree Relationships (M:N)

For M:N, first create a relation for each for each of the entity types, and then create a relation for the relationship, with a composite primary key formed from the primary keys of the participating entity types.

Page 39: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.39

3. Unary Relationships

Representing Relationships as Relations

EMPLOYEEITEM

Has component manages

Reports to

Is a componentof

(1:N)(M:N)

Employee (Emp-id, Name, Birthdate, Manager-id)

Item (Item-no, Name, Cost)Item-Bill (Item-no, Component-no, Quantity)

Page 40: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.40

Representing Relationships as Relations

4. IS-A Relationship (Class-Subclass or generalisation)

PROPERTY

BEACH PROPERTY

MOUNTAINPROPERTY

Property (Street-address, City-state-postcode, No-rooms, Typical-rent)Beach (Street-address, City-state-postcode , Distance-to-beach)Mountain (Street-address, City-state-postcode , Skiing)

Page 41: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.41

During the normalisation process two or more relations with the same primary key may appear

The set of 3NF relations must not contain any duplicate data

Relations with the same primary key should be merged

Normalisation of relations: merging relations

Page 42: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.42

Normalisation of relations

Synonyms Two or more attributes may have different names

but the same meaning Either adopt one of the names as a standard or

choose a third nameSTUDENT1 (Student-id, Name, Phone-no)

STUDENT2 (VCE-no, Name, Address)

STUDENT (Student-id, Name, Address, Phone-no)

Page 43: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.43

Homonyms Two or more attributes may have the same name

but different meanings To resolve the conflict, new attribute names need

to be created

STUDENT1 (Student-id, Name, Address)STUDENT2 (Student-id , Name, Phone-no, Address)

STUDENT (Student-id, Name, Phone-no, Campus-address, Permanent-address )

Normalisation of relations

Page 44: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.44

Normalisation of relations Transitive dependencies

When two 3NF relations are merged, transitive dependencies may result e.g.

STUDENT1 (Student-id, Major)STUDENT2 (Student-id , Advisor)

STUDENT (Student-id, Major, Advisor)

BUT MAJOR ADVISOR

STUDENT1 (Student-id, Major)MAJOR (Major , Advisor)

Page 45: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.45

Data Structure Diagrams

A set of 3 NF relations may be converted to a simple diagrammatic form to begin physical database design

The conversion is simple:

1. Draw a named rectangle for each relation2. Draw a relationship line between rectangles

linked by foreign keys with a “many” cardinality at the foreign key end of the relationship

Page 46: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.46

Data Structure Diagram example

CUSTOMER (Cust#, Cname,Phone number) SALES ORDER (Sord#, Sord-date, Cust#) SALES ORDER-ITEM (Sord#, Item#, Qty) ITEM (Item#, Item-desc)

CUSTOMER

SALESORDER

ITEM

SALES ORDERLINE

Page 47: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.47

Data Structure Diagram example

Eliminate Redundant Relationships

(Tourcode, ...)

(Tourcode, depdate, ...)

(Booking#, ... , Tourcode, depdate)

redundant

TOUR

DEPARTURE

BOOKING

Page 48: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.48

Logical Data Model+

Designer Added Data+

Volume & Sizing Info+

Constraints

Logical Process Model+

Designer Added Procedures+

Frequency & Response Req.+

Constraints

Logical Database Design+

Logical Transaction Model

Target Platform Specific Design

Relational Database Design

Page 49: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.49

Structure (Stability / Adaptability)Preserve correspondence to normalised data model

Preserve data integrity

Performance (Speed and Resource Utilisation)Transaction response requirements

Minimise secondary storage requirementsMinimise transfers between primary and secondary

storage

Database Design Objectives

Page 50: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.50

Logical Design: Designer added entity types and attribute types Review primary and foreign keys Review derived and aggregate data Define table and column structure Establish row and column ratios Define transaction access maps Review transaction access maps considering

performance Develop composite load map Evaluate design in terms of corporate standards,

response requirements and resource utilisation.

Database Design - Logical Design

Page 51: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.51

Selection of table access paths including indexes and file organisation.

Selection of data types and sizes Selection of other DBMS dependent

variables page size, free space management option, data and index clustering, compression, concurrency control issues, database sizing - % fill for tables, buffer sizes, access control constraints.

Database Design - Physical Design

Page 52: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.52

It is often necessary to extend the logical data model with extra entity types, attribute types and / or relationship types to reflect design and implementation decisions.

Additional data may be required to support: archiving back-up access control restart and recovery audit and integrity requirements system housekeeping functions and the storing of

historical information

Designer Added Data

Page 53: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.53

Adding entities for the purpose of archiving:

Project EmployeeHas Assigned

ArchivedProject

ArchivedEmployee

Last UsedEmpno

Example: Designer Added Data

Page 54: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.54

Relations become tables. Attributes become columns

a number of decisions must be made when mapping attributes to columns.

Domains must be defined and linked to the appropriate table columns. (if domains are supported by the DBMS)

Convert Logical Data Model

Page 55: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.55

Some guidelines regarding primary keys: Primary keys identified in the logical data model may

not always be suitable for implementation. The length of each attribute forming the primary key

should be as short as possible. Avoid long character fields.

Keep the use of composite keys to a minimum. Surrogate keys may be used to reduce these

problems. Sometimes these keys may be hidden from the users.

Compact unique identifiers are particularly suitable as primary and therefore foreign keys.

No component of a primary key can accept null values, and the uniqueness property must be enforced.

Primary Keys

Page 56: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.56

A technique to analyse the data access loads generated by each transaction type.

This allows alternative approaches and their impact on data access loads to be considered.

The loads generated by each transaction type may be aggregated to calculate the total access loads on the database, and this information is useful in identifying heavily loaded structures of the database.

Transaction Analysis

Page 57: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.57

To properly evaluate a design it may be necessary to calculate several sets of performance statistics.

Transaction loads may vary widely from peak to average.

Database size may vary widely during application execution particularly if cumulative data is held.

It is much safer to perform design calculations using peak loads, rather than 'average' loads.

Planned equipment utilisation should never exceed 70%.

Statistics Must be Representative

Page 58: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.58

Define the cardinality of each table, and the ratios between table rows.

Develop transaction access maps for each transaction type showing the sequence in which each table is accessed and the basis of each access.

Use table cardinality and ratio information to determine the number of rows of each table that each transaction type will access.

Transaction Analysis Technique

Page 59: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.59

Aggregate these details to produce total access loads for each table in the database.

The individual loads for each transaction type and the composite load map should be carefully examined for opportunities to improve the efficiency of the system.

Great care needs to be taken to balance flexibility against performance, when considering amendments to transaction access paths.

Transaction Analysis Technique

Page 60: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.60

The following data model fragment will be used for examples. The numbers relate to table cardinality (eg Order table 900) table ratios (eg Order to Orderline ratio 1:10)

NextOrder#

(1)

Order Item

(900) (1000)

(9000)

(1:10) (1:9)

OrderLine

Transaction Analysis Technique

Page 61: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.61

Item (Item#, Description, Price, Quantity-on-hand) 1000 records with 5% growth pa

Order-line (Order#, Item#, Qty-ordered, Qty-shipped, Unit-price) 9000 records with 7% growth pa, Ratios ( 1 Item: 9 Order-lines), (1 Order : 10 Order-

lines) Order ( Order#, Ord-date, etc)

900 records with 7% growth pa, Next Order (Order#)

1 record with no growth Each Item has on average 9 Order-lines Each Order has on average 10 Order-lines

Table Cardinality & Row Ratios

Page 62: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.62

Developed from action diagrams for elementary processes. For each transaction, or at least the most critical, it is necessary to identify: Each table access, The sequence of the access within the transaction, The type of access C. R. U. D.(create, read, update,

delete) The columns and predicate of the search condition for

the CRUD access Whether the predicate values come from other tables

accessed by the same transaction, The columns required as a result of this access, The number of accesses per transaction, The number of accesses per period.

Transaction Access Path Maps

Page 63: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.63

Transaction Acces Path MapTransaction Number 1 Frequency 100 per hour peakTransaction Name: Enter New Order 30 per hour average

Destination Destination Destination Source Source Accesses AccessAccess Access Table Access Columns Table Column Per PerNo Type Name Predicate Required Name Names Tran Period

1 read Next Order Order# 1 1002 update Next Order current row Order# 1 1003 insert Order all screen 1 1004 read Item Item= screen all screen 10 10005 update Item current row Quantity -OH screen 10 10006 insert Order-line screen 10 1000

Total Logical Acceses 33 3300

NextOrder#

(1)

1 & 2 3 4 & 5

6

Order Item

(900) (1000)

(9000)

(1:10) (1:9)

OrderLine

Example: Transaction Access Path Map

Page 64: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.64

Item

1 & 2

(1000)

Transaction Acces Path MapTransaction Number 33 Frequency 300 per day peakTransaction Name: Item Status Enquiry 100 per day average

Destination Destination Destination Source Source Accesses AccessAccess Access Table Access Columns Table Column Per PerNo Type Name Predicate Required Name Names Tran Period

1 read Item Item = screen Item#, Desc, QOH 0.7 2102 read Item Desc= screen Item#, Desc, QOH 0.3 90

Total Logical Acceses 1 300

Assumption: 70% of enquiries are by Item#, 30% by description (Desc)

Example: Transaction Access Path Map

Page 65: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.65

3.

2.

1.

Transaction Acces Path Map

Transaction Number 66 Frequency 20 per hour peak

Transaction Name: Check Order Details 15 per hour average

Destination Destination Destination Source Source Accesses Access

Access Access Table Access Columns Table Column Per Per

No Type Name Predicate Required Name Names Tran Period

1 read Order all 1 20

2 read Order-line all Order Order# 10 200

3 read Item Item# = Item# all Order-line Item# 10 200

Total Logical Acceses 21 420

Order Item

(900) (1000)

(9000)

(1:10) (1:9)

OrderLine

Example: Transaction Access Path Map

Page 66: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.66

Complicate the situation, because they are difficult to quantify and predict in advance.

Some controls on when ad hoc queries are permitted may be necessary.

Some DBMS allow the cost of a query to be calculated and the user may be forewarned or prevented from executing an expensive query.

Data may be extracted from the production database during quiet times and loaded onto another database on a different machine to minimise the impact of ad hoc query.

Ad Hoc Queries

Page 67: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.67

The access path for each transaction should be reviewed for efficiency.

A number of techniques may be used to reduce the cost of transactions. Alteration of aggregation time. Alteration of column derivation time. Data Replication Combining and splitting tables Introducing codes simplify identification

and compress data

Analysis of the Transaction Access Path Map

Page 68: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.68

Summarise the information from each transaction access map to show the total number of accesses by table, access type, table columns used for access and the access predicate.

This will highlight possible keys and indexes, row sequencing requirements and the columns used to join tables.

Ideally the production of the composite load map should be automated to allow for automatic re-calculation of totals following changes to transaction access maps.

Developing the Composite Load Map

Page 69: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.69

Composite Load MapDestination Destination Destination Destination Accesses Source SourceTable Access Column Access Per Table Column Trans AccessName Type Name(s) Predicate Hour Name Name(s) Number NumberItem read Item# "=" 200 Order-line Item# 66 3

"=" 1000 screen 1 4"=" 27 screen 33 1Total 1227

Description "=" 12 screen 33 2Total 12

update item# "=" 1000 item item# 1 5Total 1000

Next Order read Order# 100 1 1Total 100

Next Order update Order# 100 1 2Total 100

Order insert all 100 screen 1 3Total 100

read order# "=" 20 screen 66 1Total 20

Order-line insert all 1000 screen 1 6Total 1000

read order# "=" 200 order order# 66 2Total 200

Example: Composite Load Map

Page 70: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.70

The composite load map is designed to highlight table access types that have a very high frequency.

One logical input-output operation does not necessarily result in one physical input-output operation. Buffering techniques, indexes, and logging complicate the actual number of physical I/O operations.

The transactions contributing to these high levels of access should be closely scrutinised to determine whether the number of I/O accesses could be reduced by altering their data access strategy.

The composite load map also provides a base for determining where entry point access methods such as primary and secondary indexes may provide substantial benefit.

Interpreting the Composite Load Map

Page 71: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.71

Where the composite load map shows high join frequencies between tables, control of table row placement, or the use of table access methods such as hashing techniques or indexes may be used to optimise join operations.

Where a table has a high frequency of sequential access, physical ordering of table rows, or the creation of an index to support the access could be useful.

Tables with a small number of rows could be held in main storage, if this option is supported by the DBMS.

The actual access path taken will depend on the DBMS query optimiser.

Interpreting the Composite Load Map

Page 72: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.72

Given the limitations of some DBMS it may be necessary to compromise the logical data model for performance reasons.

In general this should be hidden from application logic by the use of special access routines that are responsible for mapping the normalised application view to the de-normalised storage view.

Some DBMS specific design techniques may allow a normalised application view to be maintained.

Compromising the Logical Model

Page 73: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.73

Some transactions may implement designer added procedures that are part of the designed business system rather than part of the process model developed during analysis.

The starting point for transaction design for process implementing procedures is the logic defined via an action diagram for the elementary process(es) which the transaction (procedure) is to implement.

If action diagrams have not been developed during analysis then these should be developed in close association with the data model as part of design.

Transaction Design

Page 74: 6.1 Lecture 6 Data Design IMS2805 - Systems Design and Implementation

6.74

Operational, audit and access control issues. Impact of designer added data on procedure logic. Concurrency and locking issues. Failure, fallback, restart and recovery issues. Standardisation of transaction resource

requirements. Frequency of use and transaction response

requirements. The use of derived and redundant data to minimise

access to secondary storage. The most appropriate time to derive aggregate

data.

Physical Issues in Transaction Design