Asst. Prof. Dr. Ali Kadhim

Asst. Prof. Dr. Ali Kadhim

Asst.Prof.Dr. Ali Kadhim2 Information Systems Analysis


Introduction

Systems analysis has evolved dramatically over the last decade. During the

1960s and early 1970s, systems analysis used ad hoc methods to analyze and design

computer-based information systems. Many of these methods used natural language

to describe systems and user requirements. Few formal techniques existed.

All this changed during the 1970s. In that period many new ideas were

introduced to overcome problems associated with ad hoc methods. One of these ideas

was the use of modeling techniques instead of natural language for describing

systems. Another idea was the distinction, which was made between logical and

physical analysis and design. A third idea was the introduction of a structured way of

moving from a description of user needs to a working system.

Since these ideas were first introduced, several supporting modeling techniques

have been developed and used in practice. These techniques apply structured methods

to many aspects of systems analysis and design. One important set of techniques is

known as structured systems analysis which concentrate on modeling data flows,

deriving data structure and identifying system functions from those flows. Another

set of techniques concentrates on data, developing data models and using data

analysis methods to construct these models. Process and flow are then defined around

the data model. Data analysis techniques include relational analysis and a variety of

semantic modeling techniques. Another set of techniques concentrates on functional

analysis to describe how data is used and changed in the system.


Chapter One

Fundamental Concepts

1.1 Systems Concept and Information Systems

For our study, the most basic concept is that of a “System”, which we can

initially define as:

“A collection of elements such that interact with each other as well as with the

environment to reach a predefined goal.”

So all systems have some common properties, they have:

Elements,

Environments,

Interaction between their elements and with the environment, and the most

important of all,

Goals to be fulfilled.

Hence, one may define a system as:

“An organized collection of people, machine, procedures, documents, data, or

any other entities such that they interact with each other as well as with the

environment to reach a predefined goal.”

1.1.1 Information

Very similar to the word “System,” the ward “Information” is also a very much

used and confused word. “Data” is another word that is used with information or

even as its synonym. Yet a third word, “Knowledge,” is closely related to the other

two. In fact fifth-generation computer systems are being referred to as knowledge-

information processing systems. An attempt to clear that confusion around the terms

Data, Information, and Knowledge needs to refer to another term: “Message.”

A message is a group of characters that is stored, processed, and transmitted in

the information system of an organization. In other words; information in an

organization is stored, processed, and transmitted as messages. The content of

messages that flow through the information system of an organization has different

levels of meaning depending on whether the message carry data, information, or

knowledge. We recognize three levels of meaning for messages included in an

information system, as:


Knowledge

Information

Data

Meaning value

Data, are groups of characters recognized as having the lowest level meaning,

they are raw facts and opinions,

Information, has more meaning than data in that it is useful in a present decision

situation,

Knowledge, has the highest level of meaning because it represents information

that can be potentially useful in future decision situation.

There is still difficulty in distinguishing data from information or knowledge,

however, because a certain data element may be information to a user at one time and

knowledge to the same user at a different time or place. We, therefore, use the term

“Information” as a general term, without differentiating the meaning level of the

messages.

1.1.2 Information Systems and Organization

The information system of an organization may be defined as:

“A system that serves to provide information within the organization when and

where it is needed at any managerial level.”

Such a system must take the information received and store, retrieve, transform,

process, and communicate it using the computer system or some other means.

Bryce (1983), defines an information system as:

“A logically interrelated set of business process that accomplish organizational

goals.”

Information in an information system environment has the following common

requirements:

Its recipient in the proper frame of reference must understand it.

It must be relevant to a current need in the decision-making process.

It must have a surprise value, that is, what is already known should not be

presented.

It must lead its users to make a decision.

Information system to yield information with above characteristics, it must have

some attributes, as follows:


Effective processing of information. This refers to proper editing of input data and

efficient utilization of hardware and software.

Effective management of information. Stressed are care in file management and

security and integrity of existing data.

Flexibility. The information system should be flexible enough to handle a variety

of operations.

User satisfaction. Of prime importance are user understanding of satisfaction

with the information system.

Any action in an organization depends on the result of a decision-making

process at the proper managerial level of that organization.

1.2 Subsystems of an Information System

The information system of any organization contains information related to three

basic types of operations, namely, transaction processing, control, and strategic

planning. One could group them into two as operating level and management level

activities as,

Information

Transaction data

Decision by management

(Information + rules) Management – level

activities

Decision by operations

personnel

(Information + rules)

Operating – level

activities

“Operating level and management level

activities”


It is almost conventional now to represent management level activities as a

triangle and to base it on a rectangle as a symbol of operating level activities, as

In strategic planning, the executive or top management of the organization

decides on the objectives of the organization, on the resources to be used to attain

these objectives. The control function has both management and operational

components. In management control, middle-level managers assure those resources

are obtained and used effectively and efficiently to accomplish the organization’s

objectives. In operational control, supervisory management assures that specific tasks

are carried out effectively and efficiently.

Transaction processing means the daily routine business operations of the

organization.

When we consider the relationship between the information system of an

organization and its activities, common practice has been to define two subsystems as

MIS (Management Information System) and OIS (Operations Information System),

as

MIS

OIS

“ Information subsystems of an organization”

Transaction Processing

Management Level Activities

Operating Level Activities

Control

Strategic

Planning Executive management

Middle management

(Management controls)

Supervisory management

(Operational controls)

Operational personnel

“ Information – related activities of organization ”


The OIS is the information subsystem relevant to transaction processing of the

organization, and MIS is the information subsystem relevant to managerial decision

for control and strategic planning purposes.

1.3 EDP/ MIS/ DSS

We have conceptualized the information system of an organization as consisting

of two subsystems: MIS and OIS. Quite often the OIS of an organization is referred

to as EDP (Electronic Data Processing) if computers support the information flow. In

addition, the tendency now is to break down the MIS subsystem of an information

system into MIS and DSS (Decision Support System).

The effective collection, storage, transformation, and communication of

information in an organization is a function of EDP, MIS, and DSS. Understanding

the differences between EDP, MIS, and DSS will, therefore, help managers to cope

with the dynamic information systems they are living with.

We may state that data are of primary interest in an EDP system, which

functions at an operational level within the organization. The internal output of an

EDP system consists mainly of factual reports and summary reports.

In a MIS, information is data, which has processed to become meaningful to a

variety of potential users, primarily middle and upper management. In contrast to the

EDP system, an MIS views not just the transformation of data but how it can be

turned into useful organizational information. The major output of a MIS consists of

standardized reports and interrogative reporting. A MIS is designed from an

organizational perspective rather than from business transaction perspective.

In DSS, the emphasis is on decision making at all levels of management in the

organization. DSS is a tool to help managers in their decision-making activities by

providing them with the necessary information. The information produced by a DSS

consists of interactive/iterative reports and unstructured reports oriented to the

individual manager.

An important distinction between DSS on the one hand and EDP and MIS on the

other has to do with development of the system structure. In EDP and MIS, the

system designers develop the system structure, whereas in DSS the manager not only

provides the input but he/she defines the structure as well.


1.4 Other Aspects of Information

Information is not only a resource but also an assist for an organization,

empowering it to produce changes in its environment. Defining knowledge as high-

level information, one sees the significance of information in the technical

development of countries.

The data communications system of an organization is another topic that is

closely related to the information system of that organization. The transmission of

data / information from one point to another in the organization by any means is data

communications, and it seems that more and more, data communication and

information systems of organizations are becoming inseparable. The convergence of

information processing and data communications of organization defines the eighties

as a “decade of information”.

The fast growing use of databases (DB) and database management systems

(DBMS) is also closely related to data processing and data communication and

information systems, and the proper application of these systems definitely improves

the value of information in organizations. One of the major advantages of dB

applications is data integrity or data correctness. A dB also improves the value of

information by minimizing data redundancy. An another major advantage of dB is

that the same set of data is available for different applications.

Another important factor related to the success of information systems as they

impact decision-making activities is the reliability of the information in the database

system.

The developments in “knowledge-based expert system,” or “knowledge

systems” for short, are expected to lead over time to the construction of extremely

valuable “knowledge bases,” a basic unit of the new discipline of “knowledge

engineering.” This by improving the performance of tasks that require expertise.

Data processing

Data communication Decade of

information

Information

systems

1970 1980 1990

“ Information systems in the 1980s ”


1.5 General Model of a system

A general model of a system is input, process, and output. The system is inside

the boundary; the environment is outside the boundary. In some cases, it is fairly

simple to define what is part of the system and what is not; in other cases, the person

studying the system may arbitrarily define the boundaries.

Each system is composed of subsystems, which in Turn are made up of other

subsystems, each subsystem being delineated by its boundaries. The interconnections

and interactions between the subsystems are termed interfaces. Interfaces occur at the

boundary and take the form of inputs and outputs.

“Computer configuration as system”

“Central Processing Unit as system”

System Output Input

Arithmetic

unit

Control

unit Storage

unit

Interfaces

Input

units

Storage

units

Output

units CPU

Interfaces


A subsystem at the lowest level (input, process, and output) is often not defined

as to the process. This system is termed a black box, since the inputs and outputs are

known but not the actual transformation from one to the other. The major system

concepts of boundary, interfaces, subsystem, and black box are illustrated as:

“The boundary concept”

“Interfaces-interconnection at the boundary”

“Simple case of subsystems operating in serial fashion”

Processes (transformation)

not defined

“A Black-box System”

The boundary

delineates the

system

Outside the

boundary is the

environment

Subsystem

Subsystem

Subsystem

# 1

Subsystem

# 2

Subsystem

# 3

Defined

inputs Defined

outputs


Another major system concepts, feedback for system control, will be described

latter.

1.6 Types of Systems

There are several ways of classifying systems that is deterministic and

probabilistic, or closed and open systems.

1.6.1 Deterministic and Probabilistic Systems

A deterministic system operates in a predictable manner. The interaction among

the parts is known with certainty. If one has a description of the state of the system at

a given point in time plus a description of its operation, the next state of the system

may be given exactly, without error. An example is a correct computer program,

which performs exactly according to a set of instructions.

The probabilistic system can be described in terms of probable behavior, but a certain degree of error is always attached to the prediction of what the system will do. An example of a probabilistic system is a set of instructions given to a human who, for a variety of reasons, may not follow the instructions exactly as given.

1.6.2 Closed and Open Systems

A closed system is defined in physics as a system, which is self-contained. It

does not exchange material, information, or energy with its environment.

“Close System”

In organizations and in information processing, there are systems that are

relatively isolated from the environment but not completely closed in the physics

sense. These will be termed closed systems, meaning relatively closed. A computer

program is a relatively closed system because it accepts only previously defined

inputs, processes them, and provides previously defined outputs. In summary, the

No exchanges

with environment


relatively closed system is one that has only controlled and well defined inputs and

outputs. It is not subject to disturbances from outside the system.

“Relatively Closed System”

Open systems exchange information, material, or energy with the environment,

including random and undefined inputs. Open systems tend to have form and

structure to allow them to adapt to changes in their environment in such a way as to

continue their existence. They are “self-organizing” in the sense that they change

their organization to response to changing conditions.

“Open System”

1.7 Subsystems

The use of subsystems as building blocks is basic to analysis and development

of systems. This requires an understanding of the principles, which dictate how

systems are built from subsystems.

1.7.1 Decomposition

A complex system is difficult to comprehend when considered as a whole.

Therefore, the system is decomposed or factored into subsystems. The boundaries

and interfaces are defined, so that the sum of the subsystems constitutes the entire

system. This process of decomposition (or process of factoring) is continued with

subsystems divided into smaller subsystems until the smallest subsystems are of

manageable size.

Decomposition into subsystems is used both to analyze an existing system and

to design and implement a new system. In both cases, the investigator or designer

Controlled exchange with

environment

insulated from outside

disturbances

Known and

defined inputs Known and

defined outputs

Subject to known

and unknown inputs

and environmental

disturbances

Known

Known

Unknown

Disturbance Output


must decide how to factor, i.e., where to draw boundaries. The decisions will depend

on the objectives of the decomposition and also on individual differences among

designers. In design the boundary need to be clearly specified, interfaces simplified,

and appropriate connections established among the subsystems.

1.7.2 Simplification

The process of factoring could lead to large number of subsystem interfaces to

define. For example, four subsystems which all interact with each other will have six

interconnections; a system with 20 subsystems all interacting will have 190

interconnections. The number of interconnections if all subsystems interact is in

general:

½ n (n-1); where n is the number of subsystems.

Simplification is the process of organizing subsystems so as to reduce the

number of interconnections. One of the common use methods of simplification is

Clusters of subsystems are established which interact with each other, then a single

interface path is defined from the cluster to other subsystems or clusters of

subsystems. An example is a database, which is accessed by many programs, but the

interconnections only through a database management interface.

“Systems connected within cluster, and clusters connected with single interface”

1 2

3 4

A2

A3 A4

B1 B2

B3 B4

A1


1.8 Control in Systems

The basic model of a system as inputs, process, and outputs, did not include

regulation and control of the system. For control purposes, a feedback loop is added

to the basic model.

“Feedback control for a system”

Feedback, which seeks to dampen and reduce fluctuations around the standard,

is termed negative feedback. It is used in feedback control loops. Positive feedback

reinforces the direction in which the system is moving. In other words, positive

feedback causes the system to repeat or amplify an adjustment or action.

1.8.1 Negative Feedback Control

Negative feedback control in a system means keeping the system operating

within certain limits of performance. Control using negative feedback normally

involves four elements:

1. A characteristic or condition to be controlled; The characteristic or condition

must be measurable from some output.

2. A sensor for measuring the characteristic or condition.

3. A control unit which compares the measurements with a standard for that

characteristic or condition.

4. An activating unit, which generates a corrective, input signal to the process.

These elements are diagrammed as:

“Negative Feedback Control Elements”

System Output Input

Sensors

Control

device

Input Processor Output

Sensor

Control

comparison

Standard

Activating unit


1.8.2 Feedback Control Loops

Feedback control loops are frequently classified as closed or open. A closed

control loop is an automated control such as a thermostat or computer-controlled

process. In much the same way that a closed system in insulated from disturbances in

the environment, a closed feedback loop is insulated from disturbances in the control

loop. An open control loop is one with random disturbances, such as those associated

with human control elements, there are variations between the two extremes. A

human-machine system is thus an attempt to use the best characteristics of both open

and closed controls to make the system as closed as possible.

1.8.3 Filtering

Filters are often used for system input and in feedback. A filter is essentially a

system element, which keeps out certain inputs, allowing others to enter the system.

Filters may be used to:

1. Reduce the type of inputs.

2. Reduce the amount of information.

A feedback system causes the system to generate corrective responses, but it

may not be desirable to have the system respond to small variations. A filter is

established to eliminate feedback, which does not reach the threshold requiring

corrective action. Exception reports in industry, which cover only those items, which

require action (all others are assumed to be within control limits) are an example of a

filter.

In summary filters should be used in information systems to reduce unneeded

or irrelevant data being accepted for processing or being output for another (perhaps

human) subsystem.


Chapter Two

Systems Analysis And Design

2.1 Why is System Analysis Necessary?

Systems analysis is an important activity that takes place when we are building

new information systems or changing existing once. But why are special activities

(such as systems analysis) needed to build good information systems? Why don’t

these things happen as a matter of course in an organization?

To answer these questions you should consider what needs to done to build

complex computer-based information systems. Basically, it is necessary to set up the

right procedures to ensure that all the organization’s personnel have all the data

needed for their work. To do this, there are myriad of things to be done:

Equipment must be chosen,

New procedures designed,

Programs must be written to support these procedures on the equipment.

The systems are often made up of many interrelated tasks. Changes to any one of

these tasks or additions of new tasks can affect existing once.

It is therefore necessary to spend considerable time to properly understand the system

and its problems. It is only after developing a good understanding that it becomes

possible to propose useful changes, which should make the system more useful and

should not cause unforeseen effects.

2.2 What is Systems Analysis and Design?

The investigation into system operation and possible changes to the system is

called systems analysis. Once systems analysis is completed, system design

commences. Systems analysts use the understanding of the existing system and its

problems to design and eventually build a better system.

Systems analysis has become more difficult over the last decade or so. Early

systems analysts worked on relatively simple systems, which did one thing or had

one function. Now the systems tend to be complex, with many of the systems using

data from a large number of sources and a larger variety of equipment than was

available with early systems. Consequently, systems analysts need better tools to

assist them in their analytical work. A major objective of this course is to describe

such tools and how analysts would use them to analyze systems.


2.3 System Life Cycle

Computer-based information systems are developed in a sequence of steps.

This sequence is known as the system life cycle or problem-solving cycle or

sometimes the development cycle. The system life cycle is used for several reasons,

these are:

First, it is used to organize the large number of activities needed to build a

system.

Second, the system life cycle helps analysts and designers to solve problems that

crop up during system development.

The system life cycle also assists management by producing reports on project

status and keeping track of resource needs.

For this reason it is now more common to define the life cycle as a sequence of

phases. Each phase is made up of more detailed activities than the last, and each has a

specific goal. However, the detailed activities for each phase are not defined at the

start of the project. The usual way to proceed is to review each phase when it is

completed. The review produces a report on the outcome of the phase and also

defines the goal as well as a detailed plan for the next phase.

A number of different kinds of life cycle are used, with each kind designed for a

different type of problem. One kind of the life cycle may be used to develop a

decision support system: a different kind may be used to develop an operational

transaction system. The most common kind of life cycle is the linear cycle.

2.4 The Linear Cycle

The linear cycle is made up of a sequence of consecutive phases. No phase in

the sequence can commence until the previous phase has been completed. A report is

produced at the end of each linear cycle phase, describing what has been achieved in

the phase and outlining a plan for the next phase. The report also includes any system

descriptions, design decisions and problems encountered. Analysts and designers use

this information at the next phase. Phase reports are also used to keep management

informed of project progress, so that the management can use the reports to change

project direction and to allocate resources to the project.


The linear cycle commences with a phase defining the problem to be solved.

Once the problem is agreed upon, the feasibility of solving it is determined. If the

project is feasible, a detailed analysis of a system is then made. System design and

implementation follow this.

User procedures

Proposed equipment configuration

Programs & dB specification

Problem

Definition

Feasibility

Study

System

Analysis

Broad

Design

Detailed

Design

Develop-

ment

Impleme-

ntation

Post-

Impleme-

ntation

Mainten-

ance

Post-implementation

Terms of reference

Project goal

Project bounds

Resource limit

Conceptual solution

Expected cost & benefits

System model

Detailed system objectives

System design

System

Implementation

Working system

“ The linear cycle ”


2.4.1 Linear Cycle Phases

Phase 1 – Problem definition Many people consider this to be the most important phase. It defines the

problem to be solved and thus sets the direction for the whole project. It also defines

the project bounds, which define what parts of the system can be changed by the

project and what parts are outside its control. The resources to be made available to

the project are also specified in this phase. These three important factors – the project

goal, project bounds, and resource limits – are sometimes called the project’s terms

of reference. Because of their importance, the organization’s management sets them.

Phase output

The output of this phase consists of the terms of reference for the project. The

phase output contains a rough idea of the resource requirements of each of the

subsequent project phase. It will contain tentative start and completion dates for each

phase and the number of persons expected to be involved in each phase.

Phase 2 – Feasibility Study The feasibility study proposes one or more conceptual solutions to the problem

set for the project. The conceptual solutions give an idea of what the new system will

look like. They define what will be done on the computer and what will remain

manual. They also indicate what input the systems and what outputs will be needed

will produce. These solutions must be feasible and a preferred solution must be

accepted.

Three things must be done to establish feasibility:

First, it is necessary to check that the project is technically feasible; does the

organization in fact have the technology and skills necessary to carry out the

project and, if not, how should the required technology and skills be obtained?

Second, operational feasibility must be established; to do this it is necessary to

consult the system user to see if the proposed solution satisfies user objectives

and can be fitted into current system operations.

Third, project economic feasibility needs to be checked. The study must

determine whether the project’s goals can be achieved within the resource limits

allocated to it. It must also determine whether it is worthwhile to proceed with the

project at all or whether the benefits obtained from the new system are not worth

the costs (in which case, the project will be terminated).

Phase output

One output of a feasibility study is a preferred conceptual solution together with

the expected system costs and benefits. The plan in phase 1 has now been expanded

into more detail, showing resource requirements for each phase.


Phase 3 – System Analysis A detailed analysis of the system is made in this phase to familiarize designers

with the operation of the existing system. analysts use many of the commonly used

system analysis techniques such as data flow diagrams and data analysis during their

analysis. They also follow specified procedures in their search for information about

the system. this calls for interviews with system users, questionnaires and other data-

gathering methods. Analysts must spend considerable time examining components,

such as the various forms used in the system, as well as the operation of existing

systems.

Phase output

This phase results in a detailed model of the system. the model describes the

system functions, system data, and system information flows.

Phase 4 – System Design This phase produces a design for the new system. There are many things to be

done here:

Designers must select the equipment needed to implement the system.

They must specify new programs or changes to existing programs.

Specify a new database or changes to the existing database.

Designers must also produce detailed user procedures that describe how users

will use the system.

System design usually proceeds in two steps; Phase 4A – Broad design and

Phase 4B – Detailed design. During broad design, the conceptual solutions proposed

by the feasibility study are looked at in more detail. During broad design, major new

functions are proposed and change to existing functions defined. Important inputs and

outputs are also defined at this point. In addition, broad design outlines which part of

the system is to be automated and which will remain manual.

It is common during broad design to consider a number of alternative solutions,

involving different degrees of automation. One of these alternatives is then chosen as

the preferred broad solution.

It is only when abroad solution is chosen that detailed design starts. During the

detailed design phase, the database and program modules are designed and detailed

user procedures are documented. The interfaces between the system users and

computers are also defined. These interfaces define exactly what the user will be

expected to do to use the system.

Phase output

The output of this phase includes a proposed equipment configuration together

with specifications for the database and computer programs. Detailed user procedures


are also provided. These include any input forms and computer interfaces between

users and the computer. The user manual is also ready at the end of this phase.

Phase 5 – System Construction Like the design phase, this phase is also often broken up into two smaller

phases, phase 5A – Development and Phase 5B – Implementation. The individual

system components are built during the development period. Programs are written

and tested, and user interfaces developed and tried by users. The database is

initialized with data.

During implementation, the components builds during development are put into

operational use. To complete the changeover, users must be trained in system

operation and any existing procedures converted to the new system.

Phase output

At the end of the phase, users are provided with a working system. This includes

the set of working programs and an initialized database. In addition, any system

documentation describing the programs is also completed. All users have by now

been trained and can use the new system.

2.4.2 What happens after project completion?

The system is considered to be working when Phase 5 has been completed.

However, there are still a number of activities that take place after a system is

completed. The two main activities are the post-implementation review and

maintenance.

Post-implementation review The post-implementation review usually takes place about a year after the

system is implemented. It evaluates the new system to see if it has indeed satisfied

the goals set for it. The system is examined to see if the benefit expected of it have

been realized. If they have not, a study is made to see why not. Part of this study is

the project life cycle itself. This evaluation then sets guidelines for decision making

in future projects. In addition, post-evaluation may suggest minor changes to be made

to the system. if the system is performing badly, post evaluation may suggest a total

redesign.

Maintenance Maintenance is necessary to eliminate errors in the working system during its

working life and to tune the system to any variations in its working environment.

There always some errors detected that must be corrected.


If a major change to a system is needed, anew project may have to be set up to

carry out the change. This new project will then proceed through all the above life

cycle phases.


2.5 Staged Design

Problem solving cycle are often called loopy linear. They arise by default

because some thing was overlooked in the original plan. Iteration, or ‘loopiness’, is

seen as costly and to be avoided. It generally requires wasteful rework of earlier

phases.

One reason for project errors and consequent loopiness is that the problem being

solved is too large or too uncertain. If it is too large, then it is advisable to break it up

into smaller stages and build the system a stage at a time.

Staged design can only be used where it is possible to break a problem up into a

number of distinct subsystems. Thus, the first step may be a feasibility study that

determines the size of a project. The feasibility steady may suggest that the project is

very large and that it should be broken up into stages. If staged design is approved,

the next step will be to break the whole project up into a number of stages. Each stage

is then developed using the linear cycle and each subsystem is developed separately.

You will not that each stage start with its own problem definition and feasibility

study. This is necessary to get a detailed evaluation and plan for each stage, and also

to evaluate the feasibility of integrating the system produced in each stage with the

systems developed in previous stages. The stage then continues with systems

analysis, system design, and system implementation.


Problem

Definition

Define

Stages

Feasibility

Study

System

Analysis

System

Implement-

ation

Problem

Definition

System

Analysis

System

Implement-

ation

System

Design

Feasibility

Study

System

Design

Feasibility

Study

Stage 1

Stage 2

“Staged Design”


2.6 Structure Systems Development

The classical system approach to information system development emphasizes

the use of a life cycle and documentation for system development. Because of the

complicated nature of information system, however, accepting and following the

steps of the life cycle has not been enough to develop a successful information,

system. Unfortunately, until recently no one realized that following the life cycle

alone would not yield successful in formations. Now, however, it is understood that

the systems development approach requires some tools and techniques to make it

successful. Such an approach is what we commonly call a “structured approach” or

“structure systems development” or “structured analysis and design of information

systems”. The structured approach adopts some well-defined, standardized

procedures and documentation, or at the least a methodology to be product. The

resulting system will have a well-defined structure.

Through a structured approach, more complex problems of business and other

organizations are solved, and the resulting system is easy to maintain, flexible, more

satisfactory to users, better documented, on time, and within budget.

During system development, the information requirements of the organization

are usually documented in the form of a “functional specification” or “logical design”

and represent an agreement between the users and the developers of the system.

Structured analysis is used to define and describe the system that best satisfies

user requirements, given certain time and budget constraints. The objective of

structured design is to minimize the lifetime cost of an information system by

emphasizing maintainability, because maintenance is the most costly component in a

system’s life cycle. The critical point in design is to match the solution to the

problem.

2.7 Structure Tools For Analysis

At the very least, we require three types of new analysis phase tools:

Something to help us partition our requirement and document that partitioning

before specification. For this we use a Data Flow Diagram (DFD) a network of

interrelated processes.

Some means of keeping track of and evaluating interfaces without becoming

unduly physical. For our interface tool we adopt a set of Data Dictionary

convention, tailored to the analysis phase.

New tools to describe logic and policy, something better than narrative text. For

this there are three possibilities: Structured English, Decision Tables, and

Decision Trees.


2.8 Structured Analysis – A Definition

Now that we have laid all the groundwork, it is easy to give a working definition

of structured analysis, Structured Analysis is the use of these tools:

Data Flow Diagram,

Dictionary convention,

Structured English,

Decision Tables, and

Decision Trees.

To build a new kind of Target Document, the structured specification.

Definition: Target Document – Establishes the goals for the rest of the project. It says

what the project will have to deliver in order to be considered a success. The

target document is the principal product of analysis.

2.9 Models in Structured Analysis and Design

There are a variety of ways to implement structured analysis and structured

design, some of the features all methodologies shear are the use of graphical models,

emphasis on user communication (and hence user involvement), repetition of the

previous phase(s) and step(s), and reviews. In the structured analysis process, the

models should represent the functions of the system rather than the means to

accomplish them; in other words, emphasis is on the logical components of the

system rather than its physical components. Consideration of design and

implementation issues is postponed until agreement has been reached between

designers and users on the function or objectives, of the system. in some structured

design methodologies, a set of evaluative criteria is also provided as part of the

methodology as a kind of checklist for the systems analyst .


Physical Logical

2 1

3 4

Old (or current) system

New system

The models that are used in structured analysis and design and their

interrelation are depicted as,

The first model describes how the current system operates by depicting its physical

components. The second model is a logical abstraction of the current system,

prepared to show what the existing system does. The third model represents the

planned operation of the new system. By adding general physical components to the

logical model of the new system we then arrive at a physical model of the new

system – number 4 in the Figure – that details the actual implementation of structured

analysis and structured design.


Chapter Three

Gathering Information

3.1 Introduction

Analysts do a number of things during systems analysis. They gain an

understanding of how the system works and build a system model. They also find out

the requirements of the system users and use these requirements to set objectives for

the new system. Obviously analysts cannot simply sit down and draw the model of

the system or set user requirements in the privacy of their own offices. They must

spend some time studying the system, talking to its users and obtaining information

about how it works.

This chapter describes how to go about getting this information by looking at

sources of information and showing how to gather information from them.

3.2 A Framework for Gathering Information

Information gathering, especially in a large and complex system, can be an

onerous task. Information must be gathered in organized way to ensure that nothing is

over looked and that all system detail is eventually captured. All users must be

consulted to ensure that every system problem, user requirement and objective is

identified. The search must be in such a way that repetitive work is avoided and that

several different analysts seeking the same information do not confront the same

users.

It is easy to go wrong when information gathering is in large systems that have

many information sources. To avoid problems, it is necessary to develop a search

strategy for gathering information in order to ensure that all of the above criteria are

met.

3.3 Search Strategies

A search strategy is formulated by two important choices. First, sources from

which information is to be obtained are identified together with methods to obtain the

information from each source. Then a search procedure for getting the information is

established. The search procedure defines where to begin the search and how to

continue. It also identifies the sequence in which sources will be searched and what

information is to be gathered at each step.


In addition, the search strategy considers modeling methods that help analysts

make sense of the information obtained from the sources. These modeling methods

are used by analysts to build system models, which help analysts to keep track of

what they have done to complete their work.

3.3.1 Information Sources

There are a variety of sources of information for a system. Each source usually

yields different information and requires a different search method to get that

information. Some common sources and search methods appropriate to them are

lined below.

System user; are usually the first information source investigated by analysts.

From users it is possible to find out the existing system activities and to determine

the user objectives and requirement. The usual search methods here are

interviews or sometimes questionnaires.

Information

sources Search

methods Modeling

methods

Select

search

procedure

Select

source

and

method

“The main elements of a search strategy”


Forms and documents; are useful source of information for system date flows and

transactions. The search method begins with the analyst obtaining a list of such

documents from the system users. Analysts then go through the documents to find

their date elements and date structures.

Computer programs; can be used to determine the details of date structures or

processes. The search methods are often laborious and involve reading the

program or its documentation and sometimes running the program with test date

to see what it does.

Procedure manuals; specify what people do in an organization. Analysts to

determine detailed user activities can use them. This detail becomes important in

detailed system design. The method again calls for a detailed examination of the

techniques of the applicable design methodology.

Reports; this source indicates the kinds of outputs needed by users. It can be used

as a basis for user interviews.

3.3.2 Search Procedures

The search procedure suggests what order is to be used to search information

sources and what methods are to be used as the search proceeds. Thus the search

procedure becomes a plan stating what information is to be obtained from each

source and what sequence is to be used to search the sources.

The search procedure has a number of general requirements. First, it must be

top-down in nature. It must also ensure that all information sources are examined to

ensure that all information about the system is gathered.

The search procedure should specify the organizational level at which interviews

start, the personnel to be interviewed and any other information sources to be used.

They should also include an interview plan and times when other sources are to be

examined.

Search procedures very depending on whether an existing system is being

investigated or whether a totally new system is to be designed.


3.3.3 Search Methods

There are two important search methods, interviewing techniques, and questionnaires.

Interviewing; this is the most common method of gathering information from

users. Interviewing is a continuous process that is used by the analyst to gradually

build a model of the system and to gain understanding of any systems problems.

There are two key factors in successful interviewing. The first is to choose

people to interview. This is important, us the analyst must ensure that all people

within the study boundary are considered. The next important factor is finding the

right way to conduct an individual interview.

The interview plan, it specifies:

The users to be interviewed;

The sequence in which users are interviewed; and

The interview plan for each user.

You should not expect to obtain all of the information required from one user in

the course of one interview. There is usually a series of two, three or even more

interviews with a given user. Usually the analyst begins with an initial interview to

meet the users, this first interview is then followed by a number of fact-gathering

(fact-finding) interviews to gather all the major facts known to the user. Then there

may be one or more interviews to verify these facts and any models developed by the

analyst and to gain additional information to complete the analyst’s study.

Questionnaires; it contain all the questions that a user must answer to provide

information sought by the analyst. The questionnaire is then to the user and the

analyst analyzes replies.

Practical experience suggests that questionnaires are not usually a good

substituted for interviews. Questionnaires, however, are useful when the same

kind of information is sought from a number of users.

Questionnaires are useful for gathering numerical data or getting relatively

simple opinions from a number of people, but they are not very effective for in-

depth searches or identifying system problems or solutions. Interviews tend to be

more successful for this purpose.

Other search methods, search methods for sources such as documents, reports, or

computer programs are less formal. They simply evolve sitting down and going

through these sources, looking for relevant information. Often the search is

guided by the modeling techniques provided by the design methodology. Thus

when we examine programs, reports, procedures or documents, we compare them

to the current model. As we do this, we add any new information to the model

and check for any inconsistencies.


Chapter Four

Starting a Project

4.1 Introduction

The first two linear cycle phases, problem definition and feasibility analysis, are

used to start a project. These phases are important because they set the direction for

the remainder of the project. The first phase defines the project goal. The second

determines a feasible way for achieving this goal. Feasibility analysis usually

considers a number of project alternatives, one of which is chosen as the most

satisfactory solution.

It is therefore necessary to investigate as many alternatives as possible to ensure

that the best project direction is chosen. Furthermore, this direction must be set

without going into detailed design. These alternatives should be evaluated to find the

best way to satisfy the project goal. These alternatives also need to be evaluated in a

broad way without committing too many resources. The analyst needs to find the best

possible solution without actually building the system.

4.2 Setting the Project Goal

Project goals have two important aspects for systems analysts: how they look

and how they are specified. There are many ways of setting project goals.

One important guideline for goal setting is to remember that goals should not be

unrealizable ideals that are subsequently ignored. They must be developed within the

practicalities of the organization. One way to ensure that such practicalities can be

met is to elaborate the project goal into more detailed sub-goals that consider

organizational constraints. These sub-goals are used in later stages to guide detailed

analysis and design.

Another guideline for goal definition is the identification of deficiencies in the

existing system. The project goal is to remove sub-deficiencies. Deficiencies are

usually found through interviews or by examining documents about system

performance. During initial analysis, therefore, interviewers should search for

deficiencies such as:

Missing functions;

Unsatisfactory performance; or

Excessively costly operation.


So far we have suggested that goals are found by examining the internal

operation of a system. Examining a system’s external environment can also set goals.

One important external source is competitor performance. It is always worthwhile to

examine what similar organizations do and see whether some of their methods can be

used in the proposed system. Technical developments outside an organization are

also an important source, especially for systems that use computers. New technical

developments must be evaluated to see if they can improve system operation.

4.3 Generating Broad Alternative solutions

Feasibility analysis begins once the goals are defined and agreed upon. It starts

by generating broad possible solutions, which are used to give an indication of what

the new system should look like. At this stage there is no need to go into the detailed

system operation.

The solution should provide enough information to make reasonable estimates

about project cost and to give users an indication of how the new system will fit into

the organization.

The judgment of what a broad solution is and how it differs from a detailed one,

or what to include is very subjective. However, two things must be kept in mind

when proposing a broad solution. It should give people a good idea of what the new

system will look like and also convince them that it will work. Another objective of

the broad solution is to form an estimate of cost. To do this, a broad solution should

include the major parts of a proposed system.

4.3.1 Components of a Broad Solution

It is generally agreed that a broad solution should include any additional

equipment that will have to be purchased for the project, in order to estimate some of

the direct costs of the project.

The broad solution should also specify what is to be done by the computer and

what will remain manual. From this we can find the amount of data to be stored on

the machine and any transmission costs of getting data to and from machine.

The solution should specify the information that will be made available by the

system, only broad descriptions need be given here.

Usually, to investigate all possibilities, a number of broad solutions are proposed

and evaluated to find the best solution. One guideline here is to start with three

alternatives, namely:


A totally automated system;

A system with minimal automation; and

A system somewhere in between.

One should not, however, become extreme here and propose alternatives that

obviously cannot be supported in the organization. The organization has certain

constraints on the amounts of available finds and personnel skills, and on working

and accepted standards. No proposed solution should obviously exceed such funding

limits or ignore some critical system operation or data need.

4.4 Evaluating the Proposal

Once proposals are generated they are evaluated. Three things are done in the

evaluation:

The technical feasibility of a proposal is evaluated. Is the technology needed for

the proposed system available? If it is available, can it be used in the

organization?

The operational feasibility is looked at. Can the proposed system fit in with the

current operations? Does it provide the right information for the organization’s

personnel? Does this information arrive at the right place in time?

The economic feasibility is to be considered. Is it worthwhile to pursue the

project? Or will the amount of money needed to implement it never be recovered?

4.5 Economic Feasibility

To carry out an economic feasibility study, it is necessary to place actual money

values against any purchases or activities needed to implement the project. It is also

necessary to place money values against any benefits that will come from a new

system created by the project. Such calculations are often described as cost-benefit

analysis.

4.5.1 Cost-benefit Analysis

Cost-benefit analysis usually includes two steps. One is to produce the estimates

of costs and benefits. The other is to determine whether the project is worthwhile

once these costs are ascertained.



4.5.1.1 Producing Costs and Benefits

The goal is to produce a list of what is required to implement the system and a

list of the new system’s benefits.

Cost-benefit analysis is always clouded by both tangible and intangible items.

Tangible items are those to which direct values can be attached (e.g. the purchase of

equipment, time spent by people writing programs or items such as insurance costs or

the cost of borrowing money). Some tangible costs often associated with computer

system development are:

1. Equipment costs for the new system. Various items of computing equipment as

well as items such as accommodation costs and furniture are included here.

2. Personnel costs. These include personnel needed to develop the new system and

those who will subsequently run the system when it is established. Analysts,

designers and programmers will be needed to build the system. Also included are

any costs incurred to train system users?

3. Material costs. These include stationary, manual production and other

documentation costs.

4. Conversion costs. The costs of designing new forms and procedures and of the

possible parallel running of the existing and new systems are included here.

5. Other costs. Sometimes consultants costs are included here together with

management overheads, secretarial support, travel budgets and so on.

Intangible items on the other hand are those whose values cannot be precisely

determined and are the result of subjective judgement. For examples, how much is

saved by completing a project earlier or providing new information to decision-

makers? Considerable argument can take place before agreement is reached on such

intangible costs.

The sum value of costs of items needed to implement the system is known as the

cost of the system. The sum value of the savings made is known as the benefit of the

new system. Once we agree on the costs and benefits we can evaluate whether the

project is economically viable.

The cost estimates are usually used to set the project budget. Often it is

convenient to divide these costs into project phases to give management an idea of

when funds and personnel will be needed. The cost estimates need to be worked out

very carefully. One should avoid any omission from the estimates necessitating

requests for more funds because something was forgotten.


4.5.1.2 Determining Whether a Project is Worthwhile

The costs and benefits are used to determine whether a project is economically

feasible. There are two ways to do this, the payback method or the present value

method.

The Payback Method

The payback method defines the time required to recover the money spent on

the project. Its idea is quite simple. We know how mach a project will cost and the

saving for each year after the project is implemented. The computation to determine

the number of years needed to recover the cost is quite simple. As an example,

consider the Table, which show the cost of implementing a project in the year 0 to be

$111,000. Benefits from the project are obtained over the next five years, with the

actual benefits also shown in the Table. If you add the values in the benefit column,

you will note that the original $111,000 invested is returned some time towards the

end of year 3. Hence the payback period is about 2.9 years.

Year Cost Benefit

0 $110,000

1 - $20,000

2 - $40,000

3 - $60,000

4 - $30,000

5 - $10,000

The present value method

The payback method is not always the best way to determine economic

feasibility. It does not seem to make much sense to put, say, $10,000 into a project

and recover $12,000 in year 5. You might as well put the $10,000 into an investment

account at 10 percent and get $16,115 in that time. Given an investment rate of 10

percent, one would have to recover more than $16,115 in year 5 (assuming no returns

in Year 1 to 4) to make the project worthwhile. This is where the present value

method comes in.

The idea of the present value method is to determine how much money it is

worthwhile investing now in order to receive a given return in some years time. The

answer obviously depends on the interest rate used in the evaluation. Thus, the

present value of $16,115 in year 5 is $10,000 today at 10 percent interest.


To some extent the present value method works backwards. First, the project

benefits are estimated for each year from today. Then, we compute the present value

of these savings. If the project cost exceeds the present value the project is not

worthwhile.

Let as look at the Table to see whether the project is worthwhile at a discount

rate 10 percent. To do this we find the present value of the benefit at each year. The

formula is:

Present value at Yearn = Benefit at Yearn / (1 + r/100)

n

We call 1/ (1 + r/100)

n the discount factor in Table

Year Cost Benefit Present value

of Benefit

Discount factor

(11%)

0 $110,000

1 - $20,000 $18,180 .909

2 - $40,000 $33,040 .826

3 - $60,000 $45,060 .751

4 - $30,000 $13,660 .683

5 - $10,000 $6,210 .621

Total amount $122,980

Thus, for example, the present value of $40,000 benefit at Year 2 is computed as:

Present value at Year2 = Benefit at Year2 / (1 + 10

/100)2 = 33,040

Note that the sum of the present value in the Table is $122,980. As this exceeds

$110,000, the project is worthwhile.

You may wish to compute the present value of the benefits in the Table at 15%

interest.


4.6 Selecting an Alternative

Ideally we could claim that a detailed cost analysis should be made for each

alternative. This is not usually the case. Some alternatives are quickly ruled out on

technical or operational ground.

4.7 Defining a Project Plan

Usually the last step of feasibility analysis is to draw up a project plan. The tasks

to be done have now been established and it is necessary to determine their sequence.

One choice, which needs to be made, is which problem-solving cycle to use. Do we

use a linear cycle, a staged cycle or some other cycle?

Once a decision to use a staged approach is made, we specify the times and

resources needed for each phase of each stage.


Chapter Five

Data Flow Diagrams

5.1 Introduction

The Data Flow Diagram (DFD) is one of the most important tools used by

systems analysts, to model system components, these components are:

The system processes,

The data used by these processes,

Any external entities that interact with the system, and

The information flows in the system.

5.2 Definition

DeMarco (1978) define a data flow diagram as:

“A Data Flow Diagram is network representation of a system. The system may

be automated, manual, or mixed. The Data Flow Diagram portrays the

system in terms of its components indicated.”

CREDITED-

INVOICE-

COPY

BANK-

DEPOSIT

CHECK

PAYMENT

CREDIT

INVOICE

DESPOSIT

FUNDS

RECORD

PAYMENT RECEIVABLES

FILE

ARCHIVE

FILE


5.3 Data Flow Diagram Elements

Data Flow Diagrams are made up of only four basic elements:

1. Data flows; represented by named vectors.

2. Processes; represented by circles or “bubbles”.

3. Files; represented by straight lines.

4. Data sources and sinks; represented by boxes.

We can read this data flow diagram as follows: -

“X’s arrive from the source S and are transformed into Y’s by the process P1 (which

requires access to the file F to do its work). Y’s are subsequently transformed into Z’s

by the process P2.”

5.2.1 Data Flow

Data flows model the passage of data in the system and are represented by

lines joining system components. An arrow indicates the direction of the flow and the

line is labeled by the name of the data flow.

Most data flows move between processes, but they can just as well flow into or

out of files, and to and from destination boxes and source boxes respectively.

Y

F

Z

X

S

P1

P2

CHECK

CREDITED-

INVOICE-

COPY PAYMENT

CREDIT

INVOICE

Pocket of

information

CUSTOMER


For a set of notational conventions, consider the following; they may not be

exactly universal, but they are serviceable nonetheless.

Data flow names are hyphenated.

No two data flows have the same name.

Names are chosen to represent not only the data moves, but also what we know

about the data.

B

ACTION-

REPORT

TROUBLE-

REPORT

A

STOCK-REPORT

EMERGENCY-

REORDER-TICKET

STOCK

CONTROL

PURCH-ASING

REJECT-ACCOUNT-

NUMBER

ACCOUNT-

NUMBER

VALID-ACCOUNT-

NUMBER

MAIN ACCOUNTS

FILE

CHECK

ACCOUNT

NUMBER


Data flows could be converge and diverge.

The data flows moving into and out of files do not require names. All other data

flows must be named.

A few words about what a data flow is not:

A data flow is not a representation of flow of control.

A data flow is not an activator of a process.

A data flow should not split into a number of other data flows.

How shall we represent flow of control and movement of control items on the DFD?

We shall not, we will defer specifying this information as long as possible.

GET-

NEXT-

CARD

CARD

REJECTED-

CARD

ACCEPTED-

CARD

CHECK

CARDS

ACCTG

DEPT.

DAY-OF-

THE-WEEK

TIME-

SHEET

PAYROLL

EMPLOYEE

RECORS

PRODUCE

PAYROLL


5.2.2 The Process

Processes invariably show some amount of work performed on data. Well-

chosen processes, will always have the characteristic of being completely named in

terms of their inputs and outputs, i.e., they are transformations:

A process is a transformation of incoming data flow(s) into outgoing data flow(s).

The most common notational convention is to represent processes by circles

(bubbles) on DFD. Some people use oval bubbles; SADT and certain other structured

analysis conventions use square bubbles.

Regardless of its shape, each bubble needs a descriptive name, the name must

give the user a general idea or we will not have seceded in conveying the big picture

with the DFD. In a completed set of DFDs, each process will be given a unique

number. The numbering convention will depend on how the various diagrams

interrelate.

LOSS

PROFIT SALES-

DATA

COMPUTE

INCOME

MISSPELLED-

WORDS

CORRECTLY-

SPELLED-WORDS WORDS

CHECK

SPELLING

WORD-LIST



5.2.3 The File

A file (or data store) is a temporary repository of data. Processes can enter

data into file or retrieve data from the file. A straight line represents each file with the

file name in close proximity in the DFD.

5.2.4 External Entities: The Source or Sink

External entities are outside the system, but they either supply input data into

the system or use the system output.

External entities that supply data into a system are sometimes called source.

External entities that use system data are sometimes called sinks. By convention,

sources and sinks are represented on DFD by named boxes.

PARTS MASTER

ARRIVALS-

REPORT NEW-PARTS-

RECORD UPDATE

PARTS

MASTER

CONTEXT

BOUNDARY

CLIENT

BANK

COMM.

CORRESP.

BANK

FED.

RSRV.


5.3 Guidelines for drawing DFDs

How do you draw one? As you become adept at it, you will find those different

problems call for different methods, but the following is a good general-purpose

approach:

1. Identify all net input and output data flows. Draw them in around the outside of

your diagram.

2. Work your way from inputs to outputs, backward from outputs to input, or from

the middle out.

3. Label all the interface data flows carefully.

4. Label the bubbles in terms of their inputs and outputs.

5. Ignore initialization and termination.

6. Omit details of trivial error paths (for now).

7. Do not show flow of control or control information.

8. Be prepared to start over.

5.4 Leveling Data Flow Diagrams: (Top-down analysis)

Leveling allows an analyst to start with a top-level function and elaborate it in

terms of its more detailed components. These detailed components are modeled as

lower-level DFDs. We starts with a Context diagram, which was then elaborated in

terms of top-level system functions. Lower-level DFDs. Finally can then elaborate

each of these functions further, the functions are described by process logic.

Apart from supporting a natural top-down problem-solving process, leveling

improves the readability of the data flow diagram. One ought to be able to look at a

DFD and, from it, understand what the system is doing. However, each level of a

DFD is sufficiently small to be clearly understood.

5.4.1 Elements of a leveled DFD set

The leveled set of DFDs is made of a top, a bottom, and middle. The top is a

single diagram called the Context Diagram. The bottom consists of a set of un-

partitioned bubbles, called Functional Primitives. The middle is everything else.


5.4.2 The Context Diagram

The context diagram is a “de-partitioned” version of the top- level, it shows

only the net inputs and outputs. It serves only one purpose, but that is an important

one: to delineate the domain of our study.

5.4.3 Functional Primitives: Bottom-level

Functional primitives are bubbles, which are not further decomposed into

successively lower-level networks. Our decision to call something a primitive implies

that it can not be further decomposed, or that we have arbitrarily decided not to

partition further because our requirement is satisfied. Figuring out which are the

functional primitives of our system and how they interrelate is the goal of our DFD

effort.

5.4.4 The Middle Levels

Middle-level is portrays the breakdown of some or all of an area into a

network of components, which must themselves be broken-down further. In general,

these will be several middle levels, sometimes as many as eight or nine. In very

trivial example, we might have no middle at all.

5.5 Leveling Conventions

Although leveling is a very useful tool, a number of conventions must be

observed to ensure that no information is lost as a DFD is leveled. These conventions

use documentation practices that assist analysts maintain consistency between data

flow diagrams at different levels.

C

B

A

EXT1

EXT2

SYSTEM



5.5.1 Numbering

One important convention is numbering. The top-level diagram is usually

given the number 0. Processes in a top-level DFD are numbered consecutively,

starting with 1 and continuing until all processes have been labeled. As each process

is leveled, its DFD is given the same number as the process. Each process in the

leveled DFD receives a number made up of its diagram number, followed by a period

(i.e.”.”), followed by a number within the leveled DFD.

5.5.2 Balancing

Another important leveling convention is data flow balancing. Fundamentally,

balancing requires that all the data flows entering a process are the same as those

entering its leveled DFD. Similarly, all the data flows leaving the process are the

same as those leaving its leveled DFD.

5.5.3 Local Files

Leveling can also introduce data stores, which are local to the leveled

diagram. Hence, this data store dose not appears in the top-level DFD but only in its

leveled diagrams.


A

L

C Diagram 1

X M L

EXT1 DS1

1.1 1.3

1.2

DS2C Diagram 3

Z

G F DS1

3.1 3.3

3.2

DS2

V

Diagram 0

Y W

V

Z

B X

C

A

DS1

EXT1

EXT2

3

4 2 5

1

Diagram 2

W

V

Q

P

Y

2.1

2.3 2.2

Diagram 2.2

Q

O

V P 2.2.1

2.2.2

Top-Level DFD

B

C

A

EXT1

EXT2

SYSTEM

Context Diagram


5.5.4 External Entities

External entities are never part of a process and analysts should avoid

introducing new external entities at lower data flow diagrams. All external entities

should appear on the context diagram. They should also appear with a DFD if any

process in the DFD has a flow to or from the external entity.

5.6 Limitations and Additional Capabilities of DFD

As stated earlier, a data flow diagram shows:

Partitioning of the systems into subsystems.

Data flows in the system.

Data stores and in-flowing and out-flowing data.

External entities, that is sources and sinks of the system.

On the other hand, a data flow diagram normally does not show:

Composition of data flowing in the system.

Data access requirements of data stores.

Decisions in the system.

Loops in the system.

Calculations.

Quantities for data and/or processes.

5.7 Major Reasons for using DFDs

The major reasons for using DFDs may be summarized as follows:

1. They help systems analysts:

To summarize information about the system,

To understand the key components of the system and to define reusable

functions,

To understand the relationship between subsystems and subassemblies,

To effectively carry through development of the application.

2. Recall the fact that communication between user(s) and system analyst(s) is

vital in the development of an information system.

3. Examining the timing requirements of the various processes in the DFD as a

guide, the system analyst is able to draw a number of automation boundaries

to develop physical system alternatives.


Chapter Six

Structure Charts

6.1 General

a structure chart reveals both the modular structure of a system (i.e., it partitions into

modules) and the hierarchy into which the modules are arranged. In addition, a

structure charts also shows the data and control interfaces among modules. It doses

not, however, show the decision structure of a system except the major decision(s).

6.2 Structure Chart Elements

The elements of a structure charts-namely:

Modules,

Connections, and

Couples.

Each module in a structure chart is represented by a rectangle identified by a

functional module name. The module name consists of a descriptive verb and a

single, non-plural object (Figure 6.1a).

Connections and couples are the other two basic elements of structure charts. A

vector joining two modules represents a connection, and it indicates any reference

from one module to something defined in another module. Usually it is used to call

other module(s) from a module. A short arrow with a circular tail represents a couple.

A couple is used to indicate a data item or control element that moves from one

module to another. An arrow with an open circle shows data communication, and

with a solid circle shows control communication (Figure 6.1b and c).

In addition to the above elements, semicircle and diamond shapes are used to

represent a major loop and a major decision in a module, respectively. These symbols

should be used to represent only the most critical loop(s) and decision(s) in the

system to minimize the crowding of the structure chart (Figure 6.1d and e).

Figure 6.1 – Structure Chart Symbols

(a) Module (b) Connection (e) Decision (c) Couple (d) Loop


6.3 Structure Chart: An Example

The elements of structure chart and their respective symbols are summarized as

Figure 6.2.

A module named “Prepare payroll”

Module A calls module B. although the

arrow is unidirectional, it is understood

that the called module returns control to

the calling module. The connection arrow

is drawn from the edge of one module

rectangle to the edge of the other

Module A calls module B and data

element P passes from A to B; data

element Q passes from B to A.

Control element “Flag” is sent from B to

A.

Module A call module B based on a

major decision in a module A. Module A

also calls module C in a major loop of

module A.

Figure 6.2 – Summary of Structure Chart Symbols

Prepare

payroll

A

B

Flag

P Q

A

B

C

A

B


Figure 6.3 shows an example of a payroll preparation function where employee

pay record retrieval, gross pay computation, net pay computation, and pay check

printing are considered to be the main modules.

Figure 6.3 - Structure Chart example

Total

deduction Deduction details

(tax, social security deduction,

health insurance, etc.)

Gross pay

Employee number

Employee name

Employee pays Pay check

Net pay Gross pay

Employee

pays record

EOF

Get

Employee

Pay record

Compute

Gross pay

Compute

Net pay

Compute

deductions

Print

Pay check

Prepare

Payroll


6.4 Final Remarks About Structure Charts

Structure charts not only illustrate hierarchical modular structures but also

indicate the module functions and the data and control communications between

modules. They also indicate the fact that higher level modules will be called before

lower level ones. Although structure charts do not permit the documentation of

detailed decision information, the sequence of execution is implied in reading from

left to right.

6.5 Summary

A structure chart shows the modular structure of a system. Inherent in the

concept of modular structure is the partitioning of a system into modules or

components, the hierarchy into which the modules are arranged, and the interfaces

among them. The basic symbols of a structure chart are rectangles to name modules;

arrows to represent connections between modules, and couples as a short arrow with

a circular tail to show data and control elements communicated between modules.

Rarely some additional symbols, such as a semicircle or diamond shape, are also used

to show major loop(s) and major decision(s), respectively.



Chapter Seven

DESCRIBING DATA

7.1 Introduction

An important aspect of systems analysis is the analysis of system data. Data

analysis is a more difficult subject than data flow diagramming. A DFD almost looks

like a system. A data model, however, is often more abstract and difficult to relate to

actual system components.

Data modeling is made up of more than one level and different ideas are used at

each level. At the first level, analysts develop a conceptual model of data. This model

represents the major data objects and the relationships between them. The next level

organizes this model into a good shape. Usually this means removing redundancies, a

process often called normalization, which uses ideas from relational theory. This

normalized model is then converted to a physical database.

7.2 Conceptual Modeling

A conceptual model describes the essential features of system data. Just like a

DFD, a conceptual model consists of a number of symbols joined up according to

certain conventions. We will describe conceptual modeling using symbols from a

modeling method known as entity-relationship analysis. This method was first

introduced by Chen in 1976 and is now widely used.

7.3 Entity-Relationship Analysis

It uses three major abstractions to describe data, these are:

Entities, which are distinct things (or data objects) in the enterprise;

Relationships, which are meaningful interactions between the objects; and

Attributes, which are the properties of the entities and relationships.

The idea of E-R modeling is to group similar objects or thing into entity sets.

The most common system entities are distinct physical things in the organization,

thing such as persons, parts, projects and so on. Thus, all persons would fall into one

entity set named PERSONS, all parts into one entity set named PARTS and so on. In

any case, you would not place persons and parts into the one entity set.

After defining entity sets, we look at how entities in the entity sets can interact

with each other and model this interaction by relationship sets. Thus, once we find

out that persons work on projects, we would add a relationship set between the

PERSONS and PROJECTS entity sets and give this relationship an appropriate name,


in this case, WORK. A more concise representation is to use E-R diagram, as shown

in Figure - 7.1.

Figure 7.1- E-R diagram

After the entity and relationship sets are identified, the next step is to determine

the attributes (or properties) of objects in the sets. There are alternative ways of

showing attributes on E-R diagram. You will find such alternatives in practice.

In addition, we usually choose one of the attributes of an entity or relationship

set to be identifier. The values of the identifier identify unique entities in the entity

set. The identifier attributes are underlined in the E-R diagram (Figure 7.2). Choosing

identifiers for relationship sets is somewhat more complex.

PERSON-ID

NAME

ADDRESS

PERSON-ID

PROJECT-ID

TIME-SPENT

PROJECT-ID

START-DATE

BUDGET

Figure 7.2 – Adding attributes and underline identifiers

The relationship identifiers are to use the identifiers of the entities that

participate in the relationship. In most cases it is necessary to know the value of both

of these identifiers to identify a unique relationship.

PERSONS

PROJECTS

WORK

PERSONS

PROJECTS

WOR

K


7.3.1 Modeling Relationship Cardinality

One property often shown on E-R diagrams is relationship cardinality, which

specifies the number of relationships in which an entity can appear. An entity can

appear in:

One (1) relationship;

Any variable number (N) relationships; and

A maximum number of relationships.

PERSON-ID

NAME

ADDRESS

PERSON-ID

PROJECT-ID

TIME-SPENT

PROJECT-ID

START-DATE

BUDGET

Figure 7.3 - “Adding cardinality”

As you see in the Figure 7.3 a person can appear in more than one WORK

relationship and so can a project. The letters N and M shows this on the E-R diagram.

These letters stand for variables and can take any value. If there was a limit to the

number of times an entity can take part in the relationship, then N or M would be

replaced by the actual maximum number. N and M are used to show that entities in

the different sets may participate in a different number of relationships.

The next Figure 7.4 illustrates a 1:N relationship. Here a project has one (1)

manager, whereas a manager can manage any number (N) of projects.

Figure 7.4 – A 1:N Relationship

N

M

PERSONS

PROJECTS

WOR

K

1

N

MANAGERS

PROJECTS

A manager may manage

up to N projects

A project is managed by

at most one manager

MANAGE


7.3.2 Modeling Relationship Participation

E-R diagrams often specify the number of participation of entities in

relationship. The participation of entities in a relationship set can be:

Mandatory,

Optional, or

Conditional.

If all entities in a given set must appear in at least one relationship in a relationship

set, then, their participation in the relationship set is mandatory. If an entity need not

appear in the relationship set, then its participation in the relationship set is optional.

Conditional participation means that entities can only participate in a

relationship if they have some specific attribute values.

In an E-R diagram, relationship participation is illustrates as a small circle on the

link next to the entity set. There is no mark on the link for mandatory participation. A

filled in circle on the link represents conditional participation.

Figure 7.5 – Relationship Participation

7.3.3 Binary and N-ary Relationships

So far all the relationships in our E-R model have only had two participating

entities. For example, relationship set WORK only had two participating entity sets,

PERSONS and PROJECTS. It is also possible to have relationship sets with three or

more participating entity sets. Relationships in such relationship sets are known as N-

ary relationships. The Figure 7.6 – is an E-R diagram of an N-ary relationship set,

EXAMINES. This relationship has three participating entity sets, APPLICATIONS,

EXAMINERS and TESTS. This relationship set models TESTS made by EXAMINERS on

APPLICATIONS. There can be a number of tests made on each application. Each test

PERSON-ID

NAME

ADDRESS

M

N 1

PERSON-ID

PROJECT-ID

TIME-SPENT

CONTROCTOR-NAME

CONTROCTOR-ADDRESS

PERSONS

PROJECTS

WORK USE

EXTERNAL

CONTRACTORS

PRJECT-ID

CONTROCTOR-NAME

PROJECT-ID

START-DATE

TYPE

N


identified by TEST-TYPE can be made by a number of examiners, who are identified

by EXAMINER-NO.

a) N-ary relationship

b) Equivalent binary relationship

Figure 7.6 – Converting an N-ary Relationship to Binary Relationship

TEST-TYPE

TEST-DESCRIPTION

EXAMINER-NO

EXAMINER-NAME

DATE-APPOINTED

APPLICATION-NO

APPLICATION-NAME

APPLICATION-DATE

TEST-RESULT

DATE-COMPLETED

APPLICATION-NO

EXAMINER-NO

TEST-TYPE

TIME-SPENT-BY-EXAMINER

APPLICATIONS

EXAMINERS

TESTS

EXAMINE

N

1

1

N M

N

TEST-NO

TEST-TYPE

TEST-TYPE

TEST-DESCRIPTION

APPLICATION-NO

APPLICATION-NAME

APPLICATION-DATE

EXAMINER-NO

EXAMINER-NAME

DATE-APPOINTED

EXAMINERS

APPLICATIONS

TESTS

EXAMINATIONS

TEST-NO

EXAMINER-NO

TIME-SPENT-BY-EXAMINER

TEST-NO

TEST-RESULT

DATE-COMPLETED TEST-NO

APPLICATION-NO

CONDUCT ON

USING


There are a number of disadvantages in using N-ary relationships. First, you

cannot specify cardinality on the links. If there are N examiners of one application

and 1 examiner for each test, then, the link between EXAMINERS and EXAMINE

would have to be labeled both 1 and N. secondly N-ary relationship sets are often not

in BCNF.

Apart from these specific disadvantages, many people argue that too many

participating entities cloud the nature of the interaction. Perhaps when you have three

interacting entities you can still see what is going on from the E-R diagram.

However, when you get to five, six or more interacting entities, the representation

may not be semantically clear, for this reason, there is some thing to be said for

building models where all relationships have, at most, two interacting entities.

7.3.4 Converting an N-ary Relationship to a Binary Relationship

In practice, it is easier to begin by including N-ary relationships in the E-R

model. Once the initial E-R model is constructed, all the N-ary relationship sets are

replaced by relationship sets that are linked to at most two entity sets. There is a

standard method for this conversion. In the standard method, the N-ary relationship is

replaced by an entity set. Thus, the N-ary relationship set EXAMINE in Figure_7.6(a)

becomes the entity set EXAMINATIONS in Figure_7.6(b). The entity set, which

interacted in relationship set EXMINE, now interact in separate relationships

(CONDUCT, USING and ON) with entity set EXAMINATIONS.

It is common to give a special identifier to the newly created entity set. In

Figure-7.6(b), the name of the new identifier is TEST-NO. some attributes are also

moved during conversion. For example, some attributes of the N-ary relationship now

become attributes of the newly created entity set or attributes of the new binary

relationships. Thus, TEST-RESULT remains an attribute of the newly created entity set.

Other attributes such as TIME-SPENT-BY EXAMINER, however become an attribute of

relationship CONDUCT because they are dependent on the activity of the examiner of

the test and not on the test as a whole.

7.4 Entity Modeling

One popular alternative to E-R modeling is entity modeling. Entity modeling

does not distinguish between entity sets and relationship sets. Instead, everything is

treated as entity sets.

The difference between an E-R model and an entity model is that, the E-R

diagram (Figure 7.7:a) shows a number of entities set, also there are a number of

relationship sets. In an entity diagram (Figure 7.7:b), everything is treated as entity


sets. The links between the sets in the entity model are called associations and are the

same as those in the E-R model.

a) E-R model

b) Entity model

Figure 7.7 – Comparing the E-R and Entity Models

You will also find that cardinality is labeled differently on the entity model.

Cardinality shows how many times an entity in an entity set is associated with entities

in another set.

You will find that data modeling using entity analysis proceeds in the same way

as E-R modeling. We identify the entities and then add attributes to them, although it

is no longer necessary to distinguish between entity sets and relationship sets. Any

group of logically related attributes becomes an entity set.

7.5 The Next Level of Analysis

A conceptual model or an E-R model of a system is the first level of data

analysis. But data modeling is not finished at this stage. To begin system design we

N

1

1 1

1

1 N N

PROJECTS PROJECT-NO

START-DATE

ORDERS

MAKE PROJECT-NO

ORDER-NO

ORDER-NO

ORDER-DATE

FOR

PARTS PARTS-NO

COLOR

PARTS-NO

ORDER-NO

QTY-ORDERED

N

M

N

1

PROJECT-NO

ORDER-NO

PROJECT-NO

START-DATE

PROJECTS

ORDERS

PARTS

ORDER-NO

ORDER-DATE

PARTS-NO

COLOR

PARTS-NO

ORDER-NO

QTY-ORDERED

MAKE FOR


have to get the data model into shape by removing any redundancies from it. To do

this, we use relational theory and replace each set in an E-R model by a table or

relation (Figure 7.8). Thus, each entity set and each relationship set becomes a table

or relation. The name of the set becomes the table name and the attributes of the set

become the table columns.

Figure 7.8 - Replace each set in an E-R model by a table or relation

You should note that entities in an entity set become rows in the relation that

represents the entity set. Instead of tables we can use relational notation which only

shows the relation name and its attributes, as:

PERSONS (PERSON-ID, NAME, ADDRESS);

WORK (PERSON-ID, PROJECT-ID, TIME-SPENT);

PROJECTS (PROJECT-ID, START-DATE, BUDGET);

There are two reasons for converting the E-R model to a set of relations. First,

such a conversion is a convenient step for going from an E-R model to a set of files.

Each table eventually becomes a system file. The other reason is more important.

There is a theory based on sound mathematics that specifies how we should construct

tables to avoid data redundancies. What we are after is a set of ‘normal’ relations. If

we organize the data into a set of such relations, we are guaranteed a good data

design.

PERSONS

PERSON-ID # NAME ADDRESS

PX1 Jackson London

PX2 Mary Liverpool

P25 Joe London

WORK

PERSON-ID # PROJECT-ID # TIME-SPENT

PX2 Proj3 30

PX1 Proj2 15

P25 Proj2 40

PX2 Proj5 30

P25 Proj3 75

PROJECTS

PROJECT-ID # START-DATE BUDGET

Proj3 1/3/98 50

Proj2 15/2/99 30

Proj5 1/11/99 60

ID-PERSON NAME ADDRESS

ID-PERSON ID-PROJECT

TIME-SPENT

ID-PROJECT START-DATE BUDGET

PERSONS

PROJECTS

WORK


Chapter Eight

Normalization

8.1 General

A number of normal forms have been defined for relations. They are commonly

known as:

First normal form (1NF),

Second normal form (2NF),

Third normal form (3NF),

Bayce-Codd normal form (BCNF); sometimes known as the optimal normal

form (ONF),

Forth normal form (4NF), and

Fifth normal form (5NF).

Database designers must ensure that their relations are in the highest normal form.

It should also pointed out that, although much of the work on relations has been

of a mathematical nature, the beauty of the relational model is that the results of this

mathematical work can be explained in terms relevant to practical database design.

There are two reasons for this. First, there is the idea of flat files or relations. Normal

relations must be flat, that is, all the column values must have simple values and

cannot be groups of values. Flat file structures are both easier to access and to change

in order to meet new user requirements. The second important criterion deals with

redundancy – normal relations store each fact once only.

8.2 Normal Form and Non-normal Form Relations

The difference between relations in normal form and those that are non-normal

form is that, the attributes for the relation in non-normal form dose not have simple

values, but in normal form, attributes can only have one single value for one raw,

thus, are said to have simple values.

Relations that are not in normal form can always, be normalized. To do this,

take each row in the unnormalized relation and look at the relation in the non-simple

column, combine each row of the relation in the non-simple column with the values

of other columns to make a row in the normalized relation. Repeat this combination

for every row of the non-simple column until all of these values become normalized.


Relations with only simple attribute values are in the first normal form (1NF) or are

normalized.

a) Non-normal form relation

ORDERS

ORDER-NO ORDER-DATE ORDER-LINE

Ord1 6 June 1994

Ord2 3 May 1994

b) First normal form relation

ORDERS

ORDER-NO ORDER-DATE PART-NO QTY-ORDERED

Ord1 6 June 1994 P1 10

Ord1 6 June 1994 P2 30

Ord2 3 May 1994 P5 10

Ord2 3 May 1994 P6 50

Ord2 3 May 1994 P2 30

Figure 8-1: Non-normal form and first normal form relations

The important advantage of normalized relation is that each attribute has the

same importance. This is useful when we convert the relation to computer software. It

then becomes possible to add an index on any attribute field and retrieve data using

any attribute as key.

Let us look at what characterizes higher relational forms. These forms must

ensure that there is no redundancy in the data. To describe such higher normal forms

we have to define functional dependencies and relation keys.

8.3 Functional Dependencies

Functional dependencies describe some of the rules that hold between attributes

in a system. In particular, they state whether a particular value of another attribute for

that relation. That is a way of saying that if we know the value of X then we can

determine a unique value of Y. A functional dependency is often expressed as: X Y

PARE-NO QTY-ORDERED

P1 10

P2 30

PART-NO QTY-RDERED

P5 10

P6 50

P2 30


We can now say that attribute Y is functionally dependent on attribute X.

Alternatively, we sometimes use the terminology that X determines a unique value of

Y or that X determines Y.

Functional dependencies often involve more than one attribute on the left-hand

side. Thus, to know the value of Z, we must know both the value of X, and the value

of Y. The functional dependency then becomes: X, Y Z

8.3.1 Derived Functional Dependencies

Sometimes, one functional dependency can be derived from other functional

dependencies. For example, X may determine one value for attribute Z and suppose

that each attribute of Z determines to one value of attribute Y. Given a value of X we

once know the value of Z and from the vale of Z we can derive the value of Y.

Mathematically we would say that if:

X Z, and Z Y

Then, X Y

It is up to the designer and analyst to ensure that there are no derived functional

dependencies.

8.4 Relation Keys

A relation key is a set of columns whose values select unique relation rows. The

idea of normal form relations center around functional dependencies and relation

keys. Informally, we would like the key attribute value of a relation to determine

unique vale of the non-key attributes. One important thing to remember about relation

keys is that they must apply for all possible contents of the relation.

8.5 Data Redundancies

Now let us consider a relation that stores some facts more than once. As

explained before, the relation key of some relations is made up of two columns, the

value of non-key attribute is determined by part of the relation key only and not by

the combination of both parts of relation key.

You should not that, in this case the non-key attribute is stored more than once.

When facts are stored more than once, a relation is no longer in the highest normal

form.

Functional dependencies and relation keys can used to define second, third and

optimal normal forms.


8.5.1 Second Normal Form

Relations where subset of key attributes determine non-key attribute values

may be first normal form but cannot be in second normal form. To be in second

normal form, a relation must not have a non-key attribute that is functionally

dependent on part of the relation key.

Relations that are not in second normal form can be decomposed into second

normal form relations. For example, the relation ORDERS can be decomposed into

two relations, relation ORDERS and ORDERS-CONTENTS.

You will note that in these relations non-key fields are determined by the whole

(and not part of) relation key. Relations with this property are said to be in second

normal form.

ORDERS ORDERS-CONTENTES

ORDER-NO # ORDER-DATE ORDER-NO # PART-NO # QTY-ORDERD

Ord1 6 June 1994 Ord1 P1 10

Ord2 3 May 1994 Ord1 P6 30

Ord2 P5 10

Ord2 P6 50

Ord2 P2 30

8.5.2 Third Normal Form

Finally, it is also possible to have non-key fields are functionally dependent

on other non-key fields.

Consider relation VWHICLES that is stores information about cars. The values

of the non-key attributes are determined by the whole relation key and the relation is

in second normal form.

REGISTER-NO # OWNER MODEL MANUFACTURE NO-CYLINDER

Yx-01 George Laser Ford 4

Yj-77 Mary Falcon Ford 6

Yw-30 George Corolla Toyota 4

Yj-37 Mary Laser Ford 4

Yj-83 Andrew Corolla Toyota 4

However, you may note that there are functional dependencies between the non-

key attributes. Relation VEHICLES to be in third normal form it should be

decomposed into two relations as REGISTRATION and VEHICLE1, each relation:

only contains facts about the relation key; and

have no facts between non-key columns.

REGISTER-NO # OWNER MODEL MANUFACTURE

RELATION-KEY

VEHICLES

RELATION-KEY

REGIST

RATIO

N


Yx-01 George Laser Ford

Yj-77 Mary Falcon Ford

Yw-30 George Corolla Toyota

Yj-37 Mary Laser Ford

Yj-83 Andrew Corolla Toyota

MODEL MANUFACTURE NO-CYLINDER

Laser Ford 4

Falcon Ford 6

Corolla Toyota 4

Laser Ford 4

Corolla Toyota 4

8.5.3 Optimal Normal Form

All the relations discussed so far had one relation key. Normal forms also

cover relations that have more than one relation key. The particular problem with

relation that have more than one key is that part of one key may be functionally

dependent on part of another key. For example, look at relation WORK, which has

two overlapping keys. These are called overlapping keys because each key has

PERSON-ID as a component.

PROJECT-ID PERSON-ID MANAGER TIME-SPENT

Proj1 J1 Vicki 30

Proj2 J1 Joe 12

Proj1 J2 Vicki 10

Proj2 J2 Joe 79

Proj3 J2 Nicky 17

Proj2 J3 Joe 3

However, using the strict third normal definition, relation WORK is in third

normal form. What is needed is a higher normal form – one that takes overlapping

keys into account. Such a higher normal form is the Boyce-Codd normal form

(BCNF). A relation R is in BCNF if for every functional dependency (X Y)

between any relation attributes, the attributes on the left-hand side are a relation key.

Relations where this property holds are in BCNF, which is sometimes called

optimal normal form. The relation WORK is not optimal. Here there is a functional

dependency: PROJECT-ID MANAGER, but PROJECT-ID is not a relation

key of WORK.

RELATION-KEY

VEHICLE1

RELATION-KEY 2

RELATION-KEY 1

WORK


Again a non-BCNF relation can decompose into a set of optimal relations. The

result of decomposing relation WORK is shown bellow:

PROJECT-ID MANAGER

Proj1 Vicki

Proj2 Joe

Proj3 Nicky

PROJECT-ID PERSON-ID TIME-SPENT

Proj1 J1 30

Proj2 J1 12

Proj1 J2 10

Proj2 J2 79

Proj3 J2 17

Proj2 J3 3

8.6 Normal Form Relations and Multi-Valued Dependencies

Normal forms up to and including the optimal form were based on functional

dependencies. There are also a number of problems that can arise with multi-valued

dependencies and higher normal forms have been developed to deal with them. A

multi-valued dependency exists in a relation when a value of one column or set of

column, X, determines a set of values in another column, Y, multi-valued

dependencies are often expressed by the notation: X Y

8.6.1 Fourth Normal Form

Consider relation PERSONS, which is in 3NF or optimal form, SKILL and

PROJECT-ID, are independent multi-valued dependencies of PERSON-ID. A

person’s skills are independent of the projects they work on, and the projects a person

works on are independent of their skill.

PERSONS

PERSON-ID SKILL PROJECT-ID

Jill Computing Proj1

Jill French Proj1


Jill French Proj3

Arnold French Proj1

Arnold Economics Proj1

George Computing Proj1


Therefore a person’s skills and person’s projects should be stored in separate

relations as shown in relation KNOWLEDGE and relation ASSGNMENTS bellow, which

is in 4NF.

RELATION-KEY

RELATION-KEY

PROJECTS

WORK1

Relation with dependent

multi-valued facts


KNOWLEDGE ASSGNMENTS

PERSON-ID SKILL PERSON-ID PROJECT-ID

Jill Computing Jill Proj1

Jill French Jill Proj3

Arnold French Arnold Proj1

Arnold Economics George Proj1

George Computing George Proj2

A relation is in 4NF if it does not contain independent multi-valued dependencies.

8.6.2 Fifth Normal Form

Fifth normal form (5NF) can be viewed as an extension of 4NF in the sense

that now the multi-valued dependencies are no longer independent. For example, let

as return to relation PERSONS. Suppose instead of containing information about a

person’s SKILL and PROJECT-ID, the relation also stores information about what

skills that the person uses in a given project. Thus suppose ‘Proj1’ only needs

‘Computing’ and ‘Economics’ skill but not ‘French’. The relation would now be as

shown in relation APPLYING-SKILLS bellow:

APPLYING-SKILLS

PERSON-ID SKILL PROJECT-ID



Jill French Proj3

Arnold French -

Arnold Economics Proj1



Relation APPLYING-SKILLS differs from relation PERSONS in that it does not

contain rows that include both ‘French’ and ‘Proj1’. This relation is in 4NF but still

has undesirable properties in that:

Some facts are stored twice;

There are blank fields (a row with a NULL value).

These undesirable properties arise because APPLYING-SKILLS contains

dependent multi-valued dependencies, that is, the value of SKILL, that is associated

with PROJECT-ID depends on the SKILLS needed by the project. A property of a

relation that is in 4NF but not in 5NF is that it cannot be decomposed into two

relations but must be decomposed into three relations. Thus to remove the

undesirable properties in relation APPLYING-SKILLS, we must decompose it into the

three relations shown bellow.

Relation with dependent

multi-valued facts


KNOWLEDGE ASSIGNMENTS

PERSON-ID SKILL PERSON-ID PROJECT-ID

Jill Computing Jill Proj1

Jill French Jill Proj3

Arnold French Arnold Proj1

Arnold Economics George Proj1

George Computing George Proj2

NEEDS-SKILL

Project-ID SKILL

Proj1 Computing

Proj1 Economics

Proj2 Computing

Proj3 Computing

Proj3 French


Chapter Nine

PROCESS DESCRIPTIONS

9.1 Introduction

So far we outlined methods used to describe two main system components, data

flows and data structure. Here we are going to discuss methods used to describe the

third component, processes.

One of the major reasons for using structured systems analysis is to remove

ambiguities from system descriptions. Such ambiguities frequently arise in natural

language descriptions of processes.

The main techniques proposed for describing processes are:

Structured English;

Decision tables; and

Decision trees.

Structured English put verbal descriptions into a logical structure, which

removes logical ambiguities.

The next two methods (decision tables and decision trees) are preferred where

one of a large number of actions is to be selected. The action selected depends on a

large number of conditions. Structured English is not usually used for this purpose

because the logic structure would become repetitive. The differences between the

three methods are illustrated as:

a) Using structured English

IF credit limit exceeded

THEN

IF customer has bad payment history

THEN refuse credit

ELSE

IF purchase is above $200

THEN refuse credit

ELSE refer to manager

ELSE allow credit


b) Using a decision table

Credit limit exceeded Y Y Y Y N N N N

Customer has bad payment history Y Y N N Y Y N N

Purchase is above $200 Y N Y N Y N Y N

Refuse credit X X X

Refer to manager X

Allow credit X X X X

c) Using a decision tree

Purchase is

above $200 Refuse credit

Good payment

history

Credit limit

exceeded

Purchase is

Below $200 Refer to manager

Bad payment

history

Refuse credit

Credit limit not

exceeded

Allow credit

9.2 Structured English

Structured English syntax is very similar to block structured languages. It

provides the keywords to structure process specification logic while leaving some

freedom when describing the activities and data used in the process. Process

specification logic consists of combination of sequences of one or more imperative

sentences with decision and repetition constructs.

9.2.1 Imperative Sentences

An imperative sentence usually consists of imperative verb followed by the

contents of one or more data stores on which the verb operates. Boolean and

arithmetic operations can be used in imperative statements. For example:

Add PERSONS-SALARY to TATAL-SALARY

9.2.2 Structured English Logic

Structured English uses certain keywords to group imperative sentences and

define decision branches and iteration. These keywords are:

BEGIN REPEAT IF

END UNTIL THEN

CASE WHILE ELSE

OF DO FOR


9.2.3 Grouping Imperative Sentences

A sequence of imperative sentences can be grouped by enclosing then with

BEGIN and END keyword. For example, process 2.3 in (Figure 9-2) can be defined as:

BEGIN

Receive ‘sale report’.

Get SALES record for PART-NO in ‘sale report’.

TOTAL_QTY = TOTAL_QTY + QTY_SOLD.

SALE_VALUE = QTY_SOLD + UNIT_PRICE.

TOTAL_VALUE = TOTAL_VALUE + SALE_VALUE.

Write SALES record.

Send ‘summary advice’.

END

Figure 9.2 – Editing records

This sequence defines what happens when a ‘Sale report’ is received. Process

‘Record sale’ maintains data store SALES which keeps an accumulated total of the

number of parts sold and total moneys received for each part kind.

Remember that Structured English is not a programming language and that

imperative statements are as brief as possible. There are no unnecessary move

statements such as:

MOVE CUSTOMER in ‘sale report’ to CUSTOMER in ‘summary advice’

Which would normally be found in a programming language. It is assumed that the

value of CUSTOMER remains the same as it passes through the process.

Summary advice Sale report

= PART_NO

+ TOTAL_QTY

+ TOTAL_VALUE

= CUSTOMER

+ SALE_VALUE

= CUSTOMER

+ PART_NO

+ QTY_SOLD

+ UNIT_PRICE

SALES

2.3

Record

sale


9.2.4 Decisions

Two types of decisions structure usually appear in Structured English, these

are:

1. Two way choice;

2. Multiple choice.

Two way choice shows a structure which allows a choice between two groups of

imperative sentences. The keyword IF, THEN and ELSE are used in this structure.

IF condition THEN

BEGIN Group A sentences

END

ELSE

BEGIN

Group B sentences END

Multiple choice structure allows a choice between any number of groups of

imperative sentences. The keywords CASE and OF are used in this structure:

Conditio

n

Group A

statement

Group B

statement

…..

Value

test

Group A

statement

Group Z

statement

Group B

statement

Group K

statement


The value of a variable is first computed. The groups of sentences executed depend

on that value. The process description is:

CASE [ name ] OF

A: BEGIN

Group A sentences END

B: BEGIN

Group B sentences END

Z: BEGIN

Group Z sentences END

Here ‘name’ is a variable and ‘A, B, …,Z’ are values that can be taken by ‘name’.

9.4.5 Repetition

Two ways for specifying iterations in Structured English, these are:

1. Use the WHILE … DO structure; here a condition is tested before a set of

sentences is processed.

2. Use the REPEAT … UNTIL structure; here the group of sentences is executed

first and then the condition tested.

WHILE condition DO

BEGIN

Group A sentences

END

Condition

Group A

sentences

Group A

sentences

Condition

REPEAT

BEGIN

Group A sentences

END

UNTIL condition


9.3 Decision Table

Although Structured English can describe most processes, such process

descriptions can become clumsy if we try to specify a process that selects one of a

possible set of actions using a set of complex rules. This particularly applies if it is

necessary to repeat a decision or to do the same process more than once to retain the

logic structure.

A better way of describing such logic is to use decision tables. A decision table

is first divided into two parts – the conditions and the actions. The condition part

states all the conditions that are applied to the data. The actions are the various

actions that can be taken depending on the conditions.

The table is constructed by using columns so that each column corresponds to

one combination of conditions. The entry in the column indicates the existing

condition.

In many decision tables there are only two values for a condition (i.e., True or

False). It is possible for conditions to take more than two values. Decision tables

where conditions take more than two values are sometimes called extended decision

tables.

The actions taken for the combination of conditions in the column are given by

the crosses in the column. If the action line is crossed then that action is taken, given

the set of column conditions.


Chapter Ten

DOCUMENTATION

10.1 Introduction

System documentation is an organized approach needed to keep track of system

models (i.e., work produced during system analysis and design). This organized

approach is needed for a number of reasons:

First, it provides means for referencing model components and thus help to manage

system complexity.

Secondly, it supports maintenance as any work can be passed from one person to

another.

Documentation is both a communication tool and a management tool. It is a

communication tool because it contains a repository of all work done to date and

makes it available to all persons working on a large project.

Documentation is also a management tool. First, it ensures greater efficiency as

all project personnel have access to the latest work and thus less work is repeated.

Second, it s the only project deliverable, especially in the early project phases. A

broad design and evaluation are the only outputs of the feasibility study, and system

models are the only results of systems analysis and broad system design. Thus we can

use documentation to determine project status. We know what documentation must

be provided at the end of each project phase and can measure phase progress by

estimating the proportion of phase documentation that has been completed. Finally,

documentation becomes part of phase output. This output includes the system model

as well as plans for subsequent project phases.

To be useful, documentation must be mandatory and most organizations

adopting formal design methods make documentation mandatory. Mandatory

documentation requires formal techniques to be used in systems analysis and design

and ad hockery is then avoided.

10.2 Documentation

There is more to documentation than just keeping diagrams in easily accessible

locations. Documentation becomes the standard for analysis and design work as all

work produced must satisfy documentation standards. As a standard, documentation

provides the guidelines for what is to be done next. It may include checklists to make

sure that no important task is omitted. It may also suggest sequences for doing things


and specify things to be done before each task. However, you should not see

documentation as a set of things to be done with no particular benefit. It is there to

help you.

To be useful, documentation must be structured and easy to use. As shown in

Figure 10.1, documentation in the first instance can be divided into project reports

and a system description. In most organizations, the system description is kept in a

system dictionary.

Figure 10.1- Documentation structure

Project Reports

Project reports include information required by project management. They are

the deliverables expected at the end of each phase. Project reports include general as

well as phase-specific information. General information includes a summary and

recommendation for the next phase. It also includes a plan for the next phase,

together with proposed resources for that phase.

Phase-specific information depends on the project phase. A project report at the

end of a feasibility study will include different components to that provided at the end

of systems analysis phase. The feasibility report will include expected project costs

and a recommendation to proceed or abandon a project. The report at the end of

systems analysis will describe the existing system and objectives for the new system.

Project reports also include parts of the system description or references to parts

of that description. In some cases, project reports may include high-level data flow

diagrams (DFDs) and a high-level E-R model.

Documentation

System

description Project

report

Phase-specific reports

Selected system documentation

Plans for next phase Summary and recommendation

Resource allocation

System dictionary


System Description

As shown in Figure 10.2, the structure of system dictionary is made up of a

number of components. One important component is the DFD that describes the

system. Another may be the E-R diagram that describes the system data. The process

description component describes system processes and the data components, which is

sometimes called the data dictionary. The user component describes system users

and how they use the system.

Figure 10.2 – System description structure

Process Description

Process description includes an entry for each process in a data flow diagram.

Each process entry includes the DFD number for the process together with the

process description. The process description itself will depend on the process level.

High-level processes are usually described in an informal manner. They may be

simple sentences that describe overall process goal. As shown in Figure 10.3, the

entry includes the process number and name. It also includes the name of the data

flows coming and out of the process and a narrative process description.

Low-level DFD descriptions are the same as those for the high-level process.

Except that in this case, however, is more formal. Instead of narrative, the process

description will now use structured English definitions or any of other process

description methods.

Data description (data dictionary)

System

dictionary

User description

Process

description Data flow

diagrams

System

users

E-R

model

Data

stores

Data

flows


DFD number: 3

Process name: Classify expenditure

Input data flows: Approved request

Output data flows: Recorded request

Data stores used: DEPT-ACCOUNTS, TYPE-ACCOUNTS

Description: Approved reports are classified into one

of a number of types

Method: Users examine paper requests and their type

is entered into the computer

Figure 10.3 – describing a high-level process

All the process entries are organized into their numeric sequence according to

which they are entered into the system dictionary.

Data Dictionary

The components of a data dictionary in structured systems analysis are the data

stores and data flows, also include the E-R model. Most data dictionaries also

describe data elements and data structures separately from the data stores and flows.

This saves repeating the data element descriptions every time they occur in a data

flow or store.

Instead of detailed data element descriptions, the data flow and store

components and the E-R diagram description include cross-references to the

components that they use. Figure 10.4 illustrates typical data dictionary entries and

cross-references between them.

Describing data elements

There is one entry in the data dictionary for each data element. The entry

describes any aliases or alternate names for the data elements. For example, in

Figure 10.4, PRODUCT-NO can be used to mean the same thing as PRODUCT-CODE.

The entry also includes the data element description, which includes the kind of

values that the element can take and the range of these values.

Describing structures

Data structures are combinations of data elements that appear in various parts of

the DFD. Most data dictionaries describe structures by hierarchies. For example,

the data structure described in Figure 10.4, is the INVOICE. The structure INVOICE

is made up of a number of lower-level structures, namely, INVOICE-HEADING,

INVOIVE-LINE and SUPPLIER-DATA. INVOICE-HEADING is made up of two data

elements, SUPPLIER and ORDER-NO.


Data Element: PRODUCT-CODE

Alias: PRODUCT-NO

Description: A five-character code. First two characters are alphabetic to indicate

class. The last two-characters are a number within the class.

Where used: INVOICE

Data structure: INVOICE

INVOICE-HEADING

SUPLIER

ORDER-NO

INVOICE-LINE*

PRODUCT-CODE

QTY

PRICE

SUPPLIER-DATA

Description: Standard format constructed from supplier invoice

Where used:

Data stores:

Data flows: SUPPLIER- INVOICES

Data flow: SUPPLIER-INVOICE

Source:

External entity suppliers

or process:

Destination:

External entity

or process: 1.3.1

Data structure: depends on supplier

Volume: 10/day

Physical description: Papers invoices, format depends on supplier

Data store: INVOICES

Contents: INVOICE + DATE-RECEIVED

+ DATE-PAID

Processes used by: 3.7 STORE INVOICE

7.2.1 RECORD PAYMENT

Physical description: Computer file

Size: Average of 20,000 records

Figure 10.4 – Data dictionary entries for structured system analysis

The illustration in Figure 10.4 uses a hierarchical description of structure. A

hierarchy is described by listing the data elements that make up a structure as a

hierarchy, for example:


ORDER

SUPPLIER-NO

DATE-ORDERED

DATE-REQUIRED

ORDER-NO

Here ORDER is the structure name. The order is made up of four data elements,

SUPPLIER-NO, DATE-ORDERED, DATE-REQUIRED and ORDER-NO. the indentation

of the item names under ORDER implies that they are components of ORDER. Of

course, most data structures are more complex than the ORDER structure shown

above. Item values may be repeated, and there may be optional or alternative

values and structures within structures.

An example showing structures within structures and repetition follows:

ORDER

SUPPLIER-NO

DATE-ORDERED

DATE-REQUIRED

ORDER-NO

ITEM-ORDERED*

ITEM-NO

QTY-ORDERED

Now ORDER contains the structure ITEM-ORDERED. Because an order can

contain many items the structure ITEM-ORDERED can be repeated many times. The

asterisk (*) after the structure name indicates that a structure or a data element can

be repeated.

The notation also allows specification of alternative or optional items in a

structure. This is done as follows:

ORDER

SUPPLIER-NO

SUPPLIER-NAME

DATE-ORDERED

DATE-REQUIRED

ORDER-NO

[ORDER-STATUS]

ITEM-ORDERED*

ITEM-NO

QTY-ORDERED

Here the curly brackets {} indicate alternative structures. Thus an order may

contain SUPLIER-NO or SUPLLIER-NAME but not both. The square brackets [ ]

indicate an optional component. Thus an ORDER may or may not contain the data

element ORDER-STATUS.


There are of course, alternative ways of describing structures. One such

alternative has been described by DeMarco. In this notation, structure components

are described by using the plus sign (+) instead of indentations. Thus ORDER is

now described as:

ORDER = SUPPLIER-NO + DATE-ORDERED + DATE-REQUIRED + ORDER-NO

Repetition is described by placing the repeating structure within curly brackets

{}; alternative structures are defined by square brackets [ ]and optional structures

by round brackets ( ).

Describing data stores and data flows

Data stores and data flows are made up of data structures and elements.

Consequently data store and data flow descriptions contain data structure and data

elements. The data stores and elements of a DFD diagram can be described using

any of the above notations. They also contain many other things. For example, as

shown in Figure 10.4, the data flow description contains the origin and destination

of the data flow. It also contains the physical description of the data flow, in this

case stating that it takes the form of paper invoices.

Data store descriptions also contain physical information such as the medium

used to store the data and where it is located.

Cross referencing

You should note the extensive cross-referencing between all entries in the

system dictionary. There is cross-reference between data element entries and data

structure entries with data element entries referring to the structures that contain the

data element. Similarly, any data element in a data structure must have a data element

entry. The cross-reference from the structure to the data element is by the name of the

data element.

Processes should reference the data stores and flows they use. Note that

references to data stores are necessary because it is possible to have unlabeled flows

between processes and stores. If we cross reference flows only, these unlabeled flows

would not appear in the process entry. Cross-references from processes to data stores

and flows let us know what data elements and structures the process uses.

Data stores include references to the data structures and elements that they use.

Data stores also include references to the processes that the data store. Note that the

reference is to the lowest order process. A cross-reference to process 1.3.1 also means

that processes 1.3 and 1 use the data store.

Data flows reference their source and destination. They also reference the data

elements and data structures that appear in the data flow.


Also note that cross-referencing is through the use of a unique set of names.

Thus there is only one entry for PRODUCT-CODE in the data dictionary. Any structure,

store or flow that includes PRODUCT-CODE also contains the name PRODUCT-CODE

in its description. This name serves as the cross-reference.

Cross-referencing improves system maintenance. Whenever a change is

proposed to one part of the system the repercussions of that change can be quickly

evaluated. Furthermore, all parts of the system related to the change will be amended

during the change, ensuring that no unforeseen errors arise when the change is

implemented.

The Entity-Relationship Diagram and The Data Dictionary

The E-R diagram is developed in parallel with DFD diagram. There are entries in

the data dictionary for each entity set and each relationship set.

There will be cross-references from these entries to the other entries in the data

dictionary. Any time an attribute is identified as part of the E-R diagram, it is added to

the data dictionary. Similarly, any data stores or flows that include a set in the E-R

diagram will include cross references to that set. The set in turn will include a cross-

reference back to the data stores or flows that contain the set.

System Users

System users can be individuals or organizational entries. The system dictionary

will include details of these users and their responsibilities. It will also describe the

data that the users access and the processes they use.

Maintaining A System Dictionary

The system dictionary can be manual or automated. It must include an entry for

each data element, data structure, process description, user, data flow and store. The

simplest way to maintain a system dictionary is to have a specific form for each type

of dictionary entry. Each entry described in Figure 10.3 and 10.4 is entered on to

paper sheet. This sheet is stored in a system manual, which is usually a loose-leaf

folder. New entries can be added to the manual, but change to an entry requires the

page for that entry to be replaced.

10.3 Automated Documentation

Manual system dictionaries are sometimes awkward to maintain. Every change

requires a form to be replaced and if there is more than one copy of the system

dictionary, this form must be replaced in each copy. Sometimes copies may not be

updated and inconsistencies can arise.


Many organizations are now beginning to use computer aids to maintain a

system dictionary. A variety of such aids is possible, with each kind of aid giving a

different level of functionality.

10.4 Computer Aids

Computer aids can go further in assisting analysts. They can guide analysts by

guiding the modeling process and detecting any modeling inconsistencies. Thus the

inconsistencies are automatically brought to the analyst’s attention, whereas in the

manual case it is up to the analyst to detect them.

A computer aids can work in tow ways. One is to rely on the analyst to input all

the data about the model and then check the model for consistency. Another is for the

computer to actively guide analysts in their work. This kind of aid can suggest work

to be done next, identify missing pieces of the model and guide the designer by

asking questions that fill in any model details.


Chapter Eleven

DESIGNING THE NEW SYSTEM

11.1 Introduction

The first three phases of liner cycle describe methods, which is used to define

projects, check project feasibility, and analyze systems. The next step of the cycle is

to design the new system.

In analysis it is possible to come up with a correct model of the existing system.

There is, however, no such thing as a correct design. A good design is very dependent

on a particular system, and what is a good design for one system may be bad for

another. Thus it is not possible to propose a set of standard solutions from which you

can select a good design for many systems.

The designer has to search for a solution using any special knowledge about the

system. Even before a designer begins the design, the problems to be solved or

objectives to be satisfied by a system mast be established. Design is a problem-

solving process that searches out ways to meet the objectives.

This chapter will begin by describing system objectives for a new system. Then

it will outline some methods that can be used to search for a good design.

11.2 Problem Solving and Design

Remember that system design is a problem-solving process. Its goal is to create

a new system that meets a set of objectives and these objectives are the driving force

behind the design process. What is needed to make the design work is a problem-

solving method that will take the objectives and create a system to satisfy them.

The problem-solving idea requires design to proceed in top-down way. Design

usually proceeds in two steps, Phase 4A, a broad design is developed. Then, in Phase

4B, the broad design is expanded in to detail.

Broad design shows the parts of a system to be changed, and indicates the parts

to be automated and those that will stay manual. Following agreement on the broad

design, a detailed design is developed. A number of things are done during detailed

design. The inputs are specified showing the layout of any input screens or forms.

The outputs are also designed showing the layout of reports or screens. Thus the

presentation of deliveries by areas would be designed in this phase, as would inputs

of trial schedules. In addition, the files, structures and programs are also specified,

showing the different record types and program modules needed to make the system


work. The detailed calculation needed to evaluate trial schedules would be specified

at this point.

11.3 System Objectives

The objectives are the driving design factor. Any changes made to the system

should be to satisfy these objectives. It is therefore important to state the objectives in

a way useful to design.

11.3.1 Kinds of Objectives

There are many kinds of objectives, common types are:

functional objectives that state new or amended functional system

requirements. Included here can be:

- new system functions requirements;

- control over system maintenance;

- security and access controls;

- input and output methods.

operational objectives that specify performance standards to be attained by the

new system. These define system accuracy and various timing requirements.

personal and job satisfaction needs. Important objectives here are designing

system that an easy to use and which give users the ability to use their creative

abilities rather than simply responding to computer outputs.

Different design ideas are needed to satisfy each of these objectives. The first

objective can be achieved by adding new processes to the system or amending

existing processes. It may also be necessary for new data to be stored in the system

and for new data flows to be established.

The operational objectives, on the other hand, require different solutions. Often

these call for:

a different physical realization of some of the system processes;

a reorganization of the application of process transformations on a given data

report.

Personal and job satisfaction objectives may call for changes to the user

interface to the computer, new report layouts or changes to the data flows.


11.3.2 Reducing Objectives

Objectives can be high-level, low-level, general or specific. Usually we start

with high-level objectives and then reduce them as we proceed to detailed system

analysis. Often objectives are defined by starting with key performance factors. Such

key performance factors are general statements about how the goal will be met by the

new system.

These key factors are then analyzed. To do this, analysts talk to the users to

define precisely what each key goal entails. Usually it is necessary to ascertain things

like:

the information requirements to satisfy the key goals;

performance targets (such as time, volume, sample size);

comparison to existing systems; and

human factors.

New and specific values can then be set against each key goal. These target

values become the system objectives. Often the process can be repeated through a

number of levels. The objectives are defined, then they are broken down into more

detailed objectives. Specific target values are assigned to these detailed objectives

and so on.

Once we have the objectives, design can commence. The design process

described here uses the problem-solving steps proposed for structured system

analysis.

11.3.3 Problem-Solving With Structured System Techniques

Proponents of structured system analysis have suggested a problem-solving

approach for system design. This approach is illustrated in Figure 11.1. it is made up

of four steps designed to determine:

how the system works now;

what the system does now;

what the new system will do; and

how the new system will work.

Of course, the first two steps are unnecessary if you a building a completely new

system. All the steps, however, apply if you wish to change an existing system.


Step 1

Analyze

physical

system

operation

Step 2

Derive

the logical

model

Step 3

Produce the

new logical

model

Choose new

physical model

Produce

physical

alternative

System

problems

(What the

system does) System

objectives

(What the

system will

do)

New

logical

model

(How the system

will work)

Step 4

Physical alternatives

SY

ST

EM

AN

AL

YS

IS

BR

OA

D S

YS

TE

M D

ES

IGN

Current

physical

model

Current

logical

model

(How the system

works)

Economic

social factor

Chosen

physical

model

To detailed

design


Figure 11.1- Problem-solving steps


The first two steps – System analysis

The first two steps in figure 11.1 is to analyze the physical system operation and

develop the current physical model, and covert the physical model to a logical model

of the current system, that is, what the system does. System problems are also

identified during these first two steps.

The next two steps – System design

Design commences once the logical model of the existing system is available.

Design begins by using identified system problems to develop objectives for the new

system. It then proposes a system that satisfies these objectives.

During step 3, the new logical model is developed. The new logical model

includes any new processes or changes to existing processes necessary to meet

system objectives. The designer must examine as many solutions as possible to make

sure that a good solution is found to meet system objectives.

Physical design starts in step 4. Many activities take place during physical

design. Decisions are made on what processes are to remain manual and which are to

be computerized. Physical devices to store any data are chosen, and methods for

carrying out system functions defined. The interface between system user and

computers in the new system is designed.

Note tat four structured analysis descriptions are developed during the cycle

shown in Figure 11.1. As shown in Figure 11.2 they are:

a current physical model;

a current logical model;

a new logical model; and

a new physical model.

These problem-solving steps are combined with the system life cycle. Top-level

logical and physical models are developed in Phase 4A. These models are elaborated

into detailed physical design during Phase 4B of the system development cycle.

(a) Current physical model

Sales

Estimate

schedule Production

department


(b) Current logical model

(c) New logical model

(d) New physical model

New

sales

Prepare

estimate

input

Projection

program Production

Figure 11.2 – Problem-solving cycle

2

3

6

1

7

4

9

5

8 10

1

2

3

6

7

4 5

9

10

8

Domain of

change


11.3.4 Designing the New Logical Model

Figure 11.3 shows a method for crating a new model from the current logical

model of the existing system. The first step is to see which processes are effected by

the objectives for the new system. These processes are known as the domain of

change. The domain of change looks something like Figure 11.4. It includes all the

processes that are affected by the objectives, together with the interface between

these processes and the remainder of the system. The affected processes may be

adjacent to each other as shown in Figure 11.4(a) or they may be made up of

disconnected or disjointed data flow diagrams, as in Figure 11.4(b).

You should note that designers may see different ways of meeting system

objectives and thus define alternative domains of change. One of these alternatives

must be selected during design.

The domain’s processes, data flows and data stores are designed once a domain

of change is selected. Design cannot be described as a prescriptive process. It is

possible to suggest guidelines, but it is up to designers to use their knowledge of the

system to suggest ways to satisfy system objectives. Two approaches can be

identified. One is to completely redesign the domain of change. The other is to amend

it.

Redesigning the Domain of Change

A technique often used to redesign systems is illustrated in Figure 11.5. It starts

with the domain of change and data flows in and out of this domain. The design then

proceeds in the following steps:

1. Add data stores – Identify the data needed inside the domain of change and

construct data stores for this data.

2. Define processes, data flows and data stores – Take each input into the domain

of change and define processes that the input data flows through before it is

stored or used to produce an output.

3. Add processes that transform data – Define processes that use data in the data

stores and define how the processes use the data.

4. Add processes to create outputs – Look at each output produced and identify

the data stores used to produce the output. Define the processes that the data

goes through before being output.


Of course, it may not be possible to proceed in this way for a large system. A

top-down approach may be preferred instead. Figure 11.6 illustrates how top-level

processes are defined. A list of major system functions is first made and their inputs

and outputs defined. These functions are then linked by data flows to construct a top-

level data flow diagram (DFD) for the domain of change. The method shown in

Figure 11.5 is now applied to each top-level function.

1

Determine

processes

effected by

objectives

2

Create a

domain of

change

3

Develop the interface

between the domain

of change and the rest

of the system

4

Redesign the

domain of

change

Current logical

model

New logical

model

Effected system

processes

Domain of

change with

user interface

New logical

model

Objective (the charter for

change)

Figure 11.3 – steps for developing the new logical model



(a) A single domain of change

(b) A disjoint domain of change

Figure 11.4 – Domain of change


Figure 11.5 – a set of steps used to design the new system

(a) Start

(b) Add data stores

(d) Add processes that

transform data

(e) Add processes to create outputs

(c) Define processes, data flows

and data stores


Figure 11.6 – Top-down design

Identify

major

process

Top-level

DFD

Expand each

process

Into

Lower-level

DFD

Using the method shown

in Figure 11.5


Data Analysis in Design

As data stores are defined, the system data model is amended to include any new

data requirements. Thus new entity sets, relationships and attributes are added to the

E-R diagram. When all the data requirements are completed, the E-R model is

converted to a set of relations, which are then normalized. In summary, data analysis

follows the following steps during design:

1. Update the E-R level model to include any new system items.

2. Add the definitions of any new data elements to the data dictionary.

3. Normalize the model.

4. Specify frequency and usage requirements of the data items.

The newly-designed logical model should then be discussed with system users

to gain their approval.

Amending an Existing System

The alternative to a complete redesign of the domain of change is to make

changes to individual components of the domain of change – or redesign by parts.

A number of different kinds of amendments can be made to an existing system.

Some amendments are:

adding a new system process;

creating a new data flow;

changing the sequence of operations on information;

eliminating redundant or unnecessary processes;

combining two or more processes;

adding new data and changing processes to use this data.

New Logical to New Physical

So far design has not specified what is to be automated. The time to do this is

when the logical design is completed. Physical design proceeds in two steps:

an initial broad level design; followed by

a detailed design

The broad physical design phase is made during Phase 4A of the linear cycle.

Detailed design is completed in Phase 4B.

Broad-level physical design investigates alternate automation possibilities. It

doses this by drawing a boundary to identify the processes to be automated. This

boundary is represented by a dashed line and is often called the man-machine

interface. The man-machine interface can be drawn on high- and low-level data flow


diagram. An example is shown in Figures 11.12 and 11.13. The leveled diagram

shows that the transaction is entered in tow parts. Part 1 is entered first. This is

examined for errors. A request is made for Part 2 of the transaction if no errors are

found in Part 1. Otherwise, a request is made for a correction.

Figure 11.12 – High-level man-machine boundary

CUSTOMERS

Transaction 1

Receive

transaction

2

Enter

transaction

Received

transaction

3

Produce

summary

report

TRANSACTIONS

Summary

report

2.1

Enter

Part 1

2.4

Make

transaction

file

2.3

Enter

Part 2

2.6

Enter

correction

2.2

Examine

Part 1

2.5

Examine

Part 2

Part 1

Part 2

TRANSACTIONS

Request

Part 2

Request

Part 2 correction

Request

correction

for Part 1

Received

transaction

Figure 11.13 – Low-level man-machine boundary


It is common to define more than one man—machine boundary during broad

physical design. Usually one of the options is a ‘total’ option — everything is

automated. Another option is a minimum automation option, which only automates

the processes necessary to realize the most important objectives while leaving a

considerable portion of the system non-automated. Finally, an option that includes no

additional automation but uses the revised logical flows should be considered.

The operational, technical and economic feasibility of the proposed designs is

then evaluated and a preferred alternative may be somewhere in between the three

you started with.


Chapter Twelve

DETAILED SYSTEM DESIGN

12.1 Introduction

Continue with the linear cycle the next step after broad design has been

completed, detailed design starts. The output from broad design is a set of logical

DFDs, together with the man-machine interface. Detailed design uses the processes

and data stores within the automation boundary to specify programs and a database. It

also uses the flows outside the boundary to specify user procedures. Finally, the

interface between the users and computer system is designed.

This chapter concentrates on user procedures and on the interface to the

computer system. Then in the next chapters we describes database design and

program design.

12.2 Designing User Procedures

User procedures define what people must do to make the system work. Two

things are important in user procedure design:

Functional design, which is, making sure the system does what it is supposed to

do. Functional design defines the tasks needed to make the system work.

Job design, specifies what people must do within the tasks.

Functional Design

User procedures specify the tasks needed to capture data and pass it correctly

through the system. The procedures must specify many things for each task. A user

procedure must specify:

how data is obtained,

the medium used to carry the data, and

the format of the data on the medium.

It must also specify the methods used to move the data and how to perform any

calculation on the data.

A number of methods are used to design procedure tasks. A design method must

satisfy a number of requirements. One requirement is to make sure that all the

processes in the DFD are included in the user procedures. The simplest way to satisfy

this requirement is to make each process in the DFD into task. This is assigned to one

person or a group of people.



Describing User Procedures

User procedures define flows of information and the tasks needed to realize

these flows. They also describe physical devices that are used to carry and store the

data. User procedures are often described by flowcharts to represent physical system

operation.

A flowcharting technique uses a finite set of symbols with each symbol

representing a physical system component. One typical set of symbols is illustrated in

Figure 12.1, which shows symbols used to represent computer components. Figure

12.2, on the other hand, illustrates symbols used to represent manual system

components.

Figure 12.1 – Flowcharting symbols for computer components

A computer

print-out A computer

disk

Transmission

line

A computer

tape A punched

card

A computer

program A computer

terminal


Figure 12.3 illustrates a typical flowchart. It illustrates how applications are

collected and input into computer. It combines manual and computer processes.

Once tasks in the user procedures are defined, it is necessary to specify what

people must do to realize to tasks. This process is called job design. A job must be

defined for all tasks in the system.

Keyboard

input Temporar

y storage

such as

input tray

Permanent

storage

such as

input tray

A paper

documen

t

A user

activity

External

entity

Figure 12.2 – Flowcharting symbols for non-computer components

Figure 12.3 – A flowchart

APPLICANT Application

Covert to cards Applications

on cards

INPUT

PROGRAM REPORT

PROGRAM

Received

- Transactions

- File

Applications

summery

report


Job Design

Good job design involves a number of things. One is to ensure that people are

not in the service of computers. This can be done in a number of ways. First, the

interface to the computer should be made as easy and pleasant as possible. The

interface should use dialog that approaches natural language rather than using

computer jargon. Designing jobs with such variety often called job enrichment.

Another important consideration in job design is the skills and skill levels

required for jobs. Skills include dealing with people, various technical skills and

decision or supervision abilities.

Finally, once the skills needed for the job are defined, it is important to precisely

define the duties required of the job and make sure that these duties can be carried out

within the allocated times.

Job descriptions are usually documented in a user manual. This manual is

produced in conjunction with users during system design. It includes an entry for

each task and the jobs necessary to accomplish that task.

Job Enrichment

Job enrichment, now a common term in practice, is used to make jobs more

interesting and rewarding to system users. Job enrichment gives greater initiative and

responsibility to individuals, enabling people in the organization to improve

themselves through involvement in a greater variety of activities. They also get

greater satisfaction from their work because they are contributing to the development

of the organization rather than doing same minor and repetitive task.

An important part of job enrichment is to eliminate jobs made up of routine and

repetitive tasks. Jobs can also be enriched by providing a good interface with the

computer, ensuring that people do not feel they are in the service of computers rather

than the reverse.

12.3 The Computer Interface

The computer interface is an important part of user procedure design. It defines

how users interact with a computer and has an important bearing on users’ acceptance

of a system.

The purpose of the computer interface is quite simple. It is to capture

information about the system from users and make information available to these or

other users. Information capture is called the system input, which stores in database.

The supply of information is called the system output.


The inputs and outputs can be on-line or off-line. In an on-line interface, the user

directly communicates with the computer. One important property of an on-line

interaction is that an output is obtained from the computer vary soon after the input.

This is not the case with off-line input or output. With off-line input,

transactions are called onto forms at the point of capture and then input into the

computer many transactions at a time. These transactions are then processed at same

later time and output is produced from them. This output is passed on to the users.

Off-Line Processing

Off-line interfaces are designed to mach off-line processing requirements. Off-

line processing is often done in what is known as batch mode. A typical batch run is

shown in Figure 12.4.

Figure 12.4 – A batch run

Written

records Conversion to

machine readable

form

Edit run

File

update

Data

capture

Error

report

Output

reports

Input

file

Master

file Report

program


Off-Line Input Interface

The most important input interface in off-line processing is the form that is used

to capture the transactions. Form design is quite important for many reasons:

- forms must be easy to fill in,

- form layout must be clear, and

- sometimes forms allow codes to be used.

Off-Line Output

Printer usually produces off-line output as paper listings. Considerable care must

be taken to present the output in an easily-understood way.

On-Line Processing

On-line processing takes place through an interchange of messages between the

user and computer in a relatively short time. This interchange of messages is called

the user dialog.

User Dialog

Different methods are used in on-line user dialog. The most common methods

are:

- Menus, which is presents the user with a set of actions and requires the user to

select one of those actions. Once an action is selected, another menu is presented.

The next menu depends on the selected action. Note that actions are selected by

typing in a number rather than entering the actual name.

- Commands, in this case the computer asks the user for specific inputs. On getting

the input, the computer may respond with some information or ask the user for

more information. This process continues until all the data has been entered into

the computer or retrieved by the user.

- Templates, it is equivalent to forms on a computer. A form is presented on screen

and the user requested to fill in the form. Usually a number of labeled fields are

provided and users enter data into the blank spaces.


Chapter Thirteen

DATABASE DESIGN

13.1 Introduction

Database design converts the data model developed in logical design to database

definition that is supported by database software known as database management

system (DBMS).

Database design proceeds through a number of steps, illustrated in Figure 13.1.

The first step is DBMS independent, in this step the conceptual E-R or relational

model to a set of record types known as logical record structure (LRS), with each

record made up of a number of fields. The first step also defines how data in these

records is to be used. These access requirements are used later to choose DBMS

structures.

The next database steps convert the LRS to a database definition. These steps use

techniques that depend on the DBMS. DBMS-dependent techniques are needed here

because different DBMS support different kinds of links between their records.

DBMS-dependent design proceeds in two steps. The first step is logical design.

This step defines the DBMS record types and the links between them. The next step is

Figure 13.1 – Database design steps

Phase 4A

Broad

logical

design

Step 1

Conversion

to record

type

Step 3

Physical

design

Step 2

Logical

design

Logical

record

structure

DBMS-

independent

steps

DBMS-

dependent

steps

E-R model of

system data

DBMS logical

structure

Database

definitions


physical design, which chooses a physical organization that supports the methods

used to access the database.

13.2 Conversion to Logical Record Structures

To drive LRS from the conceptual model, there are two methods can be used.

One starts with a relational model of data. The other method starts with an E-R model

and converts it directly to an LRS.

13.2.1 Converting the Relational Model to A LRS

The relational model of system data is converted to the LRS in two steps. Record

types are created in the first step and links between the record types are added in the

second step. In the first step, each relation is converted to a structure record type and

each relation attribute to a record field.

Links between these record types are defined in the next step. To do this, it is

necessary to use foreign keys. A set of attributes, F, is a foreign key in a relation, R,

if:

1. F is not a relation key of R; but

2. F is a relation key of some other relation, say R2.

The next step is to create links between records in a LRS. There is a link from

relation R1 to relation R2, if the relation key of R1 is a foreign key in R2. Consider

the following relations:

PARTS (PART-NO, DESC, PRICE);

USE (PROJECT-NO, PART-NO, QTY-USED);

PROJECTS (PROJECT-NO, START-DATE, BUDGET);

Thus, the three relations will be converted to the LRS shown in Figure 13.2.

Figure 13.2 – Logical Record Structure

Links are directed from one record type to another. Links also have another

property. If you look at the link direction and the label on the link, you will find that

PROJECT-NO

START-DATE

BUDGET

PROJECT-NO

PART-NO

QTY-USED

PART-NO

DESC

PRICE

PROJECTS USES PARTS

PROJECT-NO PART-NO


the link originates on the record type that contains only one record with a given value

of the link label.

13.2.2 Converting an E-R Model to A LRS

The simplest conversion is to make each set of the E-R diagram into a record

type. However, there is one small variation to this where we combine object sets that

have the same relation key to reduce the number of record types. One such

conversion is illustrated in Figure 13.3. You will find that such combinations are

always possible in 1:N relationships. In a 1:N relationship, entities in one of the entity

sets appear in one relationship only.

Figure 13.3 – Converting an E-R diagram to a LRS

Links are then added. Links start on logical record type that represent entities

and terminate on logical records that represent relationships. The same rule applies

where sets have been combined, the link is from the record type, which represents an

entity set, to a record type, which contains the relationship.

PROJECTS

ORDERS

PARTS

MAKE

M

N

N

1

PROJECT-NO

START-DATE

BUDGET

PROJECT-NO

ORDER-NO

ORDER-NO

ORDER-DATE

PART-NO

DESC

PRICE

PROJECT-NO

START-DATE

BUDGET

PROJECT-NO

ORDER-NO

ORDER-DATE

PART-NO

DESC

PRICE

FOR ORDER-NO

PART-NO

QTY-ORDERED

ORDER-NO

PART-NO

QTY-ORDERED

PROJECT-NO

ORDER-NO

PART-NO

PROJECTS

ORDERS

FOR

PARTS



13.3 Completing the Database Specification

The logical record structure is just one part of the database specification. It

serves to define the database structure. For database design, we need additional

information, in particular:

quantitative data, which tells us the volume of information to be stored;

access requirements, which tell as how the database is to be used. Access

requirements are used to choose file keys during database design.

Information about quantitative data and access requirements is gathered during

systems analysis. Quantitative data consists of:

the size of data items; and

the number of assurances of record types.

The size of data items is usually given in a data is usually given in a data dictionary.

Record volumes can be added to the LRS, by the method shown in Figure 13.5.

Figure 13.5 – Add the number of records into LRS

13.3.1 Specifying Access Requirements

Access requirements are initially picked up from user procedure specifications,

which include statements about how users will access data. These statements become

the database access requirements. During database design, access paths are plotted

against logical record types for each access requirement. The access paths show how

data is to be used and describe:

the record types accessed by each access request;

the sequence in which the record types are accessed;

the access keys used to select record types;

the items retrieved from each record; and

the number of records accessed.

PROJECT-NO

START-DATE

BUDGET

Volume: 100

PROJECT-NO

PART-NO

QTY-USED

Volume: 5000

PART-NO

DESC

PRICE

Volume: 2000

PROJECTS USES PARTS

PROJECT-NO PART-NO


There are many ways to draw access paths. One method is illustrated in Figure

13.6. This figure shows three access requirements plotted against three logical record

types. The three access requirements are:

A – find the START-DATE for given project.

B – find the PRICE of parts used on a project with a given PROJECT-NO.

C – find the START-DATE of projects that use a given PART-NO.

There is one access path for each access requirement and each access path can be

made up of or more access steps. Each access step labeled with an access key. The

access step also includes a description of the activity at the step.

Figure 13.6 – Access paths to the logical structure level

Defining the access requirements completes the database specification, which is

now used to start database design. The logical record structure is converted to a

PROJECT-NO

START-DATE

BUDGET

Volume: 100

PROJECTS A/1

C/2

Key: PROJECT-NO

Get: START-DATE

Frequency: 50/DAY

Key: PROJECT-NO

Get: START-DATE

Frequency: 50/DAY

PROJECT-NO

PART-NO

QTY-USED

Volume: 5000

USES

C/1

B/1 Key: PART-NO

Get: PROJECT-NO

Frequency: 20/DAY

Key: PROJECT-NO

Get: PART-NO

Frequency: 150/DAY

PART-NO

DESC

PRICE

Volume: 2000

PARTS

B/2

Key: PART-NO

Get: PRICE

Frequency: 20/DAY


database logical design. Access paths are used to select appropriate physical

structure.

13.4 Conversion to A Set of Files

The first step is to converts each logical record to a file. Thus the LRS in Figure

13.6 would be converted to three files. The logical record field would become the file

fields.

The keys in the access steps are used to choose keys for file indexes. All access

to the file can be direct. Figure 13.7 illustrates the file structure that is obtained from

the LRS of Figure 13.6. The file structure is made up of tree files with indexes. The

fields of records in each file are shown next to the file. You should also note that

links between record types in LRS are not carried over into the file structure because

file software does not support links between records. Database management software,

however, does support such links.

Figure 13.7 – A file structure for the logical structure

Key: PROJECT-NO

Key:

PART-NO

Key:

PART-NO

File: PROJECTS

File:

USES

File:

PARTS

Key: PROJECT-NO

Fields: PROJECT-

NO START-DATE

BUDGET

Fields:

PART-NO DESC

PRICE

Fields: PROJECT-

NO PART-NO

QTY-USED


Chapter Fourteen

Program Design

14.1 Introduction

Program design is the last component of detailed design. The program

specification produced at the end of program design must ensure that programs

satisfy user requirements as defined by user procedures.

Apart from satisfying user requirements, programs must satisfy a number of

other criteria. First, programs must be easy to read and understand. Second, programs

should accommodate system changes that always accrue after a system is built.

Program should be easy to maintain to support such changes. Finally, the third

requirement is that programs should be written to use computer resources in the most

effective manner.

These objectives are achieved by modular program design and structured

programming. Modular program design localizes each well-defined user system

function to one program module. Structured programming uses block-structured

constructs to make programs readable. Note that modularity and structured

programming are consistent with the criteria for good DFD. In a good DFD each

process represents one well-defined function. Furthermore, each process is described

using structured English, which employs constructs similar to structured

programming. These DFD structure to the program specification.

Program design uses the other two parts of detailed design, user design and

database design. Program design uses user procedures to define the computer

interface. It uses the database design to write I/O code, which accesses those database

parts needed by the program.

14.2 Steps In Program Design

The steps in program design and their relationships with other parts of detailed

design are illustrated in Figure 14.1. Program design starts with DFD inside the

automation boundary and uses outputs from database design and user procedure

design to produce working programs.


Figure 14.1 – Program Design

14.2 Transition from DFD to Detailed Program Specifications

Figure 14.2 – describe the program design steps in more detail. It uses Process

3.2 to illustrate these steps. The first step develops detailed process specification for

the process. Detailed process specifications are written in Structured English and

begin to approximate program code. They include all the conditions that can arise in

the process and how they are treated.

The next step the DFD are converted to structure charts and each process in the

DFD become a program module in the structure charts. The structure charts show the

program modules and how they pass information – as parameters –between

themselves.

4

Produce

detailed

process

specification

1

Broad system

design

3

Design

database

2

Design

user

procedures

5

Develop

structured

charts

7

Package

design

6

Produce

detailed

module design

Automated

processes

Conceptual

data model

Manual

processes Access

requirements

User I/O

requirements

Structure

charts

Module

design

DFD with detailed

process specifications


3.2

Check

parts

available

INVENTORY

ORDERS

Parts

required

Order

Negotiate

changes

Request

cancellations

a) DFD

3.2

Get order

FOR each part in order

BEGIN

Check availability

in inventory

IF not available

THEN

BEGIN

Report shortfall

CASE reply

Cancel order:

Report cancellation

Reduce quantity:

Subtract from quantity ordered

Remove parts:

Delete part from order

END

END

END

Store order

CHECK-PARTS-AVALABILITY (order,

status)

FOR each part in order

BEGIN

Read INVENTORY using PART-NO as key

Status =‘OK’

IF QTY-AVALABE < QTY-NEEDED

THEN

BEGIN

Send ‘message –1’ to user

depending on response

CASE reply

Cancel order:

Report cancellation

Status = ‘canceled’

Reduce quantity:

QTY-NEEDED = QTY-NEEDED –

REDUCTION

Status =’reduce’

Remove parts:

Delete part from order

Status =’remove’

END

END

Store order in ORDER-FILE

b) Step 1 – Produce detailed

program specification

d) Step 3 – Produce detailed

module design

e) Step 4 – Package design

Combine

CHECK-PARTS-AVALABILITY

With

MAIN-PROGRAM

Into a single load

module

CHECK-PARTS-

AVALABILITY

MAIN

PROGRAM

Status Order

c) Step 2 – Develop structure

charts


Once we know how program modules call each other, we can convert detailed

process specifications to detailed module designs. As shown by Step 3 in Figure 14.2,

that detailed module designs include procedure calls and parameters passed to a

module. They also include user dialogs, error checks, I/O code and references to the

database. They use database design and user procedure design to choose the correct

database references and user dialog.

Finally, the modules are packaged into load modules. Load modules are groups

of program modules that are loaded into the memory together.

Figure 14.2 – Program design steps

Documents

Asst. Prof. Dr. Ali Kadhim