Data Base Systems

8/12/2019 Data Base Systems

1/22

Data base systems

INTRODUCTIONThe primary memory of a computer is limited and hence

programs and data are deleted from primary memory once their

use is over. These programs and data are organised into files for

permanent storage on secondary storage device for reuse. Thesefiles are structured in a particular way depending upon the typeof access required and the media on which they are stored. Ifthe data requires quick access, it is stored on disks and if it

requires only serial processing the data is usually stored on tape.The file is made up of a number of records. The recordsare a group of fields and each field is made up of some bits ofdata. Each file is given a name for its identify. The namegenerally consists of two parts: the first is a single-word name

and the second, a three-letter extension name to indicate thetype of file. For instance .COB, . PRG etc. for program files and

.OBF, .OAT etc. for data files. For example, in stock.dat, stock

is the first part of the file name and .dat is the extension.A file holds records of logically similar data. Each record

consists of a set of fields for data. Each field holds. data ofdefined nature like date field holds only dates, name field holds

only names, etc. The computer files are organised on physicalstorage devices like magnetic tape, disk and CD-ROM.

Data and Information

Dat,a is the result of measurements of various attributes ofentities such as product, student, inventory item and employee.The measurements may be recorded in alphabetical, numerical,image, voice or other forms. Thus, the raw and unanalysed

numbers and facts about entities constitute data. On the otherhand information results from data when they are organised or

structured in some meaningful ways. The processed data haveto be placed in a context for have them to derive meaning andrelevance. Relevance in turn adds to the value of information

in decisions and actions. Data processing requires some infusionof intelligence ( meaning, purpose and usefulness) into data to

generate information. The application of intelligence may bein the form of some principles, knowledge, experience andintuition to convert data into information.

Definition of Information

The term 'information' is a very common word and it

conveys some meaning to the recipient. Itis very difficult todefine it comprehensively. Yet, Davis and Olson 1 give a fairlygood definition. They define information as "data that has

been processed into a form that is meaningful to the recipientand is of real or perceived value in current or prospective actions

or decisions".This implies that information is:Processed dataIt has a form


2/22

. It is meaningful to the recipientIt has a value, and,

It is useful in current or prospective decisions oractions.

Differences between data and information

Though the words 'data' and 'information' are often used

interchangeably, there is clear distinction between the two.Some of the major differences are as follows:Data are facts but information, though based on data,is not fact.

Though information arises from data, all data do notbecome information. There is a lot of selective filteringof data before processing them into information. Data are the result of routine recording of events and

activities taking place. Generation of information isuser-driven which is not always automatic.

Data are independent of users whereas information is

user dependent. Most information reports are designedto meet anticipated information needs of a user or a

group of users. That is, information for one user isvery likely to be data for other users.

Field, Record and FileA file is a collection of related records. A record is madeup of a number of fields to hold data items. Each field is madeup of a number of storage spaces. Each storage space can holda byte of information. A collection of logically related files

forms a database. It usually contains quite a few files holdingdata, which can be accessed by many users.

Roll no, name, sex and address are the field names. Eachfield reserves some spaces for storage of respective data. Forexample, Roll No has a 7 byte storage space, Name has 30 bytesstorage and so on. Roll No field holds data items 9501101,9501105 and 9501112 as roll numbers of students. ARUN GOKUL,

RAJESH KUMAR etc. are data items in the name field. Each lineof fields relates to an entity: student. Attributes of the studententitysuch as roll no, sex and address become the field names.


3/22

Data fields hold the basic elements of data in them. Allattributes of an entity taken together form a record. When

such related records are put together, that collection is calleda file. Record d,esign can be logical or physical. Logical designrepresents the logical relationship among the data items in thefield. The physical record design means the way data items are

physically stored on some media like disk and tape,

File OrganisationThe file organisation means the way the records are written

up in a file and depends on:(i) File activity,(ii) Volatility of information, and(iii) Storage deviceFile activity means the properties of records processed in

one run. If only a few records are accessed in a single run,activity is low. If the file activity is low, it can be stored on disk

device for efficient file processing. On the other hand, if a

good number of records are accessed in any given time, the fileactivity is high and such files can be stored on tapes so that

processing is more efficient and less costly.File volatility means the proportion of record changes. If

records are changed very frequently, the volatility is very high.For high volatility files such as seat reservation files in atransport firm, disk medium is more efficient and offers a finiteaccess. If only.magnetic tapes are available, then files areorganised in sequential organisation. On the other hand

magnetic disks offer more flexibility as they support bothsequential access and direct access.

Other considerations in file organisation are:

(i) Response time; direct access for quick response(ii) Cost of storage medium

(iii) Volume of storage, and,(iv) Security of data

Methods of File Organisation1) Serial file organisation2) Sequential organisation

3) Indexed sequential organisation4) Direct file organisation

1. Serial file OrganisationThe records in a serial file are stored randomly and are

generally appended at the end of a file as the data originate.

The logical order of records with respect to a key field does notbear any relation to the order of physical storage of such records

in the file. It is also referred to as non-keyed sequential file.

2. Sequential file organisation

This file can be created on a magnetic tape or disk. Eachrecord is written up on the tape or disk one by one logicallyordered on one or more key fields. For example, ordering can

be in the ascending order of roll no in case of a student file.


4/22

The records are stored on a sorted order. If new records areadded or existing records are deleted, the file has to be resorted

in case of disk file. If the file is stored on a magnetictape, another new file has to be created to update the existingfile with the changes to be effected since creation or last updateof the file. This is done to maintain the proper sequence of the

records in the file. The advantages of sequential file are simpleorganisation and ease in accessing records sequentially.To minimise the cost of update, the new records are

bunched in a transaction file and the master file (that is theoriginal file which is relatively permanent) is updated in a singlerun leading to the creation of a new master file. This file update

is called grand father-father-son update, as there will be threefiles any time.

3. Indexed-sequential file organisationAn index is a combination of key and storage address of

records. This file organisation creates an index file in additionto the data file created. The index file holds pairs of key and

storage address of records in the data file. The index file helpsin randomly locating records in the data file as the physicalstorage location of the record is obtained from the index file.

This file organisation supports both sequential access ard randomaccess of records in the file.

4. Direct File Organisation

These files are created on disks or CD-ROMs. In direct fileorganisation a hashing technique is used to generate storage

address of records in the file. There are quite a number of waysof converting a key (such as roll no for a student file, and

product-code for an inventory file) to a numeric value. The keys

may be numeric, alphabetic or alphanumeric. In the case of

alphabetic and alphanumeric keys, numeric key value has to begenerated. Direct mapping is done by performing somearithmetic manipulation of the key value, called hashing. Thehashing function, h (k), generates a value for each key, WhlCh is used as an address for storage location.

Direct file supportsdirect access of files and minimises the access time of records.The records need not be sorted before storage as in an indexedsequentialfile.

Modes of File Access

The computer file can be accessed in three modes:sequential, random and dynamic.

1.Sequential Access

This means that for accessing a record sequentially, thefile has to be read from the beginning, that is record 1, record2, and so on until the required record is reached. The accesstime of a single record depends on where in the file the record

is stored. That is, if it is the first record in the file, it takesmuch less time to access than a record that is at the end of thefile.

2.Random Access

This method takes the same time for accessing the record


5/22

in the file wherever the record is physically located in the file.The storage location of the record is obtained by converting

the key value of the record into its numeric location address byhash function. Then the record is located directly.

3.Dynamic Access

This mode combines both sequential and random modes of

access. At times, it may be required to start sequential accessfrom a given record only. For example a file holds 2000 recordsand records numbered 1220 to 1250 are to be accessed for

processing. In this case, it is better to locate the record number1229 randomly and access the remaining records in sequentialmode.

File Updating

Updating of files means making" the file current by

incorporating changes to the records held in it or adding newrecords to it. If data are very large or are likely to change

occasionally, such data are held in a master file. Master filesare relatively permanent and are used for referring to the data

there in when required. Data arising out of day-to-daytransactions change very often and they are, therefore, held ina temporary file called transaction file.

The master files have to be made current by incorporatingchanges in data to the master files. This process is called fileupdating. There are three ways in which these changes areeffected: addition of a record to master file, deletion of a recordfrom, and modification of a record held in, master file.

Methods of Updating Sequential file

Sequential files can be updated in two ways: direct updating

and grand father-father-son updating.

Direct updatingIn case of direct update, the data are processed online

and files are updated directly, that is no back up files aremaintained. The direct update keeps all files updated and

enables real-time response. It saves disk space as transactionfiles are not opened for temporary storage of data. But it isvery difficult to recreate a file if it is corrupted or deleted

accidentally. Deletion of records is also not possible. For directupdating, the data must be stored in random access files.

Examples of random access storage devices are magnetic disks,magnetic drums and CDROMs.

Grand Father-Father-Son update

In this method two files are used as input files and theyresult in the creation of a new updated master file. The two

input files are the master files requiring updating and theTransaction file containing the transaction data of the period.Both the files are to be sorted in the same order on the samekey before updating starts.

Updating Process

Both the master file and transaction file are read(1) The keys are then compared


6/22

(2) If the master file key is less than the transaction filekey, no change is required. The record is copied to

the new master file.(3) If the master file key is equal to Transaction file key,then the record is to be either deleted or modified.(4) If the master file key is greater than transaction file

key, then it means that the transaction file record isnew and is therefore to be copied to the new masterfile.

(5) Three generations of files are maintained always.Hence the name Grandfather-father-son update.

Indexed File UpdatingIndexed file has random access capability. Indexed filesallow direct updating. Whenever any change in data takes place,

the particular record is randomly accessed and updated. Thedisadvantage of direct updating is that no back up files are

maintained and it may be difficult to undo changes effected.

Indexed file or Indexed sequential file organisation keepsin addition to data files an index or table that lists the address

of records on disk (namely, track and sector number) accordingto the contents of the key field. The key chosen must be able

to identify a record uniquely. Any record in the file can be readat any time. Updating is easier in case of indexed files as onlythose records requiring modification need only be read andmodified. Indexed file is highly suitable where quick responseis required; for example, airline reservation or railway

reservation requires direct updating.

Database System

A database is a set of logically connected data files that

have common access methods between them. It storestransaction data. It does not contain any input or output data.

The input data may cause a change to operational data but arenot part of the database. Similarly, the output data mean the

reports or query responses from the system. The input data andoutput data are transient and they are not stored in thedatabase.

The database system gives centralised control over thedatabase resources. The advantages of centralised control over

the data are1:Redundancy can be reduced,

Inconsistency can be avoided,

The data can be shared,Standards can. be enforced,

Security restrictions can be applied, and,Integrity can be maintained.

The concept of IRM calls for treating information as anorganisational resource. In traditional file management system,applications owned their own data and it was not shared withother applications. Each application defined its data, created


7/22

its file structure and stored the data conveniently to be accessedby its application program. Thus applications like payroll,

inventory management etc. owned their own data. Severalapplications stored the same data item in many files. This causeda lot of duplication in data storage and the consequent datainconsistency, as the related files were not updated

simultaneously. Often application programs had to be modifiedto use data files of other applications.Database is a centrally controlled, integrated collectionof logically organised data. The central control ensures datasharing among applications and enforces database security

procedures. The data items in the database are logically related

and this helps in integration of database.Advantages of Database Systems

The database system approach has the following advantages Data independence

The data are logically designed into databases and theyare independent of applications. Since the data are programindependent,

any application can use them without anymodification to the code. Data shareability

Database permits simultaneous multiple access to thedatabase. Thus, multiple users can share the same data. Data integrity

Access to the database is controlled by the databasemanagement system. The system authorises personnel for

entering, editing and deleting data. It also authorises people toaccess data for various data processing activities. Since thedatabase stores one data item only in one place and updates it

with fresh transaction data automatically, there is little chance

of inconsistency in the database. Data availabilityThe database is centrally controlled and access to data is

permitted through an authorisation scheme. The data resources are therefore available to the users in the

organisation subjectto the authorisation procedure. Data evolvabilityThe database is flexible and can store huge quantity ofdata. It can evolve as the number of applications and queries

increase to meet their data requirements.Components of Database System

The common database components are:

Database filesThe database files store the transaction data.DBMSIt is a set of programs that manages the database. It

performs a number of tasks like controlling access to thedatabase, making security checks etc.Host level language interface systemThis system interacts with application programs andinterprets their data requests that are issued in high-level


8/22

language.Natural language interface

DBMS needs to process queries and data requests issued toit in natural languages called English-like language. The naturallanguage interface performs interpreting the queries andrequests in natural language. It also facilitates managerial

interac;tion with the database for decision support applications.Application programsThe application programs request for data from thedatabase. The data independence permits the applications touse the data for a variety of purposes.Data Dictionary

The data dictionary contains schema of the database. Itdefines each data item in the database, lists its structure,

source, person authorised to modify it etc.

Report generator

The system generates output for users in the form of queryresponse or reports. It might also produce documents like invoice

and process ad-hoc queries and special report requests.Users of Database SystemsThere are three broad classes of users for organisational

database systems. They are:1. Application programmers who write application

programs that manipulate the data in the database.

2. End-users who access the database by invokingapplication programs or through a structured query

language, and,3. Database Administrator who is responsible for

planning, designing, creating and maintaining the

database.

Database Management System (DBMS)DBMS is a set of system programs that manages the entiredatabase. It controls access to files. It updates files and retrievesdata from the files on request by applications for processing.

DBMS maintains database by adding, deleting and modifyingrecords in database. It permits multiple users to access thesame files simultaneously. It acts as an interface between theapplication programs and the data in the database. If the userwants some data from the database, the DBMS processes the

request, locates the data in the database and displays them forthe user. In traditional file management system, the user needs

to specify both the data and its storage location. DBMS requires

storing the database on direct access storage devices.DBMS is general-purpose system software. It works inconjunction with the operating systems to create, process, store,retrieve, control and manage data. Its tasks include defining,

constructing, and manipulating database for applications.Defining database involves specifying data types, datastructures, storage constraints etc. Constructing database meansstoring the data on storage medium under the control of theDBMS. Database manipulation includes merging databases,


9/22

generating reports, processing queries etc.The three main components of a DBMS are data definition

language, data manipulation language, and data dictionary.

Data Definition Language

The contents of database are created using the data

definition language. It defines relationships between differentdata elements and serves as an interface for application

programs that use the data.

Database Manipulation Language

Data is processed and updated using a language called datamanipulation language. It allows a user to query database andreceive summary or customised reports. The data manipulationlanguage is usually integrated with other programminglanguages, many of which are 3GLs or 4GLs.

Each database package has its own query language withunique rules and instruction formats. Hence there is no universal

query language. Query language is used to access the data for

report generation, query processing and other data processingactivities.

Structured Query Language (SQL) is a non-procedurallanguage that deals with data, data integrity, data manipulation,

data access, data retrieval, data query and data security. MostDBMS packages use some version of SQL whose primary purposeis to allow users to query a database and generate ad-hoc reportsthat provide customised information.

Data Dictionary

Data dictionary is an electronic document that containsdata definition and data use for every data type in the database.

It describes the data and its characteristics such as its location,

size and type. It identifies its origin, use, ownership and methodsof accessing and security of data. DBMS uses data dictionary to

store all details of data such as data definition, data storage,data use and access privileges.

Database Administrator (DBA)Organisations that implement database systems constitutea function called database administration to supervise the

organisational database resources. Database administratorsupervises the database administration function. The job of

database administrator is to plan, design, create, modify andmaintain the database of the organisation with special emphasis

on security and data integrity. He is not much concerned with

the details of the application programs that access the database for data. He maintains the schema and datadictionary. Any

change in the form of data item, its creation etc. can only bedone by the database administrator.His specific responsibilities include: Guiding the initial design of the database, and laterdeveloping and extending it to meet growing

organisational requirements. Establishing the database and monitoring the use of


10/22

it. Deciding on the content of the database. He has to

see that the relevant data are collected and stored inthe database. Establishing and monitoring database control and- security policies and procedures.

Servicing database users by educating and trainingthem in the use of the database.

Disadvantages of Database

The following are some of the disadvantages of database:

Higher data processing costs

The database system causes higher data processing costs.

This is due to the strict and elaborate procedure for data access,updating and processing.

Increased hardware and software costsIt requires more direct access memory capacity, greater

communication capability (including communication software),and additional processing power. This increases the hardware

and software costs.Data insecurity and integrityMost of the security and integrity problems are related to

the fact that many users have access rights to the database.Elaborate security systems are implemented to protect thedatabase and to prevent unauthorised access.

Insufficient database expertiseDatabase technology is complex. Most organisations do not

have enough personnel with necessary expertise to implementand manage database systems.

Database Architecture

The purpose of database is to facilitate huge storage andquick retrieval of data from the database. There are three basic

ways of organising data in a database. They are hierarchical,network and relational structures.

Hierarchical StructureThe relationships between records form a hierarchy. Therecords or aggregates of data are logically conceived to be stored

at different levels of hierarchy. The structure looks like a treewith branches turned upside down. The relation between entities

is structured in such a way as to link it with only one data itemat the higher level. In a hierarchical database, the relationship

between records is one of parent-child. One record can be linked

to only record at the higher level. Data stored in a lower levelnode (child record) can be accessed only through the higherlevel

node (parent record).

Network Structure

This structure can represent more complex logicalrelationships. This structure permits multiple relations betweendata items. One entity linked up to any number of other types

of entities. That is, it allows many-to-many relationships amongrecords. Any data element can be related to any number of


11/22

other data elements.

Relational Structure

Relational Slructure is the most recent of these threestructures. All data elements stored in the database areconceived to be stored in tables. Different data tables arelinked up using common type of data item in different tables.

The table is called a relation; the columns of the table arecalled domains and the r0WS are called tuples. A tuple containsvalues of data items called data elements of an entity.

Data Mining and Data Warehousing

Large organisations have huge quantity of data in theirdatabases and they are still growing. Until recently, businesscomputing

technologies concentrated on data capture storageand retrieval. But, the need to interpret and find patterns in

the huge data is growing and computing technologies are makingit possible now. Data mining is the focus of the new class of technologies being developed to help

business find meaning indata lying idle. The data mining helps in drawing inferences

from the data and in understanding the customer, products andmarkets betteT.Data mining employs a host of techniques; some very old

like the statistical techniques including linear programming, andothers are recently developed and are known as data analysis,machine learning, online analytical processing etc. These

techniques help in discovering new patterns in data.Huge databases have necessitated the need for data

- warehousing. Data warehousing means organising large amountsof data and making them available company-wide to users. Datawarehousing is an integral part of data mining. The quality and

quantity of data available for data mining is a function of data

warehousing. Data mining helps in identifying preferences ofcustomers groups and deciding on promotional material toinfluence their buying habits. The information can be used in

product development, product customisation and target

marketing. Data mining represents a new trend in the use ofinformation technology. The focus has shifted from data storageand retrieval to data analysis for making inferences.

Relational Database Management System (RDBMS)

A DBMS that is based onrelational model

is called as RDBMS. Relation model is most

successful mode of all three models. Designed by E.F. Codd, relational model is based

on the theory of sets and relations of mathematics.Relational model represents data in the form a table.A table is a two dimensionalarray containing rows and columns. Each row contains datarelated to an entity such

as a student. Each column contains the data related to asingle attribute of the entity


12/22

such as student name.One of the reasons behind the succes

s of relational model is its simplicity. It is easy tounderstand the data and easy to manipulate.Another important advantage with relational model,compared with remaining two

models is, it doesnt bind data with relationship betwe en data item. Instead it allowsyou to have dynamic relationship between entities usingthe values of the columns.Almost all Database systems that are sold in the market,now- a-days, have either

complete or partial implementation of relational model.

Figure 1 shows how data is represented in relational model and what are the terms

used to refer to various components of a table. The following are the terms used in relational model.

Tuple / RowA single row in the table is called as tuple. Eachrow represents the data of asingle entity.Attribute / Column

A column stores an attribute of the entity. For exa

mple, if details of students arestored then student name is an attribute; course isanother attribute and so on.

Column NameEach column in the table is given a name. This name isused to refer to value in the

column.Table Name

Each table is given a name. This is used to refer to the


13/22

table. The name depicts thecontent of the table.

The following are two other terms, primary key and foreign key, that are veryimportant in relational model.Primary Key

A table contains the data related entities. If you take STUDETNS table, it contains datarelated to students. For each student there will be onerow in the table. Eachstudentsdata in the table must be uniquely identified. In o

rder to identify each entity uniquelyin the table, we use a column in the table. That colum

n, which is used to uniquelyidentify entities (students) in the table is called as pr

imary key.In c

ase of STUDENTS table (see figure 1) we can use ROLLNOas the primary key as itin not duplicated.

So a primary key can be defined as aset of columns used to uniquelyidentify rows of a table.

Some other examples for primary keys are account numberin bank, product code of

products, employee number of an employeeComposite Primary KeyIn some tables a single column cannot be used to uniquely

identify entities (rows). In

that case we have to use two or more columns to uniquelyidentify rows of the table.When a primary key contains two or more columns it is called as composite primary

key.In figure 2, we have PAYMENTS table, which contains the details of payments made bythe students. Each row in the table contains roll numberof the student, payment date

and amount paid. Neither of the columns can uniquelyidentify rows. So we have to

combine ROLLNO and DP to uniquely identify rows in t

he table. As primary key isconsisting of two columns it is called as composite primary key


14/22

Figure 2:Composite Primary Key

Foreign KeyIn relational model, we often store data in different tables and put them together to

get complete information. For example, in PAYMENTStable we have only ROLLNO of

the student. To get remaining information about thestudent we have to useSTUDETNS table. Roll number in PAYMENTS table can be

used to obtain remaininginformation about the student.The relationship between entities student and paymentis one-to-many. One studentmay make payment for many times. As we already h

ave ROLLNO column in PAYMENTStable, it is possible to join with STUDENTS table andget information about parententity (student).Roll number column of PAYMENTS table is called as

foreign keyas it is used to joinPAYMENTS table with STUDENTS table. So foreign keyis the key on the many side of

the relationship.


15/22

Figure 3:Foreign Key

ROLLNO column of PAYMENTS table must derive its valuesfrom ROLLNO column ofSTUDENTS table.

When a child table contains a row that doesnt refer toa corresponding parent key, it

is called asorphan record. We must not have orphan records, as theyare result of lack

of data integrity.Integrity Rules

Data integrity is to be maintained at any cost. If data loses integrity it becomesgarbage. So every effort is to be made to ensure dataintegrity is maintained. Thefollowing are the main integrity rules that are to b

e followed.Domain integrityData is said to contain domain integrity when the value of a column is derived fromthe domain. Domain is the collection of potential valu

es. For example, column date ofjoining must be a valid date. All valid dates form on

e domain. If the value of date ofjoining is an invalid date, then it is said to violatedomain integrity.

Entity integrityThis specifies that all values in primary key must be not

null and unique. Each entitythat is stored in the table must be uniquely identified. Every table must contain a

primary key and primary key must be not null and uni


16/22

que.Referential Integrity

This specifies that a foreign key must be either null ormust have a value that isderived from corresponding parent key. For example, if we have a table called

BATCHES, then ROLLNO column of the table will be referencing ROLLNO column ofSTUDENTS table. All the values of ROLLNO column of BATCHES table must be derivedfrom ROLLNO column of STUDENTS table. This is because ofthe fact that no student

who is not part of STUDENTS table can join a batchRelational Algebra

A set of operators used to perform operations on tablesis called as

relationalalgebra

. Operators in relational algebra take one or moretables as parameters and

produce one table as the result.

The following are operators in relational algebra:UnionIntersect

Difference or minusProject

SelectJoinUnion

This takes two tables and returns all rows that are belo

nging to either first or secondtable (or both). See figure 4.

Figure 4:

Union, Intersect and Minus


17/22

i ntersect

This takes two tables and returns all rows that are belo

nging to first and second table.

See figure 4.Difference or Minus

This takes two tables and returns all rows that exist inthe first table and not in the

second table. See figure 4.ProjectTakes a single table and returns the vertical subset of t

he table. See figure 1.5.Select

Takes a single table and returns a horizontal subset of the table. That means it returns

only those rows that satisfy the condition. See figure 1.5.

Figure 5:Project, Select and Join

JoinRows of two table are combined based on the given colum

n(s) values. The tablesbeing joined must have a common column. See figure 5.

Structured Query Language (SQL)Almost all relational database management systems use SQL

(Structured QueryLanguage) for data manipulation and retrieval. SQL

is the standard language forrelational database systems. SQL is a non-procedural language, where you need to

concentrate on what you want, not on how you get it.Put it in other way, you need

not be concerned with procedural details.


18/22

SQL Commands are divided into four categories, depending upon what they do.DDL (Data Definition Language)

DML (Data Manipulation Language)DCL (Data Control Language)Query (Retrieving data)DDL

commands are used to define the data. For example, CREATE TABLE.DMLcommands such as, INSERT and DELETE are used to manipulate data.DCLcommands are used to control access to data. For example, GRANT.Query

is used to retrieve data using SELECT.DML and Query are also collectively called as DML. And DDL and DCL are called as DDL

Data processing Methods

Data that is stored is processed in three different ways.Processing data means

retrieving data and deriving information from data.Depending upon where it is doneand how it is done, there are three methods.

Centralized data processingDe-centralized data processingDistributed data processing

Centralized data processingIn this method the entire data is stored in one place a

nd processed there itself.Mainframe is best example for this kind of processing. The entire data is stored and

processed on mainframe. All programs, invoked from clien

ts (dumb terminals), areexecuted on the mainframe and data is also stored in mainframe

Figure 6:Centralized data processing.As you can see in figure 6, all terminals are attached to mainframe. Terminals do not

have any processing ability. They take input from users


19/22

and send output to users.Decentralized data processing

In this data is processed at various places. A typical example is each departmentcontaining its own system for its own data processing needs.See figure 7, for an

example of decentralized data processing. Each department stores data related toitself and runs all programs that process its data. But the biggest drawback of thistype of data processing is that data is to be duplicated.As common data is to be

stored in each machine, it is called asredundancy

. This redundancy will cause datainconsistency. That means the data stored by two departme

nts will not agree witheach other.

Data in this mode is duplicated, as there is no means tostore common data in one

place and access from all machines

Figure 7:Decentralized Data Processing

Distributed Data Processing (Client/Server)In this data processing method, data process is distributed

between client and server.Server takes care of managing data. Client interacts wi

th user. For example, if youassume a process where we need to draw a graph to show the number of students in a

given month for each subject, the following steps will take place:


20/22

Figure 8:Distributed data processing

.1 First, client interacts with user and takes input (month

name) from user and thenpasses it to server.2.Server then will query the database to get data rela

tedto the month, which is sentto server, and will send data back to client.3.The client will then use the data retrieved from data

base to draw a graph.

If you look at the above process, the client and serverare equally participating in the

process. That is the reason this type of data processing is called as distributed. The

process is evenly distributed between client and server. C

lient is a program written in

one of the font-end tools such as Visual basic or Delphi.Server is a databasemanagement system such as Oracle, SQL Server etc. The language used to send

commands from client to server is SQL (see figure 8).This is also called as two-tier client/server architecture.

In this we have only two tiers(layers) one is server and another is client.The following is an example of 3-tier client server, where client interacts with user onone side and interacts with application server on anothe

r side. Application, which

processes and validates data, takes the request from clientand sends the request inthe language understood by database server. Application servers are generally object

oriented. They expose a set of object, whose methods areto be invoked by client to

perform the required operation.

Application server takes some burden from database serverand some burden from


21/22

client.

Figure 9:

3-tier client-server architecture.

In 3-tier client/server architecture, database server andapplication server may reside

on different machines or on the same machine. Since theadvent of web applicationwe are also seeing more than 3-tiers, which is called as n

-tier architecture. Forexample, the following is the sequence in a typical webapplication.1.Client- web browser, sends request to web server.

2.Web server executes the request page, which may be an AS

P or JSP.3.

ASP or JSP will access application server.4.Application server then will access database server.

SummaryA DBMS is used to store and manipulate data. A DBMS based on relational model isRDBMS. Primary key is used for unique identificationof rows and foreign key to join

tables. Relational algebra is a collection of operators used to operate on tables. Wewill see how to practically use these operators in laterchapter.SQL is a language commonly used in RDBMS to store and r

etrieve data. In my opinion,

SQL is one of the most important languages if you aredealing with an RDBMS becausetotal data access is done using SQL.

SQL can execute queries against a database


22/22

SQL can retrieve data from a database

SQL can insert records in a database

SQL can update records in a database

SQL can delete records from a database

SQL can create new databases

SQL can create new tables in a database

SQL can create stored procedures in a database

SQL can create views in a database

SQL can set permissions on tables, procedures, and views

Documents

Data Base Systems