45
What is a database A database is any organized collection of data. Some examples of databases you may encounter in your daily life are: a telephone book T.V. Guide airline reservation system motor vehicle registration records papers in your filing cabinet files on your computer hard drive.

What is a database

Embed Size (px)

DESCRIPTION

What is a database. A database is any organized collection of data. Some examples of databases you may encounter in your daily life are: a telephone book T.V. Guide airline reservation system motor vehicle registration records papers in your filing cabinet - PowerPoint PPT Presentation

Citation preview

Page 1: What is a database

What is a database

A database is any organized collection of data. Some examples of databases you may encounter in your daily life are:

a telephone book T.V. Guide airline reservation system motor vehicle registration records papers in your filing cabinet files on your computer hard drive. 

Page 2: What is a database

Data vs. information: What is the difference?

What is data?Data can be defined in

many ways. Information science defines data as unprocessed information.

What is information? Information is data that

have been organized and communicated in a coherent and meaningful manner.

Data is converted into information, and information is converted into knowledge.

Knowledge; information evaluated and organized so that it can be used purposefully.

Page 3: What is a database

Why do we need a database?

Keep records of our:ClientsStaffVolunteers

To keep a record of activities and interventions;

Keep sales records;Develop reports;Perform researchLongitudinal tracking

Page 4: What is a database

What Is a Database System?

A Database Management System (DBMS) is a software system designed to store, manage, and facilitate access to databases.

– A database management system (DBMS) such as Access, FileMaker, Lotus Notes, Oracle or SQL Server which provides you with the software tools you need to organize that data in a flexible manner. It includes tools to add, modify or delete data from the database, ask questions (or queries) about the data stored in the database and produce reports summarizing selected contents

A database system is a software system which supports the definition and use of a database.

Page 5: What is a database

Database System Applications

DBMS contains information about a particular enterprise Collection of interrelated data Set of programs to access the data An environment that is both convenient and efficient to use

Database Applications: Banking: all transactions Airlines: reservations, schedules Universities: registration, grades Sales: customers, products, purchases Online retailers: order tracking, customized

recommendations Manufacturing: production, inventory, orders, supply chain Human resources: employee records, salaries, tax

deductions Databases touch all aspects of our lives

Page 6: What is a database

DataData InformationInformation KnowledgeKnowledge ActionAction

Is to transformIs to transform

Page 7: What is a database

Purpose of Database Systems (Cont..) In the early days, database applications were built

directly on top of file systemsDrawbacks of using file systems to store data:

Data redundancy and inconsistency Multiple file formats, duplication of information in

different filesDifficulty in accessing data

Need to write a new program to carry out each new task

Data isolation — multiple files and formats Integrity problems

Integrity constraints (e.g. account balance > 0) become “buried” in program code rather than being stated explicitly

Hard to add new constraints or change existing ones

Page 8: What is a database

Purpose of Database Systems (Cont.)

Drawbacks of using file systems (cont.) Atomicity of updates

Failures may leave database in an inconsistent state with partial updates carried out

Example: Transfer of funds from one account to another should either complete or not happen at all

Concurrent access by multiple users Concurrent accessed needed for performance Uncontrolled concurrent accesses can lead to

inconsistencies Example: Two people reading a balance and

updating it at the same timeSecurity problems

Hard to provide user access to some, but not all, dataDatabase systems offer solutions to all the above problems

Page 9: What is a database

Why Use a DBMS?

Data independence and efficient access.Reduced application development time.Data integrity and security.Uniform data administration.Concurrent access, recovery from crashes.

Page 10: What is a database

Types of Databases

Non-relational databasesNon-relational databases place information in field categories that we create so that information is available for sorting and disseminating the way we need it. The data in a non-relational database, however, is limited to that program and cannot be extracted and applied to a number of other software programs, or other database files within a school or administrative system. The data can only be "copied and pasted.“ Example: a spread sheet

Relational databasesIn relational databases, fields can be used in a number of ways (and can be of variable length), provided that they are linked in tables. It is developed based on a database model that provides for logical connections among files (known as tables) by including identifying data from one table in another table

Page 11: What is a database

Structure of a DBMS

A typical DBMS has a layered architecture.

The figure does not show the concurrency control and recovery components.

Each database system has its own variations.

Query Optimizationand Execution

Relational Operators

Files and Access Methods

Buffer Management

Disk Space Management

DB

These layersmust considerconcurrencycontrol andrecovery

Page 12: What is a database

An architecture for a database system

Page 13: What is a database

Why Use Models?

Models can be useful when we want to examine or manage part of the real world

The costs of using a model are often considerably lower than the costs of using or experimenting with the real world itself

Examples:airplane simulatornuclear power plant simulatorflood warning systemmodel of US economymodel of a heat reservoirmap

Page 14: What is a database
Page 15: What is a database

Data Model

data structuresintegrity constraintsoperations

A data model consists of notations for expressing:

Page 16: What is a database

Data Model - Data Structures

attribute typesentity typesrelationship types

FLIGHT# AIRLINE WEEKDAY PRICE

FLIGHT-SCHEDULE

101 delta mo 156

545 american we 110

912 scandinavian fr 450

242 usair mo 231

DEPT-AIRPORT

FLIGHT# AIRPORT-CODE

101 atl

912 cph

545 lax

All data models have notation for defining:

Page 17: What is a database

Data Model - Constraints

Static constraints apply to database stateDynamic constraints apply to change of database stateE.g., “All FLIGHT-SCHEDULE entities must have precisely

one DEPT-AIRPORT relationship

FLIGHT# AIRLINE WEEKDAY PRICE

FLIGHT-SCHEDULE

101 delta mo 156

545 american we 110

912 scandinavian fr 450

242 usair mo 231

Constraints express rules that cannot be expressed by the data structures alone:

DEPT-AIRPORT

FLIGHT# AIRPORT-CODE

101 atl

912 cph

545 lax

242 bos

Page 18: What is a database

Data Model - Operations

insert FLIGHT-SCHEDULE(97, delta, tu, 258); insert DEPT-AIRPORT(97, atl);

select FLIGHT#, WEEKDAY

from FLIGHT-SCHEDULE

where AIRLINE=‘delta’;

Operations support change and retrieval of data:

FLIGHT# AIRLINE WEEKDAY PRICE

FLIGHT-SCHEDULE

101 delta mo 156

545 american we 110

912 scandinavian fr 450

242 usair mo 231

97 delta tu 258

DEPT-AIRPORT

FLIGHT# AIRPORT-CODE

101 atl

912 cph

545 lax

242 bos

97 atl

Page 19: What is a database

Keys and Identifiers

A key on FLIGHT# in FLIGHT-SCHEDULE will force all FLIGHT#’s to be unique in FLIGHT-SCHEDULE

Consider the following keys on DEPT-AIRPORT:

Keys (or identifiers) are uniqueness constraints

FLIGHT# AIRPORT-CODE FLIGHT# AIRPORT-CODE FLIGHT# AIRPORT-CODEFLIGHT# AIRPORT-CODE

DEPT-AIRPORT

FLIGHT# AIRPORT-CODE

101 atl

912 cph

545 lax

242 bos

FLIGHT# AIRLINE WEEKDAY PRICE

FLIGHT-SCHEDULE

101 delta mo 156

545 american we 110

912 scandinavian fr 450

242 usair mo 231

Page 20: What is a database

Integrity and ConsistencyIntegrity: does the model reflect reality well?Consistency: is the model without internal conflicts?

a FLIGHT# in FLIGHT-SCHEDULE cannot be null because it models the existence of an entity in the real world

a FLIGHT# in DEPT-AIRPORT must exist in FLIGHT-SCHEDULE because it doesn’t make sense for a non-existing FLIGHT-SCHEDULE entity to have a DEPT-AIRPORT

DEPT-AIRPORT

FLIGHT# AIRPORT-CODE

101 atl

912 cph

545 lax

242 bos

FLIGHT# AIRLINE WEEKDAY PRICE

FLIGHT-SCHEDULE

101 delta mo 156

545 american we 110

912 scandinavian fr 450

242 usair mo 231

Page 21: What is a database

Data Models A collection of tools for describing

Data Data relationships Data semantics Data constraints

Relational model The relational model of data is the most widely used model today.

Main concept: relation, basically a table with rows and columns.

Every relation has a schema, which describes the columns, or fields.

Entity-Relationship data model (mainly for database design) No database system is based on the model Object-based data models (Object-oriented and Object-

relational) Semi structured data model (XML) Other older models:

Network model Hierarchical model

Page 22: What is a database

Relational Model

Example of tabular data in the relational modelAttributes

Commercial systems include: ORACLE, DB2, SYBASE, INFORMIX, INGRES, SQL Server.

Dominates the database market on all platforms

Page 23: What is a database
Page 24: What is a database

Relational Model - Integrity ConstraintsKeysPrimary KeysEntity IntegrityReferential Integrity

reservation

flight# date customer#

flight-schedule

flight#

p

customer

customer# customer name

p

Page 25: What is a database

Relational Model - Operations

Powerful set-oriented query languagesRelational Algebra: procedural; describes

how to compute a query; operators like JOIN, SELECT, PROJECT

Relational Calculus: declarative; describes the desired result, e.g. SQL, QBE

insert, delete, and update capabilities

Page 26: What is a database

Relational Model - Operations

tuple calculus example (SQL) select flight#, date from reservation R, customer C where R.customer#=C.customer# and customer-name=‘LEO’;algebra example (ISBL) ((reservation join customer) where customer-

name=‘LEO’) [flight#, date];

customercustomer# customer-name

_c LEOdate

reservation

flight# customer#_c.P .P

Page 27: What is a database

REALITY

• structures• processes

DATABASE SYSTEM

MODEL

data modeling

The model represents a perception of structures of reality

The data modeling process is to fix a perception of structures of reality and represent this perception

In the data modeling process we select aspects and we abstract

Page 28: What is a database

REALITY

• structures• processes

DATABASE SYSTEM

MODEL

process modeling

The use of the model reflects processes of reality Processes may be represented by programs with

embedded database queries and updates Processes may be represented by ad-hoc database

queries and updates at run-time

Page 29: What is a database

Database Design

is a model of structures of realitysupports queries and updates modeling processes of

reality runs efficiently

The purpose of database design is to create a database which

Page 30: What is a database

Instances and Schemas Instance – the actual content of the database at a particular

point in time

external schema1

external schema1

external schema2

external schema2

external schema3

external schema3

conceptual schema

conceptual schema

internal schema

internal schema

databasedatabase

• external schema:

use of data

• conceptual schema:

meaning of data

• internal schema:

storage of data

Page 31: What is a database

Example: University Database

Conceptual schema: Students(sid: string, name: string,

login: string, age: integer, gpa:real)

Courses(cid: string, cname:string, credits:integer)

Enrolled(sid:string, cid:string, grade:string)

External Schema (View): Course_info(cid:string,enrollment:in

teger)Physical schema:

Relations stored as unordered files. Index on first column of Students.

Physical Schema

Conceptual Schema

View 1 View 2 View 3

DB

Page 32: What is a database

Levels of Abstraction

Views describe how users see the data. Application programs hide details of data types.

Conceptual schema defines

logical structure

Physical schema describes how a record (e.g., customer) is stored. Describes the files and indexes used.

Physical Schema

Conceptual Schema

View 1 View 2 View 3

DB

Users

Page 33: What is a database

Data IndependenceApplications insulated from how

data is structured and stored.Logical data independence:

Protection from changes in logical structure of data.

Physical data independence: Protection from changes in physical structure of data.

The ability to modify the physical schema without changing the logical schema Applications depend on the

logical schema In general, the interfaces

between the various levels and components should be well defined so that changes in some parts do not seriously influence others.

Physical Schema

Conceptual Schema

View 1 View 2 View 3

DB

Page 34: What is a database

Selecting a Database Management System

Database management systems (or DBMSs) can be divided into two categories -- desktop databases and server databases.  

Generally speaking, desktop databases are oriented toward single-user applications and reside on standard personal computers (hence the term desktop). 

Server databases contain mechanisms to ensure the reliability and consistency of data and are geared toward multi-user applications.

Page 35: What is a database

Database Users

Users are differentiated by the way they expect to interact with

the systemApplication programmers – interact with system through

DML callsSophisticated users – form requests in a database query

languageSpecialized users – write specialized database applications

that do not fit into the traditional data processing frameworkNaïve users – invoke one of the permanent application

programs that have been written previouslyExamples, people accessing database over the web, bank

tellers, clerical staff

Page 36: What is a database

Database Administrator

Coordinates all the activities of the database systemhas a good understanding of the enterprise’s

information resources and needs.Database administrator's duties include:

Storage structure and access method definitionSchema and physical organization modificationGranting users authority to access the databaseBacking up dataMonitoring performance and responding to changes

Database tuning

Page 37: What is a database

Functionality of a DBMS

The programmer sees SQL, which has two components:Data Definition Language - DDLData Manipulation Language - DML

query language

Behind the scenes the DBMS has:Query engineQuery optimizerStorage managementTransaction Management (concurrency, recovery)

Page 38: What is a database

Data Definition Language (DDL) Specification notation for defining the database schema

Example:create table account ( account_number char(10),

branch_name char(10), balance integer)

DDL compiler generates a set of tables stored in a data dictionary

Data dictionary contains metadata (i.e., data about data) Database schema Data storage and definition language

Specifies the storage structure and access methods used

Integrity constraints Domain constraints Referential integrity (e.g. branch_name must

correspond to a valid branch in the branch table) Authorization

Page 39: What is a database

Data Manipulation Language (DML)

Language for accessing and manipulating the data organized by the appropriate data modelDML also known as query language

Two classes of languages Procedural – user specifies what data is required and

how to get those data Declarative (nonprocedural) – user specifies what

data is required without specifying how to get those dataSQL is the most widely used query language

Page 40: What is a database

Transactions: ACID Properties Key concept is a transaction: a sequence of database actions

(reads/writes). DBMS ensures atomicity (all-or-nothing property) even if system

crashes in the middle of a Xact. Each transaction, executed completely, must take the DB

between consistent states or must not run at all. DBMS ensures that concurrent transactions appear to run in

isolation. DBMS ensures durability of committed Xacts even if system

crashes. Idea: Keep a log (history) of all actions carried out by the DBMS

while executing a set of Xacts: Before a change is made to the database, the corresponding

log entry is forced to a safe location. After a crash, the effects of partially executed transactions

are undone using the log. Effects of committed transactions are redone using the log.

Page 41: What is a database

How the Programmer Sees the DBMS

Start with DDL to create tables:

Continue with DML to populate tables:

CREATE TABLE Students (Name CHAR(30)SSN CHAR(9) PRIMARY KEY NOT NULL,Category CHAR(20)

) . . .

CREATE TABLE Students (Name CHAR(30)SSN CHAR(9) PRIMARY KEY NOT NULL,Category CHAR(20)

) . . .

INSERT INTO StudentsVALUES(‘Charles’, ‘123456789’, ‘undergraduate’). . . .

INSERT INTO StudentsVALUES(‘Charles’, ‘123456789’, ‘undergraduate’). . . .

Page 42: What is a database

How the Programmer Sees the DBMS

Tables:

SSN Name Category 123-45-6789 Charles undergrad 234-56-7890 Dan grad … …

SSN CID 123-45-6789 CSE444 123-45-6789 CSE444 234-56-7890 CSE142 …

Students: Takes:

CID Name Quarter CSE444 Databases fall CSE541 Operating systems winter

Courses:

“data independence” = separate logical view from physical implementation

Still implemented as files, but behind the scenes can be quite complex

Page 43: What is a database

Transactions

Enroll “Mary Johnson” in “CSE444”:

BEGIN TRANSACTION;

INSERT INTO Takes SELECT Students.SSN, Courses.CID FROM Students, Courses WHERE Students.name = ‘Mary Johnson’ and Courses.name = ‘CSE444’

-- More updates here....

IF everything-went-OK THEN COMMIT;ELSE ROLLBACK

BEGIN TRANSACTION;

INSERT INTO Takes SELECT Students.SSN, Courses.CID FROM Students, Courses WHERE Students.name = ‘Mary Johnson’ and Courses.name = ‘CSE444’

-- More updates here....

IF everything-went-OK THEN COMMIT;ELSE ROLLBACK

If system crashes, the transaction is still either committed or aborted

Page 44: What is a database

Advantages of a DBMS

Data independenceEfficient data accessData integrity & securityData administrationConcurrent access, crash recoveryReduced application development timeSo why not use them always?

Expensive/complicated to set up & maintainThis cost & complexity must be offset by needGeneral-purpose, not suited for special-purpose tasks (e.g.

text search!)

Page 45: What is a database

Use a DBMS when this is important

persistent storage of data centralized control of

datacontrol of redundancycontrol of consistency

and integritymultiple user supportsharing of datadata documentationdata independencecontrol of access and

securitybackup and recovery

Do not use a DBMS when

o the initial investment in hardware, software, and training is too high

o the generality a DBMS provides is not needed

o the overhead for security, concurrency control, and recovery is too high

o data and applications are simple and stable

o real-time requirements cannot be met by it

o multiple user access is not needed