Upload
jescie-rodgers
View
33
Download
1
Tags:
Embed Size (px)
DESCRIPTION
What is a database. A database is any organized collection of data. Some examples of databases you may encounter in your daily life are: a telephone book T.V. Guide airline reservation system motor vehicle registration records papers in your filing cabinet - PowerPoint PPT Presentation
Citation preview
What is a database
A database is any organized collection of data. Some examples of databases you may encounter in your daily life are:
a telephone book T.V. Guide airline reservation system motor vehicle registration records papers in your filing cabinet files on your computer hard drive.
Data vs. information: What is the difference?
What is data?Data can be defined in
many ways. Information science defines data as unprocessed information.
What is information? Information is data that
have been organized and communicated in a coherent and meaningful manner.
Data is converted into information, and information is converted into knowledge.
Knowledge; information evaluated and organized so that it can be used purposefully.
Why do we need a database?
Keep records of our:ClientsStaffVolunteers
To keep a record of activities and interventions;
Keep sales records;Develop reports;Perform researchLongitudinal tracking
What Is a Database System?
A Database Management System (DBMS) is a software system designed to store, manage, and facilitate access to databases.
– A database management system (DBMS) such as Access, FileMaker, Lotus Notes, Oracle or SQL Server which provides you with the software tools you need to organize that data in a flexible manner. It includes tools to add, modify or delete data from the database, ask questions (or queries) about the data stored in the database and produce reports summarizing selected contents
A database system is a software system which supports the definition and use of a database.
Database System Applications
DBMS contains information about a particular enterprise Collection of interrelated data Set of programs to access the data An environment that is both convenient and efficient to use
Database Applications: Banking: all transactions Airlines: reservations, schedules Universities: registration, grades Sales: customers, products, purchases Online retailers: order tracking, customized
recommendations Manufacturing: production, inventory, orders, supply chain Human resources: employee records, salaries, tax
deductions Databases touch all aspects of our lives
DataData InformationInformation KnowledgeKnowledge ActionAction
Is to transformIs to transform
Purpose of Database Systems (Cont..) In the early days, database applications were built
directly on top of file systemsDrawbacks of using file systems to store data:
Data redundancy and inconsistency Multiple file formats, duplication of information in
different filesDifficulty in accessing data
Need to write a new program to carry out each new task
Data isolation — multiple files and formats Integrity problems
Integrity constraints (e.g. account balance > 0) become “buried” in program code rather than being stated explicitly
Hard to add new constraints or change existing ones
Purpose of Database Systems (Cont.)
Drawbacks of using file systems (cont.) Atomicity of updates
Failures may leave database in an inconsistent state with partial updates carried out
Example: Transfer of funds from one account to another should either complete or not happen at all
Concurrent access by multiple users Concurrent accessed needed for performance Uncontrolled concurrent accesses can lead to
inconsistencies Example: Two people reading a balance and
updating it at the same timeSecurity problems
Hard to provide user access to some, but not all, dataDatabase systems offer solutions to all the above problems
Why Use a DBMS?
Data independence and efficient access.Reduced application development time.Data integrity and security.Uniform data administration.Concurrent access, recovery from crashes.
Types of Databases
Non-relational databasesNon-relational databases place information in field categories that we create so that information is available for sorting and disseminating the way we need it. The data in a non-relational database, however, is limited to that program and cannot be extracted and applied to a number of other software programs, or other database files within a school or administrative system. The data can only be "copied and pasted.“ Example: a spread sheet
Relational databasesIn relational databases, fields can be used in a number of ways (and can be of variable length), provided that they are linked in tables. It is developed based on a database model that provides for logical connections among files (known as tables) by including identifying data from one table in another table
Structure of a DBMS
A typical DBMS has a layered architecture.
The figure does not show the concurrency control and recovery components.
Each database system has its own variations.
Query Optimizationand Execution
Relational Operators
Files and Access Methods
Buffer Management
Disk Space Management
DB
These layersmust considerconcurrencycontrol andrecovery
An architecture for a database system
Why Use Models?
Models can be useful when we want to examine or manage part of the real world
The costs of using a model are often considerably lower than the costs of using or experimenting with the real world itself
Examples:airplane simulatornuclear power plant simulatorflood warning systemmodel of US economymodel of a heat reservoirmap
Data Model
data structuresintegrity constraintsoperations
A data model consists of notations for expressing:
Data Model - Data Structures
attribute typesentity typesrelationship types
FLIGHT# AIRLINE WEEKDAY PRICE
FLIGHT-SCHEDULE
101 delta mo 156
545 american we 110
912 scandinavian fr 450
242 usair mo 231
DEPT-AIRPORT
FLIGHT# AIRPORT-CODE
101 atl
912 cph
545 lax
All data models have notation for defining:
Data Model - Constraints
Static constraints apply to database stateDynamic constraints apply to change of database stateE.g., “All FLIGHT-SCHEDULE entities must have precisely
one DEPT-AIRPORT relationship
FLIGHT# AIRLINE WEEKDAY PRICE
FLIGHT-SCHEDULE
101 delta mo 156
545 american we 110
912 scandinavian fr 450
242 usair mo 231
Constraints express rules that cannot be expressed by the data structures alone:
DEPT-AIRPORT
FLIGHT# AIRPORT-CODE
101 atl
912 cph
545 lax
242 bos
Data Model - Operations
insert FLIGHT-SCHEDULE(97, delta, tu, 258); insert DEPT-AIRPORT(97, atl);
select FLIGHT#, WEEKDAY
from FLIGHT-SCHEDULE
where AIRLINE=‘delta’;
Operations support change and retrieval of data:
FLIGHT# AIRLINE WEEKDAY PRICE
FLIGHT-SCHEDULE
101 delta mo 156
545 american we 110
912 scandinavian fr 450
242 usair mo 231
97 delta tu 258
DEPT-AIRPORT
FLIGHT# AIRPORT-CODE
101 atl
912 cph
545 lax
242 bos
97 atl
Keys and Identifiers
A key on FLIGHT# in FLIGHT-SCHEDULE will force all FLIGHT#’s to be unique in FLIGHT-SCHEDULE
Consider the following keys on DEPT-AIRPORT:
Keys (or identifiers) are uniqueness constraints
FLIGHT# AIRPORT-CODE FLIGHT# AIRPORT-CODE FLIGHT# AIRPORT-CODEFLIGHT# AIRPORT-CODE
DEPT-AIRPORT
FLIGHT# AIRPORT-CODE
101 atl
912 cph
545 lax
242 bos
FLIGHT# AIRLINE WEEKDAY PRICE
FLIGHT-SCHEDULE
101 delta mo 156
545 american we 110
912 scandinavian fr 450
242 usair mo 231
Integrity and ConsistencyIntegrity: does the model reflect reality well?Consistency: is the model without internal conflicts?
a FLIGHT# in FLIGHT-SCHEDULE cannot be null because it models the existence of an entity in the real world
a FLIGHT# in DEPT-AIRPORT must exist in FLIGHT-SCHEDULE because it doesn’t make sense for a non-existing FLIGHT-SCHEDULE entity to have a DEPT-AIRPORT
DEPT-AIRPORT
FLIGHT# AIRPORT-CODE
101 atl
912 cph
545 lax
242 bos
FLIGHT# AIRLINE WEEKDAY PRICE
FLIGHT-SCHEDULE
101 delta mo 156
545 american we 110
912 scandinavian fr 450
242 usair mo 231
Data Models A collection of tools for describing
Data Data relationships Data semantics Data constraints
Relational model The relational model of data is the most widely used model today.
Main concept: relation, basically a table with rows and columns.
Every relation has a schema, which describes the columns, or fields.
Entity-Relationship data model (mainly for database design) No database system is based on the model Object-based data models (Object-oriented and Object-
relational) Semi structured data model (XML) Other older models:
Network model Hierarchical model
Relational Model
Example of tabular data in the relational modelAttributes
Commercial systems include: ORACLE, DB2, SYBASE, INFORMIX, INGRES, SQL Server.
Dominates the database market on all platforms
Relational Model - Integrity ConstraintsKeysPrimary KeysEntity IntegrityReferential Integrity
reservation
flight# date customer#
flight-schedule
flight#
p
customer
customer# customer name
p
Relational Model - Operations
Powerful set-oriented query languagesRelational Algebra: procedural; describes
how to compute a query; operators like JOIN, SELECT, PROJECT
Relational Calculus: declarative; describes the desired result, e.g. SQL, QBE
insert, delete, and update capabilities
Relational Model - Operations
tuple calculus example (SQL) select flight#, date from reservation R, customer C where R.customer#=C.customer# and customer-name=‘LEO’;algebra example (ISBL) ((reservation join customer) where customer-
name=‘LEO’) [flight#, date];
customercustomer# customer-name
_c LEOdate
reservation
flight# customer#_c.P .P
REALITY
• structures• processes
DATABASE SYSTEM
MODEL
data modeling
The model represents a perception of structures of reality
The data modeling process is to fix a perception of structures of reality and represent this perception
In the data modeling process we select aspects and we abstract
REALITY
• structures• processes
DATABASE SYSTEM
MODEL
process modeling
The use of the model reflects processes of reality Processes may be represented by programs with
embedded database queries and updates Processes may be represented by ad-hoc database
queries and updates at run-time
Database Design
is a model of structures of realitysupports queries and updates modeling processes of
reality runs efficiently
The purpose of database design is to create a database which
Instances and Schemas Instance – the actual content of the database at a particular
point in time
external schema1
external schema1
external schema2
external schema2
external schema3
external schema3
conceptual schema
conceptual schema
internal schema
internal schema
databasedatabase
• external schema:
use of data
• conceptual schema:
meaning of data
• internal schema:
storage of data
Example: University Database
Conceptual schema: Students(sid: string, name: string,
login: string, age: integer, gpa:real)
Courses(cid: string, cname:string, credits:integer)
Enrolled(sid:string, cid:string, grade:string)
External Schema (View): Course_info(cid:string,enrollment:in
teger)Physical schema:
Relations stored as unordered files. Index on first column of Students.
Physical Schema
Conceptual Schema
View 1 View 2 View 3
DB
Levels of Abstraction
Views describe how users see the data. Application programs hide details of data types.
Conceptual schema defines
logical structure
Physical schema describes how a record (e.g., customer) is stored. Describes the files and indexes used.
Physical Schema
Conceptual Schema
View 1 View 2 View 3
DB
Users
Data IndependenceApplications insulated from how
data is structured and stored.Logical data independence:
Protection from changes in logical structure of data.
Physical data independence: Protection from changes in physical structure of data.
The ability to modify the physical schema without changing the logical schema Applications depend on the
logical schema In general, the interfaces
between the various levels and components should be well defined so that changes in some parts do not seriously influence others.
Physical Schema
Conceptual Schema
View 1 View 2 View 3
DB
Selecting a Database Management System
Database management systems (or DBMSs) can be divided into two categories -- desktop databases and server databases.
Generally speaking, desktop databases are oriented toward single-user applications and reside on standard personal computers (hence the term desktop).
Server databases contain mechanisms to ensure the reliability and consistency of data and are geared toward multi-user applications.
Database Users
Users are differentiated by the way they expect to interact with
the systemApplication programmers – interact with system through
DML callsSophisticated users – form requests in a database query
languageSpecialized users – write specialized database applications
that do not fit into the traditional data processing frameworkNaïve users – invoke one of the permanent application
programs that have been written previouslyExamples, people accessing database over the web, bank
tellers, clerical staff
Database Administrator
Coordinates all the activities of the database systemhas a good understanding of the enterprise’s
information resources and needs.Database administrator's duties include:
Storage structure and access method definitionSchema and physical organization modificationGranting users authority to access the databaseBacking up dataMonitoring performance and responding to changes
Database tuning
Functionality of a DBMS
The programmer sees SQL, which has two components:Data Definition Language - DDLData Manipulation Language - DML
query language
Behind the scenes the DBMS has:Query engineQuery optimizerStorage managementTransaction Management (concurrency, recovery)
Data Definition Language (DDL) Specification notation for defining the database schema
Example:create table account ( account_number char(10),
branch_name char(10), balance integer)
DDL compiler generates a set of tables stored in a data dictionary
Data dictionary contains metadata (i.e., data about data) Database schema Data storage and definition language
Specifies the storage structure and access methods used
Integrity constraints Domain constraints Referential integrity (e.g. branch_name must
correspond to a valid branch in the branch table) Authorization
Data Manipulation Language (DML)
Language for accessing and manipulating the data organized by the appropriate data modelDML also known as query language
Two classes of languages Procedural – user specifies what data is required and
how to get those data Declarative (nonprocedural) – user specifies what
data is required without specifying how to get those dataSQL is the most widely used query language
Transactions: ACID Properties Key concept is a transaction: a sequence of database actions
(reads/writes). DBMS ensures atomicity (all-or-nothing property) even if system
crashes in the middle of a Xact. Each transaction, executed completely, must take the DB
between consistent states or must not run at all. DBMS ensures that concurrent transactions appear to run in
isolation. DBMS ensures durability of committed Xacts even if system
crashes. Idea: Keep a log (history) of all actions carried out by the DBMS
while executing a set of Xacts: Before a change is made to the database, the corresponding
log entry is forced to a safe location. After a crash, the effects of partially executed transactions
are undone using the log. Effects of committed transactions are redone using the log.
How the Programmer Sees the DBMS
Start with DDL to create tables:
Continue with DML to populate tables:
CREATE TABLE Students (Name CHAR(30)SSN CHAR(9) PRIMARY KEY NOT NULL,Category CHAR(20)
) . . .
CREATE TABLE Students (Name CHAR(30)SSN CHAR(9) PRIMARY KEY NOT NULL,Category CHAR(20)
) . . .
INSERT INTO StudentsVALUES(‘Charles’, ‘123456789’, ‘undergraduate’). . . .
INSERT INTO StudentsVALUES(‘Charles’, ‘123456789’, ‘undergraduate’). . . .
How the Programmer Sees the DBMS
Tables:
SSN Name Category 123-45-6789 Charles undergrad 234-56-7890 Dan grad … …
SSN CID 123-45-6789 CSE444 123-45-6789 CSE444 234-56-7890 CSE142 …
Students: Takes:
CID Name Quarter CSE444 Databases fall CSE541 Operating systems winter
Courses:
“data independence” = separate logical view from physical implementation
Still implemented as files, but behind the scenes can be quite complex
Transactions
Enroll “Mary Johnson” in “CSE444”:
BEGIN TRANSACTION;
INSERT INTO Takes SELECT Students.SSN, Courses.CID FROM Students, Courses WHERE Students.name = ‘Mary Johnson’ and Courses.name = ‘CSE444’
-- More updates here....
IF everything-went-OK THEN COMMIT;ELSE ROLLBACK
BEGIN TRANSACTION;
INSERT INTO Takes SELECT Students.SSN, Courses.CID FROM Students, Courses WHERE Students.name = ‘Mary Johnson’ and Courses.name = ‘CSE444’
-- More updates here....
IF everything-went-OK THEN COMMIT;ELSE ROLLBACK
If system crashes, the transaction is still either committed or aborted
Advantages of a DBMS
Data independenceEfficient data accessData integrity & securityData administrationConcurrent access, crash recoveryReduced application development timeSo why not use them always?
Expensive/complicated to set up & maintainThis cost & complexity must be offset by needGeneral-purpose, not suited for special-purpose tasks (e.g.
text search!)
Use a DBMS when this is important
persistent storage of data centralized control of
datacontrol of redundancycontrol of consistency
and integritymultiple user supportsharing of datadata documentationdata independencecontrol of access and
securitybackup and recovery
Do not use a DBMS when
o the initial investment in hardware, software, and training is too high
o the generality a DBMS provides is not needed
o the overhead for security, concurrency control, and recovery is too high
o data and applications are simple and stable
o real-time requirements cannot be met by it
o multiple user access is not needed