51
IS 257 – Spring 2004 2004.01.20 - SLIDE 1 Database Management: Introduction Ray R. Larson University of California, Berkeley School of Information Management and Systems SIMS 257: Database Management

2004.01.20 - SLIDE 1IS 257 – Spring 2004 Database Management: Introduction Ray R. Larson University of California, Berkeley School of Information Management

  • View
    221

  • Download
    0

Embed Size (px)

Citation preview

IS 257 – Spring 2004 2004.01.20 - SLIDE 1

Database Management:Introduction

Ray R. Larson

University of California, Berkeley

School of Information Management and Systems

SIMS 257: Database Management

IS 257 – Spring 2004 2004.01.20 - SLIDE 2

Overview

• Announcements– TA – Mayjane Co

• Course Description

• Database Concepts and Terminology

• Database Models

IS 257 – Spring 2004 2004.01.20 - SLIDE 3

Course Overview

• Description of the Course

• Assignments

• Readings

• Grading

• Schedule

• Web site: http://sims.berkeley.edu/courses/is257/s04

IS 257 – Spring 2004 2004.01.20 - SLIDE 4

Course Description

• This course is concerned with the design of the database itself -- not with the design of database system software.– We will discuss DBMS internals only as they

relate to the database and its design and structure

• We will spend a fair amount time on database application design, especially on options for Web application database support -- but this will not be primary focus.

IS 257 – Spring 2004 2004.01.20 - SLIDE 5

Assignments

• Two kinds of assignments– Using a pre-built database for search and

retrieval and database modification queries– Designing, populating, and running queries

against your own personal database• Types of database project

– Individual» Work related» Course only» Projects from around campus that need doing…

– Group» Course related» SIMS Final project

IS 257 – Spring 2004 2004.01.20 - SLIDE 6

Readings

• Textbook is:– Jeffrey A. Hoffer, Mary B. Prescott and Fred

R. McFadden. Modern Database Management (Sixth Edition). Prentice Hall (Pearson Educational) : Upper Saddle River, NJ, 2002.

– ISBN 0-13-061183-2

IS 257 – Spring 2004 2004.01.20 - SLIDE 7

Grading

• Grades will be based on:– Assignments (30%)– Personal/Group Database project (60%)– Class participation (10%)– (No midterm or final)

IS 257 – Spring 2004 2004.01.20 - SLIDE 8

Schedule

• on website:

• http://sims.berkeley.edu/courses/is257/s04/Schedule.html

IS 257 – Spring 2004 2004.01.20 - SLIDE 9

What is a Database?

IS 257 – Spring 2004 2004.01.20 - SLIDE 10

Files and Databases

• File: A collection of records or documents dealing with one organization, person, area or subject. (Rowley)– Manual (paper) files– Computer files

• Database: A collection of similar records with relationships between the records. (Rowley)– bibliographic, statistical, business data, images,

etc.

IS 257 – Spring 2004 2004.01.20 - SLIDE 11

Database

• A Database is a collection of stored operational data used by the application systems of some particular enterprise. (C.J. Date)– Paper “Databases”

• Still contain a large portion of the world’s knowledge

– File-Based Data Processing Systems• Early batch processing of (primarily) business data

– Database Management Systems (DBMS)

IS 257 – Spring 2004 2004.01.20 - SLIDE 12

Terms and Concepts

• Database Management System -- DBMS– Software system used to define, create,

maintain and provide controlled access to the database and repository

IS 257 – Spring 2004 2004.01.20 - SLIDE 13

Terms and Concepts

• Repository– AKA Data Dictionary– The place where all metadata for a particular

database is stored– may also include information on relationships

between files or tables in a particular database

IS 257 – Spring 2004 2004.01.20 - SLIDE 14

Terms and Concepts

• Metadata– Data about data

• In DBMS means all of the characteristics describing the attributes of an entity, E.G.:

– name of attribute– data type of attribute– size of the attribute– format or special characteristics

– Characteristics of files or relations• name, content, notes, etc.

IS 257 – Spring 2004 2004.01.20 - SLIDE 15

Why DBMS?

• History– 50’s and 60’s all applications were custom

built for particular needs– File based– Many similar/duplicative applications dealing

with collections of business data– Early DBMS were extensions of programming

languages– 1970 - E.F. Codd and the Relational Model– 1979 - Ashton-Tate & first Microcomputer

DBMS

IS 257 – Spring 2004 2004.01.20 - SLIDE 16

File Based Systems

Naughty

NiceJust what asked for

CoalEstimation

DeliveryList

Application File

ToysAddresses

Toys

IS 257 – Spring 2004 2004.01.20 - SLIDE 17

From File Systems to DBMS

• Problems with File Processing systems– Inconsistent Data

– Inflexibility

– Limited Data Sharing

– Poor enforcement of standards

– Excessive program maintenance

IS 257 – Spring 2004 2004.01.20 - SLIDE 18

DBMS Benefits

• Minimal Data Redundancy• Consistency of Data• Integration of Data• Sharing of Data• Ease of Application Development• Uniform Security, Privacy, and Integrity

Controls• Data Accessibility and Responsiveness• Data Independence• Reduced Program Maintenance

IS 257 – Spring 2004 2004.01.20 - SLIDE 19

Terms and Concepts

• Data Independence– Physical representation and location of data

and the use of that data are separated• The application doesn’t need to know how or

where the database has stored the data, but just how to ask for it.

• Moving a database from one DBMS to another should not have a material effect on application program

• Recoding, adding fields, etc. in the database should not affect applications

IS 257 – Spring 2004 2004.01.20 - SLIDE 20

Database Environment

CASE Tools

DBMS

UserInterface

ApplicationPrograms

Repository Database

IS 257 – Spring 2004 2004.01.20 - SLIDE 21

Database Components

DBMS===============

Design toolsTable CreationForm CreationQuery CreationReport Creation

Procedural language

compiler (4GL)=============

Run timeForm processorQuery processor

Report WriterLanguage Run time

UserInterface

Applications

ApplicationProgramsDatabase

Database contains:User’s DataMetadataIndexesApplication Metadata

IS 257 – Spring 2004 2004.01.20 - SLIDE 22

Types of Database Systems

• PC Databases

• Centralized Database

• Client/Server Databases

• Distributed Databases

• Database Models

IS 257 – Spring 2004 2004.01.20 - SLIDE 23

PC Databases

E.G.AccessFoxProDbaseEtc.

IS 257 – Spring 2004 2004.01.20 - SLIDE 24

Centralized Databases

Cental Computer

IS 257 – Spring 2004 2004.01.20 - SLIDE 25

Client Server Databases

NetworkClient

Client

Client

DatabaseServer

IS 257 – Spring 2004 2004.01.20 - SLIDE 26

Distributed Databases

computercomputer

computer

Location A

Location CLocation B

HomogeneousDatabases

IS 257 – Spring 2004 2004.01.20 - SLIDE 27

Distributed Databases

Local Network

DatabaseServer

Client

Client

CommServer

Remote Comp.

Remote Comp.

HeterogeneousOr FederatedDatabases

IS 257 – Spring 2004 2004.01.20 - SLIDE 28

Terms and Concepts

• Database Application– An application program (or set of related

programs) that is used to perform a series of database activities:

• Create• Read• Update• Delete• On behalf of database users

IS 257 – Spring 2004 2004.01.20 - SLIDE 29

Range of Database Applications

• PC databases– Usually for individual

• WorkGroup databases– Small group use where everyone has access

to the database over a LAN

• Departmental databases– Larger than a workgroup – but similar

• Enterprises databases– For the entire organization over an intranet (or

sometimes the internet)

IS 257 – Spring 2004 2004.01.20 - SLIDE 30

Terms and Concepts

• Database activities:– Create

• Add new data to the database

– Read• Read current data from the database

– Update• Update or modify current database data

– Delete• Remove current data from the database

IS 257 – Spring 2004 2004.01.20 - SLIDE 31

Terms and Concepts

• Enterprise– Organization

• Entity– Person, Place, Thing, Event, Concept...

• Attributes– Data elements (facts) about some entity– Also sometimes called fields or items or domains

• Data values– instances of a particular attribute for a particular entity

IS 257 – Spring 2004 2004.01.20 - SLIDE 32

Terms and Concepts

• Records– The set of values for all attributes of a

particular entity– AKA “tuples” or “rows” in relational DBMS

• File– Collection of records – AKA “Relation” or “Table” in relational DBMS

IS 257 – Spring 2004 2004.01.20 - SLIDE 33

Terms and Concepts

• Key– an attribute or set of attributes used to identify

or locate records in a file

• Primary Key– an attribute or set of attributes that uniquely

identifies each record in a file

IS 257 – Spring 2004 2004.01.20 - SLIDE 34

Terms and Concepts

• DA– Data adminstrator - person responsible for the

Data Administration function in an organization

– Sometimes may be the CIO -- Chief Information Officer

• DBA– Database Administrator - person responsible

for the Database Administration Function

IS 257 – Spring 2004 2004.01.20 - SLIDE 35

Terms and Concepts

• Data Administration– Responsibility for the overall management of data

resources within an organization

• Database Administration– Responsibility for physical database design and

technical issues in database management

• Data Steward– Responsibility for some subset of the

organization’s data, and all of the interactions (applications, user access, etc.) for that data

IS 257 – Spring 2004 2004.01.20 - SLIDE 36

Terms and Concepts

• Models– (1) Levels or views of the Database

• Conceptual, logical, physical

– (2) DBMS types• Relational, Hierarchic, Network, Object-Oriented,

Object-Relational

IS 257 – Spring 2004 2004.01.20 - SLIDE 37

Models (1)

ConceptualModel

LogicalModel

External Model

Conceptual requirements

Conceptual requirements

Conceptual requirements

Conceptual requirements

Application 1

Application 1

Application 2 Application 3 Application 4

Application 2

Application 3

Application 4

External Model

External Model

External Model

Internal Model

IS 257 – Spring 2004 2004.01.20 - SLIDE 38

Data Models(2): History

• Hierarchical Model (1960’s and 1970’s)– Similar to data structures in programming

languages.

Books(id, title)

Publisher SubjectsAuthors

(first, last)

IS 257 – Spring 2004 2004.01.20 - SLIDE 39

Data Models(2): History

• Network Model (1970’s)– Provides for single entries of data and

navigational “links” through chains of data.

Subjects Books

Authors

Publishers

IS 257 – Spring 2004 2004.01.20 - SLIDE 40

Data Models(2): History

• Relational Model (1980’s)– Provides a conceptually simple model for data

as relations (typically considered “tables”) with all data visible.

Book ID Title pubid Author id1 Introductio 2 12 The history 4 23 New stuff ab 3 34 Another title 2 45 And yet more 1 5

pubid pubname1 Harper2 Addison3 Oxford4 Que

Authorid Author name1 Smith2 Wynar3 Jones4 Duncan5 Applegate

Subid Subject1 cataloging2 history3 stuff

Book ID Subid1 22 13 34 24 3

IS 257 – Spring 2004 2004.01.20 - SLIDE 41

Data Models(2): History

• Object Oriented Data Model (1990’s)– Encapsulates data and operations as

“Objects”

Books(id, title)

Publisher SubjectsAuthors

(first, last)

IS 257 – Spring 2004 2004.01.20 - SLIDE 42

Data Models(2): History

• Object-Relational Model (1990’s)– Combines the well-known properties of the

Relational Model with such OO features as:• User-defined datatypes• User-defined functions• Inheritance and sub-classing

IS 257 – Spring 2004 2004.01.20 - SLIDE 43

Database System Life Cycle

Growth,Change, &

Maintenance6

Operations5

Integration4

Design1

Conversion3

PhysicalCreation

2

IS 257 – Spring 2004 2004.01.20 - SLIDE 44

Design

• Determination of the needs of the organization

• Development of the Conceptual Model of the database– Typically using Entity-Relationship

diagramming techniques

• Construction of a Data Dictionary• Development of the Logical Model

IS 257 – Spring 2004 2004.01.20 - SLIDE 45

Physical Creation

• Development of the Physical Model of the Database– data formats and types– determination of indexes, etc.

• Load a prototype database and test• Determine and implement security, privacy

and access controls• Determine and implement integrity constraints

IS 257 – Spring 2004 2004.01.20 - SLIDE 46

Conversion

• Convert existing data sets and applications to use the new database– May need programs, conversion utilities to

convert old data to new formats.

IS 257 – Spring 2004 2004.01.20 - SLIDE 47

Integration

• Overlaps with Phase 3

• Integration of converted applications and new applications into the new database

IS 257 – Spring 2004 2004.01.20 - SLIDE 48

Operations

• All applications run full-scale

• Privacy, security, access control must be in place.

• Recovery and Backup procedures must be established and used

IS 257 – Spring 2004 2004.01.20 - SLIDE 49

Growth, Change & Maintenance

• Change is a way of life– Applications, data requirements, reports, etc.

will all change as new needs and requirements are found

– The Database and applications and will need to be modified to meet the needs of changes

IS 257 – Spring 2004 2004.01.20 - SLIDE 50

Another View of the Life Cycle

Operations5

Conversion3

PhysicalCreation

2Growth, Change

6

Integration4

Design1

IS 257 – Spring 2004 2004.01.20 - SLIDE 51

Next Time