33
INTRODUCTION TO DATABASES CS 260 Database Systems

INTRODUCTION TO DATABASES CS 260 Database Systems

Embed Size (px)

Citation preview

Page 1: INTRODUCTION TO DATABASES CS 260 Database Systems

INTRODUCTION TO DATABASESCS 260

Database Systems

Page 2: INTRODUCTION TO DATABASES CS 260 Database Systems

Course Outline

SQL queries DML Select (SELECT) DDL (CREATE, ALTER, DROP) DML Action (INSERT, UPDATE, DELETE)

User authentication Data modeling and normalization Transaction processing Stored DBMS programs Scaling and performance Distributed databases

Page 3: INTRODUCTION TO DATABASES CS 260 Database Systems

Why Study Database Systems? Important

The back end of many important systems Web-based e-commerce (e.g. amazon.com, imdb.com) Scientific sites (e.g. protein sequences) Real-time systems (e.g. air-traffic control) Business systems (e.g. banking)

Good job experience Many jobs require database systems experience Oracle DBAs hired at >$80K, often make >$100K

A lot of data out there Facebook – over 100 petabytes Amazon – over 90 pb Google processes over 20pb per day AT&T (323 TB, 2010) World Data Center for Climate (6+ PB, 2010)

Page 4: INTRODUCTION TO DATABASES CS 260 Database Systems

Why Study Database Systems? As it relates to UWEC courses

CS268 Use SQL and JDBC

CS355 & CS485 (SE I and II) Project(s) might use databases

Any CS course might use a database!

Page 5: INTRODUCTION TO DATABASES CS 260 Database Systems

Overview

Data management Database system architectures Database vocabulary & history

Page 6: INTRODUCTION TO DATABASES CS 260 Database Systems

Data Management

"Real" computer applications tend to have: Large volumes of data Different types of data Data with complex relationships

Page 7: INTRODUCTION TO DATABASES CS 260 Database Systems

“Old” Approach to Data Management

File-based Decentralized Each application has its own data files

Data File 1

Application1

Application2

Data File 2

Page 8: INTRODUCTION TO DATABASES CS 260 Database Systems

Problems with file based approach?

Duplicate data Redundant data

Becomes inconsistent over time Requires custom programs to create,

update, and retrieve data from files Transaction atomicity Concurrent access issues Security

Page 9: INTRODUCTION TO DATABASES CS 260 Database Systems

DBMS Approach to Data Management

Data is viewed as a shared organizational resource

Centralized All applications interact with a single

centralized database

Database

Application1

Application2

Page 10: INTRODUCTION TO DATABASES CS 260 Database Systems

What is a Database?

Components Data files Database Management System (DBMS)

DataFiles

DBMS

Application1

Application2

Database

Page 11: INTRODUCTION TO DATABASES CS 260 Database Systems

What is the DBMS?

Set of programs that perform… Basic data handling tasks

Insert, Update, Delete, Retrieve Creating atomic transactions Enabling/disabling concurrency Database administration tasks

Creating users, creating objects, enforcing security, creating backups, …

Applications interact only with the DBMS Never interact directly with the data files

Major players today: Oracle, SQL Server, DB2, MySQL

Page 12: INTRODUCTION TO DATABASES CS 260 Database Systems

Overview

Data management Database system architectures Database vocabulary & history

Page 13: INTRODUCTION TO DATABASES CS 260 Database Systems

Database System Architectures

Single-tier 2-tier N-tier

Page 14: INTRODUCTION TO DATABASES CS 260 Database Systems

Single-Tier Databases

Personal computer-based Applications and database run on the same

local computer Example: Access, Filemaker What are the limitations?

DatabaseDatabase

Applications

Host Computer (microcomputer)

Page 15: INTRODUCTION TO DATABASES CS 260 Database Systems

Single-Tier Databases

Mainframe-based (host) Applications & database run on the same

host computer Mainly used in legacy systems

DatabaseDatabase

Applications

Host Computer

Terminals

Network

Page 16: INTRODUCTION TO DATABASES CS 260 Database Systems

2-Tier (Client/Server)

DBMS runs on a server Client applications run on the clients

Example: Oracle (server) + SQL Developer (client)

Client Workstations

DatabaseApplications

DatabaseApplications

DatabaseApplications

Database

Database Server

Network

Page 17: INTRODUCTION TO DATABASES CS 260 Database Systems

N-Tier Database Systems

N-tier system places each service type on a different host

Database

Client Workstations(Browser)

UserServices

UserServices

UserServices

Database Server

BusinessServices

Middle-tier Server(s)(Web server)

Page 18: INTRODUCTION TO DATABASES CS 260 Database Systems

Sample Database (CANDY)CUST_ID CUST_NAME CUST_TYPE CUST_ADDR CUST_ZIP CUST_PHONE USERNAME PASSWORD

1 Jones, Joe P 1234 Main St. 91212 434-1231 jonesj 12342 Armstrong,Inc. R 231 Globe Blvd. 91212 434-7664 armstrong 33333 Sw edish Burgers R 1889 20th N.E. 91213 434-9090 sw edburg 23534 Pickled Pickles R 194 CityView 91289 324-8909 pickpick 53335 The Candy Kid W 2121 Main St. 91212 563-4545 kidcandy 23516 Waterman, Al P 23 Yankee Blvd. 91234 w ateral 89007 Bobby Bon Bons R 12 Nichi Cres. 91212 434-9045 bobbybon 30118 Crow sh, Elias P 7 77th Ave. 91211 434-0007 crow el 10339 Montag, Susie P 981 Montview 91213 456-2091 montags 9633

10 Columberg Sw eets W 239 East Falls 91209 874-9092 columsw e 8399

PURCH_ID PROD_ID CUST_ID PURCH_DATE DELIVERY_DATE POUNDS STATUS

1 1 5 28-Oct-04 28-Oct-04 3.5 PAID2 2 6 28-Oct-04 30-Oct-04 15 PAID3 1 9 28-Oct-04 28-Oct-04 2 PAID3 3 9 28-Oct-04 28-Oct-04 3.7 PAID4 3 2 28-Oct-04 3.7 PAID5 1 7 29-Oct-04 29-Oct-04 3.7 NOT PAID5 2 7 29-Oct-04 29-Oct-04 1.2 NOT PAID5 3 7 29-Oct-04 29-Oct-04 4.4 NOT PAID6 2 7 29-Oct-04 3 PAID7 2 10 29-Oct-04 14 NOT PAID7 5 10 29-Oct-04 4.8 NOT PAID8 1 4 29-Oct-04 29-Oct-04 1 PAID8 5 4 29-Oct-04 7.6 PAID9 5 4 29-Oct-04 29-Oct-04 3.5 NOT PAID

PROD_ID PROD_DESC PROD_COSTPROD_PRICE

1 Celestial Cashew Crunch 7.45$ 10.00$

2 Unbrittle Peanut Paradise 5.75$ 9.00$

3 Mystery Melange 7.75$ 10.50$

4 Millionaire’s Macadamia Mix 12.50$ 16.00$

5 Nuts Not Nachos 6.25$ 9.50$

CUST_TYPE_IDCUST_TYPE_DESC

P Private

R Retail

W Wholesale

CANDY_CUSTOMER

CANDY_PURCHASE

CANDY_CUST_TYPE

CANDY_PRODUCT

Page 19: INTRODUCTION TO DATABASES CS 260 Database Systems

Overview

Data management Database system architectures Database vocabulary & history

Page 20: INTRODUCTION TO DATABASES CS 260 Database Systems

Basic Database Vocabulary

Field: column of similar data values Record*: row of related fields Table*: set of related rows

PROD_ID PROD_DESC PROD_COST PROD_PRICE

1 Celestial Cashew Crunch 7.45$ 10.00$

2 Unbrittle Peanut Paradise 5.75$ 9.00$

3 Mystery Melange 7.75$ 10.50$

4 Millionaire’s Macadamia Mix 12.50$ 16.00$

5 Nuts Not Nachos 6.25$ 9.50$

• Mathematical terms for record and table: tuple and relation

Page 21: INTRODUCTION TO DATABASES CS 260 Database Systems

Database History

All pre-1960’s systems used file-based data

First database: Apollo project Goal: No duplicate data in multiple

locations Used a hierarchical structure Created relationships using pointers

Pointer: hardware address

Page 22: INTRODUCTION TO DATABASES CS 260 Database Systems

Example Hierarchical Database

StudentID

StudentLastName

StudentFirstName

StudentMI

Pointers to Course Data*

5000 Nelson Amber S

5001 Hernandez Joseph P

5002 Myers Stephen R

UNIVERSITY_STUDENT

CourseID CourseName

CourseTitle

100 MIS 290 Intro. to Database Applications

101 MIS 304 Fundamentals of Business Programming

102 MIS 310 Systems Analysis & Design

UNIVERSITY_COURSE

* Hex number referencing data’s physical location on hard drive

Page 23: INTRODUCTION TO DATABASES CS 260 Database Systems

Problems with Hierarchical Databases

Relationships are all one-way; to go the other way, you must create a new set of pointers

Pointers are hardware-specific VERY hard to move to new hardware

Applications must be custom-written Usually in COBOL

Page 24: INTRODUCTION TO DATABASES CS 260 Database Systems

Relational Databases

Circa 1972 E.J. Codd “Normalizing” relations

Goal: No redundant data Stores data in a tabular format Creates relationships by sharing key

fields

Page 25: INTRODUCTION TO DATABASES CS 260 Database Systems

Key Fields

Primary key: field that uniquely identifies a record Often abbreviated “PK”

InstructorID InstructorLastName

InstructorFirstName

1 Black Greg

2 McIntyre Karen

3 Sarin Naj

UNIVERSITY_INSTRUCTOR

Primary keys

Page 26: INTRODUCTION TO DATABASES CS 260 Database Systems

Class Discussion

What is the primary key of each table in the CANDY database?

How can you tell if a field is a primary key?

Page 27: INTRODUCTION TO DATABASES CS 260 Database Systems

Sample Database (CANDY)CUST_ID CUST_NAME CUST_TYPE CUST_ADDR CUST_ZIP CUST_PHONE USERNAME PASSWORD

1 Jones, Joe P 1234 Main St. 91212 434-1231 jonesj 12342 Armstrong,Inc. R 231 Globe Blvd. 91212 434-7664 armstrong 33333 Sw edish Burgers R 1889 20th N.E. 91213 434-9090 sw edburg 23534 Pickled Pickles R 194 CityView 91289 324-8909 pickpick 53335 The Candy Kid W 2121 Main St. 91212 563-4545 kidcandy 23516 Waterman, Al P 23 Yankee Blvd. 91234 w ateral 89007 Bobby Bon Bons R 12 Nichi Cres. 91212 434-9045 bobbybon 30118 Crow sh, Elias P 7 77th Ave. 91211 434-0007 crow el 10339 Montag, Susie P 981 Montview 91213 456-2091 montags 9633

10 Columberg Sw eets W 239 East Falls 91209 874-9092 columsw e 8399

PURCH_ID PROD_ID CUST_ID PURCH_DATE DELIVERY_DATE POUNDS STATUS

1 1 5 28-Oct-04 28-Oct-04 3.5 PAID2 2 6 28-Oct-04 30-Oct-04 15 PAID3 1 9 28-Oct-04 28-Oct-04 2 PAID3 3 9 28-Oct-04 28-Oct-04 3.7 PAID4 3 2 28-Oct-04 3.7 PAID5 1 7 29-Oct-04 29-Oct-04 3.7 NOT PAID5 2 7 29-Oct-04 29-Oct-04 1.2 NOT PAID5 3 7 29-Oct-04 29-Oct-04 4.4 NOT PAID6 2 7 29-Oct-04 3 PAID7 2 10 29-Oct-04 14 NOT PAID7 5 10 29-Oct-04 4.8 NOT PAID8 1 4 29-Oct-04 29-Oct-04 1 PAID8 5 4 29-Oct-04 7.6 PAID9 5 4 29-Oct-04 29-Oct-04 3.5 NOT PAID

PROD_ID PROD_DESC PROD_COSTPROD_PRICE

1 Celestial Cashew Crunch 7.45$ 10.00$

2 Unbrittle Peanut Paradise 5.75$ 9.00$

3 Mystery Melange 7.75$ 10.50$

4 Millionaire’s Macadamia Mix 12.50$ 16.00$

5 Nuts Not Nachos 6.25$ 9.50$

CUST_TYPE_IDCUST_TYPE_DESC

P Private

R Retail

W Wholesale

CANDY_CUSTOMER

CANDY_PURCHASE

CANDY_CUST_TYPE

CANDY_PRODUCT

Page 28: INTRODUCTION TO DATABASES CS 260 Database Systems

Special Types of Primary Keys

Composite PK: made by combining 2 or more fields to create a unique identifier Consider the CANDY_PURCHASE table…

Surrogate PK: ID generated by the DBMS solely as a unique identifier (not done in above example, but likely in CANDY_CUSTOMER and CANDY_PRODUCT)

Composite PK

Page 29: INTRODUCTION TO DATABASES CS 260 Database Systems

Key Fields (continued)

Foreign key Field that is a primary key in another table Serves to create a relationship

StudentID StudentLastName

StudentFirstName

StudentMI AdvisorID

5000 Nelson Amber S 1

5001 Hernandez Joseph P 1

5002 Myers Stephen R 3

UNIVERSITY_STUDENT Foreign keys

InstructorID InstructorLastName

InstructorFirstName

1 Black Greg

2 McIntyre Karen

3 Sarin Naj

UNIVERSITY_INSTRUCTOR

Primary keys

Page 30: INTRODUCTION TO DATABASES CS 260 Database Systems

Alternative to Foreign Keys

Repeat data values for every record Problems

Takes extra space Redundant data becomes inconsistent over

time

StudentID StudentLastName

StudentFirstName

StudentMI AdvisorLastName

AdvisorFirstName

5000 Nelson Amber S Black Greg

5001 Hernandez Joseph P Black Gregory

5002 Myers Stephen R Sarin Naj

UNIVERSITY_STUDENT

Page 31: INTRODUCTION TO DATABASES CS 260 Database Systems

Class Discussion

What are the foreign keys in the CANDY database?

Does a table HAVE to have foreign keys? How can you tell if a field is a foreign

key?

Page 32: INTRODUCTION TO DATABASES CS 260 Database Systems

Sample Database (CANDY)CUST_ID CUST_NAME CUST_TYPE CUST_ADDR CUST_ZIP CUST_PHONE USERNAME PASSWORD

1 Jones, Joe P 1234 Main St. 91212 434-1231 jonesj 12342 Armstrong,Inc. R 231 Globe Blvd. 91212 434-7664 armstrong 33333 Sw edish Burgers R 1889 20th N.E. 91213 434-9090 sw edburg 23534 Pickled Pickles R 194 CityView 91289 324-8909 pickpick 53335 The Candy Kid W 2121 Main St. 91212 563-4545 kidcandy 23516 Waterman, Al P 23 Yankee Blvd. 91234 w ateral 89007 Bobby Bon Bons R 12 Nichi Cres. 91212 434-9045 bobbybon 30118 Crow sh, Elias P 7 77th Ave. 91211 434-0007 crow el 10339 Montag, Susie P 981 Montview 91213 456-2091 montags 9633

10 Columberg Sw eets W 239 East Falls 91209 874-9092 columsw e 8399

PURCH_ID PROD_ID CUST_ID PURCH_DATE DELIVERY_DATE POUNDS STATUS

1 1 5 28-Oct-04 28-Oct-04 3.5 PAID2 2 6 28-Oct-04 30-Oct-04 15 PAID3 1 9 28-Oct-04 28-Oct-04 2 PAID3 3 9 28-Oct-04 28-Oct-04 3.7 PAID4 3 2 28-Oct-04 3.7 PAID5 1 7 29-Oct-04 29-Oct-04 3.7 NOT PAID5 2 7 29-Oct-04 29-Oct-04 1.2 NOT PAID5 3 7 29-Oct-04 29-Oct-04 4.4 NOT PAID6 2 7 29-Oct-04 3 PAID7 2 10 29-Oct-04 14 NOT PAID7 5 10 29-Oct-04 4.8 NOT PAID8 1 4 29-Oct-04 29-Oct-04 1 PAID8 5 4 29-Oct-04 7.6 PAID9 5 4 29-Oct-04 29-Oct-04 3.5 NOT PAID

PROD_ID PROD_DESC PROD_COSTPROD_PRICE

1 Celestial Cashew Crunch 7.45$ 10.00$

2 Unbrittle Peanut Paradise 5.75$ 9.00$

3 Mystery Melange 7.75$ 10.50$

4 Millionaire’s Macadamia Mix 12.50$ 16.00$

5 Nuts Not Nachos 6.25$ 9.50$

CUST_TYPE_IDCUST_TYPE_DESC

P Private

R Retail

W Wholesale

CANDY_CUSTOMER

CANDY_PURCHASE

CANDY_CUST_TYPE

CANDY_PRODUCT

Page 33: INTRODUCTION TO DATABASES CS 260 Database Systems

Rules for Relational Database Tables

Every record has to have a non-NULL (or “non-empty”) and unique PK value

Every FK value must be defined as a PK in its parent table