Upload
deepanbaalan5112
View
225
Download
0
Embed Size (px)
Citation preview
7/29/2019 Database Design by Nunes
1/50
SAN DIEGO SUPERCOMPUTER CENTER
Introduction toDatabase Design
7/29/2019 Database Design by Nunes
2/50
SAN DIEGO SUPERCOMPUTER CENTER
Database Design Agenda
7/29/2019 Database Design by Nunes
3/50
SAN DIEGO SUPERCOMPUTER CENTER
General Design Considerations
7/29/2019 Database Design by Nunes
4/50
SAN DIEGO SUPERCOMPUTER CENTER
Users
7/29/2019 Database Design by Nunes
5/50
SAN DIEGO SUPERCOMPUTER CENTER
Legacy Systems/Data
7/29/2019 Database Design by Nunes
6/50
SAN DIEGO SUPERCOMPUTER CENTER
Application Requirements
7/29/2019 Database Design by Nunes
7/50
SAN DIEGO SUPERCOMPUTER CENTER
Entity - Relationship Model
7/29/2019 Database Design by Nunes
8/50
SAN DIEGO SUPERCOMPUTER CENTER
Entities
Employee Department
7/29/2019 Database Design by Nunes
9/50
SAN DIEGO SUPERCOMPUTER CENTER
Attributes
Name SSN
Employee Department
Name Budget
7/29/2019 Database Design by Nunes
10/50
SAN DIEGO SUPERCOMPUTER CENTER
Relationships
Name SSN
Employee Department
Name Budget
works in
Start date
7/29/2019 Database Design by Nunes
11/50
SAN DIEGO SUPERCOMPUTER CENTER
Relationship Connectivity
Name SSN
Employee Department
Name Budget
work 1N
Start date
7/29/2019 Database Design by Nunes
12/50
SAN DIEGO SUPERCOMPUTER CENTER
Connectivity
Department Managerhas 11
Department Projecthas N1
Employee Projectworks on NM
one-to-one
one-to-many
many-to-many
7/29/2019 Database Design by Nunes
13/50
SAN DIEGO SUPERCOMPUTER CENTER
ER example
7/29/2019 Database Design by Nunes
14/50
SAN DIEGO SUPERCOMPUTER CENTER
Team Entities & Attributes
Players Sales
Start date End date
StatisticsName
tickets merchandise
Games
opponentdate result
7/29/2019 Database Design by Nunes
15/50
SAN DIEGO SUPERCOMPUTER CENTER
Team Relationships
PlayersGamesN1
play
7/29/2019 Database Design by Nunes
16/50
SAN DIEGO SUPERCOMPUTER CENTER
Team Relationships
Games Salesgenerates
11
7/29/2019 Database Design by Nunes
17/50
SAN DIEGO SUPERCOMPUTER CENTER
Team ER Diagram
Players
Games
Sales
play generates
N 1
1
1
Start date End date Statistics
Name
tickets merchandise
opponentdate result
7/29/2019 Database Design by Nunes
18/50
SAN DIEGO SUPERCOMPUTER CENTER
Logical Design to Physical Design
7/29/2019 Database Design by Nunes
19/50
SAN DIEGO SUPERCOMPUTER CENTER
Entity tables
Name SSN
Employeecreate table employee
(emp_no number,name varchar2(256),ssn number,primary key
(emp_no));
7/29/2019 Database Design by Nunes
20/50
SAN DIEGO SUPERCOMPUTER CENTER
Foreign Keys
create table employee(emp_no number,
dept_no number,name varchar2(256),ssn number,primary key (emp_no),foreign key (dept_no) references department);
Employee
Department
has
1
N
create table department(dept_no number,name varchar2(50),primary key (dept_no));
7/29/2019 Database Design by Nunes
21/50
SAN DIEGO SUPERCOMPUTER CENTER
Foreign Key
Department
Employee
Accounting has 1 employee:Brian Burnett
Human Resources has 2 employees:Nora EdwardsBen Smith
IT has 3 employees:Ajay PatelJohn OLearyJulia Lenin
7/29/2019 Database Design by Nunes
22/50
SAN DIEGO SUPERCOMPUTER CENTER
Many-to-Many tables
create table proj_has_emp(proj_no number,emp_no number,start_date date,primary key (proj_no, emp_no),foreign key (proj_no) references project
foreign key (emp_no) references employee);
Employee
Project
has
N
M
Start date
7/29/2019 Database Design by Nunes
23/50
SAN DIEGO SUPERCOMPUTER CENTER
Many-to-Many tables
Project
Employee
proj_has_emp
Employee Audit has 1 employee:Brian Burnett
Budget has 2 employees:
Julia LeninNora Edwards
Intranet has 3 employees:Julia LeninJohn OLearyAjay Patel
7/29/2019 Database Design by Nunes
24/50
SAN DIEGO SUPERCOMPUTER CENTER
Tutorial
7/29/2019 Database Design by Nunes
25/50
SAN DIEGO SUPERCOMPUTER CENTER
Tutorial
7/29/2019 Database Design by Nunes
26/50
SAN DIEGO SUPERCOMPUTER CENTER
Tutorial
7/29/2019 Database Design by Nunes
27/50
SAN DIEGO SUPERCOMPUTER CENTER
Normalization
7/29/2019 Database Design by Nunes
28/50
SAN DIEGO SUPERCOMPUTER CENTER
First Normal Form (1NF)
No repeating columns within a row.
No multi-valued columns.
Queries become easier.
7/29/2019 Database Design by Nunes
29/50
SAN DIEGO SUPERCOMPUTER CENTER
1NF
Employee (unnormalized)
emp_no name dept_no dept_name skills
1 Kevin Jacobs 201 R&D C
1 Kevin Jacobs 201 R&D Perl
1 Kevin Jacobs 201 R&D Java
2 Barbara Jones 224 IT Linux
2 Barbara Jones 224 IT Mac
3 Jake Rivera 201 R&D DB2
3 Jake Rivera 201 R&D Oracle
3 Jake Rivera 201 R&D Java
Employee (1NF)
7/29/2019 Database Design by Nunes
30/50
SAN DIEGO SUPERCOMPUTER CENTER
Second Normal Form (2NF)
Functional dependence - the property of one or more
attributes that uniquely determines the value of otherattributes.
Any non-dependent attributes are moved into a
smaller (subset) table.
Prevents update, insert, and delete anomalies.
7/29/2019 Database Design by Nunes
31/50
SAN DIEGO SUPERCOMPUTER CENTER
Functional Dependence
emp_no name dept_no dept_name skills
1 Kevin Jacobs 201 R&D C
1 Kevin Jacobs 201 R&D Perl
1 Kevin Jacobs 201 R&D Java
2 Barbara Jones 224 IT Linux
2 Barbara Jones 224 IT Mac
3 Jake Rivera 201 R&D DB23 Jake Rivera 201 R&D Oracle
3 Jake Rivera 201 R&D Java
Employee (1NF)
7/29/2019 Database Design by Nunes
32/50
SAN DIEGO SUPERCOMPUTER CENTER
2NF
emp_no name dept_no dept_name skills
1 Kevin Jacobs 201 R&D C
1 Kevin Jacobs 201 R&D Perl
1 Kevin Jacobs 201 R&D Java
2 Barbara Jones 224 IT Linux
2 Barbara Jones 224 IT Mac
3 Jake Rivera 201 R&D DB2
3 Jake Rivera 201 R&D Oracle
3 Jake Rivera 201 R&D Java
Employee (1NF)
Employee (2NF) Skills (2NF)
7/29/2019 Database Design by Nunes
33/50
SAN DIEGO SUPERCOMPUTER CENTER
Data Integrity
Insert Anomaly - adding null values. eg, inserting a new department does not
require the primary key of emp_no to be added.
Update Anomaly - multiple updates for a single name change, causesperformance degradation. eg, changing IT dept_name to IS
Delete Anomaly - deleting wanted information. eg, deleting the IT department
removes employee Barbara Jones from the database
emp_no name dept_no dept_name skills
1 Kevin Jacobs 201 R&D C
1 Kevin Jacobs 201 R&D Perl
1 Kevin Jacobs 201 R&D Java
2 Barbara Jones 224 IT Linux
2 Barbara Jones 224 IT Mac
3 Jake Rivera 201 R&D DB2
3 Jake Rivera 201 R&D Oracle
3 Jake Rivera 201 R&D Java
Employee (1NF)
7/29/2019 Database Design by Nunes
34/50
SAN DIEGO SUPERCOMPUTER CENTER
Third Normal Form (3NF)
Transitive dependence - two separate entities exist
within one table.
Any transitive dependencies are moved into a smaller
(subset) table.
Prevents update, insert, and delete anomalies.
7/29/2019 Database Design by Nunes
35/50
SAN DIEGO SUPERCOMPUTER CENTER
Transitive Dependence
Employee (2NF)
7/29/2019 Database Design by Nunes
36/50
SAN DIEGO SUPERCOMPUTER CENTER
3NF
Employee (2NF)
Employee (3NF) Department (3NF)
7/29/2019 Database Design by Nunes
37/50
SAN DIEGO SUPERCOMPUTER CENTER
Other Normal Forms
Strengthens 3NF by requiring the keys in the
functional dependencies to be superkeys (a column or
columns that uniquely identify a row)
Eliminate trivial multivalued dependencies.
Eliminate dependencies not determined by keys.
7/29/2019 Database Design by Nunes
38/50
SAN DIEGO SUPERCOMPUTER CENTER
Normalizing our team (1NF)
players
games sales
player_id game_id name start_date end_date aces blocks spikes digs
45 34 Mike Speedy 1/1/00 12 3 20 5
45 35 Mike Speedy 1/1/00 10 2 15 4
45 40 Mike Speedy 1/1/00 7 2 10 3
78 42 Frank Newmon 5/1/05
102 34 Joe Powers 1/1/02 7/1/05 8 6 18 10
102 35 Joe Powers 1/1/02 7/1/05 10 8 24 12
103 42 Tony Tough 1/1/05 15 10 20 14
7/29/2019 Database Design by Nunes
39/50
SAN DIEGO SUPERCOMPUTER CENTER
Normalizing our team (2NF & 3NF)
players
games sales
player_stats
7/29/2019 Database Design by Nunes
40/50
SAN DIEGO SUPERCOMPUTER CENTER
Revisit team ER diagram
games salesgenerates11
tickets merchandise
opponentdate result
player_stats tracked
Recorded
by
1
N
N
aces blocks digs
players
Start date End dateName
1
spikes
7/29/2019 Database Design by Nunes
41/50
SAN DIEGO SUPERCOMPUTER CENTER
Star Schemas
Best for use in decision support tasks such as Data
Warehouses and Data Marts.
Denormalized - allows for faster querying due to lessjoins.
Slow performance for insert, delete, and update
transactions.
Comprised of two types tables: facts and dimensions.
7/29/2019 Database Design by Nunes
42/50
SAN DIEGO SUPERCOMPUTER CENTER
Fact Table
Contains groupings of measures of an event to be
analyzed.
Measure - numeric data
Invoice Facts
units sold
unit amounttotal sale price
7/29/2019 Database Design by Nunes
43/50
SAN DIEGO SUPERCOMPUTER CENTER
Dimension Table
descriptor - non-numeric data
Customer Dimension
cust_dim_keynameaddressphone
Time Dimension
time_dim_keyinvoice datedue datedelivered date
Location Dimension
loc_dim_keystore numberstore addressstore phone
Product Dimension
prod_dim_keyproductpricecost
7/29/2019 Database Design by Nunes
44/50
SAN DIEGO SUPERCOMPUTER CENTER
Star Schema
Customer Dimension
cust_dim_keynameaddressphone
Time Dimension
time_dim_keyinvoice datedue datedelivered date
Location Dimension
loc_dim_keystore numberstore addressstore phone
Product Dimension
prod_dim_keyproductpricecost
Invoice Factscust_dim_keyloc_dim_keytime_dim_keyprod_dim_keyunits soldunit amounttotal sale price
1
1
1
1
N
NN
N
7/29/2019 Database Design by Nunes
45/50
SAN DIEGO SUPERCOMPUTER CENTER
Analyzing the team
Team Facts
datemerchandise
tickets
From this we will use the sales table to create our fact
table.
7/29/2019 Database Design by Nunes
46/50
SAN DIEGO SUPERCOMPUTER CENTER
Team Dimension
Player Dimension
player_dim_keynamestart_dateend_dateacesblocksspikesdigs
Game Dimension
game_dim_keyopponentresult
7/29/2019 Database Design by Nunes
47/50
SAN DIEGO SUPERCOMPUTER CENTER
Team Star Schema
Player Dimension
player_dim_key
namestart_dateend_dateacesblocksspikesdigs
Team Facts
player_dim_keygame_dim_keydatemerchandise
tickets
1
N
Game Dimensiongame_dim_keyopponentresult
1
N
7/29/2019 Database Design by Nunes
48/50
SAN DIEGO SUPERCOMPUTER CENTER
Books and Reference
7/29/2019 Database Design by Nunes
49/50
SAN DIEGO SUPERCOMPUTER CENTER
Continuing Education
UCSD Extension
Data Management Courses
DBA Certificate Program
Database Application Developer Certificate Program
http://www.extension.ucsd.edu/studyarea/index.cfm?vAction=saCourses&vStudyAreaID=15http://www.extension.ucsd.edu/Programs/index.cfm?vAction=certDetail&vCertificateID=130http://www.extension.ucsd.edu/Programs/index.cfm?vAction=certDetail&vCertificateID=129http://www.extension.ucsd.edu/Programs/index.cfm?vAction=certDetail&vCertificateID=129http://www.extension.ucsd.edu/Programs/index.cfm?vAction=certDetail&vCertificateID=130http://www.extension.ucsd.edu/studyarea/index.cfm?vAction=saCourses&vStudyAreaID=157/29/2019 Database Design by Nunes
50/50
Data Central
The Data Services Group provides Data Allocations for
the scientific community.
http://datacentral.sdsc.edu/
Tools and expertise for making data collectionsavailable to the broader scientific community.
Provide disk, tape, and database storage resources.
http://datacentral.sdsc.edu/http://datacentral.sdsc.edu/