Database Design by Nunes

Embed Size (px)

Citation preview

  • 7/29/2019 Database Design by Nunes

    1/50

    SAN DIEGO SUPERCOMPUTER CENTER

    Introduction toDatabase Design

  • 7/29/2019 Database Design by Nunes

    2/50

    SAN DIEGO SUPERCOMPUTER CENTER

    Database Design Agenda

  • 7/29/2019 Database Design by Nunes

    3/50

    SAN DIEGO SUPERCOMPUTER CENTER

    General Design Considerations

  • 7/29/2019 Database Design by Nunes

    4/50

    SAN DIEGO SUPERCOMPUTER CENTER

    Users

  • 7/29/2019 Database Design by Nunes

    5/50

    SAN DIEGO SUPERCOMPUTER CENTER

    Legacy Systems/Data

  • 7/29/2019 Database Design by Nunes

    6/50

    SAN DIEGO SUPERCOMPUTER CENTER

    Application Requirements

  • 7/29/2019 Database Design by Nunes

    7/50

    SAN DIEGO SUPERCOMPUTER CENTER

    Entity - Relationship Model

  • 7/29/2019 Database Design by Nunes

    8/50

    SAN DIEGO SUPERCOMPUTER CENTER

    Entities

    Employee Department

  • 7/29/2019 Database Design by Nunes

    9/50

    SAN DIEGO SUPERCOMPUTER CENTER

    Attributes

    Name SSN

    Employee Department

    Name Budget

  • 7/29/2019 Database Design by Nunes

    10/50

    SAN DIEGO SUPERCOMPUTER CENTER

    Relationships

    Name SSN

    Employee Department

    Name Budget

    works in

    Start date

  • 7/29/2019 Database Design by Nunes

    11/50

    SAN DIEGO SUPERCOMPUTER CENTER

    Relationship Connectivity

    Name SSN

    Employee Department

    Name Budget

    work 1N

    Start date

  • 7/29/2019 Database Design by Nunes

    12/50

    SAN DIEGO SUPERCOMPUTER CENTER

    Connectivity

    Department Managerhas 11

    Department Projecthas N1

    Employee Projectworks on NM

    one-to-one

    one-to-many

    many-to-many

  • 7/29/2019 Database Design by Nunes

    13/50

    SAN DIEGO SUPERCOMPUTER CENTER

    ER example

  • 7/29/2019 Database Design by Nunes

    14/50

    SAN DIEGO SUPERCOMPUTER CENTER

    Team Entities & Attributes

    Players Sales

    Start date End date

    StatisticsName

    tickets merchandise

    Games

    opponentdate result

  • 7/29/2019 Database Design by Nunes

    15/50

    SAN DIEGO SUPERCOMPUTER CENTER

    Team Relationships

    PlayersGamesN1

    play

  • 7/29/2019 Database Design by Nunes

    16/50

    SAN DIEGO SUPERCOMPUTER CENTER

    Team Relationships

    Games Salesgenerates

    11

  • 7/29/2019 Database Design by Nunes

    17/50

    SAN DIEGO SUPERCOMPUTER CENTER

    Team ER Diagram

    Players

    Games

    Sales

    play generates

    N 1

    1

    1

    Start date End date Statistics

    Name

    tickets merchandise

    opponentdate result

  • 7/29/2019 Database Design by Nunes

    18/50

    SAN DIEGO SUPERCOMPUTER CENTER

    Logical Design to Physical Design

  • 7/29/2019 Database Design by Nunes

    19/50

    SAN DIEGO SUPERCOMPUTER CENTER

    Entity tables

    Name SSN

    Employeecreate table employee

    (emp_no number,name varchar2(256),ssn number,primary key

    (emp_no));

  • 7/29/2019 Database Design by Nunes

    20/50

    SAN DIEGO SUPERCOMPUTER CENTER

    Foreign Keys

    create table employee(emp_no number,

    dept_no number,name varchar2(256),ssn number,primary key (emp_no),foreign key (dept_no) references department);

    Employee

    Department

    has

    1

    N

    create table department(dept_no number,name varchar2(50),primary key (dept_no));

  • 7/29/2019 Database Design by Nunes

    21/50

    SAN DIEGO SUPERCOMPUTER CENTER

    Foreign Key

    Department

    Employee

    Accounting has 1 employee:Brian Burnett

    Human Resources has 2 employees:Nora EdwardsBen Smith

    IT has 3 employees:Ajay PatelJohn OLearyJulia Lenin

  • 7/29/2019 Database Design by Nunes

    22/50

    SAN DIEGO SUPERCOMPUTER CENTER

    Many-to-Many tables

    create table proj_has_emp(proj_no number,emp_no number,start_date date,primary key (proj_no, emp_no),foreign key (proj_no) references project

    foreign key (emp_no) references employee);

    Employee

    Project

    has

    N

    M

    Start date

  • 7/29/2019 Database Design by Nunes

    23/50

    SAN DIEGO SUPERCOMPUTER CENTER

    Many-to-Many tables

    Project

    Employee

    proj_has_emp

    Employee Audit has 1 employee:Brian Burnett

    Budget has 2 employees:

    Julia LeninNora Edwards

    Intranet has 3 employees:Julia LeninJohn OLearyAjay Patel

  • 7/29/2019 Database Design by Nunes

    24/50

    SAN DIEGO SUPERCOMPUTER CENTER

    Tutorial

  • 7/29/2019 Database Design by Nunes

    25/50

    SAN DIEGO SUPERCOMPUTER CENTER

    Tutorial

  • 7/29/2019 Database Design by Nunes

    26/50

    SAN DIEGO SUPERCOMPUTER CENTER

    Tutorial

  • 7/29/2019 Database Design by Nunes

    27/50

    SAN DIEGO SUPERCOMPUTER CENTER

    Normalization

  • 7/29/2019 Database Design by Nunes

    28/50

    SAN DIEGO SUPERCOMPUTER CENTER

    First Normal Form (1NF)

    No repeating columns within a row.

    No multi-valued columns.

    Queries become easier.

  • 7/29/2019 Database Design by Nunes

    29/50

    SAN DIEGO SUPERCOMPUTER CENTER

    1NF

    Employee (unnormalized)

    emp_no name dept_no dept_name skills

    1 Kevin Jacobs 201 R&D C

    1 Kevin Jacobs 201 R&D Perl

    1 Kevin Jacobs 201 R&D Java

    2 Barbara Jones 224 IT Linux

    2 Barbara Jones 224 IT Mac

    3 Jake Rivera 201 R&D DB2

    3 Jake Rivera 201 R&D Oracle

    3 Jake Rivera 201 R&D Java

    Employee (1NF)

  • 7/29/2019 Database Design by Nunes

    30/50

    SAN DIEGO SUPERCOMPUTER CENTER

    Second Normal Form (2NF)

    Functional dependence - the property of one or more

    attributes that uniquely determines the value of otherattributes.

    Any non-dependent attributes are moved into a

    smaller (subset) table.

    Prevents update, insert, and delete anomalies.

  • 7/29/2019 Database Design by Nunes

    31/50

    SAN DIEGO SUPERCOMPUTER CENTER

    Functional Dependence

    emp_no name dept_no dept_name skills

    1 Kevin Jacobs 201 R&D C

    1 Kevin Jacobs 201 R&D Perl

    1 Kevin Jacobs 201 R&D Java

    2 Barbara Jones 224 IT Linux

    2 Barbara Jones 224 IT Mac

    3 Jake Rivera 201 R&D DB23 Jake Rivera 201 R&D Oracle

    3 Jake Rivera 201 R&D Java

    Employee (1NF)

  • 7/29/2019 Database Design by Nunes

    32/50

    SAN DIEGO SUPERCOMPUTER CENTER

    2NF

    emp_no name dept_no dept_name skills

    1 Kevin Jacobs 201 R&D C

    1 Kevin Jacobs 201 R&D Perl

    1 Kevin Jacobs 201 R&D Java

    2 Barbara Jones 224 IT Linux

    2 Barbara Jones 224 IT Mac

    3 Jake Rivera 201 R&D DB2

    3 Jake Rivera 201 R&D Oracle

    3 Jake Rivera 201 R&D Java

    Employee (1NF)

    Employee (2NF) Skills (2NF)

  • 7/29/2019 Database Design by Nunes

    33/50

    SAN DIEGO SUPERCOMPUTER CENTER

    Data Integrity

    Insert Anomaly - adding null values. eg, inserting a new department does not

    require the primary key of emp_no to be added.

    Update Anomaly - multiple updates for a single name change, causesperformance degradation. eg, changing IT dept_name to IS

    Delete Anomaly - deleting wanted information. eg, deleting the IT department

    removes employee Barbara Jones from the database

    emp_no name dept_no dept_name skills

    1 Kevin Jacobs 201 R&D C

    1 Kevin Jacobs 201 R&D Perl

    1 Kevin Jacobs 201 R&D Java

    2 Barbara Jones 224 IT Linux

    2 Barbara Jones 224 IT Mac

    3 Jake Rivera 201 R&D DB2

    3 Jake Rivera 201 R&D Oracle

    3 Jake Rivera 201 R&D Java

    Employee (1NF)

  • 7/29/2019 Database Design by Nunes

    34/50

    SAN DIEGO SUPERCOMPUTER CENTER

    Third Normal Form (3NF)

    Transitive dependence - two separate entities exist

    within one table.

    Any transitive dependencies are moved into a smaller

    (subset) table.

    Prevents update, insert, and delete anomalies.

  • 7/29/2019 Database Design by Nunes

    35/50

    SAN DIEGO SUPERCOMPUTER CENTER

    Transitive Dependence

    Employee (2NF)

  • 7/29/2019 Database Design by Nunes

    36/50

    SAN DIEGO SUPERCOMPUTER CENTER

    3NF

    Employee (2NF)

    Employee (3NF) Department (3NF)

  • 7/29/2019 Database Design by Nunes

    37/50

    SAN DIEGO SUPERCOMPUTER CENTER

    Other Normal Forms

    Strengthens 3NF by requiring the keys in the

    functional dependencies to be superkeys (a column or

    columns that uniquely identify a row)

    Eliminate trivial multivalued dependencies.

    Eliminate dependencies not determined by keys.

  • 7/29/2019 Database Design by Nunes

    38/50

    SAN DIEGO SUPERCOMPUTER CENTER

    Normalizing our team (1NF)

    players

    games sales

    player_id game_id name start_date end_date aces blocks spikes digs

    45 34 Mike Speedy 1/1/00 12 3 20 5

    45 35 Mike Speedy 1/1/00 10 2 15 4

    45 40 Mike Speedy 1/1/00 7 2 10 3

    78 42 Frank Newmon 5/1/05

    102 34 Joe Powers 1/1/02 7/1/05 8 6 18 10

    102 35 Joe Powers 1/1/02 7/1/05 10 8 24 12

    103 42 Tony Tough 1/1/05 15 10 20 14

  • 7/29/2019 Database Design by Nunes

    39/50

    SAN DIEGO SUPERCOMPUTER CENTER

    Normalizing our team (2NF & 3NF)

    players

    games sales

    player_stats

  • 7/29/2019 Database Design by Nunes

    40/50

    SAN DIEGO SUPERCOMPUTER CENTER

    Revisit team ER diagram

    games salesgenerates11

    tickets merchandise

    opponentdate result

    player_stats tracked

    Recorded

    by

    1

    N

    N

    aces blocks digs

    players

    Start date End dateName

    1

    spikes

  • 7/29/2019 Database Design by Nunes

    41/50

    SAN DIEGO SUPERCOMPUTER CENTER

    Star Schemas

    Best for use in decision support tasks such as Data

    Warehouses and Data Marts.

    Denormalized - allows for faster querying due to lessjoins.

    Slow performance for insert, delete, and update

    transactions.

    Comprised of two types tables: facts and dimensions.

  • 7/29/2019 Database Design by Nunes

    42/50

    SAN DIEGO SUPERCOMPUTER CENTER

    Fact Table

    Contains groupings of measures of an event to be

    analyzed.

    Measure - numeric data

    Invoice Facts

    units sold

    unit amounttotal sale price

  • 7/29/2019 Database Design by Nunes

    43/50

    SAN DIEGO SUPERCOMPUTER CENTER

    Dimension Table

    descriptor - non-numeric data

    Customer Dimension

    cust_dim_keynameaddressphone

    Time Dimension

    time_dim_keyinvoice datedue datedelivered date

    Location Dimension

    loc_dim_keystore numberstore addressstore phone

    Product Dimension

    prod_dim_keyproductpricecost

  • 7/29/2019 Database Design by Nunes

    44/50

    SAN DIEGO SUPERCOMPUTER CENTER

    Star Schema

    Customer Dimension

    cust_dim_keynameaddressphone

    Time Dimension

    time_dim_keyinvoice datedue datedelivered date

    Location Dimension

    loc_dim_keystore numberstore addressstore phone

    Product Dimension

    prod_dim_keyproductpricecost

    Invoice Factscust_dim_keyloc_dim_keytime_dim_keyprod_dim_keyunits soldunit amounttotal sale price

    1

    1

    1

    1

    N

    NN

    N

  • 7/29/2019 Database Design by Nunes

    45/50

    SAN DIEGO SUPERCOMPUTER CENTER

    Analyzing the team

    Team Facts

    datemerchandise

    tickets

    From this we will use the sales table to create our fact

    table.

  • 7/29/2019 Database Design by Nunes

    46/50

    SAN DIEGO SUPERCOMPUTER CENTER

    Team Dimension

    Player Dimension

    player_dim_keynamestart_dateend_dateacesblocksspikesdigs

    Game Dimension

    game_dim_keyopponentresult

  • 7/29/2019 Database Design by Nunes

    47/50

    SAN DIEGO SUPERCOMPUTER CENTER

    Team Star Schema

    Player Dimension

    player_dim_key

    namestart_dateend_dateacesblocksspikesdigs

    Team Facts

    player_dim_keygame_dim_keydatemerchandise

    tickets

    1

    N

    Game Dimensiongame_dim_keyopponentresult

    1

    N

  • 7/29/2019 Database Design by Nunes

    48/50

    SAN DIEGO SUPERCOMPUTER CENTER

    Books and Reference

  • 7/29/2019 Database Design by Nunes

    49/50

    SAN DIEGO SUPERCOMPUTER CENTER

    Continuing Education

    UCSD Extension

    Data Management Courses

    DBA Certificate Program

    Database Application Developer Certificate Program

    http://www.extension.ucsd.edu/studyarea/index.cfm?vAction=saCourses&vStudyAreaID=15http://www.extension.ucsd.edu/Programs/index.cfm?vAction=certDetail&vCertificateID=130http://www.extension.ucsd.edu/Programs/index.cfm?vAction=certDetail&vCertificateID=129http://www.extension.ucsd.edu/Programs/index.cfm?vAction=certDetail&vCertificateID=129http://www.extension.ucsd.edu/Programs/index.cfm?vAction=certDetail&vCertificateID=130http://www.extension.ucsd.edu/studyarea/index.cfm?vAction=saCourses&vStudyAreaID=15
  • 7/29/2019 Database Design by Nunes

    50/50

    Data Central

    The Data Services Group provides Data Allocations for

    the scientific community.

    http://datacentral.sdsc.edu/

    Tools and expertise for making data collectionsavailable to the broader scientific community.

    Provide disk, tape, and database storage resources.

    http://datacentral.sdsc.edu/http://datacentral.sdsc.edu/