Upload
calvin-johnson
View
218
Download
1
Embed Size (px)
Citation preview
The Relational Model
01/28/2014 – Material from Chapter 4 (Chap2 and Chap3 make an appearance)
Project Proposal - ???’s
Added some clarification.
What questions do you have about the project?
Data sets are available online at
psql –h db.cs.jmu.edu vlds – to play with the data
if already inside psql \c databaseName
Homework debrief
Scripts – If all is working correctly, you can run through psql.
psql –h hostname –f filename
-- comment to end of line (inline comment)
/* C-like comment, possibly multiple lines */ (block comment)
easier to build a text file and cut and paste commands in. Easier to change.
Can also use the –e option in psql to echo queries sent to server to standard output. > to pipe to a file.
Datatypes
http://www.postgresql.org/docs/9.3/static/datatype.html
Of note: varchar (n). Will save a maximum of n characters in the corresponding field.
If all of that space is not needed, then it will use less space.
money
date and time types
Review 2.2.1
acctNo type balance
12345 savings 12000.00
23456 checking 1000.00
34567 savings 25.25
firstName lastName idNo account
Robbie Banks 901-222 12345
Lena Hand 805-333 12345
Lena Hand 805-333 23456
a. attributesb. tuplesc. one tuple from each
relationd. relation schemae. database schemaf. a domain for each
attributeg. an equivalent relationh. a possible key to the
relation
ReviewExercise – 2.2.1
Accounts(balance : float, acctNo : integer, type : string)
firstName lastName idNo account
Robbie Banks 901-222 12345
Lena Hand 805-333 12345
Lena Hand 805-333 23456
Customer(firstName : string, lastName : string,idNo : string, account : integer)
a. attributesb. tuplesc. one tuple from each
relationd. relation schemae. database schemaf. a domain for each
attributeg. an equivalent relationh. a possible key to the
relation
acctNo type balance
12345 savings 12000.00
23456 checking 1000.00
34567 savings 25.25
Model (freeonlinedictionary.com)
A schematic description of a system, theory, or phenomenon that accounts for its known or inferred properties and may be used for further study of its characteristics: a model of generative grammar; a model of an atom; an economic model; a database model.
Gives us a way of understanding the database and the interaction among its objects.
Design vs Implementation
The Relational Model itself
The database is formed from a series of interconnected tables.
Each table has one or more attributes (fields).
Each table has zero or more rows (tuples).
Tables can be connected by keys.
How do we express that model in design?
Various design tools
ER Diagramming (Peter Chen – 1976) (Remember Codd’s paper in 1970)
Databases can be expressed as “entities” who are connected by “relationships”.
From Chen’s paper
In the ER-Diagram Chen Style
Entity
Relation-ship
Attribute
Teacher
Teaches
Classes
Number
Name
Id
Class
Multiplicity
Each connection can have a number associated with it.
1:1 – one to one - each entity in one relation has at most one entity in the other and vice versa. ex: a faculty member and their office.
1:M – one to many – each entity in one relation has many related entities in another. ex: one faculty member teaches many courses but each
course is taught by only one faculty member.
M:M (often shown as M:N) many to many - each entity in one relation may have many related entities in another and vice versa. ex: a supplier supplies many products and each product
is supplied by many suppliers.
In the ER-Diagram Chen Style
Entity
Relation-ship
Attribute
Teacher
Teaches
Classes
Number
Name
Id
Class
1
M
Instances
Each ERD describes a database schema.
If that schema is realized as a database, an instance of the database is a snapshot of the database at a particular period of time.
A database instance
Accounts(balance : float, acctNo : integer, type : string)
firstName lastName idNo account
Robbie Banks 901-222 12345
Lena Hand 805-333 12345
Lena Hand 805-333 23456
Customer(firstName : string, lastName : string,idNo : string, account : integer)
acctNo type balance
12345 savings 12000.00
23456 checking 1000.00
34567 savings 25.25
Keys (Chapter 3.1.2)
“A key for an entity set E is a set K of one or more attributes such that, given any two distinct entities e1 and e2 in E, e1 and e2 cannot have identical values for each of the attributes in the key.” Ullman
A key is a set of attributes that uniquely define a row of a table.
There may be more than one key. One is selected as the primary key.
One additional requirement – No proper subset of the key attributes can themselve be a key.
Keys
Person(id, email, last, first, dob, SSN)
id, email, last-first-dob, SSN may all be considered keys.
id-email, last-first-dob-SSN would not be keys. In each, there is a proper subset of elements that itself is a key.
We would call id-email and last-first-dob-SSN superkeys…a set of attributes that contain a key. Every key is a superkey but not every superkey is a key.
We sometimes use the term candidate keys for the possible keys to the relation. From the candidates, a single key is chosen.
Keys in ERD
Movies Own Studios Runs Presidents
title year
genrelength
Id
name
composite key
Movies Own Studios Runs Presidents
So what does this ERD tell us?
Constraints (Referential Integrity)
The curve arrow is a 1 relationship. The filled arrow is a 0..1 relationship. (I may have a studio that does not currently have a president, but if I have a president, she must run a studio.)
A studio may own 0..* Movies, but a Movie must be owned by a single Studio.
Many-one relationship->optional tupleReferential integrityrequired tuple
Movies Own Studios Runs Presidents
Alternate expression of the same idea
Constraints (Referential Integrity)
The curve arrow is a 1 relationship. The filled arrow is a 0..1 relationship. (I may have a studio that does not currently have a president, but if I have a president, she must run a studio.)
A studio may own 0..* Movies, but a Movie must be owned by a single Studio.
0..1 110..m
Other kinds of relationships - self
Movies sequel of
original
A movie may have many sequels, but a sequel can have only one original.
sequel
Other kinds of relationships - multiway
Movies
Contracts
Studios
Stars
A contract is between a star, the movie that star is in, the producing studio and the star’s contracting studio.
Producing studio, the arrow implies that there is at most one producing studio
Stars’ studios, the arrow impliesa star can have at most one studio
pay
Other diagramming tools
Crow’s Foot
UML – See Chapter 4.7
MoviesPK titlePK yearlength genre
StudiosPresidentsPK Idname
own
minimum cardinality
maximum cardinality
Means, a movie must be owned by one and only one studio. A studio may own no or many movies.
Means a studio may have one or no presidents, but if there is a president then she must preside over exactly one studio.
Some other terms
Weak entity (represented by double lines) An entity that depends upon the existence of
another. Usually it will have as one of its key components the key from another table.
Ex. A student-class table is a weak entity of student and class as it depends on both.
We will see that relationships are often implemented as tables within a db. They become weak entities.
Design considerations
While we will be looking more at design through the semester, a few things of note:
Faithfulness (or fitfulness) – The data should “faithfully” represent the underlying real world model and be “fit” for the application. Ex: Stars and Movies – Implies a many to many
relationship since movies have multiple stars and stars appear in multiple movies.
It also depends on the enterprise in which the data will be used. A school that has no “team” teaching could say one teacher
teaches many courses but a class can only be taught by one teacher vs
JMU which has team teaching so we have a many to many relationship between teachers and courses.
Design
Avoid redundancy – Redundant data is often inconsistent data.
Simplicity – Avoid extra elements or elements that do not support the enterprise.
Choosing the right relationships – Look at the relationships among the entities and determine the best way to represent them. (See diagram 143)
What is right for the type of element? Is it an attribute of an entity or a standalone entity in its own right? Is it an entity or simply a relationship between entities.
Design
Choose the right data elements (Atomicity) Do we make name an attribute or do we break it up
into first middle last. name can be derived from the individual data.
Avoid derived data (Staleness) Data that can be derived from other data perhaps
does not belong in the db. If I have date of birth I can derive age at any time that
I need it. If I have vendor transactions, I can derive the total
amount purchased.
Design – Referential Integrity
Where do we want to enforce the notion of required tuples?
When must we have a corresponding row?
Sometimes we build tables to help us maintain data integrity.