Upload
raymond-oneal
View
219
Download
0
Tags:
Embed Size (px)
Citation preview
Announcements Project-Phase 1 is already out on Oct. 28 and due Nov. 4
Homework 1 is out today Nov. 1 and due Nov. 8
TAs Assigned to Project Teams All phases of a given team will be graded by the same TA The list is on the course website under “Homeworks & Projects” and under blackboard
system under “Project Phases” as well
Homeworks & Project Due Date Reducing the number of HWs and project phases by one so the last due data is Dec.
6
cs3431 2
One Last Example of ERD !!!
cs3431 3
What is Bad???
4
carcar
ColorNameColorName
colorIDcolorID
VINVINMakeMake ModelModel
CustomerCustomer
IDIDDoBDoB NameName
Car-featureCar-feature
FeatureNameFeatureName
containscontains
buysbuys
DateDate
CarMilesCarMiles
AgeAge
PricePrice
LoanLoan
amountamount
numbernumber
BankBank
takestakes
DateDate
= A customer can buy many cars= A customer may take a loan to buy a specific car
From the Previous Example ColorId & ColorName (cause redundancy & inconsistency)
Car can have one feature (wrong cardinality)
Car-feature has one attribute (should not be an entity)
CarMiles should be attached to the car (not to the relationship)
Age should be a derived attribute
A car can be bought by one (or zero) customers (wrong arrow head)
Loan and Car are not linked together (buys should be 3-way) Or create a new entity set “Contract” and link it to the three entity sets
5
6
Relational Model
7
Relational Model Another model for describing your data and application
requirements
Currently the most widely used model Vendors: Oracle, Microsoft, IBM
Recent competitions Object Oriented Model: ObjectStore, Oracle XML Databases : native and extensions
What about ER model ER model is the first step in the design ER model is then translated (mapped) to relational model
8
Relational Model Structure:
Relations (also called Tables) Attributes (also called Columns or Fields)
9
Attributes in Relational Model Attribute Name
Each attribute of a relation has a unique name within the relation
Attribute Domain The set of allowed values for each attribute is called the domain of the attribute E.g., integer, real, date, string {“Male”, “Female”}, etc.
Atomic Attributes Attribute values are required to be atomic
(primitive or derived) Note: multivalued attributes are not atomic Note: composite attributes are not atomic We will see how to handle non-atomic attributes
Special Null Value The special value null is a member of every domain The null value means the values does not exist (unknown)
Relational Schema Schema is the structure of the relation Attributes A1, A2, ..., An of relation R
R=(A1,A2,...,An ) is the relations schema of R
Relational Schema of “customer” is customer = (customer_name, customer_street, customer_city) customer (customer_name, customer_street, customer_city)
10
Relational Instance Instance is the current schema plus the current tuples in the relation
The current instance of “customer” has four tuples (rows or records)
Instances are frequently changing E.g., inserting new tuples, deleting or updating existing tuples
Schemas may also change but not frequently
11
Arity & Cardinality of Relations
Arity (Degree) of a relation Number of attributes of that relation
Cardinality of a relation Number of tuples in that relation
12
Arity = 3Cardinality = 4
13
Relations are Unordered
Order of tuples is not important
Order of columns is not important
Think of a relation as a set (or bag) of tuples not as a list of tuples Most current DBMSs allow bags
sNumber sName
1 Dave
2 Greg
Student
sNumber sName
2 Greg
1 Dave
Student
These two relations are equivalent
Keys of Relations Super Keys
Subset of attributes that uniquely identify each tuple
Candidate Keys Minimal super key
Primary Keys Choose any of the candidate keys Represented by “underline” under the selected attributes
Set vs. Bag If a relation has a primary key, then it is a set not a bag
14
customer (customer_name, customer_street, customer_city)
Assuming the customer name is unique
How do we create these relations?
SQL Language
How do we create these relations?
SQL Language
Relational Query Languages (SQL Standard) Developed by IBM (system R) in 1970s Standard as used by many vendors (portable)
SQL (Structured Query Language)
Standards: SQL-86 SQL-89 (minor revision) SQL-92 (major revision) SQL-99 (major extensions in OO, current standard)
SQL Language
Data Definition Language (DDL) Create tables, specifying the columns and their types of columns,
the primary keys, etc. Drop tables, add/drop columns, add/drop constraints – primary
key, unique, etc.
Data Manipulation Language (DML) Update, Insert, Delete tuples Query the data
16
17
SQL DDL
DDL allows the specification of:
The schema for each relation
The type and domain of values associated with each attribute
Integrity constraints Will see many types of these constraints
The set of indices to be maintained for each relation
Security and authorization information for each relation
The physical storage structure of each relation on disk
18
DDL: Creating TablesCREATE TABLE <tableName> (
<col> <type>,<col> <type>,…<col> <type>,[CONSTRAINT <cName>] PRIMARY KEY (…),[CONSTRAINT <cName>] UNIQUE (…),[CONSTRAINT <cName>] FOREIGN KEY (…)
REFERENCES <tableName> (…));
Domain Types in SQL String:
string: variable size string with no maximum (theoretically ) char(n): Fixed length character string, with user-specified length n varchar(n): Variable length character strings, with user-specified maximum length n
Numbers: Int: Integer Smallint (short): Small integer float(n): Floating point number, with user-specified precision of at least n digits. Decimal(n,d): Double-precision of n decimal digits with decimal point at position d
Bits: Bit(n): String of bits of fixed size n (user-defined) Bit Varying(n): String of variable-size bits of maximum n (user-defined)
Boolean: Boolean type, True or False
Date & Time: Date: Date field with certain format Time: Date plus time with a certain format
19
customer (name: string, DoB: Date, Phone: Char(10), …)customer (name: string, DoB: Date, Phone: Char(10), …)
DDL ---- Creating Relations
Create “Students” relation
CREATE TABLE Students(sid: CHAR(20), name: CHAR(20), login: CHAR(10), age: INTEGER, gpa: REAL);
CREATE TABLE Enrolled(sid: CHAR(20), cid: Varchar(20),
enrollDate: date, grade: CHAR(2))
Observe that type (domain) of each field is specified
Types are enforced by DBMS whenever tuples are added or modified
Create “Enrolled” relation
CREATE TABLE Courses(cid: Varchar(20), name: string, maxCredits : integer,
graduateFlag: boolean);
Create “Courses” relation
21
Dropping & Altering Tables Dropping (Deleting) a table
DROP TABLE <tableName>;
Altering a table to add a column ALTER TABLE <tableName> ADD <col> <type>; The initial value of the column is Null
Altering a table to drop a column ALTER TABLE <tableName> DROP COLUMN <col>; Some DBMSs do not support dropping columns
Integrity Constraints (ICs)
IC: condition that must be True for any instance of the database
ICs are specified when schema is defined ICs are checked when relations are modified
A legal instance of a relation is one that satisfies all specified ICs.
DBMS should not allow illegal instances.
22
23
IC: Key Constraints Two types of key constraints can be defined
Primary Key Unique
The other candidate keys
Each relation may have only one primary key but, possibly many unique keys
Primary key attributes do not accept Null
Create “Students” relation
CREATE TABLE Students(sid: CHAR(20) Primary Key, name: CHAR(20), login: CHAR(10) Unique, age: INTEGER, gpa: REAL);
CREATE TABLE Courses (cid: Varchar(20) Primary Key, name: string, maxCredits : integer, graduateFlag: boolean);
Create “Courses” relation
IC: Key Constraints (Cont’d)
Another way to define Keys
24
Create “Students” relation
CREATE TABLE Students(sid: CHAR(20), name: CHAR(20), login: CHAR(10), age: INTEGER, gpa: REAL,
Primary Key (sid), Unique (login));
CREATE TABLE Courses (cid: Varchar(20), name: string, maxCredits : integer, graduateFlag: boolean, Primary Key (cid));
Create “Courses” relation
CREATE TABLE Enrolled(sid: CHAR(20), cid: Varchar(20), enrollDate: date, grade: CHAR(2),
Primary Key (sid, cid, enrollDate));
Create “Enrolled” relation
Violations of Key Constraints
DBMS prevents this from happening
Example customer (name: string, DoB: Date, Phone: Char(10), address: string);
(“Mike”, “10/25/1988”, “1765432123”, “NY”); Inserted
(“Mike”, “5/10/1990”, “1999887665”, “CA”); Raise an error
(null , “5/10/1990”, “1999887665”, “CA”); Raise an error
25
26
Violation of key constraints
Primary key violation if: There is a row with null values for any attribute of primary key, OR Two rows with same values for all attributes of primary key
Unique key violation if: Two rows have the same non-null values in the unique attributes Null values are not equal to each other
DBMS will prevent this from happening
27
Adding/Dropping Constraints
Adding a constraint (after creating the table) ALTER TABLE <tableName> ADD CONSTRAINT
<cName> <constraint type> …
Dropping a constraint ALTER TABLE <tableName> DROP CONSTRAINT
<cName>
Constraints Limits the Data
28
CREATE TABLE Enrolled(sid: CHAR(20), cid: Varchar(20), enrollDate: date, grade: CHAR(2),
Primary Key (sid, cid));
Create “Enrolled” relation
CREATE TABLE Enrolled(sid: CHAR(20), cid: Varchar(20), enrollDate: date, grade: CHAR(2),
Primary Key (sid, cid, enrollDate));
Create “Enrolled” relation
A student can take a course only once.
A student can take a course many times on different dates.
Other Constraint Types
29
Create “Students” relation
CREATE TABLE Students(sid: CHAR(20), name: CHAR(20) NOT NULL, login: CHAR(10), age: INTEGER, gpa: REAL Default 0.0,
Primary Key (sid), Unique (login));
CREATE TABLE Courses (cid: Varchar(20), name: string, maxCredits : integer, graduateFlag: boolean Default False, Primary Key (cid));
Create “Courses” relation
NOT NULL & DEFAULT constraints
Other Constraint Types (Cont’d)
30
CREATE TABLE Enrolled(sid: CHAR(20), cid: Varchar(20), enrollDate: date, grade: CHAR(2),
Primary Key (sid, cid, enrollDate),Constraint gradeVal check grade in (‘A+’, ‘A-’, ‘B+’, ‘B-’));
Create “Enrolled” relation
Domain constraint
IC: Foreign Key Constraints(Referential Integrity )
31
Create “Students” relation
CREATE TABLE Students(sid: CHAR(20), name: CHAR(20), login: CHAR(10), age: INTEGER, gpa: REAL);
CREATE TABLE Courses (cid: Varchar(20), name: string, maxCredits : integer, graduateFlag: boolean);
Create “Courses” relation
CREATE TABLE Enrolled(sid: CHAR(20), cid: Varchar(20), enrollDate: date, grade: CHAR(2));
Create “Enrolled” relationForeign key
Foreign key
Foreign Keys, Referential Integrity
Foreign key : Set of fields in one relation that "refer" to a tuple in another relation (like a pointer)
Foreign key : FK in referencing relation must match PK of referenced relation. Match = same number of columns, compatible data types (column names
can be different) The relationship is many-to-one from the “referencing” to the “referenced”
sid name login age gpa
53666 Jones jones@cs 18 3.453688 Smith smith@eecs 18 3.253650 Smith smith@math 19 3.8
sid cid grade53666 Carnatic101 C53666 Reggae203 B53650 Topology112 A53666 History105 B
Enrolled (referencing relation)Students (referenced relation)
Foreign Key Primary Key
Foreign Keys in SQL
33
Create “Students” relation
CREATE TABLE Students(sid: CHAR(20) Primary Key, name: CHAR(20), login: CHAR(10), age: INTEGER, gpa: REAL);
CREATE TABLE Courses (cid: Varchar(20) Primary Key, name: string, maxCredits : integer, graduateFlag: boolean);
Create “Courses” relation
CREATE TABLE Enrolled(sid: CHAR(20) Foreign Key References
(Students.sid), cid: Varchar(20), enrollDate: date, grade: CHAR(2),
Constraints fk_cid Foreign Key cid References (Courses.cid));
Create “Enrolled” relation
Alter Table Enrolled Add Constraints fk_cid Foreign Key cid References Courses(cid));
sid & cid have to be already defined as
primary keys
sid & cid have to be already defined as
primary keys
34
Violation of Foreign Key Constraints
Suppose we have: FOREIGN KEY R1 (S1) REFERENCES R2 (S2)
This constraint is violated if: If there is a value in R1.S1 which does not exist in R2.S2
The DBMS prevents this from happening
Enforcing Referential Integrity
Consider Students and Enrolled; sid in Enrolled is a foreign key that references Students.
What if inserting in “Enrolled” values (“4545”, “History105”, “A”) DBMS will reject it because no student with sid “4545”
Enforcing Referential Integrity (Cont’d)
Deletion: What if an Enrolled tuple is deleted? No problem. It is allowed
Enforcing Referential Integrity (Cont’d)
Deletion: What if a Student tuple is deleted? Cascading -- Also delete all Enrolled tuples that refer to it
No Action -- Disallow deletion of a Students tuple since there are dependent tuples
Set Default -- Set sid in Enrolled tuples that refer to it to a default sid
Set Null -- Set sid in Enrolled tuples that refer to it to a special value null, denoting `unknown’ (Not always applicable)
Same rules apply when updating Students.sid
The two most common actions are: Cascading or No Action
Referential Integrity in SQL
SQL/99 supports all 4 options on deletes & updates:
CREATE TABLE Enrolled(sid: CHAR(20) Foreign Key References
(Students.sid), cid: Varchar(20), enrollDate: date, grade: CHAR(2),
Constraints fk_cid Foreign Key cid References (Courses.cid) ON DELETE CASCADE ON UPDATE NO ACTION);
Cyclic Dependencies
What if cyclic dependencies between two tables exists? Table R has a foreign key to Table S (Say R.a references S.b) Table S has a foreign key to Table R (Say S.c references R.d)
At insertion time: When insert into R will be rejected because S.b does not yet exits When insert into S will be rejected because R.d does not yet exits
Solution (Many) – The easiest is: Disable the constraint on one table, say S Insert the tuple into S, then insert into R Enable the constraint on S again
Alter Table S Disable Constraint <name>;
Alter Table S Enable Constraint <name>;
Where do ICs Come From? ICs are based upon semantics of real-world enterprise being
described in database relations.
We can check a database instance to see if an IC is violated ? In typical cases, DBMS should make sure all constraints are enforced
Given an instance, can we tell what are the ICs? No, NEVER ! An IC is a statement about all possible instances! They need to be checked from the schema definition not from a
given instance
41
Relational Model: Summary
Structure Relations (Tables) Attributes (Columns, Fields)
Constraints + Constraint Enforcement Domain, Default, and Not Null Constraint Key Constraint
Primary key, candidate key (unique) Foreign Key Constraint
Inserting, Updating, Deleting Data This is performed using Data Manipulation
Language of SQL (DML)
Insertion Insert into Students values (“1111”, …);
Deletion Delete from Students; Delete from Students Where sid = “1111”;
Update Update Students Set GPA = GPA + 0.4; Update Students Set GPA = GPA + 0.4 Where sid = “1111”;
42