Upload
others
View
6
Download
0
Embed Size (px)
Citation preview
LeongHW, SOC, NUS (UIT2201: 4 Database) Page 1
Copyright © 2007-9 by Leong Hon Wai
Database – Info Storage and Retrieval
! Aim: Understand basics of " Info storage and Retrieval; " Database Organization; " DBMS, Query and Query Processing; " Work some simple exercises; " Advanced Applications of DB Technology; " Concurrency Issues (in Database)
! Readings: " [SG] --- Ch 13.3
Last Revised: 28 September 2016.
LeongHW, SOC, NUS (UIT2201: 4 Database) Page 2
Copyright © 2007-9 by Leong Hon Wai
Outline
! What is a Database and Evolution…
! Organization of Databases
! Foundations of Relational Database
! DBMS and Query Processing
! Advanced Applications of DB Technology
! Concurrency Issue in Database
LeongHW, SOC, NUS (UIT2201: 4 Database) Page 3
Copyright © 2007-9 by Leong Hon Wai
What is a Database? (1)
! First attempt… " A collection of data
! Examples: " Course database " Employee database
LeongHW, SOC, NUS (UIT2201: 4 Database) Page 4
Copyright © 2007-9 by Leong Hon Wai
What is a Database? (2)
! First attempt… " A collection of data…
! Examples: " Course database " Employee database " Jobs Database " LINC Database " Inventory Database " Recipe Database " Hotel Database " CIA/FBI Database
LeongHW, SOC, NUS (UIT2201: 4 Database) Page 5
Copyright © 2007-9 by Leong Hon Wai
What is a Database? (3)
What if we combine some tables?
Employee-Database +
CIA/FBI Database
What can we find out? May possibly find “criminals” hidden in the organization?
LeongHW, SOC, NUS (UIT2201: 4 Database) Page 6
Copyright © 2007-9 by Leong Hon Wai
What is a Database? (4)
What if we combine some tables?
Fridge-Inventory-Database +
Recipe Database
What’s for late-night supper? (eg: after family arrive home at 2pm
from airport, and all dead tired?)
LeongHW, SOC, NUS (UIT2201: 4 Database) Page 7
Copyright © 2007-9 by Leong Hon Wai
What is a Database? (5)
A database system is an integrated collection
of database tables
In UIT2201, we only consider Relational Databases
(by E.F. Codd, [CACM, 1970])
LeongHW, SOC, NUS (UIT2201: 4 Database) Page 8
Copyright © 2007-9 by Leong Hon Wai
Database (with 3 Tables) STUDENT-INFO (30,000)
Student-ID Name NRIC-ID Address Tel-No Faculty Major
U0601001S Betty Yeo S9998888Q 23A Cinn 11112222 FoS Math
U0902002R Tan Lee Lee S8888777A 57 PGP 22224444 FASS Phil
U1909999P Albert Neo S5556666C 707 KE7 99995555 SOC CS
U2908888P Fish Leong S7778888F 808 CAPT 33332222 Biz Mktg
COURSE-INFO (1,000)
Course-ID Name Day Hour Venue Instructor
CS3230 Algorithms Wed 1400 LT15 S-Halim
MA1231 Disc. Math Tue 1400 LT28 CF-Gauss
UIT2201 CSITR Wed 1000 USP-SR1 LeongHW
VE1001 Vocal-Ex Fri 1700 RWS TanEC
ENROL (100,000)
Student-ID Course-ID
U0601001S MA1231
U0601001S UIT2201
U1909999P CS3230
U2908888P WT1001
LeongHW, SOC, NUS (UIT2201: 4 Database) Page 9
Copyright © 2007-9 by Leong Hon Wai
Evolution of Databases (1)
From Separate, Independent DB
(Eg: 1 DB per dept/faculty)
Bad: incompatability, inconvenience,
error-prone To Integrated DB System
(eg: serve all depts/faculties)
Compatible, fast
NUS CORS Online Reg Course DB System
SG IRAS e-filing Tax Filing DB System
LeongHW, SOC, NUS (UIT2201: 4 Database) Page 10
Copyright © 2007-9 by Leong Hon Wai
Evolution of Databases (2)
With Integrated DB System
Create new problems, new job roles, different
services, etc
New Problems: Ensure consistency,
differentiated services, Different views of data,
New Job Roles: Database Administrator,
DB programmer,
LeongHW, SOC, NUS (UIT2201: 4 Database) Page 11
Copyright © 2007-9 by Leong Hon Wai
Outline
! What is a Database and Evolution…
! Organization of Databases
! Foundations of Relational Database
! DBMS and Query Processing
! Advanced Applications of DB Technology
! Concurrency Issue in Database
LeongHW, SOC, NUS (UIT2201: 4 Database) Page 12
Copyright © 2007-9 by Leong Hon Wai
Database (with 3 Tables) STUDENT-INFO (30,000)
Student-ID Name NRIC-ID Address Tel-No Faculty Major
U0601001S Betty Yeo S9998888Q 23A Cinn 11112222 FoS Math
U0902002R Tan Lee Lee S8888777A 57 PGP 22224444 FASS Phil
U1909999P Albert Neo S5556666C 707 KE7 99995555 SOC CS
U2908888P Fish Leong S7778888F 808 CAPT 33332222 Biz Mktg
COURSE-INFO (1,000)
Course-ID Name Day Hour Venue Instructor
CS3230 Algorithms Wed 1400 LT15 S-Halim
MA1231 Disc. Math Tue 1400 LT28 CF-Gauss
UIT2201 CSITR Wed 1000 USP-SR1 LeongHW
VE1001 Vocal-Ex Fri 1700 RWS TanEC
ENROL (100,000)
Student-ID Course-ID
U0601001S MA1231
U0601001S UIT2201
U1909999P CS3230
U2908888P UIT2201
LeongHW, SOC, NUS (UIT2201: 4 Database) Page 13
Copyright © 2007-9 by Leong Hon Wai
Figure 13.3: Data Organization Hierarchy
Database Organization (Overview)
LeongHW, SOC, NUS (UIT2201: 4 Database) Page 14
Copyright © 2007-9 by Leong Hon Wai
Data Organization (A Bottom-Up View)
! Bit " A binary digit, (0 or 1)
! Byte " A group of eight (8) bits " Stores the binary rep. of a character / small integer " A single unit of addressable memory
! Field " A group of bytes used to represent a string
LeongHW, SOC, NUS (UIT2201: 4 Database) Page 15
Copyright © 2007-9 by Leong Hon Wai
Database Organization (continued)
! Record " A collection of related fields
! Data File (Table) " Related records are kept in a data file
! Database " Related files make up a database
LeongHW, SOC, NUS (UIT2201: 4 Database) Page 16
Copyright © 2007-9 by Leong Hon Wai
Figure 13.4: Records and Fields in a Single File (Table)
Database Files or Database Table
Eg: COURSE-INFO (Table, Record, Fields)
LeongHW, SOC, NUS (UIT2201: 4 Database) Page 17
Copyright © 2007-9 by Leong Hon Wai
Outline
! What is a Database and Evolution…
! Organization of Databases
! Foundations of Relational Database
! DBMS and Query Processing
! Advanced Applications of DB Technology
! Concurrency Issue in Database
LeongHW, SOC, NUS (UIT2201: 4 Database) Page 18
Copyright © 2007-9 by Leong Hon Wai
Database (with 3 Tables) STUDENT-INFO (30,000)
Student-ID Name NRIC-ID Address Tel-No Faculty Major
U0601001S Betty Yeo S9998888Q 23A Cinn 11112222 FoS Math
U0902002R Tan Lee Lee S8888777A 57 PGP 22224444 FASS Phil
U1909999P Albert Neo S5556666C 707 KE7 99995555 SOC CS
U2908888P Fish Leong S7778888F 808 CAPT 33332222 Biz Mktg
COURSE-INFO (1,000)
Course-ID Name Day Hour Venue Instructor
CS3230 Algorithms Wed 1400 LT15 S-Halim
MA1231 Disc. Math Tue 1400 LT28 CF-Gauss
UIT2201 CSITR Wed 1000 USP-SR1 LeongHW
VE1001 Vocal-Ex Fri 1700 RWS TanEC
ENROL (100,000)
Student-ID Course-ID
U0601001S MA1231
U0601001S UIT2201
U1909999P CS3230
U2908888P UIT2201
LeongHW, SOC, NUS (UIT2201: 4 Database) Page 19
Copyright © 2007-9 by Leong Hon Wai
Foundations of Relational DB
! Table (Relation) : information about an entity " A set of records (eg: COURSE-INFO Table)
! Record (Tuple): data about an instance of the entity " A row in the table; A tuple; Eg: (UIT2201, …, Wed, 1000,…)
! Attribute (Fields): category of information/data " Columns in the table (eg: Course-ID, Name, Day, Hour, …)
! Schema: A set of Attributes " COURSE-INFO – {Course-ID, Name, Day, Hour, Venue,
Instructor}
! Database: A set of tables (relations) " { STUDENT-INFO, COURSE-INFO, ENROL }
LeongHW, SOC, NUS (UIT2201: 4 Database) Page 20
Copyright © 2007-9 by Leong Hon Wai
Typical Database Operations…
! Insert a new Record
! Delete Records " Delete a specific record " Delete all records that match the specification X
! Searching for Records (‘select’) " Look up all records that match a given spec. X
! Display some attributes (‘projection’)
! Join two or more tables (‘join’)
LeongHW, SOC, NUS (UIT2201: 4 Database) Page 21
Copyright © 2007-9 by Leong Hon Wai
Relational-DB Operations
Insert (COURSE-INFO, (CS1102,Data-Struct, Thu, 1000, LT14, King) ) Delete (COURSE-INFO, (UIT2201, CSITR, Wed, 1000, USP-SR1, LeongHW) )
Delete (COURSE-INFO, (UIT2201, *, *, *, *) ) Lookup (COURSE-INFO, ( *, *, Tue, *, *) )
Lookup (COURSE-INFO, ( *, *, Wed, *, LT27, *))
COURSE-INFO (1,000)
Course-ID Name Day Hour Venue Instructor
CS3230 Algorithms Wed 1400 LT15 S-Halim
MA1231 Disc. Math Tue 1400 LT28 CF-Gauss
UIT2201 CSITR Wed 1000 USP-SR1 LeongHW
VE1001 Vocal-Ex Fri 1700 RWS TanEC
LeongHW, SOC, NUS (UIT2201: 4 Database) Page 22
Copyright © 2007-9 by Leong Hon Wai
“Relational” DB come from
Relational-DB Schema
Table Records
Relational Ops:
Insert Union Select Project
Join
Relational Algebra Schema Relation Tuples
Relational Ops: Insert Union Select Project
Natural Joins
If interested, read more in https://en.wikipedia.org/wiki/Relational_algebra
LeongHW, SOC, NUS (UIT2201: 4 Database) Page 23
Copyright © 2007-9 by Leong Hon Wai
Outline
! What is a Database and Evolution…
! Organization of Databases
! Foundations of Relational Database
! DBMS and Query Processing
! Advanced Applications of DB Technology
! Concurrency Issue in Database
LeongHW, SOC, NUS (UIT2201: 4 Database) Page 24
Copyright © 2007-9 by Leong Hon Wai
Database for use in Tutorials STUDENT-INFO (30,000)
Student-ID Name NRIC-ID Address Tel-No Faculty Major
U0601001S Betty Yeo S9998888Q 23A Cinn 11112222 FoS Math
U0902002R Tan Lee Lee S8888777A 57 PGP 22224444 FASS Econs
U1909999P Albert Neo S5556666C 707 KE7 99995555 SOC CS
. . . . . . . . . . . . . . . . . . . . .
COURSE-INFO (1,000)
Course-ID Name Day Hour Venue Instructor
CS3230 Algorithms Wed 1000 USP-SR1 S-Halim
MA1231 Disc. Math Tue 1400 LT28 CF-Gauss
UIT2201 CSITR Wed 1000 USP-SR1 LeongHW
. . . . . . . . . . . . . . . . . .
ENROL (100,000)
Student-ID Course-ID
U0601001S MA1231
U0601001S UIT2201
U1909999P CS3230
. . . . . .
LeongHW, SOC, NUS (UIT2201: 4 Database) Page 25
Copyright © 2007-9 by Leong Hon Wai
Database Management Systems
! DBMS (Database Mgmt Systems) " Software system, maintains the files and data
! Relational Database Model (and Design) " Database specified via schema (conceptual models)
! Database Query Processing " To query the database (to get information) " SQL (Structured Query Language) ◆ Specialized query language
! Relationships between tables " Established via primary keys and foreign keys
LeongHW, SOC, NUS (UIT2201: 4 Database) Page 26
Copyright © 2007-9 by Leong Hon Wai
Query Processing with SQL
! SQL is a DB Query Language " Supported by many of the common DBMS " Provide easy means to query a DBMS " Provides easier means to insert/delete records " Quite simple to use / can even learn on your own
! Simplified SQL Queries (used in UIT2201) " SELECT <some fields>
FROM <some tables> WHERE <some conditions>;
Much simpler than real SQL. But, sufficient for UIT2201
LeongHW, SOC, NUS (UIT2201: 4 Database) Page 27
Copyright © 2007-9 by Leong Hon Wai
DB Query Processing
with SQL (Simple Examples)
LeongHW, SOC, NUS (UIT2201: 4 Database) Page 28
Copyright © 2007-9 by Leong Hon Wai
Database for Rugs-for-You
LeongHW, SOC, NUS (UIT2201: 4 Database) Page 29
Copyright © 2007-9 by Leong Hon Wai
Query Processing (simple, using SQL)
SELECT ID, LastName, FirstName, PayRate FROM EMPLOYEES WHERE (LastName = ‘KAY’);
Output of SQL Query ID LASTNAME FIRSTNAME PAYRATE
116 Kay Janet $16.60
171 Kay John $17.80
SQL Query
Note: Output of any SQL query
is a new table.
LeongHW, SOC, NUS (UIT2201: 4 Database) Page 30
Copyright © 2007-9 by Leong Hon Wai
Query Processing (simple, using SQL)
SELECT ID, LastName, FirstName, HoursWorked FROM EMPLOYEES WHERE (HOURSWORKED > 200);
SELECT * FROM EMPLOYEES WHERE (PAYRATE > 15.00);
LeongHW, SOC, NUS (UIT2201: 4 Database) Page 31
Copyright © 2007-9 by Leong Hon Wai
Figure 13.8: Three Tables in the Rugs-For-You Database
Primary Keys and Foreign Keys
(Readings: Primary & Foreign Keys, [SG3] Section 13.3)
LeongHW, SOC, NUS (UIT2201: 4 Database) Page 32
Copyright © 2007-9 by Leong Hon Wai
SQL with Multiple Relations
! In SQL, combining two or more tables " that share common data (via keys) " SQL uses a Join operation.
SELECT ID, LastName, FirstName, PlanType, DateIssued FROM EMPLOYEES, INSURANCEPOLICIES WHERE (LastName = “Takasano”) AND (EMPLOYEES.ID = INSURANCEPOLICIES.EmployeeID);
key key Foreign key
LeongHW, SOC, NUS (UIT2201: 4 Database) Page 33
Copyright © 2007-9 by Leong Hon Wai
Join Operation of 2 Tables COURSE-INFO (1,000)
Course-ID Name Day Hour Venue Instructor
CS3230 Algorithms Wed 1400 LT15 S-Halim
MA1231 Disc. Math Tue 1400 LT28 CF-Gauss
UIT2201 CSITR Wed 1000 USP-SR1 LeongHW
VE1001 Vocal-Ex Fri 1700 RWS TanEC
ENROL (100,000)
Student-ID Course-ID
U0601001S MA1231
U0601001S UIT2201
U1909999P CS3230
U2908888P UIT2201
JOIN COURSE-INFO and ENROL WHERE (COURSE-INFO.Course-ID = ENROL.Course-ID)
RESULT-OF-JOIN-OPERATION
Course-ID Name Day Hour Venue Instructor Student-ID
CS3230 Algorithms Wed 1400 LT15 S-Halim U1909999P
MA1231 Disc. Math Tue 1400 LT28 CF-Gauss U0601001S
UIT2201 CSITR Wed 1000 USP-SR1 LeongHW U0601001S
UIT2201 CSITR Wed 1000 USP-SR1 LeongHW U2908888P
LeongHW, SOC, NUS (UIT2201: 4 Database) Page 34
Copyright © 2007-9 by Leong Hon Wai
More about JOIN operation
! Check out animation of Join Operation " Running time: O(mn) row operations
! Join is an expensive operation!
! May produce huge resultant tables;
! Exercise great care with JOINs
(See examples in Tutorial)
LeongHW, SOC, NUS (UIT2201: 4 Database) Page 35
Copyright © 2007-9 by Leong Hon Wai
Declarative vs Procedural
SQL is a Declarative Language
You specify: WHAT you want
From which tables What conditions to meet
You DON’T specify: HOW to do query
Procedural HOW to process it
Uses 3 basic prim: e-select,
e-project, e-join,
Give alg steps: “Algorithm” to process it
LeongHW, SOC, NUS (UIT2201: 4 Database) Page 36
Copyright © 2007-9 by Leong Hon Wai
Why declarative SQL?
SQL is a Declarative Language
You specify: WHAT you want
From which tables What conditions to meet
You DON’T specify: HOW to do query
Why SQL?
Easy to Use Ignore HOW to do query
Let DBMS figure out: HOW to do query
EFFICIENTLY
REDUCE user mistakes WORKS WELL
most of the time.
LeongHW, SOC, NUS (UIT2201: 4 Database) Page 37
Copyright © 2007-9 by Leong Hon Wai
Why learn Procedural…
Procedural HOW to process it
Uses 3 basic prim: e-select,
e-project, e-join,
Give alg steps: “Algorithm” to process it
Open the Hood: Use 3 primitives:
(e-select, e-project, e-join) ..
Understand HOW it works.
..
Learn Algorithm: To process Complex Queries EFFICIENTLY
LeongHW, SOC, NUS (UIT2201: 4 Database) Page 38
Copyright © 2007-9 by Leong Hon Wai
Three basic primitives
T1 # e-select from SCHEDULE-DB where (DAY=“Tue”); T4 # e-select from SCHEDULE-DB where (HOUR=1200);
! Basic Primitive Operation 1 – e-select " e-select from <table> where <some condition>; " (a row/record selector) " includes ALL columns. (ALWAYS)
! Basic Primitive Operation 2 – e-project " e-project <some fields> from <table>; " (a column/field selector) " includes ALL rows. (ALWAYS)
P1 # e-project COURSE, DAY from SCHEDULE-DB; P6 # e-project COURSE, HOUR from T1;
e-project take (n) row operations
LeongHW, SOC, NUS (UIT2201: 4 Database) Page 39
Copyright © 2007-9 by Leong Hon Wai
Basic primitives operations (2) P1 # e-project Course, Day from SCHEDULE-DB;
SCHEDULE-DB
Course Day Hour UIT2201 Tue 1000 UIT2201 Tue 1100 CS1101 Wed 1300 CS1101 Wed 1400
P1 Course Day
UIT2201 Tue
UIT2201 Tue CS1101 Wed CS1101 Wed S1
# e
-sel
ect
from
SCH
EDUL
E-DB
whe
re (
Day=
“Tue
”);
S1 Course Day Hour UIT2201 Tue 1000 UIT2201 Tue 1100
In e-project, ALL rows are included
In e-select, ALL columns are included
LeongHW, SOC, NUS (UIT2201: 4 Database) Page 40
Copyright © 2007-9 by Leong Hon Wai
Basic primitives operation – e-join
B1 # e-join SCHEDULE-DB and VENUE-DB where (SCHEDULE-DB.Course = VENUE-DB.Course); W3 # e-join P6 and VENUE-DB where (P6.Course = VENUE-DB.Course);
! Basic Primitive Operation 3 – e-join " e-join from <two tables> where <join-conditions>; " Specify join-conditions using primary/foreign keys; " ONLY two (2) tables at a time! (basic primitive) " Includes all “satisfying” rows and columns
LeongHW, SOC, NUS (UIT2201: 4 Database) Page 41
Copyright © 2007-9 by Leong Hon Wai
Example of e-join
SCHEDULE-DB Course Day Hour UIT2201 Tue 10 AM UIT2201 Tue 11 AM CS1101 Wed 1 PM CS1101 Wed 2 PM
VENUE-DB Course Room UIT2201 SR5 CS1101 LT15
Course Day Hour Room UIT2201 Tue 10 AM SR5 UIT2201 Tue 11 AM SR5 CS1101 Wed 1 PM LT15 CS1101 Wed 2 PM LT15
B1 # e-join SCHEDULE-DB and VENUE-DB where (SCHEDULE-DB.Course = VENUE-DB.Course);
B1
LeongHW, SOC, NUS (UIT2201: 4 Database) Page 42
Copyright © 2007-9 by Leong Hon Wai
About the 3 primitives
! Query processing with basic primitives " e-project, Θ(n) row operations " e-select, Θ(n) row operations " e-join Θ(nm) row operations (EXPENSIVE)
! Primitives are primitives " These are most basic operations (like Scratch blocks) " CANNOT change their syntax / semantics " CANNOT do e-project and also specify conditions " CANNOT do e-select and also specify fields " CANNOT e-join 3 tables in one operation
LeongHW, SOC, NUS (UIT2201: 4 Database) Page 43
Copyright © 2007-9 by Leong Hon Wai
Issues when designing schema…
! Try to Avoid Duplication of Data
! Avoid “accidental” deletion of information
! Other Considerations in Databases " Read Section 13.3.3 (pp. 604--606)
LeongHW, SOC, NUS (UIT2201: 4 Database) Page 44
Copyright © 2007-9 by Leong Hon Wai
Why not store everything in one Table?
! Problems: " Duplication of data; " Deletion Problem;
◆ What if Cathy Xin drops CS1101?
STUDENT-SCHEDULE-DB
Stud-ID Name Phone Course Day Hour …
1024 Albert Zan 4358 UIT2201 Tue 10 AM … 1024 Albert Zan 4358 UIT2201 Tue 11 AM … 1337 Cathy Xin 1388 CS1101 Wed 1 PM … 1337 Cathy Xin 1388 CS1101 Wed 2 PM … 2007 Betty Yeo 6177 UIT2201 Tue 10 AM 2007 Betty Yeo 6177 UIT2201 Tue 11 AM
LeongHW, SOC, NUS (UIT2201: 4 Database) Page 45
Copyright © 2007-9 by Leong Hon Wai
Outline
! What is a Database and Evolution…
! Organization of Databases
! Foundations of Relational Database
! DBMS and Query Processing
! DB Applications
! Concurrency Issue in Database
LeongHW, SOC, NUS (UIT2201: 4 Database) Page 46
Copyright © 2007-9 by Leong Hon Wai
Some examples of DB applications…
! Value added Services: " Data Mining – frequent patterns " Targeted marketing (Database marketing) " Credit-card fraud, " Handphone acct churning analysis
! Big Data applications: " Data Visualization " Data Mining " Deep Learning
LeongHW, SOC, NUS (UIT2201: 4 Database) Page 47
Copyright © 2007-9 by Leong Hon Wai
Thank you!