Upload
claude-alexander
View
219
Download
0
Embed Size (px)
DESCRIPTION
3 Recall Basic SQL SELECT [DISTINCT] {T 1.attrib, …, T 2.attrib} FROM {relation} T 1, {relation} T 2, … WHERE {predicates} SELECT * All STUDENTs AS As a “range variable” (tuple variable): optional As an attribute rename operator select-list from-list qualification
Citation preview
The Structured Query Language
Zachary G. Ives / Nicholas TaylorUniversity of Pennsylvania
CIS 550 – Database & Information Systems
September 26, 2007Some slide content courtesy of Susan Davidson & Raghu Ramakrishnan
2
Administrivia Homework 2 handed out today
Due 10/8
3
Recall Basic SQLSELECT [DISTINCT] {T1.attrib, …, T2.attrib}FROM {relation} T1, {relation} T2, …WHERE {predicates}
SELECT * All STUDENTs
AS As a “range variable” (tuple variable): optional As an attribute rename operator
select-list
from-listqualification
4
Our Example Data Instance
sid name1 Jill2 Qun3 Nitin
fid name1 Ives2 Saul8 Martin
sid exp-grade
cid
1 A 550-01051 A 700-10053 C 501-0105
cid subj sem
550-0105 DB F05700-1005 AI S05501-0105 Arch F05
fid cid1 550-
01052 700-
10058 501-
0105
STUDENT Takes COURSE
PROFESSOR Teaches
5
Some Nice Features SELECT *
All STUDENTs AS
As a “range variable” (tuple variable): optional As an attribute rename operator
Example: Which students (names) have taken more than
one course from the same professor?
6
Expressions in SQL Can do computation over scalars (int, real or
string) in the select-list or the qualification Show all student IDs decremented by 1
Strings: Fixed (CHAR(x)) or variable length (VARCHAR(x)) Use single quotes: ’A string’ Special comparison operator: LIKE Not equal: <>
Typecasting: CAST(S.sid AS VARCHAR(255))
7
Set Operations Set operations default to set semantics, not bag
semantics:(SELECT … FROM … WHERE …){op}(SELECT … FROM … WHERE …)
Where op is one of: UNION INTERSECT, MINUS/EXCEPT
(many DBs don’t support these last ones!) Bag semantics: ALL
8
Exercise Find all students who have taken DB but
not AI Hint: use EXCEPT
9
Set Operations Set operations default to set semantics, not bag
semantics:(SELECT … FROM … WHERE …){op}(SELECT … FROM … WHERE …)
Where op is one of: UNION INTERSECT, MINUS/EXCEPT
(many DBs don’t support these last ones!) Bag semantics: ALL
10
Exercise Find all students who have taken DB but
not AI Hint: use EXCEPT
11
Revised Example Data Instance
sid name1 Jill2 Qun3 Nitin4 Marty
fid name1 Ives2 Saul8 Martin
sid exp-grade
cid
1 A 550-01051 A 700-10053 A 700-10053 C 501-01054 C 501-0105
cid subj sem
550-0105 DB F05700-1005 AI S05501-0105 Arch F05555-1006 Sys S06
fid cid1 550-
01052 700-
10058 501-
0105
STUDENT Takes COURSE
PROFESSOR Teaches
12
Nested Queries in SQL Simplest: IN/NOT IN
Example: Students who have taken subjects that have (at any point) been taught by Martin
13
Correlated Subqueries Most common: EXISTS/NOT EXISTS
Find all students who have taken DB but not AI
14
Universal and Existential Quantification Generally used with subqueries:
{op} ANY, {op} ALL Find the students with the best expected
grades
15
Table Expressions Can substitute a subquery for any relation
in the FROM clause:SELECT S.sidFROM (SELECT sid FROM STUDENT WHERE sid = 5) SWHERE S.sid = 4
Notice that we can actually simplify this query!
What is this equivalent to?
16
Aggregation GROUP BY
SELECT {group-attribs}, {aggregate-operator}(attrib)FROM {relation} T1, {relation} T2, …WHERE {predicates}GROUP BY {group-list}
Aggregate operators AVG, COUNT, SUM, MAX, MIN DISTINCT keyword for AVG, COUNT, SUM
17
Some Examples Number of students in each course
offering Number of different grades expected for
each course offering Number of (distinct) students taking AI
courses
18
Data Instance, Again
sid name1 Jill2 Qun3 Nitin4 Marty
fid name1 Ives2 Saul8 Martin
sid exp-grade
cid
1 A 550-01051 A 700-10053 A 700-10053 C 501-01054 C 501-0105
cid subj sem
550-0105 DB F05700-1005 AI S05501-0105 Arch F05555-1006 Sys S06
fid cid1 550-
01052 700-
10058 501-
0105
STUDENT Takes COURSE
PROFESSOR Teaches
19
What If You Want to Only ShowSome Groups? The HAVING clause lets you do a selection based
on an aggregate (there must be 1 value per group):
SELECT C.subj, COUNT(S.sid)FROM STUDENT S, Takes T, COURSE CWHERE S.sid = T.sid AND T.cid = C.cidGROUP BY subjHAVING COUNT(S.sid) > 5
Exercise: For each subject taught by at least two professors, list the minimum expected grade
20
Aggregation and Table Expressions(aka Derived Relations) Sometimes need to compute results over
the results of a previous aggregation:
SELECT subj, AVG(size)FROM (
SELECT C.cid AS id, C.subj AS subj, COUNT(S.sid) AS sizeFROM STUDENT S, Takes T, COURSE CWHERE S.sid = T.sid AND T.cid = C.cidGROUP BY cid, subj)
GROUP BY subj
21
Thought Exercise… Tables are great, but…
Not everyone is uniform – I may have a cell phone but not a fax
We may simply be missing certain information We may be unsure about values
How do we handle these things?
22
One Answer: Null Values We designate a special “null” value to
represent “unknown” or “N/A”
But a question: what does:
do?
Name
Home Fax
Sam 123-4567
NULL
Li 234-8972
234-8766
Maria
789-2312
789-2121SELECT * FROM CONTACT WHERE Fax < “789-1111”
23
Three-State Logic Need ways to evaluate boolean expressions
and have the result be “unknown” (or T/F) Need ways of composing these three-state
expressions using AND, OR, NOT:
Can also test for null-ness: attr IS NULL, attr IS NOT NULL
Finally: need rules for arithmetic, aggregation
T AND U = UF AND U = FU AND U = U
T OR U = TF OR U = UU OR U = U
NOT U = U
24
Nulls and Joins Sometimes need special variations of joins:
I want to see all courses and their students … But what if there’s a course with no students?
Outer join: Most common is left outer join:
SELECT C.subj, C.cid, T.sid FROM COURSE C LEFT OUTER JOIN Takes T ON C.cid = T.cidWHERE …
25
Data Instance, Again (!)
sid name1 Jill2 Qun3 Nitin4 Marty
fid name1 Ives2 Saul8 Martin
sid exp-grade
cid
1 A 550-01051 A 700-10053 A 700-10053 C 501-01054 C 501-0105
cid subj sem
550-0105 DB F05700-1005 AI S05501-0105 Arch F05555-1006 Sys S06
fid cid1 550-
01052 700-
10058 501-
0105
STUDENT Takes COURSE
PROFESSOR Teaches
26
Warning on Outer Join Oracle doesn’t support standard SQL
syntax here:
SELECT C.subj, C.cid, T.sid FROM COURSE C , Takes T WHERE C.cid =(+) T.cid
27
Beyond Null Can have much more complex ideas of
incomplete or approximate information Probabilistic models (tuple 80% likely to be an
answer) Naïve tables (can have variables instead of NULLs) Conditional tables (tuple IF some condition holds)
… And what if you want “0 or more”? In relational databases, create a new table and
foreign key But can have semistructured data (like XML)
28
Modifying the Database:Inserting Data Inserting a new literal tuple is easy, if wordy:
INSERT INTO PROFESSOR (fid, name)VALUES (4, ‘Simpson’)
But we can also insert the results of a query!
INSERT INTO PROFESSOR (fid, name) SELECT sid AS fid, name FROM STUDENT WHERE sid < 20
29
Deleting Tuples Deletion is a fairly simple operation:
DELETEFROM STUDENT SWHERE S.sid < 25
30
Updating Tuples What kinds of updates might you want to
do?
UPDATE STUDENT SSET S.sid = 1 + S.sid, S.name = ‘Janet’WHERE S.name = ‘Jane’
31
Now, How Do I Talk to the DB? Generally, apps are in a different (“host”)
language with embedded SQL statements Static (query fixed): SQLJ, embedded SQL in C Dynamic (query generated by program at
runtime): ODBC, JDBC, ADO, OLE DB, … Predefined mappings between SQL types
and host language types CHAR, VARCHAR String INTEGER int DOUBLE double
32
Static SQL using SQLJint sid = 5;String name5 = "Jim", name5;// Database connection setup omitted
#sql {INSERT INTO STUDENTVALUES(:sid, :name)
};
#sql {SELECT name INTO :name6 FROM
STUDENTWHERE sid = 6
};
33
JDBC: Dynamic SQLimport java.sql.*;
Connection conn = DriverManager.getConnection(…);Statement s = conn.createStatement();
int sid = 5;String name = "Jim";s.executeUpdate("INSERT INTO STUDENT VALUES(" +
sid + ", '" + name + "')");// or equivalentlys.executeUpdate(" INSERT INTO STUDENT VALUES(5,
'Jim')");
34
Static vs. Dynamic SQL Syntax
Static is cleaner that Dynamic Dynamic doesn’t extend language syntax, so
you can use any tool you like Execution
Static must be precompiled Can be faster at runtime Extra step is needed to deploy application
Static checks SQL syntax at compilation time, Dynamic at run time
We’ll focus on JDBC, since it’s easy to use
35
The Impedance Mismatch and Cursors SQL is set-oriented – it returns relations There’s no relation type in most languages! Solution: cursor that’s opened, read
ResultSet rs = stmt.executeQuery("SELECT * FROM STUDENT");
while (rs.next()) {int sid = rs.getInt("sid");String name = rs.getString("name");System.out.println(sid + ": " + name);
}
36
JDBC: Prepared Statements (1) But query compilation takes a (relatively) long time! This example is therefore inefficient.
int[] students = {1, 2, 4, 7, 9};for (int i = 0; i < students.length; ++i) {
ResultSet rs = stmt.executeQuery("SELECT * " +
"FROM STUDENT WHERE sid = " + students[i]);
while (rs.next()) {…
}
37
JDBC: Prepared Statements (2) To speed things up, prepare statements and bind
arguments to them This also means you don’t have to worry about escaping
strings, formatting dates, etc. Problems with this lead to a lot of security holes (SQL injection) Or suppose a user inputs the name “O’Reilly”
PreparedStatement stmt = conn.prepareStatement("SELECT * " +
"FROM STUDENT WHERE sid = ? ");int[] students = {1, 2, 4, 7, 9};for (int i = 0; i < students.length; ++i) {
stmt.setInt(1, students[i]);ResultSet rs = stmt.executeQuery();while (rs.next()) {…
}
38
Database-Backed Web Sites We all know traditional static HTML web
sites:Web-Browser
HTTP-RequestGET ...
Web-Server
File-System
Load File
HTML-File
HTML-File
39
Common Gateway Interface (CGI)Can have the web server invoke code (with
parameters) to generate HTML
Web ServerHTTP-Request
HTML-File
Web Server
File-SystemLoad File
FileHTML?
HTML
Execute Program
Program?Output
I/O, Network, DB
40
CGI: Discussion Advantages:
Standardized: works for every web-server, browser Flexible: Any language (C++, Perl, Java, …) can be
used Disadvantages:
Statelessness: query-by-query approach Inefficient: new process forked for every request Security: CGI programmer is responsible for security Updates: To update layout, one has to be a
programmer
41
Java-Server-Process
DB Access in Java
Sybase
Java Applet
TCP/UDPIP
Oracle ...
JDBC-Driver
JDBC-Driver
JDBC-Driver
JDBC Driver manager
BrowserJVM
42
Java Applets: Discussion Advantages:
Can take advantage of client processing Platform independent – assuming standard Java
Disadvantages: Requires JVM on client; self-contained Inefficient: loading can take a long time ... Resource intensive: Client needs to be state of the
art Restrictive: can only connect to server where
applet was loaded from (for security … can be configured)
43
*SP Server Pages and Servlets(IIS, Tomcat, …)
File-SystemWeb Server
HTTP Request
HTML File
Web Server
Load File
FileHTML?HTML
I/O, Network, DB
Script/Servlet?
Output
Server Extension
May have a built-in VM (JVM, CLR)
44
DB-Driven Web Server
One Step Beyond: DB-Driven Web Sites (Strudel, Cocoon, …)
LocalDatabase
HTTP Request
HTML File
Web Server
Cache
Data
HTML
Other datasources
Script?
DynamicHTML
Generation
Styles
45
Wrapping Up We’ve seen how to query in SQL
Basic foundation is TRC-based Subqueries and aggregation add extra power beyond *RC Nulls and outer joins add flexibility of representation We can update tables
We’ve also seen that SQL doesn’t precisely match standard host language semantics Embedded SQL Dynamic SQL
We’ve seen a hint of data-driven web site architectures