33
Advanced SQL Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems September 29, 2005 me slide content courtesy of Susan Davidson & Raghu Ramakrishnan

Advanced SQL Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems September 29, 2005 Some slide content courtesy of Susan

  • View
    214

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Advanced SQL Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems September 29, 2005 Some slide content courtesy of Susan

Advanced SQL

Zachary G. IvesUniversity of Pennsylvania

CIS 550 – Database & Information Systems

September 29, 2005

Some slide content courtesy of Susan Davidson & Raghu Ramakrishnan

Page 2: Advanced SQL Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems September 29, 2005 Some slide content courtesy of Susan

2

Administrivia

Homework 2 due Tuesday Further project details will be given out

next week

Page 3: Advanced SQL Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems September 29, 2005 Some slide content courtesy of Susan

3

Aggregation

GROUP BY:SELECT {group-attribs}, {aggregate-operator}(attrib)FROM {relation} T1, {relation} T2, …WHERE {predicates}GROUP BY {group-list}

Aggregate operators AVG, COUNT, SUM, MAX, MIN DISTINCT keyword for AVG, COUNT, SUM Can do COUNT(*)

After the GROUP BY, only {group-list} or aggregated values are accessible!

Page 4: Advanced SQL Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems September 29, 2005 Some slide content courtesy of Susan

4

Some Examples

Number of students in each course offering

Number of different grades expected for each course offering

Number of (distinct) students taking AI courses

Page 5: Advanced SQL Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems September 29, 2005 Some slide content courtesy of Susan

5

Revised Example Data Instance

sid name

1 Jill

2 Qun

3 Nitin

4 Marty

fid name

1 Ives

2 Saul

8 Martin

sid exp-grade

cid

1 A 550-0105

1 A 700-1005

3 A 700-1005

3 C 501-0105

4 C 501-0105

cid subj sem

550-0105 DB F05

700-1005 AI S05

501-0105 Arch F05

555-1006 Sys S06

fid cid

1 550-0105

2 700-1005

8 501-0105

STUDENT Takes COURSE

PROFESSOR Teaches

Page 6: Advanced SQL Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems September 29, 2005 Some slide content courtesy of Susan

6

What If You Want to Only ShowSome Groups?

The HAVING clause lets you do a selection based on an aggregate (there must be 1 value per group):

SELECT C.subj, MIN(T.exp-grade)FROM STUDENT S, Takes T, COURSE CWHERE S.sid = T.sid AND T.cid = C.cidGROUP BY subjHAVING COUNT(DISTINCT S.sid) > 5

Note that you can ONLY use aggregate functions or

{group-list} attributes in the HAVING clause Exercise: For each subject taught by at least two

professors, list the minimum expected grade

Page 7: Advanced SQL Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems September 29, 2005 Some slide content courtesy of Susan

7

Aggregation and Table Expressions(aka Derived Relations)

Sometimes need to compute results over the results of a previous aggregation:

SELECT subj, AVG(size)FROM (

SELECT C.cid AS id, C.subj AS subj, COUNT(S.sid) AS sizeFROM STUDENT S, Takes T, COURSE CWHERE S.sid = T.sid AND T.cid =

C.cidGROUP BY cid, subj)

GROUP BY subj

Page 8: Advanced SQL Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems September 29, 2005 Some slide content courtesy of Susan

8

Thought Exercise…

Tables are great, but… Not everyone is uniform – I may have a cell

phone but not a fax We may simply be missing certain information We may be unsure about values

How do we handle these things?

Page 9: Advanced SQL Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems September 29, 2005 Some slide content courtesy of Susan

9

One Answer: Null Values

We designate a special “null” value to represent “unknown” or “N/A”

But a question: what does:

do?

Name

Home Fax

Sam 123-4567

NULL

Li 234-8972

234-8766

Maria

789-2312

789-2121SELECT * FROM CONTACT WHERE Fax < “789-1111”

Page 10: Advanced SQL Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems September 29, 2005 Some slide content courtesy of Susan

10

Three-State Logic

Need ways to evaluate boolean expressions and have the result be “unknown” (or T/F)

Need ways of composing these three-state expressions using AND, OR, NOT:

Can also test for null-ness: attr IS NULL, attr IS NOT NULL

Finally: need rules for arithmetic, aggregation

T AND U = UF AND U = FU AND U = U

T OR U = TF OR U = UU OR U = U

NOT U = U

Page 11: Advanced SQL Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems September 29, 2005 Some slide content courtesy of Susan

11

Nulls and Joins

Sometimes need special variations of joins: I want to see all courses and their students … But what if there’s a course with no

students?

Outer join: Most common is left outer join:

SELECT C.subj, C.cid, T.sid FROM COURSE C LEFT OUTER JOIN Takes T ON C.cid = T.cidWHERE …

Page 12: Advanced SQL Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems September 29, 2005 Some slide content courtesy of Susan

12

Revised Example Data Instance

sid name

1 Jill

2 Qun

3 Nitin

4 Marty

fid name

1 Ives

2 Saul

8 Martin

sid exp-grade

cid

1 A 550-0105

1 A 700-1005

3 A 700-1005

3 C 501-0105

4 C 501-0105

cid subj sem

550-0105 DB F05

700-1005 AI S05

501-0105 Arch F05

555-1006 Sys S06

fid cid

1 550-0105

2 700-1005

8 501-0105

STUDENT Takes COURSE

PROFESSOR Teaches

Page 13: Advanced SQL Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems September 29, 2005 Some slide content courtesy of Susan

13

Warning on Outer Join

Oracle doesn’t support standard SQL syntax here:

SELECT C.subj, C.cid, T.sid FROM COURSE C , Takes T WHERE C.cid =(+) T.cid

Page 14: Advanced SQL Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems September 29, 2005 Some slide content courtesy of Susan

14

Beyond Null

Can have much more complex ideas of incomplete or approximate information Probabilistic models (tuple 80% likely to be an

answer) Naïve tables (can have variables instead of

NULLs) Conditional tables (tuple IF some condition holds)

… And what if you want “0 or more”? In relational databases, create a new table and

foreign key But can have semistructured data (like XML)

Page 15: Advanced SQL Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems September 29, 2005 Some slide content courtesy of Susan

15

Modifying the Database:Inserting Data

Inserting a new literal tuple is easy, if wordy:

INSERT INTO PROFESSOR(fid, name)VALUES (4, ‘Simpson’)

But we can also insert the results of a query!

INSERT INTO PROFESSOR(fid, name) SELECT sid AS fid, name FROM STUDENT WHERE sid < 20

Page 16: Advanced SQL Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems September 29, 2005 Some slide content courtesy of Susan

16

Deleting Tuples

Deletion is a fairly simple operation:

DELETEFROM STUDENT SWHERE S.sid < 25

Page 17: Advanced SQL Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems September 29, 2005 Some slide content courtesy of Susan

17

Updating Tuples

What kinds of updates might you want to do?

UPDATE STUDENT SSET S.sid = 1 + S.sid, S.name = ‘Janet’WHERE S.name = ‘Jane’

Page 18: Advanced SQL Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems September 29, 2005 Some slide content courtesy of Susan

18

Now, How Do I Talk to the DB?

Generally, apps are in a different (“host”) language with embedded SQL statements Static: SQLJ, embedded SQL in C Runtime: ODBC, JDBC, ADO, OLE DB, …

Typically, predefined mappings between host language types and SQL types (e.g., VARCHAR String or char[])

Page 19: Advanced SQL Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems September 29, 2005 Some slide content courtesy of Susan

19

Embedded SQL in C

EXEC SQL BEGIN DECLARE SECTION int sid; char name[20];EXEC SQL END DECLARE SECTION…EXEC SQL INSERT INTO STUDENT VALUES (:sid, :name);

EXEC SQL SELECT name, ageINTO :sid, :nameFROM STUDENTWHERE sid < 20

Page 20: Advanced SQL Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems September 29, 2005 Some slide content courtesy of Susan

20

The Impedance Mismatch and Cursors

SQL is set-oriented – it returns relations There’s no relation type in most languages! Solution: cursor that’s opened, read

DECLARE sinfo CURSOR FOR SELECT sid, name FROM STUDENT…OPEN sinfo;while (…) { FETCH sinfo INTO :sid, :name …}CLOSE sinfo;

Page 21: Advanced SQL Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems September 29, 2005 Some slide content courtesy of Susan

21

JDBC: Dynamic SQL for Java Roughly speaking, a Java version of ODBC

You’ll likely use this in the course project See Chapter 6 of the text for more info

import java.sql.*;Connection conn = DriverManager.getConnection(…);PreparedStatement stmt =

conn.prepareStatement(“SELECT * FROM STUDENT”);…ResultSet rs = stmt.executeQuery ();while (rs.next()) {

sid = rs.getInteger(1);…

}

Page 22: Advanced SQL Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems September 29, 2005 Some slide content courtesy of Susan

22

Database-Backed Web Sites We all know traditional static HTML web

sites:Web-Browser

HTTP-Request

GET ...

Web-Server

File-System

Load File

HTML-File

HTML-File

Page 23: Advanced SQL Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems September 29, 2005 Some slide content courtesy of Susan

23

DB-Driven Web Server

DB-Generated Web Sites (Strudel, Cocoon, …)

LocalDatabase

HTTP Request

HTML File

Web Server

Cache

Data

HTML

Other datasources

Script?

DynamicHTML

Generation

Styles

Page 24: Advanced SQL Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems September 29, 2005 Some slide content courtesy of Susan

24

Procedural Methods for D-HTML:Common Gateway Interface (CGI)

Can have the web server invoke code (with parameters) to generate HTML

Web ServerHTTP-Request

HTML-File

Web Server

File-SystemLoad File

FileHTML?

HTML

Execute Program

Program?Output

I/O, Network, DB

Page 25: Advanced SQL Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems September 29, 2005 Some slide content courtesy of Susan

25

HTML Forms

<html><form action=“http://my.com/cgi/my-cgi”

method=“POST”><input type=“text” name=“value1” /><input type=“submit” value=“Send” /><input type=“rest” value=“Cancel” />

</form>

Page 26: Advanced SQL Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems September 29, 2005 Some slide content courtesy of Susan

26

CGI Pros and Cons

Advantages: Standardized: works for every web-server, browser Flexible: Any language (C++, Perl, Java, …) can be

used Disadvantages:

Statelessness: query-by-query approach Inefficient: new process forked for every request Security: CGI programmer is responsible for security Updates: To update layout, one has to be a

programmer

In general, CGI isn’t used very much today

Page 27: Advanced SQL Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems September 29, 2005 Some slide content courtesy of Susan

27

Java-Server-Process

DB Access with JavaApplets and Server Processes

Sybase

Java Applet

TCP/UDP

IP

Oracle ...

JDBC-Driver

JDBC-Driver

JDBC-Driver

App. Server EJB Layer

BrowserJVM

Page 28: Advanced SQL Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems September 29, 2005 Some slide content courtesy of Susan

28

Java Applets: Discussion

Advantages: Can take advantage of client processing Platform independent – assuming standard java

Disadvantages: Requires JVM on client; self-contained Inefficient: loading can take a long time ... Resource intensive: Client needs to be state of the

art Restrictive: can only connect to server where

applet was loaded from (for security … can be configured)

Page 29: Advanced SQL Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems September 29, 2005 Some slide content courtesy of Susan

29

*SP Server Pages, PHP, and Servlets(IIS, Tomcat, WebSphere, WebLogic, …)

File-SystemWeb Server

HTTP Request

HTML File

Web Server

Load File

FileHTML?

HTML

I/O, Network, DB

Script?Output

Server Extension

Page 30: Advanced SQL Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems September 29, 2005 Some slide content courtesy of Susan

30

ASP/JSP/PHP Versus Servlets

The goal: combine direct HTML (or XML) output with program code that’s executed at the server

The code is responsible for generating more HTML, e.g., to output the results of a database table as HTML table elements

How might I do this? HTML with embedded code Code that prints out HTML

Page 31: Advanced SQL Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems September 29, 2005 Some slide content courtesy of Susan

31

ASP/JSP/PHP “Escapes”

<html><head><title>Sample</title></head><body><h1>Sample</h1><%

myClass.Process(request.getParameter(“test”)); %>

<%= request.getParameter(“value”); %></body></html>

Page 32: Advanced SQL Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems September 29, 2005 Some slide content courtesy of Susan

32

Servlets

class MyClass extends HttpServlet {public void doGet(HttpRequest req, HttpResponse res) … {

res.println(“<html><head><title>Test</title></head></html>”);}

}

Page 33: Advanced SQL Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems September 29, 2005 Some slide content courtesy of Susan

33

Wrapping Up

We’ve seen how to query in SQL (DML) Basic foundation is TRC-based Subqueries and aggregation add extra power Nulls and outer joins add flexibility of

representation We can update tables

We’ve seen that SQL doesn’t precisely match standard host language semantics Embedded SQL Dynamic SQL

Data-driven web sites