27
SQL - Part 2 Much of the material presented in these slides was developed by Dr. Ramon Lawrence at the University of Iowa

SQL - Part 2 Much of the material presented in these slides was developed by Dr. Ramon Lawrence at the University of Iowa

Embed Size (px)

Citation preview

Page 1: SQL - Part 2 Much of the material presented in these slides was developed by Dr. Ramon Lawrence at the University of Iowa

SQL - Part 2

Much of the material presented in these slides was developed by Dr. Ramon Lawrence at the University of Iowa

Page 2: SQL - Part 2 Much of the material presented in these slides was developed by Dr. Ramon Lawrence at the University of Iowa

SELECT Statement Overview

SELECT <list of column expressions> FROM <list of tables and join operations> WHERE <list of logical expressions for rows> GROUP BY <list of grouping columns> HAVING <list of logical expressions for groups> ORDER BY <list of sorting specifications>

Expression: combination of columns, constants, operators, and functions

Page 3: SQL - Part 2 Much of the material presented in these slides was developed by Dr. Ramon Lawrence at the University of Iowa

Example Relations Relations:

Emp (eno, ename, bdate, title, salary, supereno, dno)

Proj (pno, pname, budget, dno)

Dept (dno, dname, mgreno)

WorksOn (eno, pno, resp, hours)

Foreign keys: Emp: Emp.supereno to Emp.eno, Emp.dno to Dept.dno Proj: Proj.dno to Dept.dno Dept: Dept.mgreno to Emp.eno WorksOn: WorksOn.eno to Emp.eno, WorksOn.pno to

Proj.pno

Page 4: SQL - Part 2 Much of the material presented in these slides was developed by Dr. Ramon Lawrence at the University of Iowa

Example Relation Instances

Page 5: SQL - Part 2 Much of the material presented in these slides was developed by Dr. Ramon Lawrence at the University of Iowa

But First Join RevisitedSELECT *FROM emp, deptWHERE emp.dno=dept.dno;

• Natural Join Style:How many columns in the output table?

SELECT * FROM emp NATURAL JOIN dept;

• Cross Product Style:How many columns in the output table?

• Alternative SQL92 styles:How many columns in each output table?

SELECT * FROM emp INNER JOIN dept

USING (dno);

SELECT *FROM emp INNER JOIN dept

ON emp.dno = dept.dno;

Page 6: SQL - Part 2 Much of the material presented in these slides was developed by Dr. Ramon Lawrence at the University of Iowa

More Join Practice Relational database schema:

Return a list of all department names, the names of the projects of that department, and the name of the manager of each department.

Return the names of all projects and the names of the employees who have worked on each project.

Return the names of all employees who are supervisors.

emp (eno, ename, bdate, title, salary, supereno, dno)proj (pno, pname, budget, dno)dept (dno, dname, mgreno)workson (eno, pno, resp, hours)

Page 7: SQL - Part 2 Much of the material presented in these slides was developed by Dr. Ramon Lawrence at the University of Iowa

Ordering Result Data

The query result returned is not ordered on any attribute by default. We can order the data using the ORDER BY clause:

SELECT ename, salary, bdateFROM empWHERE salary > 30000ORDER BY salary DESC, ename ASC;

'ASC' sorts the data in ascending order, and 'DESC' sorts it in descending order. The default is 'ASC'.

The order of sorted attributes is significant. The first attribute specified is sorted on first, then the second attribute is used to break any ties, etc.

NULL is normally treated as less than all non-null values.

Page 8: SQL - Part 2 Much of the material presented in these slides was developed by Dr. Ramon Lawrence at the University of Iowa

Aggregate Queries and Functions Several queries cannot be answered using the

simple form of the SELECT statement. These queries require a summary calculation to be performed. Examples: What is the maximum employee salary? What is the total number of hours worked on a project? How many employees are there in department 'D1'?

To answer these queries requires the use of aggregate functions. These functions operate on a single column of a table and return a single value.

Page 9: SQL - Part 2 Much of the material presented in these slides was developed by Dr. Ramon Lawrence at the University of Iowa

Aggregate Functions

The five basic aggregate functions are: COUNT - returns the # of values in a column SUM - returns the sum of the values in a column AVG - returns the average of the values in a column MIN - returns the smallest value in a column MAX - returns the largest value in a column

Notes: COUNT, MAX, and MIN apply to all types of fields, whereas

SUM and AVG apply to only numeric fields. Except for COUNT(*) all functions ignore nulls. COUNT(*)

returns the number of rows in the table. Use DISTINCT to eliminate duplicates.

Page 10: SQL - Part 2 Much of the material presented in these slides was developed by Dr. Ramon Lawrence at the University of Iowa

Aggregate Function Example

Return the number of employees and their average salary.

SELECT COUNT(eno) AS numEmp, AVG(salary) AS avgSalary

FROM emp;

Page 11: SQL - Part 2 Much of the material presented in these slides was developed by Dr. Ramon Lawrence at the University of Iowa

Aggregate functions are often most useful when combined with the GROUP BY clause. The GROUP BY clause groups the tuples based on the values of the attributes specified.

When used in combination with aggregation functions, the result is a table where each tuple consists of unique values for the group by attributes and the result of the aggregate functions applied to the tuples of that group.

GROUP BY Clause

Page 12: SQL - Part 2 Much of the material presented in these slides was developed by Dr. Ramon Lawrence at the University of Iowa

GROUP BY Example

For each employee title, return the number of employees with that title, and the minimum, maximum, and average salary.

SELECT title, COUNT(eno) AS numEmp,MIN(salary) as minSal,MAX(salary) as maxSal, AVG(salary) AS avgSalFROM empGROUP BY title;

Result:

Page 13: SQL - Part 2 Much of the material presented in these slides was developed by Dr. Ramon Lawrence at the University of Iowa

GROUP BY Clause Rules

There are a few rules for using the GROUP BY clause:

1) A column name cannot appear in the SELECT part of the query unless it is part of an aggregate function or in the list of group by attributes. Note that the reverse is true: a column can be in the

GROUP BY without being in the SELECT part.

2) Any WHERE conditions are applied before the GROUP BY and aggregate functions are calculated.

Page 14: SQL - Part 2 Much of the material presented in these slides was developed by Dr. Ramon Lawrence at the University of Iowa

HAVING Clause

The HAVING clause is applied AFTER the GROUP BY clause and aggregate functions are calculated.

It is used to filter out entire groups that do not match certain criteria.

Page 15: SQL - Part 2 Much of the material presented in these slides was developed by Dr. Ramon Lawrence at the University of Iowa

HAVING Example

Return the title and number of employees of that title where the number of employees of the title is at least 2.

SELECT title, COUNT(eno) AS numEmp

FROM emp

GROUP BY title

HAVING COUNT(eno) >= 2;

Result:

Page 16: SQL - Part 2 Much of the material presented in these slides was developed by Dr. Ramon Lawrence at the University of Iowa

GROUP BY/HAVING Example For employees born after December 1, 1965, return

the average salary by department where the average is > 40,000.

SELECT dname, AVG(salary) AS avgSalFROM emp NATURAL JOIN deptWHERE emp.bdate > DATE ’1965-12-01'GROUP BY dnameHAVING AVG(salary) > 40000;

Step #1: Perform Join and Filter in WHERE clause

Page 17: SQL - Part 2 Much of the material presented in these slides was developed by Dr. Ramon Lawrence at the University of Iowa

GROUP BY/HAVING Example (2)Step #2: GROUP BY on dname

Step #3: Calculate aggregate functions

Step #4: Filter groups using HAVING clause

Page 18: SQL - Part 2 Much of the material presented in these slides was developed by Dr. Ramon Lawrence at the University of Iowa

GROUP BY Examples Return the average budget per project:

SELECT AVG(budget)FROM proj;

Return the average # of hours worked on each project:SELECT pno, AVG(hours)FROM worksonGROUP BY pno;

Return the departments that have projects with at least 2 'EE's working on them:

SELECT proj.dno, COUNT(*)FROM proj, workson, empWHERE emp.title = 'EE' and workson.eno=emp.enoand workson.pno = proj.pnoGROUP BY proj.dnoHAVING COUNT(*) >=2;

Page 19: SQL - Part 2 Much of the material presented in these slides was developed by Dr. Ramon Lawrence at the University of Iowa

Multi-Attribute Example Return the employee number, department number

and hours the employee worked per department where the hours is >= 10.

SELECT W.eno, D.dno, SUM(hours)FROM workson AS W, dept AS D, proj AS PWHERE W.pno = P.pno and P.dno = D.dnoGROUP BY W.eno, D.dnoHAVING SUM(hours) >= 10;

Question:1) How would you only return recordsfor departments D2 and D3?

Result:

Page 20: SQL - Part 2 Much of the material presented in these slides was developed by Dr. Ramon Lawrence at the University of Iowa

GROUP BY Practice Questions Relational database schema:

Emp (eno, ename, bdate, title, salary, supereno, dno)Proj (pno, pname, budget, dno)Dept (dno, dname, mgreno)WorksOn (eno, pno, resp, hours)

Return the highest salary of any employee. For each project, return its name and the total number of hours

employees have worked on it. For each employee, return the total number of hours they have

worked. Calculate the average # of hours spent per project in each

department.

Page 21: SQL - Part 2 Much of the material presented in these slides was developed by Dr. Ramon Lawrence at the University of Iowa

Conceptual Evaluation Process

FROM Tables:Cross Product and

Join Operations

GROUPBY?

Restrictionon WHEREConditions

Sort onGROUP BY

Columns

ComputeAggregatesand ReduceEach Group

to 1 Row

ORDERBY?

SortColumns inORDER BY

1

2

3

4

6

7

Yes

No

Yes

No

Finish

Restrictionon HAVINGConditions

5

ProjectColumns in

SELECT

Page 22: SQL - Part 2 Much of the material presented in these slides was developed by Dr. Ramon Lawrence at the University of Iowa

Conceptual Evaluation Lessons Row operations before group operations

FROM and WHERE before GROUP BY and HAVING

Check row operations first Grouping occurs only one time

Page 23: SQL - Part 2 Much of the material presented in these slides was developed by Dr. Ramon Lawrence at the University of Iowa

Conceptual Evaluation ProblemRelational database schema:

Student(StdSSN, StdFirstName, StdLastName, StdCity, StdState, StdMajor, StdClass, StdGPA, StdZip)

Faculty(FacSSN, FacFirstName, FacLastName, FacCity, FacState, FacDept, FacRank, FacSalary)

Faculty_1(FacSSN, FacSupervisor, FacHireDate, FacZipCode)

Offering(offerNo, CourseNo, OffTerm, OffTerm, OffYear, OffLocation, OffTime, FacSSN,OffDays)

Course(CourseNo, CrsDesc, CrsUnits)

Enrollment(OfferNo, StdSSN, EnrGrade)

Example 15 from your text: List the number of offerings taught in 2006 by faculty rank and department. Exclude combinations of faculty rank and department with less than two offerings taught.

Page 24: SQL - Part 2 Much of the material presented in these slides was developed by Dr. Ramon Lawrence at the University of Iowa

Conceptual Evaluation Problem (cont)• List the number of offerings taught in 2006 by faculty

rank and department. Exclude combinations of faculty rank and department with less than two offerings taught.

SELECT FacRank, FacDept, COUNT(*) AS NumOfferings FROM Faculty, Offering WHERE Offering.FacSSN = Faculty.FacSSN AND OffYear = 2006 GROUP BY FacRank, FacDept HAVING COUNT(*) > 1

Page 25: SQL - Part 2 Much of the material presented in these slides was developed by Dr. Ramon Lawrence at the University of Iowa

Query Formulation Process

Problem Statement

Problem Statement

Database Representation

Database Language Statement

Page 26: SQL - Part 2 Much of the material presented in these slides was developed by Dr. Ramon Lawrence at the University of Iowa

Critical Questions

What tables? Columns in output Conditions to test (including join conditions)

How to combine the tables? Usually join PK to FK More complex ways to combine

Individual rows or groups of rows? Aggregate functions in output Conditions with aggregate functions

Page 27: SQL - Part 2 Much of the material presented in these slides was developed by Dr. Ramon Lawrence at the University of Iowa

Efficiency Considerations

Little concern for efficiency Intelligent SQL compilers Correct and non redundant solution

No extra tables No unnecessary grouping Use HAVING for group conditions only

Chapter 8 provides additional tips for avoiding inefficient SELECT statements