View
583
Download
1
Category
Tags:
Preview:
Citation preview
Advanced SQLProgramming
Mark HolmMark Holm
Centerfield TechnologyCenterfield Technology
Goals
IntroduceIntroduce some useful advanced SQL some useful advanced SQL programming techniques programming techniques
Show you how to let the database do more Show you how to let the database do more work to reduce programming effortwork to reduce programming effort
Go over some basic techniques and tips to Go over some basic techniques and tips to improve performanceimprove performance
2
Notes
V4R3 and higher syntax used in examplesV4R3 and higher syntax used in examples Examples show only a small subset of what Examples show only a small subset of what
can be done!can be done!
3
Agenda
Joining files - techniques, do’s and don’tsJoining files - techniques, do’s and don’ts Query within a query - SubqueriesQuery within a query - Subqueries Stacking data - UnionsStacking data - Unions Simplifying data with ViewsSimplifying data with Views Referential Integrity and constraintsReferential Integrity and constraints Performance, performance, performancePerformance, performance, performance
4
Joining files
Joins are used to relate data from different tablesJoins are used to relate data from different tables Data can be retrieved with one “open file” rather Data can be retrieved with one “open file” rather
than manythan many Concept is identical to join logical files without an Concept is identical to join logical files without an
associated permanent object (except if the join is associated permanent object (except if the join is done with an SQL view)done with an SQL view)
5
Join types Inner JoinInner Join
– Used to find related dataUsed to find related data Left Outer (or simply Outer) JoinLeft Outer (or simply Outer) Join
– Used to find related data and ‘orphaned’ rowsUsed to find related data and ‘orphaned’ rows Exception JoinException Join
– Used to only find ‘orphaned’ rowsUsed to only find ‘orphaned’ rows Cross JoinCross Join
– Join all rows to all rowsJoin all rows to all rows
6
Sample tables
FirstName LastName DeptJohn Doe 397
Cindy Smith 450
Sally Anderson 250
Dept Area397 Development
550 Marketing
250 Sales
Em
plo
yee
tab
le
Dep
artm
ent
tab
le
Inner Join
SELECT LastName, Division FROM Employee, Department WHERE Employee.Dept = Department.Dept
• Method #1 - Using the WHERE Clause
• Method #2 - Using the JOIN Clause
SELECT LastName, Division FROM Employee INNER JOIN Department ON Employee.Dept = Department.Dept
NOTE: This method is useful if you need to influence the order of the tables are joined in for performance reasons. Only works on releases prior to V4R4.
8
Results
LastName AreaDoe Development
Anderson Sales
• Return list of employees that are in a valid department.
• Employee ‘Smith’ is not returned because she is not in a department listed in the ‘Department’ table
Res
ult
tab
le
9
Left Outer Join
SELECT LastName, Area FROM Employee LEFT OUTER JOIN Department ON Employee.Dept = Department.Dept
• Must use Join Syntax
10
Results
LastName AreaDoe Development
Smith -
Anderson Sales
• Return list of employees even if they are not in a valid department
• Employee ‘Smith’ has a NULL Area because it could not be associated with a valid Dept
Res
ult
tab
le
11
Exception Join
SELECT LastName, Area FROM Employee EXCEPTION JOIN Department ON Employee.Dept = Department.Dept
• Must use Join Syntax
12
Results
LastName AreaSmith -
• Return list of employees only if they are NOT in a valid department
• Employee ‘Smith’ is only one without a valid department
Res
ult
tab
le
13
WARNING!
The order tables are listed in the FROM The order tables are listed in the FROM clause is importantclause is important
For OUTER and EXCEPTION joins, the For OUTER and EXCEPTION joins, the database must join the tables in that order. database must join the tables in that order.
The result may be horrible performance…The result may be horrible performance…more on this topic latermore on this topic later
14
Observations
Joins provide one way to bury application Joins provide one way to bury application logic in the databaselogic in the database
Each join type has a purpose and can be Each join type has a purpose and can be used to not only get the data you want but used to not only get the data you want but identify “incomplete” informationidentify “incomplete” information
With some exceptions, if joined properly With some exceptions, if joined properly performance should be at least as good as performance should be at least as good as an applicationan application
15
Subqueries
Subqueries are a powerful way to select Subqueries are a powerful way to select only the data you need without separate only the data you need without separate statements.statements.
Example: List employees making a higher Example: List employees making a higher than average salarythan average salary
16
Subquery Example
SELECT FNAME, LNAME FROM EMPLOYEEWHERE SALARY > (SELECT AVG(SALARY) FROM EMPLOYEE)
SELECT FNAME, LNAME FROM EMPLOYEEWHERE SALARY > (SELECT AVG(SALARY) FROM EMPLOYEE WHERE LNAME = ’JONES’)
17
Subqueries - types
Correlated Correlated – Inner select refers to part of the outer (parent) Inner select refers to part of the outer (parent)
select (multiple evaluations)select (multiple evaluations) Non-CorrelatedNon-Correlated
– Inner select does not relate to outer query (one Inner select does not relate to outer query (one evaluation)evaluation)
18
Subquery Tips 1
Subquery optimization (2nd statement will Subquery optimization (2nd statement will be faster)be faster)– SELECT name FROM employee WHERE SELECT name FROM employee WHERE
salary > ALL (SELECT salary FROM salscale) salary > ALL (SELECT salary FROM salscale) – SELECT name FROM employee WHERE SELECT name FROM employee WHERE
salary > (SELECT max(salary) FROM salscale)salary > (SELECT max(salary) FROM salscale)
19
Subquery Tips 2
Subquery optimization (2nd statement will Subquery optimization (2nd statement will be faster)be faster)– SELECT name FROM employee WHERE SELECT name FROM employee WHERE
salary IN (SELECT salary FROM salscale) salary IN (SELECT salary FROM salscale) – SELECT name FROM employee WHERE SELECT name FROM employee WHERE
EXISTS (SELECT salary FROM salscale EXISTS (SELECT salary FROM salscale WHERE employee.salid = salscale.salid)WHERE employee.salid = salscale.salid)
20
UNIONs
Unions provide a way to append multiple row sets files in one statementUnions provide a way to append multiple row sets files in one statement Example: Process all of the orders from January and FebruaryExample: Process all of the orders from January and February
SELECT * FROM JanOrders WHERE SKU = 199976
UNION
SELECT * FROM FebOrders WHERE SKU = 199976
21
Unions
Each SELECT statement that is UNIONed Each SELECT statement that is UNIONed together must have the same number of together must have the same number of result columns and have compatible typesresult columns and have compatible types
Two forms of syntaxTwo forms of syntax– UNION ALL -- allow duplicate recordsUNION ALL -- allow duplicate records– UNION -- return only distinct rowsUNION -- return only distinct rows
22
Views
Views provide a convenient way to Views provide a convenient way to permanently put SQL logicpermanently put SQL logic
Create once and use many timesCreate once and use many times Also make the database more Also make the database more
understandable to usersunderstandable to users Can put simple business rules into views to Can put simple business rules into views to
ensure consistencyensure consistency
23
Views Example: Make it easy for the human resources department to run a Example: Make it easy for the human resources department to run a
report that shows ‘new’ employees. report that shows ‘new’ employees.
CREATE VIEW HR/NEWBIES (EMPLOYEE_NAME, DEPARTMENT, HIRE_DATE) AS
SELECT concat(concat(strip(last_name),','),strip(first_name)),
department,hire_date
FROM HR/EMPLOYEE WHERE (year(current date)-year(hire_date)) < 2
24
Performance
SQL performance is harder to predict and SQL performance is harder to predict and tune than native I/O.tune than native I/O.
SQL provides a powerful way to manipulate SQL provides a powerful way to manipulate data but you have little control over HOW it data but you have little control over HOW it does it.does it.
Query optimizer takes responsibility for Query optimizer takes responsibility for doing it ‘right’.doing it ‘right’.
25
Performance - diagnosis
Getting information about how the Getting information about how the optimizer processed a query is crucialoptimizer processed a query is crucial
Can be done via one or all of the following:Can be done via one or all of the following:– STRDBG: debug messages in job logSTRDBG: debug messages in job log– STRDBMON: optimizer info put in fileSTRDBMON: optimizer info put in file– QAQQINI: can be used to force messagesQAQQINI: can be used to force messages– CHGQRYA: messages put out when time limit CHGQRYA: messages put out when time limit
set to 0set to 0
26
Performance tips
Create indexesCreate indexes– Over columns that significantly limit data in Over columns that significantly limit data in
WHERE clauseWHERE clause– Over columns that join tables togetherOver columns that join tables together– Over columns used in ORDER BY and Over columns used in ORDER BY and
GROUP BY clausesGROUP BY clauses
27
Performance tips
Create Encoded Vector Indexes (EVI’s)Create Encoded Vector Indexes (EVI’s)– Most useful in heavy query environments with a Most useful in heavy query environments with a
lot of data (e.g. large data warehouses)lot of data (e.g. large data warehouses)– Helps queries that process between 20-60% of a Helps queries that process between 20-60% of a
table’s datatable’s data– Create over columns with a modest number of Create over columns with a modest number of
distinct values and those with data skewdistinct values and those with data skew– EVI’s bridge the gap between traditional indexes EVI’s bridge the gap between traditional indexes
and table scansand table scans
28
Performance tips Encourage optimizer to use indexesEncourage optimizer to use indexes
– Use keyed columns in WHERE clause if possibleUse keyed columns in WHERE clause if possible– Use ANDed conditions as much as possibleUse ANDed conditions as much as possible– OPTIMIZE FOR n ROWSOPTIMIZE FOR n ROWS– Don’t do things that eliminate index useDon’t do things that eliminate index use
Data conversion (binary-key = 1.5)Data conversion (binary-key = 1.5) LIKE clause w/leading wildcard (NAME LIKE LIKE clause w/leading wildcard (NAME LIKE
‘%JOE’)‘%JOE’)
29
Performance tips
Keep statements simpleKeep statements simple– Complex statements are much more difficult to Complex statements are much more difficult to
optimizeoptimize– Provide more opportunity for the optimizer to Provide more opportunity for the optimizer to
choose a sub-optimal plan of attackchoose a sub-optimal plan of attack
30
Performance tips
Enable DB2 to use parallelismEnable DB2 to use parallelism– Query processed by many tasks (CPU Query processed by many tasks (CPU
parallelism) or by getting data from many disks parallelism) or by getting data from many disks at once (I/O parallelism)at once (I/O parallelism)
– CPU parallelism requires IBM’s SMP feature CPU parallelism requires IBM’s SMP feature and a machine with multiple processorsand a machine with multiple processors
– Enabled via the QQRYDEGREE system value, Enabled via the QQRYDEGREE system value, CHGQRYA, or the QAQQINI fileCHGQRYA, or the QAQQINI file
31
Other useful features
CASE clause - conditional calculationsCASE clause - conditional calculations ALIAS - access to multi-member filesALIAS - access to multi-member files Primary/Foreign keys - referential integrityPrimary/Foreign keys - referential integrity ConstraintsConstraints
32
CASE
Conditional calculations with CASEConditional calculations with CASE
SELECT Warehouse, Description, CASE RegionCode WHEN 'E' THEN 'East Region' WHEN 'S' THEN 'South Region' WHEN 'M' THEN 'Midwest Region' WHEN 'W' THEN 'West Region' END FROM Locations
33
CASE
Avoiding calculation errors (e.g. division by 0)Avoiding calculation errors (e.g. division by 0)
SELECT Warehouse, Description, CASE NumInStock WHEN 0 THEN NULL ELSE CaseUnits/NumInStock END FROM Inventory
34
ALIAS names
The CREATE ALIAS statement creates an alias on a table, view, or member of a database file.
– CREATE ALIAS CREATE ALIAS alias-name alias-name FORFOR table member table member Example: Create an alias over the second Example: Create an alias over the second
member of a multi-member physical filemember of a multi-member physical file– CREATE ALIASCREATE ALIAS February February FORFOR MonthSales MonthSales
FebruaryFebruary
35
Referential Integrity
Keeps two or more files in synch with each Keeps two or more files in synch with each otherother
Ensures that children rows have parentsEnsures that children rows have parents Can also be used to automatically delete Can also be used to automatically delete
children when parents are deletedchildren when parents are deleted
36
Referential Integrity Rules
A row inserted into a child table must have A row inserted into a child table must have a parent row (typically in another table).a parent row (typically in another table).
Parent rulesParent rules– A parent row can not be deleted if there are A parent row can not be deleted if there are
dependent children (Restrict rule) ORdependent children (Restrict rule) OR– All children are also deleted (Cascade rule) ORAll children are also deleted (Cascade rule) OR– All children’s foreign keys are changed (Set All children’s foreign keys are changed (Set
Null and Set Default rules)Null and Set Default rules)
37
Parent table Child table
Pri
mar
y K
ey
For
eign
K
eyPri
mar
y k
ey m
ust
b
e u
niq
ue
38
Referential Integrity syntax
ALTER TABLE Hr/Employee ADD ALTER TABLE Hr/Employee ADD CONSTRAINT EmpPK PRIMARY KEY CONSTRAINT EmpPK PRIMARY KEY (EmployeeId)(EmployeeId)
ALTER TABLE Hr/Department ADD ALTER TABLE Hr/Department ADD CONSTRAINT EmpFK FOREIGN KEY CONSTRAINT EmpFK FOREIGN KEY (EmployeeId) REFERENCES Hr/Employee (EmployeeId) REFERENCES Hr/Employee (EmployeeId) ON DELETE CASCADE (EmployeeId) ON DELETE CASCADE ON UPDATE RESTRICTON UPDATE RESTRICT
39
Check Constraints
Rules which limit the allowable values in one or Rules which limit the allowable values in one or more columns:more columns:
CREATE TABLE Employee CREATE TABLE Employee
(FirstName CHAR(20), (FirstName CHAR(20),
LastName CHAR(30), LastName CHAR(30),
Salary CHECK (Salary>0 AND Salary<200000))Salary CHECK (Salary>0 AND Salary<200000))
40
Check Constraints
Effectively does data checking at the database Effectively does data checking at the database level.level.
Data checking done with display files or Data checking done with display files or application logic can now be done at the application logic can now be done at the database level.database level.
Ensures that it is always done and closes “back Ensures that it is always done and closes “back doors” like DFU, ODBC, 3-rd party utilities….doors” like DFU, ODBC, 3-rd party utilities….
41
Other resources
Database Design and Programming for DB2/400 - book by Paul Conte
SQL for Smarties - book by Joe Celko
SQL Tutorial - www.as400network.com AS/400 DB2 web site at http://www.as400.ibm.com/db2/db2main.htm Publications at http://publib.boulder.ibm.com/pubs/html/as400/ Our web site at http://www.centerfieldtechnology.com
42
Summary
SQL is a powerful way to access and SQL is a powerful way to access and process dataprocess data
Used effectively, it can reduce the time it Used effectively, it can reduce the time it takes to build applicationstakes to build applications
Once tuned, it can perform very close (and Once tuned, it can perform very close (and sometimes better) than HLL’s alonesometimes better) than HLL’s alone
43
Good Luck and
Happy SQLing
Recommended