AgendaAgenda
AdministriviaAdministrivia Physical Database DesignPhysical Database Design
Database IntegrityDatabase Integrity PerformancePerformance
Regrade Requests Exam SQLRegrade Requests Exam SQL
Create DatabaseCreate Database Enter query(s) as submittedEnter query(s) as submitted Submit to meSubmit to me
Database (electronic)Database (electronic) Graded homework (paper)Graded homework (paper)
Reserve the right to change test data Reserve the right to change test data and reexecute queryand reexecute query
Foreign KeysForeign Keys
Inserts require all FK values be the Inserts require all FK values be the value of a primary key in the reference value of a primary key in the reference tabletable
Update and delete constraints are also Update and delete constraints are also possiblepossible
Referential IntegrityReferential Integrity
ON DELETE CASCADE/RESTRICT/SET ON DELETE CASCADE/RESTRICT/SET NULLNULL
ON UPDATE CASCADE/RESTRICT/SET ON UPDATE CASCADE/RESTRICT/SET NULLNULL
DefaultDefault ON DELETE RESTRICTON DELETE RESTRICT ON UPDATE CASCADEON UPDATE CASCADE
ExampleExample
CREATE TABLE PCAccessCREATE TABLE PCAccess
(PC#(PC# INTEGER,INTEGER,
EmpIDEmpID CHAR(9),CHAR(9),
AccessTypeAccessType CHAR(15),CHAR(15),
PRIMARY KEY (PC#, EmpID),PRIMARY KEY (PC#, EmpID),
FOREIGN KEY (EmpID) REFERENCES FOREIGN KEY (EmpID) REFERENCES (Employee),(Employee),
FOREIGN KEY (PC#) REFERENCES (PC))FOREIGN KEY (PC#) REFERENCES (PC))
Example - PCAccess TableExample - PCAccess Table
PC# EmpID AccessType
1 1 Full
1 2 Restricted
2 1 Full
3 4 Full
3 2 Semi-Restricted
Example #1Example #1
PC# EmpID AccessType1 1 Full1 2 Restricted2 1 Full3 4 Full3 2 Semi-Restricted
INSERT INTO PCAccess (PC#)
VALUES (4)
Example #2Example #2
PC# EmpID AccessType1 1 Full1 2 Restricted2 1 Full3 4 Full3 2 Semi-Restricted
INSERT INTO PCAccess (PC#, EmpID)
VALUES (4,5)
Example #3Example #3
PC# EmpID AccessType1 1 Full1 2 Restricted2 1 Full3 4 Full3 2 Semi-Restricted
UPDATE Employee
SET EmpID = 10
WHERE EmpID = 1
Example #4Example #4
PC# EmpID AccessType1 1 Full1 2 Restricted2 1 Full3 4 Full3 2 Semi-Restricted
UPDATE PCAccess
SET EmpID = 10
WHERE EmpID = 1
Example #5Example #5
PC# EmpID AccessType1 1 Full1 2 Restricted2 1 Full3 4 Full3 2 Semi-Restricted
DELETE FROM Employee
WHERE EmpID = 2
Example #6Example #6
PC# EmpID AccessType1 1 Full1 2 Restricted2 1 Full3 4 Full3 2 Semi-Restricted
DELETE FROM PCAccess
WHERE EmpID = 2
Example #7Example #7
PC# EmpID AccessType1 1 Full1 2 Restricted2 1 Full3 4 Full3 2 Semi-Restricted
DELETE FROM PC
WHERE PC# = 3
CascadingCascading
Chain followed until the endChain followed until the end Especially for deletesEspecially for deletes
If mix of CASCADE, RESTRICT, SET If mix of CASCADE, RESTRICT, SET NULLNULL Will get all or nothingWill get all or nothing
Update & Delete ConstraintsUpdate & Delete Constraints
CREATE TABLE T1CREATE TABLE T1(A(A CHAR(5)CHAR(5) BB CHAR(5)CHAR(5)PRIMARY KEY (A,B)PRIMARY KEY (A,B)FOREIGN KEY (A) REFERENCES (T2)FOREIGN KEY (A) REFERENCES (T2)
ON DELETE RESTRICTON DELETE RESTRICTON UPDATE RESTRICT)ON UPDATE RESTRICT)
CREATE TABLE T2CREATE TABLE T2(C(C CHAR(5)CHAR(5) DD VARCHAR(30) PRIMARY KEY (C))VARCHAR(30) PRIMARY KEY (C))
ConstraintsConstraints
A BX1 Y1X1 Y2X1 A1X1 A4X2 A4X3 Y5X3 Y1
T1 C D
X1 X1 DescX2 X2 DescX3 X3 Desc
T2
Want to update the value of X1 to be X11.
What has to happen?
PerformancePerformance
Requires Knowledge ofRequires Knowledge of DBMSDBMS ApplicationsApplications DataData Users & ExpectationsUsers & Expectations EnvironmentEnvironment
Performance ClassesPerformance Classes
OLTPOLTP On-Line Transaction ProcessingOn-Line Transaction Processing
OLAPOLAP On-Line Analytic ProcessingOn-Line Analytic Processing
Mix of OLTP and OLAPMix of OLTP and OLAP
OLTPOLTP
Throughput DrivenThroughput Driven Throughput - number of transactions Throughput - number of transactions
per unit of timeper unit of time Lots of TransactionsLots of Transactions Mix of Update and QueryMix of Update and Query High ConcurrencyHigh Concurrency
OLAPOLAP
Response Time DrivenResponse Time Driven Response Time - single transactionResponse Time - single transaction
Very Large, Possibly Complex, Very Large, Possibly Complex, TransactionsTransactions
Query Evaluation and OptimizationQuery Evaluation and Optimization
Performance TuningPerformance Tuning
Consider the Mix of OLTP & OLAPConsider the Mix of OLTP & OLAP Interference Between TypesInterference Between Types
Example:Example:Single daily large analytic transaction, rest Single daily large analytic transaction, rest simple transactions, locking could prevent simple transactions, locking could prevent others from running.others from running.
Tuning LevelsTuning Levels
DBMSDBMS HardwareHardware DesignDesign
Interactions Between LevelsInteractions Between Levels
DBMS Parameter TuningDBMS Parameter Tuning
Specific to DBMSSpecific to DBMS Buffers - Buffer PoolBuffers - Buffer Pool Logging - CheckpointsLogging - Checkpoints Lock ManagementLock Management Space Allocation - Log, Data, FreespaceSpace Allocation - Log, Data, Freespace Thread ManagementThread Management Operating System TuningOperating System Tuning
Hardware TuningHardware Tuning
MemoryMemory CPUCPU DiskDisk
RAIDRAID Number of DrivesNumber of Drives PartitioningPartitioning
Architecture -- Parallel Systems?Architecture -- Parallel Systems?
RAIDRAID
Redundant Array of Inexpensive DisksRedundant Array of Inexpensive Disks
Appears as single diskAppears as single disk Physical storage difference - no Physical storage difference - no
database differencesdatabase differences Increase performanceIncrease performance Provide recovery from disk failureProvide recovery from disk failure
Negative Effect of RAIDNegative Effect of RAID
MTBF (mean time between failures)MTBF (mean time between failures) Increase by factor = # of drives usedIncrease by factor = # of drives used
# Drives MTBF1 730 days2 3654 182 ½8 91 ¼
How RAID WorksHow RAID Works
Striping - dividing equally across all disksStriping - dividing equally across all disks
1 2 3 4
1
5
9
2
6
10
3
7
11
4
8
12
Stripe 1
Stripe 2
Stripe 3
Stripe n
RAID LevelsRAID Levels
RAID-0RAID-0 RAID-1RAID-1 RAID-2RAID-2 RAID-3RAID-3 RAID-4RAID-4 RAID-5RAID-5 RAID-6RAID-6
RAID-0RAID-0
All disks store unique dataAll disks store unique data Very fastVery fast No fault tolerance or recoveryNo fault tolerance or recovery
1 2 3 4
1
5
9
2
6
10
3
7
11
4
8
12
RAID-1RAID-1
Fully RedundantFully Redundant Faster Reads/Slower WritesFaster Reads/Slower Writes High fault tolerance -- easy recoveryHigh fault tolerance -- easy recovery
1 2 3 4
1
3
5
2
4
6
1
3
5
2
4
6
RAID-2RAID-2
Each record spans all drivesEach record spans all drives Some disks store ECC (error correction codes)Some disks store ECC (error correction codes) Parity checks allow error detection and correctionParity checks allow error detection and correction
1 2 3 4
1a
2a
3a
1b
2b
3b
ECCECCECC
ECCECCECC
RAID-3RAID-3
Each record spans all drivesEach record spans all drives One disk stores ECCOne disk stores ECC Single-UserSingle-User
1 2 3 4
1a
2a
3a
1b
2b
3b
1c
2c
3c
ECCECCECC
RAID-4RAID-4
Each record stored on a single diskEach record stored on a single disk One drive for ECCOne drive for ECC Multi-user reads; Single-User writesMulti-user reads; Single-User writes
1 2 3 4
1
4
7
2
5
8
3
6
9
ECCECCECC
RAID-5 (Rotating Parity Array)RAID-5 (Rotating Parity Array)
Drive has both data and ECCDrive has both data and ECC ECCs rotate to different driveECCs rotate to different drive Multi-user reads and writesMulti-user reads and writes
1 2 3 4
1ECC
2
4
3
5
ECC
6
ECC
ECC
ECC
6 7 8
11109
12 13 14
RAID-6 P+Q RedundancyRAID-6 P+Q Redundancy
P - “parity”P - “parity” Q - “extra parity”Q - “extra parity” 2 bits of ECC per 4 bits of data2 bits of ECC per 4 bits of data Handles multiple disk failuresHandles multiple disk failures Reed-Solomon codesReed-Solomon codes
Introduction to the Theory of Error-Introduction to the Theory of Error-Correcting CodesCorrecting Codes, Pless (1989), Pless (1989)
Your Mileage May VaryYour Mileage May Vary
“We note that numerous improvements have been proposed to the basic RAID schemes described here. As a result, sometimes there is confusion about the exact definitions of the different RAID levels.”
RAID UsageRAID Usage
1, 3, and 5 outperform others1, 3, and 5 outperform others
RAID-1 - fastest, no storage cost, but not RAID-1 - fastest, no storage cost, but not fault tolerantfault tolerant
RAID-3 - single-user onlyRAID-3 - single-user only RAID-5 - higher speed than single disk, RAID-5 - higher speed than single disk,
fault tolerant, multi-user, but some fault tolerant, multi-user, but some storage cost and slower write timesstorage cost and slower write times
Transaction TuningTransaction Tuning
The DBMS optimizes so why worry?The DBMS optimizes so why worry?
An optimized poorly written transaction An optimized poorly written transaction can always be outperformed by a well-can always be outperformed by a well-written nonoptimized one.written nonoptimized one.
EXPLAIN (DB2)EXPLAIN (DB2) What did the optimizer come up with?What did the optimizer come up with?
Transaction TuningTransaction Tuning
Distributed DatabasesDistributed Databases Client-ServerClient-Server
Network performance becomes an Network performance becomes an additional concernadditional concern
Transaction TuningTransaction Tuning
DBA participation in program reviews DBA participation in program reviews and walkthroughsand walkthroughs
Continuous MonitoringContinuous Monitoring
Transaction Tuning HeuristicsTransaction Tuning Heuristics
Single query instead of multiple queriesSingle query instead of multiple queries ““multiple” includes sub-queriesmultiple” includes sub-queries
Avoid long-running transactionsAvoid long-running transactions Avoid large quantities of updatesAvoid large quantities of updates
Locking and loggingLocking and logging Reduce number of tables joinedReduce number of tables joined
Transaction Tuning HeuristicsTransaction Tuning Heuristics
Reduce sortingReduce sorting Return less data rather than moreReturn less data rather than more
Don’t shift logic from query to programDon’t shift logic from query to program Optimizer is likely to be fasterOptimizer is likely to be faster Less data is returnedLess data is returned
Transaction Tuning ExampleTransaction Tuning Example
Get the names of all managers whose Get the names of all managers whose offices have property listed in Pgh.offices have property listed in Pgh.
SELECT * SELECT *
FROM Property as P, Office as O, Manager as M, FROM Property as P, Office as O, Manager as M, Employee as E Employee as E
WHERE P.OfficeNbr = O.OfficeNbr AND O.OfficeNbr WHERE P.OfficeNbr = O.OfficeNbr AND O.OfficeNbr = M.OfficeNbr AND M.EmpID = E.EmpID AND = M.OfficeNbr AND M.EmpID = E.EmpID AND
PropertyID IN (SELECT PropertyID FROM Property as PropertyID IN (SELECT PropertyID FROM Property as P2 WHERE P2.City = ‘Pgh’ AND P2.OfficeNbr = P2 WHERE P2.City = ‘Pgh’ AND P2.OfficeNbr = M.OfficeNbr)M.OfficeNbr)
Transaction Tuning ExampleTransaction Tuning Example
SELECT SELECT **
FROM Property as P, Office as O, Manager as M, Employee as FROM Property as P, Office as O, Manager as M, Employee as E E
WHERE P.OfficeNbr = O.OfficeNbr AND O.OfficeNbr = WHERE P.OfficeNbr = O.OfficeNbr AND O.OfficeNbr = M.OfficeNbr AND M.EmpID = E.EmpID AND M.OfficeNbr AND M.EmpID = E.EmpID AND
PropertyID IN (SELECT PropertyID FROM Property as P2 PropertyID IN (SELECT PropertyID FROM Property as P2 WHERE P2.City = ‘Pgh’ AND P2.OfficeNbr = M.OfficeNbr)WHERE P2.City = ‘Pgh’ AND P2.OfficeNbr = M.OfficeNbr)
More is selected than is needed.More is selected than is needed.
Transaction Tuning ExampleTransaction Tuning Example
SELECT * SELECT *
FROM FROM Property as P, Office as O, Manager as M, Employee as Property as P, Office as O, Manager as M, Employee as EE
WHERE P.OfficeNbr = O.OfficeNbr AND O.OfficeNbr = WHERE P.OfficeNbr = O.OfficeNbr AND O.OfficeNbr = M.OfficeNbr AND M.EmpID = E.EmpID AND M.OfficeNbr AND M.EmpID = E.EmpID AND
PropertyID IN (SELECT PropertyID FROM Property as P2 PropertyID IN (SELECT PropertyID FROM Property as P2 WHERE P2.City = ‘Pgh’ AND P2.OfficeNbr = M.OfficeNbr)WHERE P2.City = ‘Pgh’ AND P2.OfficeNbr = M.OfficeNbr)
Some joined tables can be eliminated.Some joined tables can be eliminated.
Transaction Tuning ExampleTransaction Tuning Example
SELECT * SELECT *
FROM Property as P, Office as O, Manager as M, Employee as FROM Property as P, Office as O, Manager as M, Employee as E E
WHERE P.OfficeNbr = O.OfficeNbr AND O.OfficeNbr = WHERE P.OfficeNbr = O.OfficeNbr AND O.OfficeNbr = M.OfficeNbr AND M.EmpID = E.EmpID AND M.OfficeNbr AND M.EmpID = E.EmpID AND
PropertyID IN PropertyID IN (SELECT PropertyID FROM Property as P2 (SELECT PropertyID FROM Property as P2 WHERE P2.City = ‘Pgh’ AND P2.OfficeNbr = M.OfficeNbr)WHERE P2.City = ‘Pgh’ AND P2.OfficeNbr = M.OfficeNbr)
Subquery is executed once per office.Subquery is executed once per office.
Transaction Tuning ExampleTransaction Tuning Example
SELECT E.Name SELECT E.Name
FROM Employee as E FROM Employee as E
WHERE E.MgrFlag = 1 AND OfficeNbr IN WHERE E.MgrFlag = 1 AND OfficeNbr IN
(SELECT OfficeNbr FROM Property as P (SELECT OfficeNbr FROM Property as P
WHERE P.City = ‘Pgh’)WHERE P.City = ‘Pgh’)
Version without any joins - 2 single table Version without any joins - 2 single table queries only.queries only.
Transaction Tuning ExampleTransaction Tuning Example
SELECT DISTINCT E.Name SELECT DISTINCT E.Name
FROM Property as P, Employee as E FROM Property as P, Employee as E
WHERE P.OfficeNbr = E.OfficeNbr AND E.MgrFlag = 1 AND WHERE P.OfficeNbr = E.OfficeNbr AND E.MgrFlag = 1 AND
P.City = ‘Pgh’P.City = ‘Pgh’
Single query with join.Single query with join.
Transaction TuningTransaction Tuning
Explain (or similar tool) can help to Explain (or similar tool) can help to identify how each transaction will access identify how each transaction will access the data and what temporary tables will the data and what temporary tables will have to be created to execute the queryhave to be created to execute the query
With multiple options, test themWith multiple options, test them Order of conditions in WHERE can affect Order of conditions in WHERE can affect
the optimization and performancethe optimization and performance I.E., put MgrFlag = 1 firstI.E., put MgrFlag = 1 first
Physical Database TuningPhysical Database Tuning
IndicesIndices Schema TuningSchema Tuning
Retaining NormalizationRetaining Normalization DenormalizationDenormalization
IndicesIndices
UniqueUnique NonuniqueNonunique
Single AttributeSingle Attribute Multiple AttributesMultiple Attributes
(concatenated or (concatenated or composite key)composite key)
Primary KeyPrimary Key Secondary IndexSecondary Index
Additional IndicesAdditional Indices
Index decreases read time but increases Index decreases read time but increases update timeupdate time
Based on queries - even single queryBased on queries - even single query (EXPLAIN)(EXPLAIN)
Indices need reorganizationIndices need reorganization Inserts, Updates, DeletesInserts, Updates, Deletes
Specify freespaceSpecify freespace Reduce frequency of reorganizationsReduce frequency of reorganizations
Schema Tuning - Schema Tuning - Staying NormalStaying Normal Split Tables - Vertical PartitioningSplit Tables - Vertical Partitioning Highly used vs. infrequently used Highly used vs. infrequently used
columnscolumns
Don’t partition if result will be more joinsDon’t partition if result will be more joins
Keys are duplicatedKeys are duplicated
Schema Tuning -Schema Tuning -Staying NormalStaying Normal Variable length fields (VARCHAR, others)Variable length fields (VARCHAR, others) Indeterminant record lengthsIndeterminant record lengths Row locations varyRow locations vary
Vertically partition row into two tables, Vertically partition row into two tables, one with fixed and one with variable one with fixed and one with variable columnscolumns
Schema Tuning -Schema Tuning -Leaving NormalLeaving Normal NormalizationNormalization
Eliminates duplicationEliminates duplication Reduces anomaliesReduces anomalies
Does not result in efficiencyDoes not result in efficiency
Denormalize for performanceDenormalize for performance
Denormalization WarningsDenormalization Warnings
Increases chance of errors or inconsistenciesIncreases chance of errors or inconsistencies May result in reprogramming if business rules May result in reprogramming if business rules
changechange Optimizes based on Optimizes based on currentcurrent transaction mix transaction mix Increases duplication and space requiredIncreases duplication and space required Increases programming complexityIncreases programming complexity
Always normalize first then denormalizeAlways normalize first then denormalize
DenormalizationDenormalization
Partition RowsPartition Rows Combine TablesCombine Tables Combine and PartitionCombine and Partition Replicate DataReplicate Data
Combining OpportunitiesCombining Opportunities
One-to-one (optional)One-to-one (optional) allow nullsallow nulls
Many-to-many (assoc. entity)Many-to-many (assoc. entity) 2 tables instead of 32 tables instead of 3
Reference data (one-to-many)Reference data (one-to-many) ““one” not use elsewhereone” not use elsewhere few of “many”few of “many”
Combining ExamplesCombining Examples
EmployeeEmployee-Spouse (name and SSN -Spouse (name and SSN only)only)
Owner-PctOwnedOwner-PctOwned - - PropertyProperty few owners with multiple propertiesfew owners with multiple properties
PropertyProperty-Type (description)-Type (description) one type per propertyone type per property
PartitioningPartitioning
HorizontalHorizontal By row typeBy row type Separate processing by typeSeparate processing by type Supertype/subtype decisionSupertype/subtype decision
Vertical (already seen)Vertical (already seen) BothBoth
ReplicationReplication
Intentionally repeating dataIntentionally repeating data
Example: Owner-PctOwned-PropertyExample: Owner-PctOwned-Property Owner includes PctOwned & Owner includes PctOwned &
PropertyIDPropertyID Property includes majority OwnerSSN Property includes majority OwnerSSN
and and PctOwnedPctOwned
Performance TuningPerformance Tuning
Not a one-time eventNot a one-time event Monitoring probably more importantMonitoring probably more important Things changeThings change
applications, database (table) sizes, applications, database (table) sizes, data characteristicsdata characteristics
hardware, operating system, DBMShardware, operating system, DBMS