214
Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips-Tait

Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Embed Size (px)

Citation preview

Page 1: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Data Structures, Algorithms and Database Programming

Semester 2/ Weeks 13-24

Database Programming

Nick Rossiter/Emma-Jane Phillips-Tait

Page 2: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Data Structures, Algorithms and Database Programming

Semester 2/ Week 13

Database Programming

Nick Rossiter/Emma-Jane Phillips-Tait

Page 3: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Introduction

• Database Programming– A program is defined simply as:

• a sequence of instructions that a computer can interpret and execute

– So SQL (Structured Query Language)• the ISO standard language for relational databases

– is a programming language

Page 4: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

SQL - Classification

• SQL is the basis of all database programming• As a language SQL is:

– Non-procedural• Specify the target, not the mechanism (what not how)

– Safe• Negations limited by context

– Set-oriented• All operations are on entire sets

– Relationally complete• Has the power of the relational algebra

– Functionally incomplete• Does not have the power of a programming language like Java

Page 5: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

SQL – Program Constructions

SELECT id, name, addressFROM studentWHERE name = ‘Mary Brown’;

• FROM statement specifies tables to be queried (source/range)

• WHERE statement specifies restriction on values to be processed (predicate)

• SELECT statement specifies what is to be retrieved (target)

Page 6: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Some properties of SQL re-visited

• Non-procedural– No loops or tests for end of file

• Set-oriented– The operation is automatically applied to all the rows in

STUDENT

• Relationally complete– Restrict shown here (all others are available)

• Functionally incomplete– Does not matter here if just want information displayed

Page 7: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

SQL Program - Example of Natural Join

SELECT student.id, name, address, year

FROM student, module_choice

WHERE name = ‘Mary Brown’

AND module = ‘CM503’

AND student.id = module_choice.id;

• Last line does primary key – foreign key match

• “Give details of when a student called Mary Brown took module CM503”

Page 8: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Id * | Name | Address

Student

Module_choice

Module * | Id * | year

Module_choice.Id is foreign key to Student.id; represents a path along which joins are made

* Indicates component of primary key

Data names in SQL are case insensitive; SQL values are case sensitive.

Page 9: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Id * Name Address

127 Mary Brown Hexham

296 John Brown Morpeth

654 Mary Brown Newcastle

Student

Module_choice Module * Id * year

CM503 127 2003

CM503 654 2001

cm503 127 2002

For the above data values, what’s the answer?

Page 10: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Id Name Address Year

127 Mary Brown

Hexham 2003

654 Mary Brown

New-castle

2001

Column name contains only one value (as would a module column)

Why only 2 rows? Why is one ‘127’ match missing?

Page 11: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Rewriting Joins as Intersections• SQL is not necessarily run in the way you enter it• You (or the system) could rewrite the join earlier as:SELECT id, name, addressFROM studentWHERE name = ‘Mary Brown’ AND id IN

(SELECT id FROM module_choiceWHERE module= ‘CM503’);

• There’s one difference. Why?

Page 12: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

SQL controls the filing cabinet

• Defines data structures (CREATE TABLE, CREATE VIEW, …)

• Handles updates (INSERT, DELETE, COMMIT, ROLLBACK, …)

• Provides retrieval (SELECT)

• But it is not functionally complete in its interactive form

Page 13: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Functional Incompleteness in SQL

• No control statements such as: – Case, Repeat, If, While

• No substitution at run time:– e.g. … WHERE id = :idread– where idread is a program variable

• You don’t see travel agents typing in SQL statements to search for holiday vacancies– although they will be searching a relational database

• There is SQL underneath – But its functionality is increased through additional features

Page 14: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Getting More out of Basic SQL

• To overcome functional incompleteness:– Pre-defined Functions – Procedures (e.g. PL/SQL)– User-defined Functions– Embedded SQL– Web-based Servers (e.g. Microsoft/ASP,

Oracle/JSP, Oracle/JDBC, MySQL/PHP)

Page 15: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Pre-defined Functions in SQL

• An SQL function:– Is a method applied to a particular type– Returns a single value

• There are many pre-defined functions:– Can be used without any knowledge of how they are

implemented– All can be used in target (SELECT) and some in predicate

(WHERE)– Used in areas such as string handling, simple statistics, date

manipulation, type casting

• Use pre-defined functions where available to avoid writing your own code

Page 16: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Example of Predefined Function

SELECT student.id, name, address, yearFROM student, module_choiceWHERE name = ‘Mary Brown’ AND upper(module) = ‘CM503’ AND student.id = module_choice.id;

• Upper(char) takes a value of type char and forces it to upper case

• In the example above it does not update module values in the module_choice table

• So how many rows are retrieved by this join?

Page 17: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Other String-Handling Functions

• Include– CONCAT(arg1, arg2)

• concatenation of the two arguments arg1, arg2

– TRIM(arg1)

• removes leading and trailing blanks from arg1

– LOWER(arg1)

• translates arg1 to lower case

– UPPER(arg1)

• translates arg1 to upper case

– SUBSTR(arg1,n,m)• returns positions n…(n+m) of arg1

• arg1, arg2 are char types; n, m are integer types

Page 18: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Predefined Aggregation Functions

• Functions operating on collections include:– AVG(setN)

• returns average of setN – SUM(setN)

• returns sum of setN– COUNT(setR)

• returns count of setR

– MAX(setT)• returns maximum of setT

• setN is a set of type number, setR is a set of type row, setT is a set of any type

• Sets may be formed as columns of values via:– a SELECT command on a table or– a GROUP BY on a table

Page 19: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Predefined Date Functions

• Include:– SYSDATE

• no arguments, returns current date

– MONTH(arg3)• returns month component of arg3

– YEAR(arg3)• returns year component of arg3

– MONTHS_BETWEEN(arg3,arg4)• the number of months between arg3 and arg4

– In some versions of Oracle, need to use syntax e.g. {fn MONTH(arg3)}

• arg3, arg4 are date types

• (arg3-arg4) gives number of days between the two dates

• time handling is also available within date type

Page 20: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Data Structures, Algorithms and Database Programming

Semester 2/ Week 14

Database Programming

Nick Rossiter/Emma-Jane Phillips-Tait

Page 21: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Programming with SQL*Plus

• Predefined functions play an increasing role– Sometimes termed built-in functions

• Rather unstable to some extent– New functions in Oracle 9i– Some redefinitions of Oracle 8i functions

– Not always upwards compatible

– What works in one system may not work in another without some tweaking

Page 22: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Reference Material

• So programmers must always consult reference material

• Text books and lecture notes are not reference manuals

• All database manuals are available on-line e.g.– DB2 notes cited in exercises for week 13 (should be

very similar to Oracle as same standard)– Oracle 9i notes available from (prefix id by unn/)– http://cgweb1.unn.ac.uk/SubjectAreaResources/database/oracle/doc/

• Sound advice is: Read the Manual!

Page 23: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Reading Variable Values into SQL Programs

• Makes programs dynamic• Number of methods

– Substitution variables• &variable in programs

– Accept statements • User-defined prompt and type for a variable value

– Script parameters• Variables assigned values on executing a script file

• Such reads make programs versatile– Can run with values specified by user at run-time

Page 24: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Substitution Variablesselect *from patientwhere pid = ‘&pno’;

• when run, prompts user for value for pno• quotes indicate a char value select * from patientwhere pid <= '&&pno'and pid >= '&&pno';

• Double && means only one prompt is made– even if same variable occurs more than once in program

• In all input operations quotes are also used for dates but not for numbers.

Page 25: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Accept Statement

• ACCEPT sp_variable type PROMPT ‘string’;• where upper case is literal • sp_variable is the SQL*Plus variable being assigned • type is the type of the SQL*Plus variable• string is the prompt

• Example:accept pat_no char prompt 'Enter patient id:';select *from patientwhere pid = ‘&pat_no’;• Value entered for pat_no is available for rest of session.

Page 26: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Script Parameters• Can run a script file called S.sql (type sql is required by

default) by:@S• Say S containsselect *from patientwhere pname = '&1‘ AND pid = '&2';

• Then can run the script by:@S 'Fred' '1';• ‘Fred’ is parameter 1 and ‘1’ is parameter 2• May need to use file/open in SQL*Plus to set directory

Page 27: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Nulls

• In more serious programming with SQL– often need to know whether a variable has been initialised or not.

• An un-initialised variable has a null value – unless a default has been supplied

• Cannot search for nulls as `’ or ``’’

select * from patient where pname IS NULL;

• finds rows where patient name is null

select * from patient where pname IS NOT NULL;

• finds rows where patient name is not null

Page 28: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Spooling

• Useful to record a whole session

• From the File menu in SQL*Plus:– Can set a (text) file as the recipient of all output

including commands– Need

• SET ECHO ON

– to have a record of everything

Page 29: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Intermediate Results• Building up results in stages is often a good idea:

– Can check intermediate results for correctness– May be able to re-use intermediate results in more than one

way

• Views may be used for this purpose– No data storage costs for a view– Updated automatically as data changes (in effect)

• Reflects latest data position

• Tables are less satisfactory– Duplicate data storage– Out of date as snapshot of data held

Page 30: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Examples of Views

create view pv as(select patient.*, visits.did, visits.vdatefrom patient, visitswhere patient.pid = visits.pid);• Natural join of patient and visits, contains:

– pid, pname, address, dobirth, date_reg, did, vdate

create view dv as(select doctor.*, visits.pid, visits.vdatefrom doctor, visitswhere doctor.did = visits.did);• Natural join of doctor and visits, contains:

– did, dname, date_start, pid, vdate

Page 31: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Operations on Views 1

• Can select and search as if they were tables• Updates may cause problems (not considered here)

• Examplesa) select * from pv; b) select * from pv where pid = ‘5’;c) Create view pvv as (select pv.*, action, vaccinatedfrom pv, vaccinations vcwhere pv.pid = vc.pidand pv.vdate = vc.vdate);• Does natural join of pv and vaccinations over pid and vdate. • Natural join pvv contains:

– pid, pname, address, dobirth, date_reg, did, vdate, action, vaccinated

vc is alias forvaccinations

Page 32: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Operations on Views 2

• View pvv is the natural join of patient, visits and vaccinations

• It can be presented to users as a structure:– For ease of searching (just where clause)– In which no knowledge of joins is required

select distinct pid, pname from pvvwhere upper(vaccinated) = ‘TYPHOID’ and address = ‘Heaton’and vdate < ’25-apr-2002’; • Searches for values in all three base tables with joins (that is

logical connections) already made in pvv.

Page 33: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Setting the SQL*Plus Environment

• Over 50 variables control the environment in which SQL*Plus runs.

• Can see all their current settings through:show all;

• Cover appearance of prompts, formats on screen, transaction settings, recovery, escape, compatibility, …

• A potential pitfall for imported applications if different environment assumed

SHOW var;

• shows the current setting for a particular variable

Page 34: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Examples of Environment Variables 1

• Autocommit– If on updates are committed after each update

command (insert, delete, update)– If off updates are not committed after each

update command

• Time– If on all prompts are preceded by time giving

time stamping;– If off time is not displayed with the prompt

Page 35: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Examples of Environment Variables 2

• Linesize– Set to an integer giving width of line display

• SQLPrompt– Can vary default prompt from SQL>

• Feedback– If on report on number of rows found– If off give no report

• Echo– If on echo input on screen– If off do not echo input

Page 36: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Setting Environment Variables

• SET variable value;– variable is the environment variable– value is the new value

• Examples:– set autocommit on; – set sqlprompt input>;– Set linesize 100;

Page 37: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Data Structures, Algorithms and Database Programming

Semester 2/ Week 15

Database Programming

Nick Rossiter/Emma-Jane Phillips-Tait

Page 38: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

SQL*Plus Scripting 1• Plus points:

– Same SQL language as in interactive mode• Can test programs interactively first

• Includes predefined (built-in) functions

– Fast development possible• Rapid prototyping

• Get results and feedback quickly

• No cumbersome environment

– Variable inputs• Parameters, substitution, accept

Page 39: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

SQL*Plus Scripting 2

• Plus points (continued)– Can have multiple script files

• Each file created by simple text editor

– Can have master script file• calling others in sequence

– Or can nest script files more generally• scripts can call other scripts (default file type sql)

– @S6– start S6

Page 40: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

SQL*Plus Scripting 3

• Problems with scripts:– interpreted each time they are run

• not verified and compiled

• optimisation of SQL code done each run time

• poor performance

– no control environment• procedural actions lacking (case, if, while, for, repeat)

• no error handling– resulting in outright failures or ignoring of messages

Page 41: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

SQL*Plus Scripting 4

• Problems with scripts (continued):– Lack of control by business (via DBA -- DataBase

Administrator)

– How do we permit scripts for usage by particular people?

• Can anyone write a script to do anything they like?

– If people write scripts themselves to handle business rules

• how do we know they’ve implemented the rules in the same way?

Page 42: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Example: Different Representation of Same Rule

• update patient set dobirth = '20-feb-1932’ where pid = '3';• select round(months_between(sysdate,dobirth)/12) as age_mb from

patient;• select trunc((sysdate-dobirth)/365.25) as age_dd from patient;• select trunc((sysdate-dobirth)/365) as age_ddnl from patient;• select (extract(year from sysdate)-extract(year from dobirth)) as

age_yr from patient;

• Above:• Alter dobirth for patient ‘3’• Run four queries each one, according to the user,

calculating the age.

Page 43: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Results 14-Feb-2004

pid 1 2 3 4

Age_mb 34 20 72 22

Age_dd 33 20 71 22

Age_ddnl 33 20 72 22

Age_yr 34 21 72 22

Page 44: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Comments on Table• Minor differences such as these

– often more of a problem than major differences

• If big differences– these rapidly become obvious– e.g. paying an interest rate ten times more than others

• Age differences here could play havoc with:– social security benefits

• Later we will create a function to calculate the age precisely

Page 45: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Production Environment

• Encourages:– business rules in one place

• application of rules then controlled by DBA

– users need permission to apply rules• permission is granted/revoked by DBA

• Discourages:– duplicated, potentially inconsistent, rules– access by users to anything they like

Page 46: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

SQL Procedure

• An important technique• Part of PL/SQL in Oracle

– Procedural Language/Structured Query Language

• Part of the SQL standard– approximate portability from one system to another

• Techniques are available for:– procedural control (case, if, while, …)– parameterised input/output– security

Page 47: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Oracle PL/SQL

• Not available in Oracle 8i Lite• Available in Oracle 9i at Northumbria• Available in Oracle 9i Personal Edition for

Windows (XP/NT/2000/98) and linux. – http://otn.oracle.com/software/products/oracle9i/index.html

– c1.4Gb download -- needs Broadband -- 3 CDs

• Useful guide to PL/SQL:– http://www-db.stanford.edu/~ullman/fcdb/oracle/or-plsql.html

– Using Oracle PL/SQL -- Jeffrey Ullman, Stanford University

Page 48: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Procedures are First-class Database Objects

• Procedures are held in database tables– under the control of the database system

• in the data dictionary• select object_type, object_name• from user_objects• where object_type = 'PROCEDURE';

• user_objects is data dictionary table maintained by Oracle• object_type is attribute of table user_objects holding value

‘PROCEDURE’ (upper case) for procedures – other values for object_type include ‘TABLE’, ‘VIEW’

• object_name is user assigned name for object e.g. ‘PATIENT’

Page 49: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Procedures aid Security

• Privileges on Tables: • Select

– query the table with a select statement. • Insert

– add new rows to the table with insert statement. • Update

– update rows in the table with update statement. • Delete

– delete rows from the table with delete statement. • References

– create a constraint that refers to the table. • Alter

– change the table definition with the alter table statement. • Index

– create an index on the table with the create index statement

Page 50: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Privileges on Tables• SQL statement -- issued by DBA:

– GRANT select, insert, update, delete ON patient TO cgnr2;

– ‘no grants to cgnr3 for table access’• allows user cgnr2 to issue SQL commands:

• beginning with SELECT, INSERT, UPDATE, DELETE on table patient

• but this user cannot issue SQL commands

• beginning with REFERENCES, ALTER, INDEX on table patient

• User cgnr3 does not know that table patient exists

Page 51: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Privileges on Procedures• The SQL statement

– GRANT execute ON add_patient TO cgnr3;

• allows user cgnr3 to execute the procedure called add_patient

• So user cgnr3 can add patients

– presumably the task of add_patient

• but cannot do any other activity on the patient table

– including SELECT

• So procedures give security based on tasks– powerful task-based security system

Page 52: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Data Structures, Algorithms and Database Programming

Semester 2/ Week 16

Database Programming

Nick Rossiter/Emma-Jane Phillips-Tait

Page 53: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

SQL Procedure Construction• Simple example, SQL*Plus Window:SQL> create or replace procedure add_patient as2 begin3 insert into patient values('99','Smith','Newcastle','12-mar-1980');4 end;5 /Warning: Procedure created with compilation errorsSQL> show errorsErrors for PROCEDURE ADD_PATIENT

LINE/COL ERROR-------- -----------------------------------------3/1 PL/SQL: SQL Statement ignored3/13 PL/SQL: ORA-00947: not enough values

Page 54: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Technique

• Have procedure code in text file managed by simple editor– E.g. Notepad

create or replace procedure add_patient asbegininsert into patient values('99','Smith','Newcastle','12-mar-1980');end;/

• Copy and paste code from text file into SQL*Plus window

• Oracle does keep a copy in its data dictionary• Many users work from text files

Page 55: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Features of Procedure

• CREATE OR REPLACE add_patient AS– Either add or over-write procedure called add_patient

• Needs care

• Could over-write existing procedure

• IS is alternative for AS

• BEGIN and END – Start and finish block

• INSERT is standard SQL statement• / means compile

Page 56: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Error Tracking

• ‘with compilation errors’– Problem(s) encountered in compilation

• Look at these through SQL command– SHOW ERRORS (SHO ERR abbreviation)

• Diagnostics:– Statement at line 3 ignored

• As not enough values at line 3, column 13 for patient• Five columns in patient, four in insert statement

– So in compilation tables are checked for compatibility with procedure operations

• ORA-00947 is an Oracle return code for ‘not enough values’

• Only execute procedures compiled without errors

Page 57: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Try again

SQL> create or replace procedure add_patient (reg in char) as

2 begin 3 insert into patient

values('99','Smith','Newcastle','12-mar-1980',reg); 4 end; 5 /

Procedure created.

Page 58: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Parameters

• Have added 5th variable to values• Also added a parameter

– Reg• type char (as in SQL types) and in (input, read-only)

– Other types at this level are number, date

• Message ‘Procedure created’ means:– No errors found

– Procedure can be executed

– Procedure is held in Oracle’s data dictionary

Page 59: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Data Dictionary entry for procedure

SQL> select object_type, object_name 2 from user_objects 3 where object_type = 'PROCEDURE';

OBJECT_TYPE------------------OBJECT_NAME--------------------------------------------------PROCEDUREADD_PATIENT

Page 60: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Executing Procedure

SQL> execute add_patient('14-feb-2002');PL/SQL procedure successfully completed.SQL> select * from patient where pid = '99';PID PNAME------ --------------------ADDRESS-----------------------------------------------DOBIRTH DATE_REG--------- ---------99 SmithNewcastle12-MAR-80 14-FEB-02

Page 61: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Features of Execution

• ’14-feb-2002’ is value for parameter of type date

• Other values are hard-wired in procedure

• Message ‘… successfully completed’– No errors during run

• Subsequent SELECT confirms– New data entered for patient with pid = ’99’

Page 62: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Now run procedure again

SQL> execute add_patient('14-feb-2002');

BEGIN add_patient('14-feb-2002'); END;

*

ERROR at line 1:

ORA-00001: unique constraint (CGNR1.PKP) violated

ORA-06512: at "CGNR1.ADD_PATIENT", line 3

ORA-06512: at line 1

Page 63: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Error – why?

• Attempt to add row with same primary key as last run (’99’).

• So violation at line 3 of procedure of primary key constraint CGNR1.PKP – CGNR1 is id– PKP is constraint from CREATE TABLE

• create table patient (• pid char(6) constraint pkp primary key, ….

• ‘ORA-00001: unique constraint violated’– Oracle return code and associated message

Page 64: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

All values from parameters

SQL> CREATE OR REPLACE PROCEDURE add_patient (pid in char, pname in char, address in char, dobirth in date, regdate in date)

2 AS 3 BEGIN 4 insert into patient values(pid,pname,address,dobirth,regdate); 5 DBMS_OUTPUT.PUT_LINE ('Insert attempted'); 6 END; 7 /

Procedure created.

Page 65: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

No data hard-wired/output strings

• Usually meaningless to have hard-wired data values– Need dynamic input at run-time– Note two types – char, date– Values may be captured through SQL Forms

• Output strings– Varies from system to system– In Oracle

• Use DBMS_OUTPUT.PUT_LINE • Needs earlier SQL command:

– Set serveroutput on

– Note output here is unconditional and vague

Page 66: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Execution with all values as parameters

SQL> execute add_patient('124','Smith','Edinburgh','13-nov-1980','27-dec-2002');Insert attempted

PL/SQL procedure successfully completed.

SQL> select * from patient where pid = '124';

PID PNAME------ --------------------ADDRESS--------------------------------------------------------------------------------DOBIRTH DATE_REG--------- ---------124 SmithEdinburgh13-NOV-80 27-DEC-02

Page 67: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Make columns explicit

SQL> CREATE OR REPLACE PROCEDURE add_patient (pat_id in char, pat_name in char, pat_address in char, pat_dobirth in date, pat_regdate in date)

2 AS 3 BEGIN 4 insert into patient(pid,pname,address,dobirth,date_reg)

values(pat_id,pat_name,pat_address,pat_dobirth,pat_regdate); 5 DBMS_OUTPUT.PUT_LINE ('Insert attempted'); 6 END; 7 /Procedure created.

• Specifying columns for patient makes procedure immune to any later changes in order of columns in patient

Page 68: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Data Structures, Algorithms and Database Programming

Semester 2/ Week 17

Database Programming

Nick Rossiter/Emma-Jane Phillips-Tait

Page 69: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Transactions -- Rationale

• Consider two clients booking airline tickets• There are 2 seats left on a flight• Client A wants 2 seats:

– time 12:02 makes initial request– 12:06 confirms purchase through booking form– 12:08 authorises credit card payment

• Client B wants 2 seats:– time 12:03 makes initial request– 12:05 confirms purchase through booking form– 12:09 authorises credit card payment

• Situation needs careful control

Page 70: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Some Possibilities• Clients A and B are both told 2 seats are free in

initial enquiries• B confirms purchase before A

– But A may still proceed

• A attempts credit card debit first– If successful A secures tickets at 12:08

• B then attempts credit card debit– If successful B secures tickets at 12:09

• potentially over-writing A’s tickets• A has paid for tickets no longer his/hers

Page 71: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Requirements 1• When client A beats B in the initial enquiry:

– they should form a queue (serialisability)– B must wait for A to finish

• Different kinds of finish for A:– successful

• completes booking form• makes credit card debit• store results (commit)

– number of seats available is now zero

• write transaction log and finish• B cannot proceed with purchase as no tickets left

Page 72: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Requirements 2

– unsuccessful• may not complete booking form

• may not have funds on credit card

• undo any database changes (rollback) and finish– number of seats available is still 2

• B can now proceed to attempt to purchase the 2 tickets left

• Techniques required emulate business practice

Page 73: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Transactions -- ACID• A transaction is a unit of operation on a database.

– typically comprises a collection of individual actions• e.g. in SQL INSERT, UPDATE, DELETE, SELECT

• Satisfies ACID requirements:– Atomicity

• Collection of operations is viewed as a single data process

– Consistency• Data integrity is preserved

– Isolation• No interaction between one transaction and another• Intermediate results not viewable by others

– Durability• Once completed, effect of transaction is guaranteed to persist

Page 74: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Transactions in SQL• Logical units of work• A group of related operations that

– must be performed successfully• before any changes to the database are finalised.

• Variable size:

– entire run on SQL*Plus • e.g. spend 2 hours inserting data

– single command in SQL*Plus• e.g. one insert command

– one execution of a procedure• e.g. one run of add_patient (week 16)

Page 75: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

SQL approach may be informal

• No explicit– BEGIN transaction, END transaction

• With autocommit OFF– SET AUTOCOMMIT OFF

• Implicit BEGIN transaction by:– start of SQL*Plus session

• Implicit END transaction by:– end of SQL*Plus session

Page 76: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

SQL Transaction commands

• Commit;– saves current database state

– releases resources held

– equivalent to Save and Exit in MS Word

• Rollback;– returns database state to that at start of transaction

– releases resources held

– equivalent to dismiss/ do not save changes in MS Word

Page 77: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Use of Commands• Commit/Rollback

– explicitly entered in:• SQL*Plus window interactively• PL/SQL code including procedures

– implicitly entered• on normal EXIT from Oracle (commit 9i, rollback 8i Lite)• on abnormal exit from Oracle e.g. dismiss (rollback)• after each update command in SQL*Plus (commit)

– when autocommit is ON (or IMMEDIATE)

• after change to data definition, e.g. alter table (commit)– whatever autocommit setting

Page 78: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

ACID in Oracle 1

• Atomicity – all commands in a transaction form a single

logical group

• Consistency– integrity checks within transaction

• Isolation– data modified by transaction not visible to

others until end of transaction

Page 79: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

ACID in Oracle 2

• Durability– On Commit

• database state is first saved

• transaction log file is then updated– this log file may be held in several locations

• confirmation of log file writes ends transaction

– If crash (e.g. of disk) after commit• restore last save of database file

• run transaction log on database forward from:– save point to last transaction that committed

Page 80: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Partial Rollbacks

• Savepoints can be declared in SQL*Plus window or PL/SQL:

SAVEPOINT label; (label is a character string)

• The command ROLLBACK to label;• undoes changes back to the label in the

program or window• Many different savepoints can be declared

Page 81: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Locks

• Resources are held by locks• In SQL lock management is done:

– automatically with COMMIT and ROLLBACK

• Users and programmers can rely on defaults• However, some knowledge is useful for:

– tuning in production systems using LOCK command for efficiency

– understanding problems in running concurrent transactions

Page 82: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Example Lock Table

Task Table Row Lock type

CGNR1–1 Patient 8 W

CGNR1-2 Patient 1 R

CGSA1-1 Patient 7 W

CGSA1-2 Patient 1 R

Page 83: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Locking Modes • R (read) or shared

– any number of tasks can read the same data items concurrently

– CGSA1-2 and CGNR1-2 both read patient 1

• W (write) or exclusive– when writing data need exclusive access– otherwise values can change while in use by others– So CGNR1-1 is only task that can access Patient 8– and CGSA1-1 is only task that can access Patient 7

• None -- no entry in table

Page 84: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Locking Granularity• Can lock at level of:

– table– page (unit of disk storage)– row

• Coarse locks:– for instance a whole table– give small lock tables (not that many tables)– much contention for resources (many users queue for table access)

• Fine locks:– for instance a single row– give large lock tables (many rows included)– less contention for resources (few users queue for row access)

• Oracle defaults give fine locking

Page 85: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Data Structures, Algorithms and Database Programming

Semester 2/ Week 18

Database Programming

Nick Rossiter/Emma-Jane Phillips-Tait

Page 86: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Transactions in Procedures

• On the surface – very easy.• If everything goes well (at end):

– COMMIT:

• If things go badly (at end):– ROLLBACK;

• Problem is controlling bad outcomes:– Handling exceptions

– Giving useful feedback

Page 87: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Week 16 Example – with Commit

SQL> CREATE OR REPLACE PROCEDURE add_patient (pat_id in char, pat_name in char, pat_address in char,

pat_dobirth in date, pat_regdate in date) 2 AS 3 BEGIN 4 insert into patient(pid,pname,address,dobirth,date_reg)

values(pat_id,pat_name,pat_address,pat_dobirth,pat_regdate); 5 DBMS_OUTPUT.PUT_LINE ('Insert attempted'); 6 COMMIT; 7 END; 8 /

Procedure created.

Page 88: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Review of Assignment Procedure

• Asked to add a procedure to add vaccination data

• Generate:– one successful run– three unsuccessful runs

• Here review closely results of run

Page 89: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

ADD_VACC procedure

SQL> CREATE OR REPLACE PROCEDURE add_vacc (pat_id in char, vis_vdate in date, vis_act in number, vac_vacc in char)

2 AS 3 BEGIN 4 insert into vaccinations(pid,vdate,action,vaccinated)

values(pat_id,vis_vdate,vis_act,vac_vacc); 5 DBMS_OUTPUT.PUT_LINE ('Insert attempted'); 6 END; 7 /Procedure created.

Page 90: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Successful Run 1aSQL> execute add_vacc('2','16-dec-1999',3,'cholera');

PL/SQL procedure successfully completed.

SQL> select * from vaccinations 2 where pid = '2' and action = 3;

PID VDATE ACTION VACCINATED------ --------- ---------- --------------------2 06-AUG-91 3 polio2 16-DEC-99 3 cholera

SQL> commit;

Commit complete.

Page 91: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Successful Run 1b

• No error messages• Message ‘PL/SQL procedure successfully completed’ is

significant. It means:– Any exception raised during run has been properly

handled

– Does not necessarily mean data has been added successfully

• COMMIT should have been last line in procedure• Here user has made decision to commit

Page 92: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Unsuccessful Run 1a

SQL> execute add_vacc('2','16-dec-1999',1,'cholera');

BEGIN add_vacc('2','16-dec-1999',1,'cholera'); END;

*

ERROR at line 1:

ORA-00001: unique constraint (CGNR1.PKVAC) violated

ORA-06512: at "CGNR1.ADD_VACC", line 4

ORA-06512: at line 1

Page 93: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Unsuccessful Run 1b

• Error message returned:– ORA-00001 indicates non-unique primary key value– Text message ‘unique constraint violated’ spells out nature of problem– CGNR1.PKVAC is name of constraint in CREATE TABLE definition for

Vaccinations• constraint pkvac primary key (pid,vdate,action)

• Note no message about successful completion.– Does not necessarily mean unsuccessful addition– Means that exception raised in INSERT operation has not been

handled within the procedure

Page 94: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Unsuccessful Run 2a

SQL> execute add_vacc('2','17-dec-1999',1,'cholera');

BEGIN add_vacc('2','17-dec-1999',1,'cholera'); END;

*

ERROR at line 1:

ORA-02291: integrity constraint (CGNR1.SYS_C0080698) violated - parent key not found

ORA-06512: at "CGNR1.ADD_VACC", line 4

ORA-06512: at line 1

Page 95: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Unsuccessful Run 2b• Error message returned:

– ORA-02291 indicates foreign key entered does not match a primary key value (in visits)

– Text message ‘parent key not found’ spells out nature of problem

– foreign key(pid,vdate) REFERENCES visits(pid,vdate);

– CGNR1.SYS_C0080698 is name of constraint– Named constraints give more information

• Again no message about successful completion– As exception not handled

Page 96: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Attempted unsuccessful run

SQL> execute add_vacc('2','16-dec-1999','4','cholera');

PL/SQL procedure successfully completed.

SQL> select * from vaccinations 2 where pid = '2' and action = 4;

PID VDATE ACTION VACCINATED------ --------- ---------- --------------------2 16-DEC-99 4 cholera

Worked as ‘4’ char value entered for numeric attribute action was type cast (automatically) to a number

Page 97: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Unsuccessful Run 3a

SQL> execute add_vacc('2','16-dec-1999','4',cholera);

BEGIN add_vacc('2','16-dec-1999','4',cholera); END;

*

ERROR at line 1:

ORA-06550: line 1, column 38:

PLS-00201: identifier 'CHOLERA' must be declared

ORA-06550: line 1, column 7:

PL/SQL: Statement ignored

Page 98: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Unsuccessful Run 3b

• Error message returned:– ORA-06550 indicates non-declared identifier– Parameter value CHOLERA is not in quotes– Therefore taken as variable– Not declared to system

• Again no message about successful completion– As exception not handled

Page 99: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Exception Handling PL/SQL

• Essential part of any program• Particularly needed for updates

– open-ended nature of user inputs

• But also needed for searches– e.g. may not find any matching data

• An exception is raised when an operation:– fails to perform normally

• A non-handled exception leads to program failure

Page 100: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Exceptions Raised

• With input particularly– Cannot specify all Oracle error codes in

advance– Too many codes to specify– Some rule exceptions though can be

emphasised

• Need specific exceptions• And general (catch-all) exceptions

Page 101: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Complete PL/SQL procedure

• CREATE OR REPLACE PROCEDURE proc_name (parameters) AS

• [DECLARE] local_vars• BEGIN • executable_code• EXCEPTION exception_code• END• /

Page 102: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Explanation• Upper case -- literal (as is)• Lower case (to be substituted)• [DECLARE] omitted in procedures but part of full definition for

PL/SQL • Executable_code

– SQL commands, assignments, condition checking, text output, transactions

• Exception_code:– event handling, transactions

• proc_name is procedure name• local_vars are variables declared for use within procedure

(standard SQL types + Boolean types)

Page 103: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Example Procedure - part 1CREATE OR REPLACE PROCEDURE add_patient (pat_id in char, pat_name

in char, pat_address in char, pat_dobirth in date, pat_regdate in date) ASpid_too_high exception;PRAGMA EXCEPTION_INIT(pid_too_high,-20000); BEGINinsert into patient(pid,pname,address,dobirth,date_reg)

values(pat_id,pat_name,pat_address,pat_dobirth,pat_regdate);DBMS_OUTPUT.PUT_LINE ('Insert attempted');IF pat_id > '500' THENRAISE pid_too_high;END IF;COMMIT;

Page 104: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Example Procedure - part 2

EXCEPTION

WHEN pid_too_high THEN

DBMS_OUTPUT.PUT_LINE ('pid too high');

ROLLBACK;

END;

/

Page 105: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Explanation 1• pid_too_high exception;

– variable pid_too_high of type exception (value true or false)

• PRAGMA EXCEPTION_INIT(pid_too_high,-20000);

– instruction to compiler – enables launch of further transaction to handle

exception pid_too_high • IF pat_id > '500' THEN RAISE pid_too_high; END IF;

– IF .. THEN … END IF construction – enforces a business rule that pid <= 500 by

• raising exception pid_too_high when this state occurs

Page 106: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Explanation 2

• EXCEPTION– opens exception handling part of procedure

• WHEN … THEN …; – defines actions when a particular exception

occurs

Page 107: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Flow of Action 1• If no exception raised

– insert is performed– commit takes place– procedure terminates with ‘successful’ message

• If specific exception for business rule raised– insert is performed– exception pid_too_high is raised in IF code– execution of main code immediately finishes– code in EXCEPTION section after WHEN pid_too_high is

executed• including rollback

– procedure terminates with ‘successful’ message

Page 108: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Flow of Action 2

• If another exception raised (on insert e.g. primary key violation)– insert is not performed– exception is raised in procedure– execution of main code immediately finishes– As no further exception handlers are declared

• procedure terminates with:– error reports– no ‘successful’ message

• Need catch-all exception handlers

Page 109: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Data Structures, Algorithms and Database Programming

Semester 2/ Week 19

Database Programming

Nick Rossiter/Emma-Jane Phillips-Tait

Page 110: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Selects in Procedures 1

SQL> CREATE OR REPLACE PROCEDURE sel_patient AS 2 BEGIN 3 select * from patient; 4 END; 5 /Warning: Procedure created with compilation errors.SQL> sho err……3/1 PLS-00428: an INTO clause is expected in this

SELECT statement

Page 111: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Selects in Procedures 2

• A procedure is no substitute for scripting here– Cannot put in simple selects

• SELECT is used in procedures to:– Fetch one row at a time

• An exception may be generated when we fetch:– no rows– multiple rows

• A cursor construction is used to handle multiple rows in an orderly fashion

Page 112: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Selects in Procedures 3

• SELECT attribute_list INTO variable_list– is the basic format

• Lists can be singular or multiple • Multiple entries are comma delimited• Attribute 1 goes into variable 1, 2 into 2, …

– Variables are declared in Declare section• Must be of compatible type to that in CREATE TABLE …

• Single row retrieval is guaranteed– When WHERE clause searches only on primary key or

alternate key– No need for cursor here

Page 113: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

SELECT – Single Row –Partial Attributes

SQL> CREATE OR REPLACE PROCEDURE sel_patient 2 (pat_id in char) 3 AS 4 pname_var char(20); 5 BEGIN 6 select pname into pname_var 7 from patient 8 where pid = pat_id; 9 DBMS_OUTPUT.PUT_LINE (pname_var); 10 END; 11 /

Procedure created.

Page 114: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Explanation

• Input is pid• Local variable pname_var is declared of

same type– as in CREATE TABLE for patient

• Attribute value pname is passed into:– Variable pname_var– For the one row where pid = ‘1’

• The value of pname_var is then displayed

Page 115: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Run of Single Record Procedure

SQL> set serveroutput on

SQL> execute sel_patient('1');

Fred

PL/SQL procedure successfully completed.

• Above gives patient name ‘Fred’ for patient with id ‘1’

Page 116: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Automatic variable typing

• In declarations

pname_var patient.pname%TYPE;

• Gives pname_var same type as pname in patient

• Good practice:• ensures types of table attributes and procedure

variables are exactly the same

Page 117: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Exceptions – no rows found

• Exception which needs handling is:– when no rows are found

• PL/SQL provides a pre-defined exception:– NO_DATA_FOUND

– Can test with WHEN clause in EXCEPTION part of procedure

• To avoid procedure error at run time:– Include this exception handler

– Or use an equivalent technique (cursor attributes)

Page 118: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Example – Single Row Retrieval with Exception

SQL> CREATE OR REPLACE PROCEDURE sel_patient 2 (pat_id in char) 3 AS 4 pname_var patient.pname%TYPE; 5 BEGIN 6 select pname into pname_var 7 from patient 8 where pid = pat_id; 9 DBMS_OUTPUT.PUT_LINE (pname_var); 10 EXCEPTION 11 WHEN no_data_found THEN 12 DBMS_OUTPUT.PUT_LINE ('pid does not exist'); 13 END; 14 /

Procedure created.

Page 119: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Run with exception

SQL> execute sel_patient('77');pid does not existPL/SQL procedure successfully completed.

• Error message comes from exception• Exception handled so successful completion

– Even though nothing useful achieved (no pid ’77’)

Page 120: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Retrieval of Complete Row

• Declare variable (instead of pname_var):

pat_row patient%ROWTYPE;• Pat_row is a rowtype

– Holds one row of patient data

– Types as in patient table

– Refer to columns by pat_row.column• e.g. pat_row.pname addresses:

– Column pname in row pat_row

• Use separator || for multiple items in output

Page 121: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Revised Procedure with Rowtype

SQL> CREATE OR REPLACE PROCEDURE sel_patient2 2 (pat_id in char) AS 3 pat_row patient%ROWTYPE; 4 BEGIN 5 select * into pat_row from patient 6 where pid = pat_id; 7 DBMS_OUTPUT.PUT_LINE ('Name is:' || pat_row.pname 8 || 'Address is:' || pat_row.address); 9 EXCEPTION 10 WHEN no_data_found THEN 11 DBMS_OUTPUT.PUT_LINE ('pid does not exist'); 12 END; 13 /

Page 122: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Execution of Procedure

SQL> execute sel_patient2('1');

Name is:Fred Address is:Newcastle

PL/SQL procedure successfully completed.

Page 123: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Selections of Multiple Rows

• PL/SQL deals with one row at a time

• If SELECT potentially retrieves more than one row:– Procedure still compiles– Will work with retrieval of 0 or 1 row– Will fail with more than 1 row

• Consider retrieval on patient name

Page 124: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Procedure for Retrieval on Name

SQL> CREATE OR REPLACE PROCEDURE sel_patient3 2 (pat_name in char) AS 3 pat_row patient%ROWTYPE; 4 BEGIN 5 select * into pat_row from patient 6 where pname = pat_name; 7 DBMS_OUTPUT.PUT_LINE ('Id is:' || pat_row.pid 8 || 'Address is:' || pat_row.address); 9 EXCEPTION 10 WHEN no_data_found THEN 11 DBMS_OUTPUT.PUT_LINE ('pname does not exist'); 12 END;

13 /

Page 125: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Execution of Procedure

SQL> execute sel_patient3('Fred');Id is:1 Address is:NewcastlePL/SQL procedure successfully completed.*********************************SQL> execute sel_patient3('smith');BEGIN sel_patient3('smith'); END;*ERROR at line 1:ORA-01422: exact fetch returns more than requested number of rowsORA-06512: at "CGNR1.SEL_PATIENT3", line 5ORA-06512: at line 1 ‘Fred’ appears once in patient; ‘smith’ appears twice (in my current data). Can use

predefined exception too_many_rows to avoid error.

Page 126: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Cursors

• Cannot rely on luck with searches which may retrieve multiple rows

• Declare cursor (before BEGIN) as the select statement

• Have in executable part:– Open cursor– Process set, row by row, until exit– Close cursor

• Can have multiple cursors

Page 127: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Cursor Declaration

CURSOR p IS

select * from patient

where pname = pat_name;

• The variable p addresses– the set defined by the SELECT statement

• No INTO are needed here

Page 128: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Cursor Executable

OPEN p;

LOOP

FETCH p INTO pat_row;

EXIT WHEN p%NOTFOUND;

DBMS_OUTPUT.PUT_LINE ('Id is:' || pat_row.pid

|| 'Address is:' || pat_row.address);

END LOOP;

CLOSE p;

Page 129: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Explanation• OPEN p

– Retrieves set of rows satisfying select statement– Sets pointer to 1st row

• LOOP– Start of instructions for processing a row

• FETCH– Transfers data from current row to variables– Sets pointer to next row

• EXIT WHEN p%NOTFOUND– Exits loop when no row was transferred in last fetch

• END LOOP– Ends processing of current row; returns to LOOP

• CLOSE p – Closes cursor and releases resources

Page 130: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Processing of Data

• Within FETCH and END LOOP– Can do any processing required for application

• Statistical calculations• Re-packaging of data• Complex reports• Transfers to other tables• Integrity checks• Amalgamations of data from other cursors

• Exception handling is through cursor attribute %notfound– not in SELECT statement

Page 131: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Complete ProcedureCREATE OR REPLACE PROCEDURE sel_patient4 (pat_name in char) ASpat_row patient%ROWTYPE;CURSOR p IS select * from patientwhere pname = pat_name;BEGINOPEN p;LOOP

FETCH p INTO pat_row;EXIT WHEN p%NOTFOUND; DBMS_OUTPUT.PUT_LINE ('Id is:' || pat_row.pid|| 'Address is:' || pat_row.address);

END LOOP; CLOSE p; END; /

Page 132: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Execution

SQL> execute sel_patient4('smith');Id is:42 Address is:grimsbyId is:43 Address is:grimsby

PL/SQL procedure successfully completed.

SQL> execute sel_patient4('Fred');Id is:1 Address is:Newcastle

PL/SQL procedure successfully completed.

SQL> execute sel_patient4('Nigel');

PL/SQL procedure successfully completed.

Page 133: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Useful Reference Book

• Oracle 9i: PL/SQL Programming– Develop Powerful PL/SQL Applications

• by Scott Urman

• Oracle Press

• McGraw-Hill (2002)

Page 134: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Data Structures, Algorithms and Database Programming

Semester 2/ Week 20

Database Programming

Nick Rossiter/Emma-Jane Phillips-Tait

Page 135: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

PL/SQL Review

• From perspective of assignment 5

• Plus previous exercises

Page 136: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Exercise week 19 - Declare Section

• CREATE OR REPLACE PROCEDURE ...– age number;– mondiff number;– daydiff number;– action_too_high exception;– vdate_too_early exception;– pragma ...

• Local variables for use within procedure– age, mondiff, daydiff for age calculations

• Exceptions to be RAISEd during run if business rule broken

• Pragma for compiler instructions (not important to logic)

Page 137: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Executable - Age Calculation• select * INTO pat_row from patient where

pname = pat_name;– places a row from table into pat_row PL/SQL– if row not found, constraint exception by systemage:=extract(year from sysdate) - extract(year from pat_row.dobirth);

mondiff:=extract(month from sysdate) - extract(month from pat_row.dobirth);daydiff:=extract(day from sysdate) - extract(day from pat_row.dobirth);

IF mondiff < 0 THEN age:= age - 1; END IF;

IF mondiff = 0 AND daydiff < 0 THEN age:= age - 1; END iF;

• Note use of SQL predefined procedure extract

Page 138: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Executable - Transfer of DataIF age >= 50 THEN DBMS_OUTPUT.PUT_LINE ('Inserting pid: ' ||

pat_row.pid);

INSERT INTO patover50(pid,pname,address,age)

VALUES(pat_row.pid, pat_row.pname, pat_row.address, age);

END IF;

• If PL/SQL variable age is over 50 then:– output message– insert into table– inserted values are

• pat_row variables from patient• PL/SQL variable -- age

Page 139: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Assignment 5

• Assignment 5: CM503/CM517• Set: end week 19 (Friday, 19th March 2004)• Assessed: in seminars during week 21 (Monday,

29th March - Thursday, 1st April 2004)• The assignment extends work done in weeks 18

and 19. The solutions for these exercises are on Blackboard.

• The client now wishes to revise the add vaccination procedure (call it say add_vacc2) so that it does the following in total:

Page 140: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Business Rules• To raise exceptions when rules broken:1.No more than two vaccinations ... per day. 2.If over 75 in age, no more than one vaccination …

per day. 3.The vaccination date >= 1st January 2003.4.The combination of cholera and typhoid on same

day ... is not permitted.5.A vaccination which is within safe period ... shall

not be given.

Page 141: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Exceptions

• Exception:– “We have a problem”– Fatal error for procedure

• Need to get out• Give useful info to user

– In PL/SQL never attempt to recover position• Rollback

– undo any changes made– release resources

Page 142: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Strategy• Insert data first• Then look at consistency (against rules)• So add vaccination data

– Then look at business rules• For instance:

– cannot assess number of actions– or see whether both cholera and typhoid given– until new data is added

• If new data breaks rules, then raise exception and undo changes (rollback)

Page 143: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Types of Exception• Business Rules

– Typically rules not specified in CREATE TABLE• may vary slightly from procedure to procedure

– Determined by inspecting data when provisionally added– RAISE exception by code in procedure when rule broken– Give error message to user– Rollback (Undo changes)

• Table (General) Constraints– Specified in CREATE TABLE– Exception is raised by system automatically– Can handle with WHEN OTHERS …– Display error code (SQLCODE) and Rollback

Page 144: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Parameters

• “Values in/out to/from the outside world”• Typed as:

– IN (input), OUT (output), INOUT (both)

– char, number, date (broad-brush types)

• Input as for add_vacc: – patient id, visit date, action number, vaccine given.

• So procedure always runs with 4 input values

Page 145: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Data Transfers• Common to write verified values to variety of

tables (logs, checks, safety)• For each vaccination given (i.e. each validated

treatment):a. update the vaccinations tableb. insert into the table VACC_RECORD (which you

should create) the following:pid, pname, address, age, vdate, action, vaccinated,

expiry_date– 1st, 5th, 6th, 7th given as input parameters

Page 146: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Remaining Values?• 2nd, 3rd from look up in Patient table• 4th by calculation on dobirth in Patient table• 8th by calculation on valid_for

– could use SQL function Add_Months(date,lasting_year*12)

• Need to retrieve patient data for supplied pid– Calculate age for pid

• Need to retrieve information on vaccine in valid_for table

Page 147: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Implementing Business Rules

• Need to write SQL statement to determine whether rule holds or fails

• Often declared as CURSOR (e.g. c, d, e, or descriptive name)– Could have one cursor per rule

• but may have multiple rules on one cursor

– Then OPEN a cursor (after INSERT of new data made):

• FETCH data

• Look at cursor attributes

Page 148: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Cursor Attributes

• c%FOUND– means last FETCH from cursor c successful

• c%NOTFOUND– means last FETCH from cursor c unsuccessful

• c%ROWCOUNT– gives running total of number of rows retrieved

in FETCHes so far

Page 149: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Testing of Cursor Attributes• WHEN c%NOTFOUND THEN EXIT;

– terminates current loop when rows finished

• IF d%FOUND THEN RAISE vacc_already; END IF;– if find safe vaccination, raise exception

• IF e%ROWCOUNT > 12 THEN RAISE too_many_modules; END IF;– on 13th fetch of module row, raise exception– immediately goto EXCEPTION part, too_many_modules

will be of exception type, have pragma and an exception handler

Page 150: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Basic Procedure Structure

• CREATE … name procedure, parameters .. AS

• [DECLARE] … PL/SQL variables, exception variables, pragmas, cursors

• BEGIN … general messages, INSERTs, deriving data, testing of cursors against rules, commit (if data satisfies rules) … END

• EXCEPTION … handle business rule problems and general constraint violations, rollback

Page 151: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Oracle’s PL/SQL Approach• Fairly typical for relational databases• Previous slide shows general structure• Can use experience here in:

– other SQL systems (procedure is standard)– scripting systems (e.g. PHP/Oracle or PHP/MySQL)

• Useful for placements and final-year projects

• IN JDBC and other Java-based embedded approaches:– PL/SQL type system would be Java type system

Page 152: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Data Structures, Algorithms and Database Programming

Semester 2/ Week 21

Database Programming

Nick Rossiter/Emma-Jane Phillips-Tait

Page 153: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Database Design

• After producing logical design with elegant maintainable structures:

• Need to do physical design to make it run fast.• Performance is often more important in database

applications than in more general information system design:– Emphasis on number of transactions per second

(throughput)

Page 154: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Database Design Methodologies

• Produce default storage schema• May be adequate for small applications• For large applications, much further tuning

required• Physical design is the technique• Concepts: memory (main/disk), target disk

architecture, blocks, access methods, indexing, clustering.

Page 155: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Aims of Physical Design

• Fast retrieval – usually taken as <= 5 disk accesses– Since disk access is very long compared to other access

times, number of disk accesses is often used as indicator of performance

• Fast placement – within 5 disk accesses– Insertion of data, may be in middle of file not at end

– Deleting data, actual removal or tombstone

– Updating data, including primary key and other data

Page 156: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Retrieval/Placement

• Distinguish between actions involving primary and secondary keys

• Primary key is that determined by normalisation– May be single or multiple attributes– Only one per table

• Secondary keys – Again may be single or multiple attributes– Many per table– Include attributes other than the primary key

• Complications such as candidate keys are omitted in this part of the course

Page 157: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Access Method

• An access method is the software responsible for storage and retrieval of data on disk

• Handles page I/O between disk and main memory– A page is a unit of storage on disk

– Pages may be blocked so that many are retrieved in a single disk access

• Many different access methods exist– Each has a particular technique for relating a primary

key value to a page number

Page 158: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Processing of Data

• All processing by software is done in main memory

• Blocks of pages are moved– from disk to main memory for retrieval by user– from main memory to disk for storage by user

• Access method drives the retrieval/storage process

Page 159: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Cost Model

• Identify cost of each retrieval/storage operation• Access time to disk to read/write page = D =

seek time (time to move head to required cylinder)

+ rotational delay (time to rotate disk once the head is over the required track)

+ transfer time (time to transfer data from disk to main memory)

• Typical value for D =15 milliseconds (msecs) or

15 x 10-3 secs

Page 160: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Other times:

• C = average time to process a record in main memory = 100 nanoseconds (nsecs) = 100 x 10-9 secs.

• R = number of records/page

• B = number of pages in file

• Note that D > C by roughly 105 times

Page 161: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Access Method I: the Heap

• Records (tuples) are held on file:– in no particular order

– with no indexing

– that is in a ‘heap’ – unix default file type

• Strategy:– Insertions usually at end

– Deletions are marked by tombstones

– Searching is exhaustive from start to finish until required record found

Page 162: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

The Heap – Cost Model 1

• Cost of complete scan: B(D+RC)– For each page, one disk access D and process of R records

taking C each.– If R=1000, B=1000 (file contains 1,000,000 records)– Then cost = 1000(0.015+(1000*10-7)) =

1000(0.0150+0.0001) = 1000(0.0151) = 15.1 secs

• Cost for finding particular record: B(D+RC)/2 (scan half file on average) = 7.55 secs

• Cost for finding a range of records e.g. student names beginning with sm: B(D+RC) (must search whole file) =15.1 secs

Page 163: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

The Heap – Cost Model 2

• Insertion: 2D + C – Fetch last page (D), process record (C ), write

last page back again (D). – Assumes:

• all insertions at end

• system can fetch last page in one disk access

• Cost = (2*0.015)+10-7 0.030 secs

Page 164: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

The Heap – Cost Model 3

• Deletions: B(D+RC)/2 + C + D– Find particular record (scan half file -

B(D+RC)/2), process record on page (C ), write page back (D).

– Record will be flagged as deleted (tombstone)– Record will still occupy space– If reclaim space need potentially to read/write

many more pages

• Cost = 7.550 + 10-7 + 0.015 7.565 secs

Page 165: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Pros and Cons of Heaps

• Pros:– Cost effective where many records processed in a

single scan (can process 1,000,000 records in 15.1 secs)

– Simple access method to write and maintain

• Cons:– Very expensive for searching for single records in large

files (1 record could take 15.1 secs to find)

– Expensive for operations like sorting as no inherent order

Page 166: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Usage of Heaps

• Where much of the file is to be processed:– Batch mode (collect search and update requests

into batches)– Reports– Statistical summaries– Program source files

• Files which occupy a small number of pages

Page 167: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Data Structures, Algorithms and Database Programming

Semester 2/ Week 22

Database Programming

Nick Rossiter/Emma-Jane Phillips-Tait

Page 168: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Hashing

• One of the big two Access Methods• Very fast potentially

– One disk access only in ideal situation

• Used in many database and more general information systems:– where speed is vital

• Many variants to cope with certain problems

Page 169: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Meaning of Hash

• Definition 3. to cut into many small pieces; mince (often fol. by up).

• Example He chopped up some garlic. • Synonyms dice , mince (1) , hash (1) • Similar Words chip1 , cut up {cut (vt)} , carve ,

crumble , cube1 , divide – From

http://www.wordsmyth.net/live/home.php?content=wotw/2001.0521/wotw_esl

Page 170: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Hash Function

• Takes key value of record to be stored

• Applies some function (often including a chop) delivering an integer

• This integer is a page number on the disk. So– input is key– output is a page number

Page 171: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Simple Example

• Have:– B=10 (ten pages for disk file)– R=2,000 (2,000 records/page)– Keys {S12, S27, S30, S42}

• Apply function ‘chop’ to keys giving:– {12, 27, 30, 42} so that initial letter is discarded

Page 172: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Simple Example

• Then take remainder of dividing chopped key by 10.

• Why divide?– Gives integer remainder

• Why 10?– Output numbers from 0 … 9– 10 possible outputs corresponds with 10 pages for

storage

• In this case, numbers returned are:– {2, 7, 0, 2}

Page 173: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

9

8

S277

6

5

4

3

S12, S422

1

S300

Records (only keys shown)

Page

Disk File: hash table

Page 174: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Retrieval

• Say user looks for record with key S42• Apply hash function to key:

– Discard initial letter, divide by 10, take remainder

– Gives 2

• Transfer page 2 to buffer in main memory• Search buffer looking for record with key

S42

Page 175: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Cost Model

• Retrieval of a particular record:D+0.5RC (one disk access + time taken to search half a

page for the required record)

= 0.015+(0.5*2000*10-7) = 0.0151 secs (very fast)

• Insertion of a record:Fetch page (D) + Modify in main memory (C ) + Write

back to disk (D)

= 0.015+10-7+0.015 0.0300

• Deletions same as insertions

Page 176: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Effectiveness

• Looks very good:– Searches in one disk access– Insertions and deletions in two disk accesses– So:

• Searching faster than heap and sorted

• Insertions and deletions similar to heaped, much faster than sorted

Page 177: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Minor Problem

• Complete scan:– Normally do not fill pages to leave space for

new records to be inserted– 80% initially loading– So records occupy 1.25 times number of pages

if densely packed

1.25B(D+RC) = 1.25*10(0.015+2000*10-7) 0.189 secs (against 0.152 if packed densely)

Page 178: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Larger Problems

• Scan for groups of records say S31-S37 will be very slow

• Each record will be in different page, not in same page.

• So 7 disk accesses instead of 1 with sorted file (once page located holding S31-S37).

Page 179: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Larger Problems

• What happens if page becomes full?

• This could happen if– Hash function poorly chosen e.g. all keys end

in 0 and hash function is a number divided by 10

• All records go in same page

– Simply too many records for initial space allocated to file

Page 180: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Overflow Area

• Have extra pages in an overflow area to hold records– Mark overflowed pages in main disk area

• Retrieval now may take 2 disk accesses to search expected page and overflow page.

• If have overflow on overflow page, may take 3+ disk accesses for retrieval.

• Insertions may also be slow – collisions on already full pages.

• Performance degrades

Page 181: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

At Intervals in Static Hashing

• The Data Base Administrator’s lost weekend

• He/she comes in

• Closes system down to external use

• Runs a utility expanding number of pages and re-hashing all records into the new space

Page 182: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Alternatives to Static Hashing

• Automatic adjustment – Dynamic Hashing

• Extendible Hashing– Have index (directory) to pages which adjusts

to number of records in the system

• Linear Hashing– Have family of hash functions

Page 183: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Pros and Cons of Hashing

• Pros:– Very fast for searches on search key (may be 1 disk access)– Very fast for insertions and deletions (often 2 disk accesses)– No indexes to search/maintain (in most variants)

• Cons:– Slightly slower than sorted files for scan of complete file– Requires periodic off-line maintenance in static hashing as

pages become full and collisions occur– Poor for range searches

Page 184: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Usage of Hashing

• Applications involving:– Searching (on key) in files of any size for single

records – very fast – Insertions and deletions of single records

• So typical in On-line Transaction Processing (OLTP)

Page 185: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Data Structures, Algorithms and Database Programming

Semester 2/ Week 23

Database Programming

Nick Rossiter/Emma-Jane Phillips-Tait

Page 186: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

B-trees

• What does the B stand for?• Balanced, Bushy or Bayer (apparently not clear)• Balanced means distance from root to leaf node is

same for all of tree• Bushy is a gardening term meaning all strands of

similar length• Bayer is a person who wrote a seminal paper on

them

Page 187: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

B+-Trees

• There are some variants on B-trees.• We deal here with B+-trees where all data

entries are held in leaf nodes of tree• The two terms are interchangeable for our

purposes.• B+-trees are dynamic index structures• The tree automatically adjusts as the data

changes

Page 188: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

B+-tree

• A B+-tree is a:– Multiway tree (fan-out or branching factor >=

2, binary tree has fan-out 2) with– Efficient self-balancing operations

• Minimum node occupancy is 50%

• Typical node occupancy can be greater than this but initially keep it around 60%

Page 189: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

B+-tree Diagram: index+sequence set

Sequence Set The dataentries

Index set

Internal structureas in root

d = 2

Page 190: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Structure

• Index set (tree, other than leaf pages) is sparse– Contains key-pointer pairs– Not all keys included

• Sequence set (leaf pages) is dense– All data entries included– Data held is key + non-key data– Pages are connected by double linked lists– Can navigate in either direction– Pages are usually in sequence of primary key– No pointers are held at this level

Page 191: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Parameters

• d =2 says that the order of the tree is 2• Each non-terminal node contains between d and 2d

index entries (except root node 1…2d entries)• Each leaf node contains between d and 2d data

entries• So tree shown can hold 2, 3 or 4 index values in

each node• How many index pointers? • Format is always one more pointer than value. So 3,

4 or 5 pointers per node.

Page 192: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Capacity of Tree

• d = 2– One root node – can hold 2*d = 4 records– Next level – 5 pointers from root, each node holds

maximum 4 records = 20 records in 5 nodes– Next level – 5 pointers from each of the 5 nodes above,

each node maximum 4 records = 100 records in leaf nodes

• In practice will not pack as closely• But d=2, 3-levels – potentially addresses 100 records• If all held on disk, 3 disk accesses

Page 193: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Capacity of Tree

• High branching factors (fan-out) increase capacity dramatically (see seminar).

• So tree structure with high branching factor can be kept to a small number of levels

• Bushy trees (heavily pruned!) mean less disk accesses

• Root node at least will normally be held in main memory

Page 194: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

26 33 7 13 52 65

Index multi-levels

Data Entries

20 40

1 2 3 4 7 9 13

14

Example of a B+-tree Order (d) = 2

Page 195: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Search Times - Single Record

• Always go to leaf node (data entries) to retrieve actual data.

• Root node in main memory• Cost = (T-1)*(D+0.5RC) for single record

– T = height of tree– (T-1) as root node is already in main memory

• If d=2, then R=4 (max), cost = (3-1)*(0.015+(0.5*4*10-7))

= 2*0.0150002 = 0.0300004 secs

Page 196: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Search Times - Whole File/Ranges

• Lowest cost is B(D+RC)– B is number of pages assuming data is packed

to 100% capacity, D is time for disk access, R us number of records/page, C is cost for processing each record in memory

• If Sequence Set is packed at 60%, then cost is: (100/60) * B(D+RC)

• Ranges are held in proximity in Sequence Set -- search for these is fast

Page 197: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Insertions

• First add into sequence set– Search for node– If space add to node

• Leave index untouched

– If no space• Try re-distributing records in sibling nodes

– Sibling node is adjacent node with same parent

• Or split node and share entries amongst the two nodes• Will involve alteration to index

• Insertions tend to push index entries up the tree • If splits all the way up and root node overflows, then

height of tree T increases by one.

Page 198: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Deletions• First delete from sequence set

– Search for node– Delete record from node– If node still has at least d entries

• Leave index untouched

– If node has less than d entries• Try re-distributing records in sibling nodes

– Sibling node is adjacent node with same parent

• Or merge sibling nodes• Will involve alteration to index

• Deletions tend to push index entries down the tree • If merges all the way up and root node underflows, then T

decreases by one.

Page 199: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Usage of B+-trees• General Purpose• Single-record Searching

– not as fast as hashing but acceptable with usually five or less disk accesses

• Processing ranges of key values– faster than hashing as records held in order of key in

Sequence Set

• Robust as while data changes– algorithms for inserts/deletes give automatic self-

balancing of tree (no re-organisations)

Page 200: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Data Structures, Algorithms and Database Programming

Semester 2/ Week 24

Database Programming

Nick Rossiter/Emma-Jane Phillips-Tait

Page 201: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Revision

• First, remarks on B+-tree Exercises.

• Can think of B+-tree as generalised binary search tree

• Binary search trees are:– good memory structures

• fast tree traversal

– poor disk structures

• if every pointer access involves a disk access

Page 202: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Binary Search Tree compared with B+-tree

• d = 1• then root node has:

– 1...2 data entries– 2…3 pointers

• Leaf node has 1…2 data entries, no pointers• Intermediate node has 1…2 data entries, 2…3

pointers• So binary search tree is special case of B+-tree

with d=1 and properties in red above.

Page 203: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Maximum Capacity (records) of B+-trees

Order (d) Fan out Maxrecords/node

Capacityat 5 levels

1 2…3 2 2*3*3*3*3 = 162

2 3…5 4 4*5*5*5*5 =2500

10 11…21 20 20*21*21*21*21= c4x106

50 51…101 100 100*101*101*101*101 = 1010

200 201…401 400 400*401*401*401*401 = 1013

Page 204: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Cost of Searching B+-trees (order = d)

• Number of disk accesses is number of levels (T) minus 1:– all data held in leaves of tree– top level (root) is in main memory

• Assume search half of each node on average to find a particular record (0.5RC)

• Cost = (T-1)*(D+0.5RC) where D = 0.015 secs. C = 10-7 secs, R = a number from d….2*d

Page 205: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Insertions

• Put record into sequence in leaf nodes• If inserted node has <=2*d records, OK• If inserted node has >2*d records:

– first try to redistribute records between inserted node, its parent and immediate sibling

– otherwise split inserted node into two nodes and pass intermediate record (key) up one level in tree

Page 206: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Balance of Course - 1

• SQL Scripting– bridging level 1 and level 2– reinforcing SQL knowledge– use of variables– pre-defined functions– Important area for:

• prototyping applications

• some production environments (web-based)

Page 207: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Balance of Course - 2

• Database Fundamentals– Transactions– Concurrency– Security– Procedures– Important for:

• production multi-user systems

Page 208: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Balance of Course - 3

• SQL Procedures (PL/SQL)– Differences from scripting– Business rules– Constructions– Parameterised SQL statements– Exception handling– Important for:

• handling business rules across an enterprise

Page 209: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Balance of Course - 4

• Access Methods– physical side – placement and retrieval of data– choice of algorithms– two main types

• B+-trees• Hashing

– Important for:• efficient access to data

Page 210: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Exam Paper

• Database paper counts 40%– database assignments 10%– java paper 35%– java assignments 15%

• Database paper is independent of 1st semester work– exam will be on 2nd semester material

• No previous database paper in this area

Page 211: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Type of exam

• Closed book– no materials can be carried in

• Two hours duration

• Four questions on paper

• Three questions should be attempted

Page 212: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Type of question

• 20 marks in total on each question– 8-10 marks typically for testing basic

knowledge of a subject (definitions, simple derivations, material from lecture notes)

– 10-12 marks typically for problem solving i.e. taking a scenario and providing a solution in terms of code or algorithm.

Page 213: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Recommended Strategy

• Be familiar with the lecture notes• Be familiar with existing exercises and their

solutions (including assignments) on BB• You should assume that the exam will test your

understanding of the lecture notes and exercises• Problems in the exam environment will generally

be simpler than those in the assignment environment.

Page 214: Data Structures, Algorithms and Database Programming Semester 2/ Weeks 13-24 Database Programming Nick Rossiter/Emma-Jane Phillips- Tait

Problem Solutions

• Under desk-bound exam conditions it is not possible to test program code exhaustively.

• Also student does not have feedback from compiler or run-time system.

• Ideal expectation is that code will:– provide basis of implementation– with feedback from live system, be readily

modified to provide an acceptable deliverable.