110
1 Matthew P. Johnson, OCL3, CISDD CUNY, J une 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

  • View
    220

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

OCL3 Oracle 10g:SQL & PL/SQLSession #10

Matthew P. Johnson

CISDD, CUNY

January, 2005

Page 2: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

2Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

Agenda Security & web apps

RegEx support in 10g

Oracle & XML

Data warehousing

More on the PL/SQL labs

Any more lab?

Page 3: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

3Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

Review: Why security is hard It’s a “negative deliverable”

It’s an asymmetric threat

Tolstoy: “Happy families are all alike; every unhappy family is unhappy in its own way.” Analogs: “homeland”, jails, debugging, proof-

reading, Popperian science, fishing, MC algs

So: fix biggest problems first

Page 4: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

4Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

DB users have privileges SELECT: read access to all columns INSERT(col-name): can insert rows with non-

default values in this column INSERT: can insert rows with non-default values in

all columns DELETE REFERENCES(col-name): can define foreign keys

that refer to (or other constraints that mention) this column

TRIGGER: triggers can reference table EXECUTE: can run function/SP

Page 5: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

5Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

Granting privileges (Oracle) One method of setting access levels Creator of object automatically gets all

privileges to it Possible objects: tables, whole databases, stored

functions/procedures, etc. <DB-name>.* - all tables in DB

A privileged user can grant privileges to other users or groups

GRANT privileges ON object TO user <WITH GRANT OPTION>GRANT privileges ON object TO user <WITH GRANT OPTION>GRANT SELECT ON mytable TO someone WITH GRANT OPTION;GRANT SELECT ON mytable TO someone WITH GRANT OPTION;

Page 6: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

6Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

Granting and revoking Privileged user has privileges Privileged-WGO user can grant them, w/wo GO Granter can revoke privileges or GO Revocation cascades by default

To prevent, use RESTRICT (at end of cmd) If would cascade, command fails

Can change owner:

ALTER TABLE my-tblOWNER TO new-owner;

ALTER TABLE my-tblOWNER TO new-owner;

Page 7: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

7Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

Granting and revoking What we giveth, we may taketh away mjohnson: (effects?)

george: (effects?)

mjohnson: (effects?)

GRANT SELECT, INSERT ON my-table TO george WITH GRANT OPTION;GRANT SELECT, INSERT ON my-table TO george WITH GRANT OPTION;

GRANT SELECT ON my-table TO laura;GRANT SELECT ON my-table TO laura;

REVOKE SELECT ON my-table FROM laura;REVOKE SELECT ON my-table FROM laura;

Page 8: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

8Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

Role-based authorization In SQL-1999, privileges assigned with roles For example:

Student role Instructor role Admin role

Each role gets to do same (sorts of) things

Privileges assigned by assigning role to users

GRANT SELECT ON my-table TO employee;GRANT SELECT ON my-table TO employee;

GRANT employee TO billg;GRANT employee TO billg;

Page 9: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

9Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

Passwords DBMS recognizes your privileges because it

recognizes you

how?

Storing passwords in the DB is a bad idea

Page 10: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

10Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

Hashed or digested passwords One-way hash function:

1. computing f(x) is easy;

2. Computing f-1(y) is hard/impossible;

3. Finding some x2 s.t. f(x2) = f(x) is hard/imposs “collisions”

Intuitively: seeing f(x) gives little (useful) info on x x “looks random” PRNGs

MD5, SHA-1 RFID for cars: http://www.rfidanalysis.org/

Page 11: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

11Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

Built-in accounts Many DBMSs (and OSs) have built-in demo

accounts by default In some versions, must “opt out”

MySQL: root/(blank) (closed on sales) http://lists.seifried.org/pipermail/security/2004-February/001782.html

Oracle: scott/tiger (was open on sales last year)

SQLServer: sa/(blank/null) http://support.microsoft.com/default.aspx?scid=kb;EN-US;31341

8

Page 12: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

12Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

Query-related: Injection attacks Here’s a situation:

Prompt for user/pass Do lookup:

If found, user gets in test.user table in MySQL http://pages.stern.nyu.edu/~mjohnson/dbms/php/loginph

p.txt http://pages.stern.nyu.edu/~mjohnson/dbms/php/login.php

Apart from no hashing, is this safe?

SELECT * FROM usersWHERE user=u AND password=p;

SELECT * FROM usersWHERE user=u AND password=p;

Page 13: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

13Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

Injection attacks

We expect to get input of something like: user: mjohnson pass: secret

SELECT * FROM usersWHERE user = u AND password = p;

SELECT * FROM usersWHERE user = u AND password = p;

SELECT * FROM usersWHERE user= 'mjohnson' AND password = 'secret';

SELECT * FROM usersWHERE user= 'mjohnson' AND password = 'secret';

Page 14: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

14Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

Injection attacks – MySQL/Perl/PHP

Consider another input: user: ' OR 1=1 OR user = ' pass: ' OR 1=1 OR pass = '

SELECT * FROM usersWHERE user = u AND password = p;

SELECT * FROM usersWHERE user = u AND password = p;

SELECT * FROM users

WHERE user = '' OR 1=1 OR user = '' AND password = '' OR 1=1 OR pass = '';

SELECT * FROM users

WHERE user = '' OR 1=1 OR user = '' AND password = '' OR 1=1 OR pass = '';

http://pages.stern.nyu.edu/~mjohnson/dbms/php/login.phphttp://pages.stern.nyu.edu/~mjohnson/dbms/eg/injection.txt

SELECT * FROM usersWHERE user = ''

OR 1=1OR user = ''AND password = ''OR 1=1OR pass = '';

SELECT * FROM usersWHERE user = ''

OR 1=1OR user = ''AND password = ''OR 1=1OR pass = '';

Page 15: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

15Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

Injection attacks – MySQL/Perl/PHP

Consider this one: user: your-boss' OR 1=1 # pass: abc

SELECT * FROM usersWHERE user = u AND password = p;

SELECT * FROM usersWHERE user = u AND password = p;

SELECT * FROM users

WHERE user = 'your-boss' OR 1=1 #' AND password = 'abc';

SELECT * FROM users

WHERE user = 'your-boss' OR 1=1 #' AND password = 'abc';

http://pages.stern.nyu.edu/~mjohnson/dbms/php/login.php

SELECT * FROM usersWHERE user = 'your-boss'

OR 1=1 #' AND password = 'abc';

SELECT * FROM usersWHERE user = 'your-boss'

OR 1=1 #' AND password = 'abc';

Page 16: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

16Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

Injection attacks – MySQL/Perl/PHP

Consider another input: user: your-boss pass: ' OR 1=1 OR pass = '

SELECT * FROM usersWHERE user = u AND password = p;

SELECT * FROM usersWHERE user = u AND password = p;

SELECT * FROM usersWHERE user = 'your-boss' AND password = '' OR 1=1 OR pass = '';

SELECT * FROM usersWHERE user = 'your-boss' AND password = '' OR 1=1 OR pass = '';

http://pages.stern.nyu.edu/~mjohnson/dbms/php/login.php

SELECT * FROM usersWHERE user = 'your-boss'

AND password = ''OR 1=1OR pass = '';

SELECT * FROM usersWHERE user = 'your-boss'

AND password = ''OR 1=1OR pass = '';

Page 17: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

17Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

Multi-command inj. attacks (other DBs)

Consider another input: user: '; DELETE FROM users WHERE user = 'abc'; SELECT FROM users WHERE password = '

pass: abc

SELECT * FROM usersWHERE user = u AND password = p;

SELECT * FROM usersWHERE user = u AND password = p;

SELECT * FROM users

WHERE user = ''; DELETE FROM users WHERE user = 'abc'; SELECT FROM users WHERE password = '' AND password = 'abc';

SELECT * FROM users

WHERE user = ''; DELETE FROM users WHERE user = 'abc'; SELECT FROM users WHERE password = '' AND password = 'abc';

SELECT * FROM users WHERE user = '';DELETE FROM users WHERE user = 'abc'; SELECT FROM users WHERE password = ''

AND password = 'abc';

SELECT * FROM users WHERE user = '';DELETE FROM users WHERE user = 'abc'; SELECT FROM users WHERE password = ''

AND password = 'abc';

Page 18: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

18Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

Consider another input: user: '; DROP TABLE users; SELECT FROM users WHERE password = '

pass: abc

SELECT * FROM usersWHERE user = u AND password = p;

SELECT * FROM usersWHERE user = u AND password = p;

SELECT * FROM users

WHERE user = ''; DROP TABLE users; SELECT FROM users WHERE password = '' AND password = 'abc';

SELECT * FROM users

WHERE user = ''; DROP TABLE users; SELECT FROM users WHERE password = '' AND password = 'abc';

SELECT * FROM users WHERE user = '';DROP TABLE users;SELECT FROM users WHERE password = ''

AND password = 'abc';

SELECT * FROM users WHERE user = '';DROP TABLE users;SELECT FROM users WHERE password = ''

AND password = 'abc';

Multi-command inj. attacks (other DBs)

Page 19: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

19Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

Consider another input: user: '; SHUTDOWN WITH NOWAIT; SELECT FROM users WHERE password = '

pass: abc

SELECT * FROM usersWHERE user = u AND password = p;

SELECT * FROM usersWHERE user = u AND password = p;

SELECT * FROM users

WHERE user = ''; SHUTDOWN WITH NOWAIT; SELECT FROM users WHERE password = '' AND password = 'abc';

SELECT * FROM users

WHERE user = ''; SHUTDOWN WITH NOWAIT; SELECT FROM users WHERE password = '' AND password = 'abc';

SELECT * FROM users WHERE user = '';SHUTDOWN WITH NOWAIT;SELECT FROM users WHERE password = ''

AND password = 'abc';

SELECT * FROM users WHERE user = '';SHUTDOWN WITH NOWAIT;SELECT FROM users WHERE password = ''

AND password = 'abc';

Multi-command inj. attacks (other DBs)

Page 20: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

20Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

Injection attacks – MySQL/Perl/PHP

Consider another input: user: your-boss pass: ' OR 1=1 AND user = 'your-boss

Delete your boss!

DELETE FROM usersWHERE user = u AND password = p;

DELETE FROM usersWHERE user = u AND password = p;

DELETE FROM usersWHERE user = 'your-boss' AND pass = ' ' OR 1=1 AND user = 'your-boss';

DELETE FROM usersWHERE user = 'your-boss' AND pass = ' ' OR 1=1 AND user = 'your-boss';

http://pages.stern.nyu.edu/~mjohnson/dbms/php/users.php

DELETE FROM usersWHERE user = 'your-boss'

AND pass = ''OR 1=1AND user = 'your-boss';

DELETE FROM usersWHERE user = 'your-boss'

AND pass = ''OR 1=1AND user = 'your-boss';

Page 21: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

21Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

http://pages.stern.nyu.edu/~mjohnson/dbms/php/users.php

Injection attacks – MySQL/Perl/PHP

Consider another input: user: ' OR 1=1 OR user = ' pass: ' OR 1=1 OR user = '

Delete everyone!

DELETE FROM usersWHERE user = u AND pass = p;

DELETE FROM usersWHERE user = u AND pass = p;

DELETE FROM users

WHERE user = '' OR 1=1 OR user = '' AND pass = '' OR 1=1 OR user = '';

DELETE FROM users

WHERE user = '' OR 1=1 OR user = '' AND pass = '' OR 1=1 OR user = '';

DELETE FROM usersWHERE user = ''

OR 1=1OR user = ''AND pass = ''OR 1=1OR user = '';

DELETE FROM usersWHERE user = ''

OR 1=1OR user = ''AND pass = ''OR 1=1OR user = '';

Page 22: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

22Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

Preventing injection attacks Ultimate source of problem: quotes Soln 1: don’t allow quotes!

Reject any entered data containing single quotes Q: Is this satisfactory?

Does Amazon need to sell O’Reilly books?

Soln 2: escape any single quotes Replace any ' with a '' or \' In Perl, use taint mode – won’t show In PHP, turn on magic_quotes_gpc flag in .htaccess

show both PHP versions

Page 23: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

23Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

Preventing injection attacks Soln 3: use prepare parameter-based queries

Supported in JDBC, Perl DBI, PHP ext/mysqli http://pages.stern.nyu.edu/~mjohnson/dbms/perl/loginsafe.cgi http://pages.stern.nyu.edu/~mjohnson/dbms/perl/userssafe.cgi

Very dangerous: using tainted data to run commands at the Unix command prompt Semi-colons, prime char, etc. Safest: define set if legal chars, not illegal ones

Page 24: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

24Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

Preventing injection attacks When to do security checking for quotes,

etc.? Natural choice: in client-side data validation But not enough!

As saw earlier: can submit GET and POST params manually

Must do security checking on server

Page 25: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

25Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

More Info phpGB MySQL Injection Vulnerability

http://www.securiteam.com/unixfocus/6X00O1P5PY.html

"How I hacked PacketStorm“ http://www.wiretrip.net/rfp/txt/rfp2k01.txt

Page 26: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

26Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

SQL*Plus settings

SQL> SET RECSEP OFFSQL> COLUMN text FORMAT A60

SQL> SET RECSEP OFFSQL> COLUMN text FORMAT A60

Page 27: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

27Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

New topic: Regular Expressions In automata theory, Finite Automata are the

simplest weakest of computer, Turing Machines the strongest Chomsky’s Hierarchy

FA are equivalent to a regular expression Expressions that specify a pattern Can check whether a string matches the pattern

Page 28: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

28Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

RegEx matching Use REGEX_LIKE Metachar for any char is . First, get employee_comment table:

http://pages.stern.nyu.edu/~mjohnson/oracle/empcomm.sql

Now do search:

So far, like LIKE

SELECT emp_id, textFROM employee_commentWHERE REGEXP_LIKE(text,'...-....');

SELECT emp_id, textFROM employee_commentWHERE REGEXP_LIKE(text,'...-....');

Page 29: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

29Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

RegEx matching Can also pull out the matching text with

REGEXP_SUBSTR:

If want only numbers, can specify a set of chars rather than a dot:

SELECT emp_id, REGEXP_SUBSTR(text,'...-....') textFROM employee_commentWHERE REGEXP_LIKE(text,'...-....');

SELECT emp_id, REGEXP_SUBSTR(text,'...-....') textFROM employee_commentWHERE REGEXP_LIKE(text,'...-....');

SELECT emp_id, REGEXP_SUBSTR(text, '[0123456789]..-...[0123456789]') textFROM employee_commentWHERE REGEXP_LIKE(text, '[0123456789]..-...[0123456789]');

SELECT emp_id, REGEXP_SUBSTR(text, '[0123456789]..-...[0123456789]') textFROM employee_commentWHERE REGEXP_LIKE(text, '[0123456789]..-...[0123456789]');

Page 30: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

30Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

RegEx matching Or can specify a range of chars:

Or, finally, can state how many copies to match:

SELECT emp_id, REGEXP_SUBSTR(text, '[0-9]..-....') textFROM employee_commentWHERE REGEXP_LIKE(text,'...-....');

SELECT emp_id, REGEXP_SUBSTR(text, '[0-9]..-....') textFROM employee_commentWHERE REGEXP_LIKE(text,'...-....');

SELECT emp_id, REGEXP_SUBSTR(text,'[0-9]{3}-[0-9]{4}') text

FROM employee_commentWHERE REGEXP_LIKE(text,'[0-9]{3}-[0-9]{4}');

SELECT emp_id, REGEXP_SUBSTR(text,'[0-9]{3}-[0-9]{4}') text

FROM employee_commentWHERE REGEXP_LIKE(text,'[0-9]{3}-[0-9]{4}');

Page 31: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

31Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

RegExp matching Other operators:

* - 0 or more matches + - 1 or more matches ? - 0 or 1 match

Also, can OR options together with | op Here: some phone nums have area codes, some

not, so want to match both:

SELECT emp_id, REGEXP_SUBSTR(text,'[0-9]{3}-[0-9]{3}-[0-9]{4}|[0-9]{3}-

[0-9]{4}') textFROM employee_commentWHERE REGEXP_LIKE(text,'[0-9]{3}-[0-9]{3}-[0-9]{4}|[0-9]{3}-[0-9]{4}');

SELECT emp_id, REGEXP_SUBSTR(text,'[0-9]{3}-[0-9]{3}-[0-9]{4}|[0-9]{3}-

[0-9]{4}') textFROM employee_commentWHERE REGEXP_LIKE(text,'[0-9]{3}-[0-9]{3}-[0-9]{4}|[0-9]{3}-[0-9]{4}');

Page 32: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

32Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

RegExp matching Order of ORed together patterns matters:

First matching pattern wins

SELECT emp_id, REGEXP_SUBSTR(text,'[0-9]{3}-[0-9]{4}|[0-9]{3}-[0-9]{3}-

[0-9]{4}') textFROM employee_commentWHERE REGEXP_LIKE(text,'[0-9]{3}-[0-9]{4}|[0-9]{3}-[0-9]{3}-[0-9]{4}');

SELECT emp_id, REGEXP_SUBSTR(text,'[0-9]{3}-[0-9]{4}|[0-9]{3}-[0-9]{3}-

[0-9]{4}') textFROM employee_commentWHERE REGEXP_LIKE(text,'[0-9]{3}-[0-9]{4}|[0-9]{3}-[0-9]{3}-[0-9]{4}');

Page 33: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

33Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

RegExp matching There’s a shared structure between the two,

tho Area code is just optional Can use ? op

SELECT emp_id, REGEXP_SUBSTR(text,'([0-9]{3}-)?[0-9]{3}-[0-9]{4}') text

FROM employee_commentWHERE REGEXP_LIKE(text,'([0-9]{3}-)?[0-9]{3}-[0-9]{4}');

SELECT emp_id, REGEXP_SUBSTR(text,'([0-9]{3}-)?[0-9]{3}-[0-9]{4}') text

FROM employee_commentWHERE REGEXP_LIKE(text,'([0-9]{3}-)?[0-9]{3}-[0-9]{4}');

Page 34: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

34Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

RegExp matching Also, different kinds of separators:

dash, dot, just blank Can OR together whole number patterns Better: Just use set of choices of each sep.

SELECT emp_id, REGEXP_SUBSTR(text, '([0-9]{3}[-. ])?[0-9]{3}[-. ][0-9]{4}') textFROM employee_commentWHERE REGEXP_LIKE(text,'([0-9]{3}[-. ])?[0-9]{3}[-. ][0-9]{4}');

SELECT emp_id, REGEXP_SUBSTR(text, '([0-9]{3}[-. ])?[0-9]{3}[-. ][0-9]{4}') textFROM employee_commentWHERE REGEXP_LIKE(text,'([0-9]{3}[-. ])?[0-9]{3}[-. ][0-9]{4}');

Page 35: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

35Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

RegExp matching One other thing: area codes in parentheses

Of course, area codes are still optional Parentheses must be escaped - \( \)

SELECT emp_id, REGEXP_SUBSTR(text, '([0-9]{3}[-. ]|\([0-9]{3}\) )?[0-9]{3}[-. ][0-9]{4}') textFROM employee_commentWHERE REGEXP_LIKE(text,'([0-9]{3}[-. ]|\([0-9]{3}\) )?[0-9]{3}[-. ][0-9]{4}');

SELECT emp_id, REGEXP_SUBSTR(text, '([0-9]{3}[-. ]|\([0-9]{3}\) )?[0-9]{3}[-. ][0-9]{4}') textFROM employee_commentWHERE REGEXP_LIKE(text,'([0-9]{3}[-. ]|\([0-9]{3}\) )?[0-9]{3}[-. ][0-9]{4}');

Page 36: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

36Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

And now for something completely different: XML XML: eXtensible Mark-up Language

Very popular language for semi-structured data

Mark-up language: consists of elements composed of tags, like HTML

Emerging lingua franca of the Internet, Web Services, inter-vender comm

Page 37: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

37Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

Unstructured data At one end of continuum: unstructured data

Text files Stock market prices CIA intelligence intercepts Audio recordings “Just one damn bit after another”

~ Henry Ford

No (intentional, formal) patterns to the data Difficult to manage/make sense of

Why we need data-mining

Page 38: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

38Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

Structured data At the other end: structured data

Tables in RDBMSs Data organized into semantic chunks

entities Similar/related entities grouped together

Relationships, classes Entities in same group have same structure

Same fields/attributes/properties

Easy to make sense of But sometimes too rigid a req. Difficult to send—convert to tab-delimited

Page 39: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

39Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

Semi-structured data Not too random

Data organized into entities Similar/related grouped to form other entities

Not too structured Some attributes may be missing Size of attributes may vary

Support of lists/sets

Juuust Right Data is self-describing

Page 40: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

40Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

Semi-structured data Predominant examples:

HTML: HyperText Mark-up Language XML: eXtensible Mark-up Language

NB: both mark-up languages (use tags) Mark-up lends self of semi-structured data

Demarcate boundaries for entities But freely allow other entities inside

Page 41: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

41Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

Data model for semi-structured data Usually represented as directed graphs Graph: set of vertices (nodes) and edges

Dots connected by lines; not nec. a tree!

In model, Nodes ~ entities or fields/attributes Edges ~ attribute-of/sub-entity-of

Example: publisher publishes >=0 books Each book has one title, one year, >=1 authors Draw publishers graph

Page 42: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

42Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

XML is a SSD language Standard published by W3C

Officially announced/recommended in 1998

XML != HTML XML != a replacement for HTML Both are mark-up languages

Big diffs:1. XML doesn’t use predefined tags (!)

But it’s extensible: tags can be added2. HTML is about presentation: <I>, <B>, <P>

XML is about content: <book>, <author>

Page 43: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

43Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

XML syntax Like HTML in many respects but more strict

All tags must be closed Can’t have: this is a line<br> Every start tag has an end tag Although <br/> style can replace both

IS case-sensitive IS space-sensitive

XML doc has a unique root element

Page 44: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

44Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

XML syntax Tags must be properly nested

Not allowed <b><i>I’m not kidding</b></i> Intuition: file folders

Elements may have quoted attributes <Myelm myatt=“myval”>…</Myelm>

Comments same as in HTML: <!-- Pay no attention… -->

Draw publishers XML

Page 45: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

45Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

Escape chars in XML Some chars must be escaped

Distinguish content from syntax

Can also declare value to be pure text:

> &lt;

< &gt;

& &amp;

" &quot;

' &apos;

<aRealTag> <![CDATA[<notAtag>jsdljsd<neitherAmI<“'><>>]]></aRealTag>

<aRealTag> <![CDATA[<notAtag>jsdljsd<neitherAmI<“'><>>]]></aRealTag>

<elm>3 &lt; 5</elm><elm>3 &lt; 5</elm>

<elm>&quot;Don&apos;t call me &apos;Ishmael&apos;!&quot;</elm>

<elm>&quot;Don&apos;t call me &apos;Ishmael&apos;!&quot;</elm>

Page 46: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

46Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

XML Namespaces Different schemas/DTDs may overlap

XHTML and MathML share some tags Soln: namespaces

as in Java/C++/C#

<book xmlns:isbn="www.isbn-org.org/def">

<title>...</title>

<number>15</number>

<isbn:number>...</isbn:number>

</book>

<book xmlns:isbn="www.isbn-org.org/def">

<title>...</title>

<number>15</number>

<isbn:number>...</isbn:number>

</book>

Page 47: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

47Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

<persons> <row><name>Michael</name> <ssn>123</ssn></row> <row><name>Hilary</name> <ssn>456</ssn></row> <row><name>Bill</name> <ssn>789</ssn></row></persons>

<persons> <row><name>Michael</name> <ssn>123</ssn></row> <row><name>Hilary</name> <ssn>456</ssn></row> <row><name>Bill</name> <ssn>789</ssn></row></persons>

row row row

name name namessn ssn ssn

“Michael” 123 “Hilary” “Bill”456 789

personsXML:

persons

From Relational Data to XML Data

Name SSN Mailing-address

Michael 123 NY

Hilary 456 DC

Bill 789 Chappaqua

Page 48: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

48Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

Semi-structured Data Explained List-valued attributes

XML is not 1NF!

Impossible in (single, BCNF) tables:

two phones!

name phone

Bill914-222-2222

212-333-3333

???

<persons> <row><name>Hilary</name> <phone>202-222-2222</phone> <phone>914-222-2222</phone></row> <row><name>Bill</name> <phone>914-222-2222</phone> <phone>212-333-3333</phone></row></persons>

<persons> <row><name>Hilary</name> <phone>202-222-2222</phone> <phone>914-222-2222</phone></row> <row><name>Bill</name> <phone>914-222-2222</phone> <phone>212-333-3333</phone></row></persons>

Page 49: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

49Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

Object ids and References SSD graph might not be trees! But XML docs must be

Would cause much redundancy Soln: same concept as pointers in C/C++/J

Object ids and references

Graph example: Movies: Lost in Translation, Hamlet Stars: Bill Murray, Scarlet Johansson

<movieinfo>

<movie id="o111">

<title>Lost in Translation</title>

<year>2003</year>

<stars idref="o333 o444"/>

</movie>

<movie id="o222">

<title>Hamlet</title>

<year>1999</year>

<stars idref="o333"/>

</movie> <person id="o456">

<person id="o111">

<name>Bill Murray</name>

<movies idref="o111 o222"/>

</person>

</movieinfo>

<movieinfo>

<movie id="o111">

<title>Lost in Translation</title>

<year>2003</year>

<stars idref="o333 o444"/>

</movie>

<movie id="o222">

<title>Hamlet</title>

<year>1999</year>

<stars idref="o333"/>

</movie> <person id="o456">

<person id="o111">

<name>Bill Murray</name>

<movies idref="o111 o222"/>

</person>

</movieinfo>

Page 50: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

50Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

What do we do with XML? Things done with XML:

Send to partners Parse XML received Convert to RDBMS rows Query for particular data Convert to other XML Convert to formats other than XML

Lots of tools/standards for these…

Page 51: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

51Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

DTDs & understanding XML XML is extensible Advantage: when creating, we can use any

tags we like Disadv: when reading, they can use any tags

they like Using XML docs a priori is very difficult

Solution: impose some constraints

Page 52: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

52Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

DTDs DTD: Document Type Definition

You and partners/vertical industry/academic discipline decide on a DTD/schema for your docs Specify which entities you may use/must understand Specify legal relationships

DTD specifies the grammar to be used DTD = set of rules for creating valid entities

DTD tells your software what to look for in doc

Page 53: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

53Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

DTD examples Well-formed XML v. valid XML

Simple example: http://pages.stern.nyu.edu/~mjohnson/dbms/xml/note.xml http://pages.stern.nyu.edu/~mjohnson/dbms/xml/badnote.xml http://pages.stern.nyu.edu/~mjohnson/dbms/xml/badnote2.xml Copy from: http://pages.stern.nyu.edu/~mjohnson/dbms/eg/xml.txt

Partial publisher example rules: Root publisher Publisher name, book*, author* Book title, date, author+ Author firstname, middlename?, lastname

Page 54: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

54Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

Partial DTD example (typos!)<?xml version=“1.0” encoding=“UTF-8” ?><!DOCTYPE PUBLISHER [<!ELEMENT PUBLISHER (name, book*, author*)><!ELEMENT name (#PCDATA)><!ELEMENT BOOK (title, date, author+)><!ELEMENT AUTHOR (firstname, middlename?,

lastname><!ELEMENT firstname (#PCDATA)><!ELEMENT lastname (#PCDATA)><!ELEMENT middlename (#PCDATA)>

<?xml version=“1.0” encoding=“UTF-8” ?><!DOCTYPE PUBLISHER [<!ELEMENT PUBLISHER (name, book*, author*)><!ELEMENT name (#PCDATA)><!ELEMENT BOOK (title, date, author+)><!ELEMENT AUTHOR (firstname, middlename?,

lastname><!ELEMENT firstname (#PCDATA)><!ELEMENT lastname (#PCDATA)><!ELEMENT middlename (#PCDATA)>

DTD is not XML, but can be embedded in or ref.ed from XML Replacement for DTDs is XML Schema

Page 55: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

55Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

XML Applications/dialects MathML: Mathematical Markup Language

http://wwwasdoc.web.cern.ch/wwwasdoc/WWW/publications/ictp99/ictp99N8059.html

VoiceXML: http://newmedia.purchase.edu/~Jeanine/interfaces/rps.xml

ChemML: Chemical Markup Language

XHMTL: HTML retrofitted as an XML application

Page 56: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

56Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

XML Applications/dialects VoiceXML:

http://newmedia.purchase.edu/~Jeanine/interfaces/rps.xml AT&T Directory Assistance http://phone.yahoo.com/

Image from http://www.voicexml.org/tutorials/intro2.html

Page 57: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

57Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

More XML Apps FIXML

XML equiv. of FIX: Financial Information eXchange

swiftML XML equiv. of SWIFT: Society for Worldwide Interbank

Financial Telecommunications message format

Apache’s Ant Scripting language for Java build management http://ant.apache.org/manual/using.html

Many more: http://www-106.ibm.com/developerworks/xml/library/x-stand4/

Page 58: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

58Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

More XML Applications/Protocols RSS: Rich Site Summary/Really Simple

Syndication News sites, blogs… http://slate.msn.com/rss/ http://slashdot.org/index.rss Screenshot

http://paulboutin.weblogger.com/pictures/viewer$673 More info: http://slate.msn.com/id/2096660/

<channel><title>my channel</title><item> <title>story 1</title> <link>…</link></item>// other items</channel>

<channel><title>my channel</title><item> <title>story 1</title> <link>…</link></item>// other items</channel>

Page 59: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

59Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

More XML Applications/Protocols SOAP: Simple Object Access Protocol

XML-based messaging format Used by Google API: http://www.google.com/apis/ Amazon API: http://amazon.com/gp/aws/landing.html Amazon light: http://kokogiak.com/amazon/ Other examples:

http://www.wired.com/wired/archive/12.03/google.html?pg=10&topic=&topic_set=

SOAP envelope with header and body Request sales tax for total

<SOAP:Envelope xmlns:SOAP="urn:schemas-xmlsoap-org:soap.v1"> <SOAP:Header></SOAP:Header> <SOAP:Body> <GetSalesTax> <SalesTotal>100</SalesTotal> <GetSalesTax> </SOAP:Body></SOAP:Envelope>

<SOAP:Envelope xmlns:SOAP="urn:schemas-xmlsoap-org:soap.v1"> <SOAP:Header></SOAP:Header> <SOAP:Body> <GetSalesTax> <SalesTotal>100</SalesTotal> <GetSalesTax> </SOAP:Body></SOAP:Envelope>

Page 60: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

60Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

More XML Applications/Protocols<?xml version="1.0" encoding="UTF-8"?><soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">

<soap:Body> <gs:doGoogleSearch xmlns:gs="urn:GoogleSearch"> <key>%(key)s</key> <start>0</start> <maxResults>10</maxResults> <filter>true</filter> <restrict/> <safeSearch>false</safeSearch> <lr/> </gs:doGoogleSearch> </soap:Body></soap:Envelope>

<?xml version="1.0" encoding="UTF-8"?><soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">

<soap:Body> <gs:doGoogleSearch xmlns:gs="urn:GoogleSearch"> <key>%(key)s</key> <start>0</start> <maxResults>10</maxResults> <filter>true</filter> <restrict/> <safeSearch>false</safeSearch> <lr/> </gs:doGoogleSearch> </soap:Body></soap:Envelope>

Page 61: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

61Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

New topic: XML in Oracle - purchase-order e.g

<?xml version="1.0"?><purchase_order> <customer_name>Alpha Tech</customer_name> <po_number>11257></po_number> <po_date>2004-01-20</po_date> <po_items> <item> <part_number>AI5-4557</part_number> <quantity>20</quantity> </item> <item> <part_number>EI-T5-001</part_number> <quantity>12</quantity> </item> </po_items></purchase_order>

<?xml version="1.0"?><purchase_order> <customer_name>Alpha Tech</customer_name> <po_number>11257></po_number> <po_date>2004-01-20</po_date> <po_items> <item> <part_number>AI5-4557</part_number> <quantity>20</quantity> </item> <item> <part_number>EI-T5-001</part_number> <quantity>12</quantity> </item> </po_items></purchase_order>

Page 62: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

62Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

Storing XML data As of 9i, has XMLType data type

By default, underlying storage is as CLOB

CREATE TABLE purchase_order( po_id number(5) not null, customer_po_nbr varchar(20), customer_inception_date date, order_nbr number(5), purchase_order_doc xmltype, constraint purchase_order_pk primary key(po_id));

CREATE TABLE purchase_order( po_id number(5) not null, customer_po_nbr varchar(20), customer_inception_date date, order_nbr number(5), purchase_order_doc xmltype, constraint purchase_order_pk primary key(po_id));

Page 63: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

63Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

Loading XML into Oracle First, log in as sys:

Now scott can import:

connect sys/junk as sysdbacreate directory xml_data as '/xml';grant read, write on directory xml_data to scott;

connect sys/junk as sysdbacreate directory xml_data as '/xml';grant read, write on directory xml_data to scott;

connect scott/tiger

declare bf1 bfile;beginbf1 := bfilename('XML_DATA', 'purch_ord.xml');insert into purchase_order(po_id, purchase_order_doc) values(1000, xmltype(bf1,

nls_charset_id('we8mswin1252')));end;

connect scott/tiger

declare bf1 bfile;beginbf1 := bfilename('XML_DATA', 'purch_ord.xml');insert into purchase_order(po_id, purchase_order_doc) values(1000, xmltype(bf1,

nls_charset_id('we8mswin1252')));end;

Page 64: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

64Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

Loading XML into Oracle Not just loading raw text

XMLType data must be well-formed Parsable as XML

Try modifying customer_name open tag

Page 65: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

65Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

Accessing XML in Oracle Now can look at raw XML:

Can also use XPath to extract particular nodes and values, with extract function:

SQL> SELECT purchase_order_docFROM purchase_order;

SQL> SELECT purchase_order_docFROM purchase_order;

SQL> SELECT extract(purchase_order_doc, '/purchase_order/customer_name')FROM purchase_order;

SQL> SELECT extract(purchase_order_doc, '/purchase_order/customer_name')FROM purchase_order;

Page 66: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

66Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

XPath in Oracle Can also extract all nodes of one type, underneath some

node, with double-slash // All purchase order items

NB: this is not valid XML No unique root Can request just one with bracket op Numbering starts at 1, not 0 Wrong name/number no error, no results

SQL> SELECT extract(purchase_order_doc, '/purchase_order/po_items/item[2]')FROM purchase_order;

SQL> SELECT extract(purchase_order_doc, '/purchase_order/po_items/item[2]')FROM purchase_order;

SQL> SELECT extract(purchase_order_doc, '/purchase_order//item')FROM purchase_order;

SQL> SELECT extract(purchase_order_doc, '/purchase_order//item')FROM purchase_order;

Page 67: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

67Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

extract v. extractvalue extractvalue returns value, not whole node:

vs.

extractvalue applies only to unique nodes:

SQL> SELECT extractvalue(purchase_order_doc, '/purchase_order/customer_name')FROM purchase_order;

SQL> SELECT extractvalue(purchase_order_doc, '/purchase_order/customer_name')FROM purchase_order;

SQL> SELECT extract(purchase_order_doc, '/purchase_order/customer_name')FROM purchase_order;

SQL> SELECT extract(purchase_order_doc, '/purchase_order/customer_name')FROM purchase_order;

SQL> SELECT extractvalue(purchase_order_doc, '/purchase_order/po_items')FROM purchase_order;

SQL> SELECT extractvalue(purchase_order_doc, '/purchase_order/po_items')FROM purchase_order;

Page 68: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

68Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

existsnode function Can check whether node/location exists with

existnode function Returns 1 or 0

Also applies to bracketed paths:

SQL> SELECT po_id FROM purchase_orderWHERE existsnode(purchase_order_doc, '/purchase_order/customer_name') = 1;

SQL> SELECT po_id FROM purchase_orderWHERE existsnode(purchase_order_doc, '/purchase_order/customer_name') = 1;

SQL> SELECT po_id FROM purchase_orderWHERE existsnode(purchase_order_doc, '/purchase_order/po_items/item[1]') = 1;

SQL> SELECT po_id FROM purchase_orderWHERE existsnode(purchase_order_doc, '/purchase_order/po_items/item[1]') = 1;

Page 69: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

69Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

Moving data from XML to relations To move single values from XML to tables, can

simply use extractvalue in UPDATE statements:

SQL> UPDATE purchase_orderSET order_nbr = 7101,customer_po_nbr = extractvalue(purchase_order_doc, '/purchase_order/po_number'),customer_inception_date =

to_date(extractvalue(purchase_order_doc,'/purchase_order/po_date'), 'yyyy-mm-dd');

SQL> UPDATE purchase_orderSET order_nbr = 7101,customer_po_nbr = extractvalue(purchase_order_doc, '/purchase_order/po_number'),customer_inception_date =

to_date(extractvalue(purchase_order_doc,'/purchase_order/po_date'), 'yyyy-mm-dd');

Page 70: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

70Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

Moving data from XML to relations What about moving set of nodes

The two item nodes

Use xmlsequence to get a varray of items Use TABLE to convert to a relation

SQL> SELECT extract(purchase_order_doc, '/purchase_order//item')

FROM purchase_order;

SQL> SELECT extract(purchase_order_doc, '/purchase_order//item')

FROM purchase_order;

SQL> SELECT rownum, item.* FROM TABLE(SELECT xmlsequence(extract(purchase_order_doc, '/purchase_order//item'))FROM purchase_order) item;

SQL> SELECT rownum, item.* FROM TABLE(SELECT xmlsequence(extract(purchase_order_doc, '/purchase_order//item'))FROM purchase_order) item;

Page 71: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

71Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

Moving data from XML to relations Result is a two-row relation with XMLTypes Can use extractvalue to extract this data First, create destination table:

CREATE TABLE LINE_ITEM( ORDER_NBR NUMBER(9) NOT NULL, PART_NBR VARCHAR2(20) NOT NULL, QTY NUMBER(5) NOT NULL, FILLED_QTY NUMBER(5), CONSTRAINT line_item_pk PRIMARY KEY (ORDER_NBR,PART_NBR));

CREATE TABLE LINE_ITEM( ORDER_NBR NUMBER(9) NOT NULL, PART_NBR VARCHAR2(20) NOT NULL, QTY NUMBER(5) NOT NULL, FILLED_QTY NUMBER(5), CONSTRAINT line_item_pk PRIMARY KEY (ORDER_NBR,PART_NBR));

Page 72: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

72Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

Moving data from XML to relations Then insert results:

SQL> INSERT INTO line_item(order_nbr,part_nbr,qty)SELECT 7109, extractvalue(column_value, '/item/part_number'),

extractvalue(column_value, '/item/quantity')FROM TABLE(

SELECT xmlsequence(extract(purchase_order_doc, '/purchase_order//item'))

FROM purchase_order);

SQL> INSERT INTO line_item(order_nbr,part_nbr,qty)SELECT 7109, extractvalue(column_value, '/item/part_number'),

extractvalue(column_value, '/item/quantity')FROM TABLE(

SELECT xmlsequence(extract(purchase_order_doc, '/purchase_order//item'))

FROM purchase_order);

Page 73: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

73Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

XML Schemas and Oracle By default, XML must be well-formed to be read into

the XMLType field XML is valid if it conforms to a schema To use a schema with Oracle, must first register it:

declare bf1 bfile;beginbf1 := bfilename('XML_DATA',

'purch_ord.xsd');dbms_xmlschema.registerschema('http://localhost:8080/home/xml/schemas/purch_ord.xsd', bf1);end;

declare bf1 bfile;beginbf1 := bfilename('XML_DATA',

'purch_ord.xsd');dbms_xmlschema.registerschema('http://localhost:8080/home/xml/schemas/purch_ord.xsd', bf1);end;

Page 74: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

74Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

XML Schemas and Oracle With schema registered, can apply it to an XMLType field

CREATE TABLE purchase_order2 (po_id NUMBER(5) NOT NULL, customer_po_nbr VARCHAR2(20), customer_inception_date DATE, order_nbr NUMBER(5), purchase_order_doc XMLTYPE, CONSTRAINT purchase_order2_pk PRIMARY KEY (po_id))XMLTYPE COLUMN purchase_order_doc XMLSCHEMA "http://localhost:8080/home/xml/schemas/purch_ord.xsd"

ELEMENT "purchase_order";

CREATE TABLE purchase_order2 (po_id NUMBER(5) NOT NULL, customer_po_nbr VARCHAR2(20), customer_inception_date DATE, order_nbr NUMBER(5), purchase_order_doc XMLTYPE, CONSTRAINT purchase_order2_pk PRIMARY KEY (po_id))XMLTYPE COLUMN purchase_order_doc XMLSCHEMA "http://localhost:8080/home/xml/schemas/purch_ord.xsd"

ELEMENT "purchase_order";

Page 75: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

75Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

Importing to schema field Try to import xml file, get error:

declare bf1 bfile;begin bf1 := bfilename('XML_DATA', 'purch_ord.xml'); insert into purchase_order2(po_id, purchase_order_doc) values (2000, XMLTYPE(bf1, nls_charset_id('WE8MSWIN1252')));end;

declare bf1 bfile;begin bf1 := bfilename('XML_DATA', 'purch_ord.xml'); insert into purchase_order2(po_id, purchase_order_doc) values (2000, XMLTYPE(bf1, nls_charset_id('WE8MSWIN1252')));end;

Page 76: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

76Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

Importing to schema field Root node of XML must specify the schema Change root to the following:

Now can import Also fails if extra or missing nodes

Modify company_name node Add new comments node

<purchase_order xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xsi:noNamespaceSchemaLocation="http://localhost:8080/home/xml/schemas/purch_ord.xsd">

<purchase_order xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xsi:noNamespaceSchemaLocation="http://localhost:8080/home/xml/schemas/purch_ord.xsd">

Page 77: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

77Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

Can check to see whether schema is used Can call isSchemaBased(), getSchemaURL()

and isSchemaValid() on XMLType fields:

SQL> select po.purchase_order_doc.isSchemaBased(),po.purchase_order_doc.getSchemaURL(),po.purchase_order_doc.isSchemaValid()

from purchase_order2 po;

SQL> select po.purchase_order_doc.isSchemaBased(),po.purchase_order_doc.getSchemaURL(),po.purchase_order_doc.isSchemaValid()

from purchase_order2 po;

Page 78: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

78Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

Updating XMLType data Can update XMLType data with ordinary

UPDATE statements:

Replaces whole XMLType object with new one

SQL> UPDATE purchase_order poSET po.purchase_order_doc = XMLTYPE(BFILENAME('XML_DATA', 'purch_ord_alt.xml'), nls_charset_id('WE8MSWIN1252'))WHERE po.po_id = 2000;

SQL> UPDATE purchase_order poSET po.purchase_order_doc = XMLTYPE(BFILENAME('XML_DATA', 'purch_ord_alt.xml'), nls_charset_id('WE8MSWIN1252'))WHERE po.po_id = 2000;

Page 79: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

79Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

Updating XMLType data Can also modify the existing XMLType object

By writing node values updateXML() function does search/replace

But searches for node, not value

SQL> SELECT extract(po.purchase_order_doc,'/purchase_order/customer_name') FROM purchase_order poWHERE po_id = 1000;

SQL> UPDATE purchase_order poSET po.purchase_order_doc = updateXML(po.purchase_order_doc,'/purchase_order/customer_name/text()', 'some other company')WHERE po.po_id = 1000;

SQL> SELECT extract(po.purchase_order_doc,'/purchase_order/customer_name') FROM purchase_order poWHERE po_id = 1000;

SQL> UPDATE purchase_order poSET po.purchase_order_doc = updateXML(po.purchase_order_doc,'/purchase_order/customer_name/text()', 'some other company')WHERE po.po_id = 1000;

Page 80: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

80Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

Updating XMLType data Can also write whole node, using XMLType:

Validation/well-formedness is still checked

SQL> UPDATE purchase_order poSET po.purchase_order_doc =

updateXML(po.purchase_order_doc,'/purchase_order/customer_name',XMLTYPE('<customer_name>some third

company</customer_name>'))WHERE po.po_id = 1000;

SQL> SELECT extract(po.purchase_order_doc,'/purchase_order/customer_name')

FROM purchase_order poWHERE po_id = 1000;

SQL> UPDATE purchase_order poSET po.purchase_order_doc =

updateXML(po.purchase_order_doc,'/purchase_order/customer_name',XMLTYPE('<customer_name>some third

company</customer_name>'))WHERE po.po_id = 1000;

SQL> SELECT extract(po.purchase_order_doc,'/purchase_order/customer_name')

FROM purchase_order poWHERE po_id = 1000;

Page 81: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

81Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

Updating XMLType data And can update items in a collection:

SQL> SELECT extract(po.purchase_order_doc, '/purchase_order//item')FROM purchase_order poWHERE po.po_id = 1000;

SQL> UPDATE purchase_order poSET po.purchase_order_doc = updateXML(po.purchase_order_doc, '/purchase_order/po_items/item[1]', XMLTYPE('<item><part_number>T-1000</part_number><quantity>33</quantity></item>'))WHERE po.po_id = 1000;

SQL> SELECT extract(po.purchase_order_doc, '/purchase_order//item')FROM purchase_order poWHERE po.po_id = 1000;

SQL> UPDATE purchase_order poSET po.purchase_order_doc = updateXML(po.purchase_order_doc, '/purchase_order/po_items/item[1]', XMLTYPE('<item><part_number>T-1000</part_number><quantity>33</quantity></item>'))WHERE po.po_id = 1000;

Page 82: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

82Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

Converting relational data to XML Saw how to put XML in a table Conversely, can convert ordinary relational

data to XML XMLElement() generates an XML node

First, create supplier table:CREATE TABLE SUPPLIER( SUPPLIER_ID NUMBER(5) NOT NULL, NAME VARCHAR2(30) NOT NULL, PRIMARY KEY (SUPPLIER_ID));insert into supplier values(1, 'Acme');insert into supplier values(2, 'Tilton');insert into supplier values(3, 'Eastern');

CREATE TABLE SUPPLIER( SUPPLIER_ID NUMBER(5) NOT NULL, NAME VARCHAR2(30) NOT NULL, PRIMARY KEY (SUPPLIER_ID));insert into supplier values(1, 'Acme');insert into supplier values(2, 'Tilton');insert into supplier values(3, 'Eastern');

Page 83: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

83Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

Converting relational data to XML Now can call XMLElement function to wrap values in

tags:

And can build it up:

Don’t concatenate! Turns to strings, escapes < > Error in book

SELECT XMLElement("supplier_id", s.supplier_id) ||XMLElement("name", s.name) xml_fragment

FROM supplier s;

SELECT XMLElement("supplier_id", s.supplier_id) ||XMLElement("name", s.name) xml_fragment

FROM supplier s;

SELECT XMLElement("supplier",XMLElement("supplier_id", s.supplier_id), XMLElement("name", s.name))

FROM supplier s;

SELECT XMLElement("supplier",XMLElement("supplier_id", s.supplier_id), XMLElement("name", s.name))

FROM supplier s;

Page 84: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

84Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

XMLForest() More simply, can use XMLForest() function:

SELECT XMLElement("supplier", XMLForest(s.supplier_id, s.name))FROM supplier s;

SELECT XMLElement("supplier", XMLForest(s.supplier_id, s.name))FROM supplier s;

Page 85: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

85Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

XMLAgg() Can use XMLAgg() to put nodes together

inside another node:

SELECT XMLElement("supplier_list", XMLAgg(XMLElement("supplier", XMLElement("supplier_id", s.supplier_id), XMLElement("name", s.name) ))) xml_documentFROM supplier s;

SELECT XMLElement("supplier_list", XMLAgg(XMLElement("supplier", XMLElement("supplier_id", s.supplier_id), XMLElement("name", s.name) ))) xml_documentFROM supplier s;

Page 86: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

86Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

New topic: Data Warehousing Physical warehouse: stores different kinds of items

combined from different sources in supply chain access items as a combined package “Synergy”

DW is the sys containing the data from many DBs OLAP is the system for easily querying the DW

Online analytical processing front-end to DW & stats

Page 87: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

87Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

Integrating Data Ad hoc combination of DBs from different sources

can be problematic

Data may be spread across many systems geographically by division different systems from before mergers…

Page 88: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

88Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

Conversion/scrubbing/merging Lots of issues…

different types of data Varchar(255) v. char(30)

Different values for data ‘GREEN’/’GR/’2

Semantic differences Cars v. Automobiles

Missing values Handle with nulls or XML

Page 89: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

89Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

Federated DBs Situ: n different DBs must work together

One idea: write programs for each to talk to each other one How many programs required? Like ambassadors for each country

Page 90: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

90Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

Federated DBs Better idea: introduce another DB

write programs for it to talk to each other DB

Now how many programs? English in business, French in diplomacy

Warehousing Refreshed nightly

Page 91: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

91Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

OLTP v. OLAP DWs usually not updated in real-time

data is usually not live but care about higher-level, longer-term patterns For “knowledge workers”/decision-makers

Live data is in system used by OLTP online transaction processing E.g., airline reservations OLTP data loaded into DW periodically, say nightly

Page 92: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

92Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

Utilizing Data Situ: each time manager has hunch

requests custom reports direct programmers to write/modify SQL app to produce

these results on higher or lower levels, for different specifics

Problem: too difficult/expensive/slow too great a time lag

Page 93: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

93Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

EISs Could just write queries at command-prompt

But decision makes aren’t (all) SQL programmers

Soln: create an executive information system provides friendly front-end to common, important queries basically a simple DB front-end your project part 5

GROUP BY queries are particularly applicable…

Page 94: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

94Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

EISs v. OLAP Okay for fixed set of queries But what if queries are open-ended?

Q: What’s driving sales in the Northeast? What’s the source cause? Result from one query influences next query tried

OLAP systems are interactive: run query analyze results think of new query repeat

Page 95: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

95Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

Star Schemas Popular schema for DW data

One central DB surrounded by specific DBs

Center: fact table

Extremities: data tables

Fields in fact table are foreign keys to data tables

Normalization Snowflake Schema May not be worthwhile…

Page 96: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

96Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

Dates and star schemas OLAP behaves as though you had a Days table,

with every possible row Dates(day, week, month, year, DID) (5, 27, 7, 2000)

Can join on Days like any other table

Page 97: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

97Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

Dates and star schemas E.g.: products x salesperson x region x date

Products sold by salespeople in regions on dates

Regular dim tables: Product(PID, name, color) Emp(name, SSN, sal) Region(name, RID)

Fact table: Sales(PID, DID, SSN, RID) Interpret as a cube (cross product of all dimensions)

Can have both data and stats

Page 98: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

98Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

Drill-down & roll-up Imagine: notice some region’s sales way up Why? Good salesperson? Some popular product

there?

Maybe need to search by month, or month and product, abstract back up to just product…

“slicing & dicing”

Page 99: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

99Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

OLAP and data warehousing Could write GROUP BY queries for each

OLAP systems provide simpler, non-SQL interface for this sort of thing

Vendors: MicroStrategy, SAP, etc.

Otoh: DW-style operators have been added to SQL and some DBMSs…

Page 100: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

100Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

DW extensions in SQL: ROLLUP (Oracle) Suppose have orders table (from two years), with

region and date info:

Can select total sales:

Examples derived/from Mastering Oracle SQL, 2e (O’Reilly) Get data here: http://examples.oreilly.com/mastorasql2/mosql2_data.sql

SELECT sum(o.tot_sales)FROM all_orders o join region rON r.region_id = o.region_id;

SELECT sum(o.tot_sales)FROM all_orders o join region rON r.region_id = o.region_id;

SQL> column month format a10SQL> @mosql2_dataSQL> describe all_orders

SQL> column month format a10SQL> @mosql2_dataSQL> describe all_orders

Page 101: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

101Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

Can write GROUP BY queries for year or region or both:

SELECT r.name region, o.year, sum(o.tot_sales)FROM all_orders o join region rON r.region_id = o.region_idGROUP BY (r.name, o.year);

SELECT r.name region, o.year, sum(o.tot_sales)FROM all_orders o join region rON r.region_id = o.region_idGROUP BY (r.name, o.year);

DW extensions in SQL: ROLLUP (Oracle)

Page 102: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

102Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

ROLLUP operator Extension of GROUP BY Does GROUP BY on several levels, simultaneously Order matters

Get sales totals for each region/year pair each region, and the grand total:

SELECT r.name region, o.year, sum(o.tot_sales)FROM all_orders o join region rON r.region_id = o.region_idGROUP BY ROLLUP (r.name, o.year);

SELECT r.name region, o.year, sum(o.tot_sales)FROM all_orders o join region rON r.region_id = o.region_idGROUP BY ROLLUP (r.name, o.year);

DW extensions in SQL: ROLLUP (Oracle)

Page 103: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

103Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

Change the order of the group fields to get a different sequence of groups

To get totals for each year/region pair, each year, and the grand total, and just reverse group-by order:

SELECT o.year, r.name region, sum(o.tot_sales)FROM all_orders o join region rON r.region_id = o.region_idGROUP BY ROLLUP (o.year, r.name);

SELECT o.year, r.name region, sum(o.tot_sales)FROM all_orders o join region rON r.region_id = o.region_idGROUP BY ROLLUP (o.year, r.name);

DW extensions in SQL: ROLLUP (Oracle)

Page 104: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

104Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

Adding more dimensions, like month, is easy (apart from formatting):

NB: summing happens on each level

SELECT o.year, to_char(to_date(o.month, 'MM'),'Month') month, r.name region, sum(o.tot_sales)FROM all_orders o join region rON r.region_id = o.region_idGROUP BY ROLLUP (o.year, o.month, r.name);

SELECT o.year, to_char(to_date(o.month, 'MM'),'Month') month, r.name region, sum(o.tot_sales)FROM all_orders o join region rON r.region_id = o.region_idGROUP BY ROLLUP (o.year, o.month, r.name);

DW extensions in SQL: ROLLUP (Oracle)

Page 105: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

105Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

If desired, can combine fields for the sake of grouping:

DW extensions in SQL: ROLLUP (Oracle)

SELECT o.year, to_char(to_date(o.month, 'MM'),'Month') month, r.name region, sum(o.tot_sales)FROM all_orders o join region rON r.region_id = o.region_idGROUP BY ROLLUP ((o.year, o.month), r.name);

SELECT o.year, to_char(to_date(o.month, 'MM'),'Month') month, r.name region, sum(o.tot_sales)FROM all_orders o join region rON r.region_id = o.region_idGROUP BY ROLLUP ((o.year, o.month), r.name);

Page 106: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

106Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

DW extensions in SQL: CUBE (Oracle) Another GROUP BY extension: CUBE

Subtotals all possible combins of group-by fields (powerset) Syntax: “ROLLUP” “CUBE” Order of fields doesn’t matter (apart from ordering)

To get subtotals for each region/month pair, each region, each month, and the grand total:

SELECT to_char(to_date(o.month, 'MM'),'Month') month, r.name region, sum(o.tot_sales)FROM all_orders o join region rON r.region_id = o.region_idGROUP BY CUBE (o.month, r.name);

SELECT to_char(to_date(o.month, 'MM'),'Month') month, r.name region, sum(o.tot_sales)FROM all_orders o join region rON r.region_id = o.region_idGROUP BY CUBE (o.month, r.name);

Page 107: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

107Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

DW extensions in SQL: CUBE (Oracle) Again, can easily add more dimensions:

SELECT o.year, to_char(to_date(o.month, 'MM'),'Month') month, r.name region, sum(o.tot_sales)FROM all_orders o join region rON r.region_id = o.region_idGROUP BY CUBE (o.year, o.month, r.name);

SELECT o.year, to_char(to_date(o.month, 'MM'),'Month') month, r.name region, sum(o.tot_sales)FROM all_orders o join region rON r.region_id = o.region_idGROUP BY CUBE (o.year, o.month, r.name);

Page 108: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

108Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

DW SQL exts: GROUPING SETS (Oracle) That’s a lot of rows Instead of a cube of all combinations, maybe we just

want the totals for each individual field:

SELECT o.year, to_char(to_date(o.month, 'MM'),'Month') month, r.name region, sum(o.tot_sales)FROM all_orders o join region rON r.region_id = o.region_idGROUP BY GROUPING SETS (o.year, o.month, r.name);

SELECT o.year, to_char(to_date(o.month, 'MM'),'Month') month, r.name region, sum(o.tot_sales)FROM all_orders o join region rON r.region_id = o.region_idGROUP BY GROUPING SETS (o.year, o.month, r.name);

Page 109: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

109Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

Next Final evals

More lab…

Page 110: 1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

110Matthew P. Johnson, OCL3, CISDD CUNY, June 2005

That’s all, folks! Selected solutions to exercises:

sqlzoo ~ “Answers” on sqlzoo.net

PL/SQL ~ http://pages.stern.nyu.edu/~mjohnson/oracle/archive/fall04/plsql/

mpjohnson-at-gmail.com