Matthew P. Johnson, OCL1, CISDD CUNY, F2004
1
OCL1 Oracle 10g:SQL & PL/SQLSession #10
Matthew P. Johnson
CISDD, CUNY
Fall, 2004
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
2
Agenda Web apps & security
Oracle & XML
RegEx support in 10g
More on the PL/SQL labs
Today’s lab
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
3
Review: Why security is hard It’s a “negative deliverable”
It’s an asymmetric threat
Tolstoy: “Happy families are all alike; every unhappy family is unhappy in its own way.” Analogs: “homeland”, jails, debugging, proof-
reading, Popperian science, fishing, MC algs
So: fix biggest problems first
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
4
Injection attacks – DB web apps
Consider another input: user: your-boss pass: ' OR 1=1 OR pass = '
SELECT * FROM usersWHERE user = u AND pass = p;
SELECT * FROM usersWHERE user = u AND pass = p;
SELECT * FROM usersWHERE user = 'your-boss' AND password = '' OR 1=1 OR pass = '';
SELECT * FROM usersWHERE user = 'your-boss' AND password = '' OR 1=1 OR pass = '';
http://pages.stern.nyu.edu/~mjohnson/dbms/perl/login.cgiCopy from: http://pages.stern.nyu.edu/~mjohnson/dbms/perl/injection.txt
SELECT * FROM usersWHERE user = 'your-boss'
AND pass = ''OR 1=1OR pass = '';
SELECT * FROM usersWHERE user = 'your-boss'
AND pass = ''OR 1=1OR pass = '';
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
5
Multi-command injection attacks
Consider another input: user: '; DROP TABLE users; SELECT FROM users WHERE pass = '
pass: abc
SELECT * FROM usersWHERE user = u AND pass = p;
SELECT * FROM usersWHERE user = u AND pass = p;
SELECT * FROM users
WHERE user = ''; DROP TABLE users; SELECT FROM users WHERE password = '' AND password = 'abc';
SELECT * FROM users
WHERE user = ''; DROP TABLE users; SELECT FROM users WHERE password = '' AND password = 'abc';
SELECT * FROM users WHERE user = '';DROP TABLE users;SELECT FROM users WHERE pass = '' AND pass = 'abc';
SELECT * FROM users WHERE user = '';DROP TABLE users;SELECT FROM users WHERE pass = '' AND pass = 'abc';
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
6
Multi-command injection attacks
Consider another input: user: '; SHUTDOWN WITH NOWAIT; SELECT FROM users WHERE pass = '
pass: abc
SELECT * FROM usersWHERE user = u AND pass = p;
SELECT * FROM usersWHERE user = u AND pass = p;
SELECT * FROM users
WHERE user = ''; SHUTDOWN WITH NOWAIT; SELECT FROM users WHERE password = '' AND password = 'abc';
SELECT * FROM users
WHERE user = ''; SHUTDOWN WITH NOWAIT; SELECT FROM users WHERE password = '' AND password = 'abc';
SELECT * FROM users WHERE user = '';SHUTDOWN WITH NOWAIT;SELECT FROM users WHERE pass = '' AND pass = 'abc';
SELECT * FROM users WHERE user = '';SHUTDOWN WITH NOWAIT;SELECT FROM users WHERE pass = '' AND pass = 'abc';
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
7
http://pages.stern.nyu.edu/~mjohnson/dbms/perl/users.cgi
Injection attacks – other inputs
Consider another input: user: ' OR 1=1 OR user = ' pass: ' OR 1=1 OR user = '
Delete everyone!
DELETE FROM usersWHERE user = u AND pass = p;
DELETE FROM usersWHERE user = u AND pass = p;
DELETE FROM users
WHERE user = '' OR 1=1 OR user = '' AND pass = '' OR 1=1 OR user = '';
DELETE FROM users
WHERE user = '' OR 1=1 OR user = '' AND pass = '' OR 1=1 OR user = '';
DELETE FROM usersWHERE user = ''
OR 1=1OR user = ''AND pass = ''OR 1=1OR user = '';
DELETE FROM usersWHERE user = ''
OR 1=1OR user = ''AND pass = ''OR 1=1OR user = '';
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
8
Preventing injection attacks Source of problem (in SQL case): use of
quotes Soln 1: don’t allow quotes!
Reject any entered data containing single quotes Q: Is this satisfactory?
Does Amazon need to sell O’Reilly books?
Soln 2: escape any single quotes Replace any ‘ with a ‘’ or \’ In PHP, turn on magic_quotes_gpc flag in .htaccess show both versions
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
9
Preventing injection attacks When to do security checking for quotes,
etc.? Natural choice: in client-side data validation But not enough!
As saw: can still manually submit GET and POST
Must do security checking on server
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
10
Preventing injection attacks Soln 3: use prepare parameterized queries
Supported in JDBC, Perl DBI, PHP ext/mysqli http://pages.stern.nyu.edu/~mjohnson/dbms/perl/loginsafe.cgi http://pages.stern.nyu.edu/~mjohnson/dbms/perl/userssafe.cgi
Very dangerous: using tainted data to run commands at the Unix command prompt Semi-colons, prime char, etc. Safest: define set if legal chars, not illegal ones
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
11
More Info phpGB MySQL Injection Vulnerability
http://www.securiteam.com/unixfocus/6X00O1P5PY.html
"How I hacked PacketStorm“ http://www.wiretrip.net/rfp/txt/rfp2k01.txt
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
12
And now for something completely different: XML XML: eXtensible Mark-up Language
Very popular language for semi-structured data
Mark-up language: consists of elements composed of tags, like HTML
Emerging lingua franca of the Internet, Web Services, inter-vender comm
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
13
Unstructured data At one end of continuum: unstructured data
Text files Stock market prices CIA intelligence intercepts Audio recordings “Just one damn bit after another”
Henry Ford
No (intentional, formal) patterns to the data Difficult to manage/make sense of
Why we need data-mining
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
14
Structured data At the other end: structured data
Tables in RDBMSs Data organized into semantic chunks
entities Similar/related entities grouped together
Relationships, classes Entities in same group have same structure
Same fields/attributes/properties
Easy to make sense of But sometimes too rigid a req. Difficult to send—convert to tab-delimited
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
15
Semi-structured data Not too random
Data organized into entities Similar/related grouped to form other entities
Not too structured Some attributes may be missing Size of attributes may vary
Support of lists/sets
Juuust Right Data is self-describing
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
16
Semi-structured data Predominant examples:
HTML: HyperText Mark-up Language XML: eXtensible Mark-up Language
NB: both mark-up languages (use tags) Mark-up lends self of semi-structured data
Demarcate boundaries for entities But freely allow other entities inside
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
17
Data model for semi-structured data Usually represented as directed graphs Graph: set of vertices (nodes) and edges
Dots connected by lines; not nec. a tree!
In model, Nodes ~ entities or fields/attributes Edges ~ attribute-of/sub-entity-of
Example: publisher publishes >=0 books Each book has one title, one year, >=1 authors Draw publishers graph
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
18
XML is a SSD language Standard published by W3C
Officially announced/recommended in 1998
XML != HTML XML != a replacement for HTML Both are mark-up languages
Big diffs:1. XML doesn’t use predefined tags (!)
But it’s extensible: tags can be added2. HTML is about presentation: <I>, <B>, <P>
XML is about content: <book>, <author>
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
19
XML syntax Like HTML in many respects but more strict
All tags must be closed Can’t have: this is a line<br> Every start tag has an end tag Although <br/> style can replace both
IS case-sensitive IS space-sensitive
XML doc has a unique root element
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
20
XML syntax Tags must be properly nested
Not allowed <b><i>I’m not kidding</b></i> Intuition: file folders
Elements may have quoted attributes <Myelm myatt=“myval”>…</Myelm>
Comments same as in HTML: <!-- Pay no attention… -->
Draw publishers XML
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
21
Escape chars in XML Some chars must be escaped
Distinguish content from syntax
Can also declare value to be pure text:
> <
< >
& &
" "
' '
<aRealTag> <![CDATA[<notAtag>jsdljsd<neitherAmI<“’><>>]]></aRealTag>
<aRealTag> <![CDATA[<notAtag>jsdljsd<neitherAmI<“’><>>]]></aRealTag>
<elm>3 < 5</elm><elm>3 < 5</elm>
<elm>"Don't call me 'Shirley'!"</elm>
<elm>"Don't call me 'Shirley'!"</elm>
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
22
XML Namespaces Different schemas/DTDs may overlap
XHTML and MathML share some tags Soln: namespaces
as in Java/C++/C#
<book xmlns:isbn=“www.isbn-org.org/def”>
<title> … </title>
<number> 15 </number>
<isbn:number> …. </isbn:number>
</book>
<book xmlns:isbn=“www.isbn-org.org/def”>
<title> … </title>
<number> 15 </number>
<isbn:number> …. </isbn:number>
</book>
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
23
From Relational Data to XML Data
<persons><row> <name>John</name> <phone>
3634</phone></row> <row> <name>Sue</name> <phone> 6343</phone> <row> <name>Dick</name> <phone>
6363</phone></row></persons>
<persons><row> <name>John</name> <phone>
3634</phone></row> <row> <name>Sue</name> <phone> 6343</phone> <row> <name>Dick</name> <phone>
6363</phone></row></persons>
n a m e p h o n e
J o h n 3 6 3 4
S u e 6 3 4 3
D i c k 6 3 6 3
row row row
name name namephone phone phone
“John” 3634 “Sue” “Dick”6343 6363
persons XML: persons
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
24
Semi-structured Data Explained List-valued attributes
XML is not 1NF!
Impossible in (single) tables:
<person> <name> Mary</name> <phone>2345</phone> <phone>3456</phone></person>
<person> <name> Mary</name> <phone>2345</phone> <phone>3456</phone></person>
two phones !
name phone
Mary 2345 3456 ???
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
25
Object ids and References SSD graph might not be trees! But XML docs must be
Would cause much redundancy Soln: same concept as pointers in C/C++/J
Object ids and references
Graph example: Movies: Lost in Translation, Hamlet Stars: Bill Murray, Scarlet Johansson
<movieinfo>
<movie id=“o111”>
<title>Lost in Translation</title>
<year>2003</year>
<stars idref=“o333 o444”/>
</movie>
<movie id=“o222”>
<title>Hamlet</title>
<year>1999</year>
<stars idref=“o333”/>
</movie> <person id=“o456”>
<person id=“o111”>
<name>Bill Murray</name>
<movies idref=“o111 o222”/>
</person>
</movieinfo>
<movieinfo>
<movie id=“o111”>
<title>Lost in Translation</title>
<year>2003</year>
<stars idref=“o333 o444”/>
</movie>
<movie id=“o222”>
<title>Hamlet</title>
<year>1999</year>
<stars idref=“o333”/>
</movie> <person id=“o456”>
<person id=“o111”>
<name>Bill Murray</name>
<movies idref=“o111 o222”/>
</person>
</movieinfo>
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
26
What do we do with XML? Things done with XML:
Send to partners Parse XML received Convert to RDBMS rows Query for particular data Convert to other XML Convert to formats other than XML
Lots of tools/standards for these…
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
27
DTDs & understanding XML XML is extensible Advantage: when creating, we can use any
tags we like Disadv: when reading, they can use any tags
they like Using XML docs a priori is very difficult
Solution: impose some constraints
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
28
DTDs DTD: Document Type Definition
You and partners/vertical industry/academic discipline decide on a DTD/schema for your docs Specify which entities you may use/must understand Specify legal relationships
DTD specifies the grammar to be used DTD = set of rules for creating valid entities
DTD tells your software what to look for in doc
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
29
DTD examples Well-formed XML v. valid XML
Simple example: http://pages.stern.nyu.edu/~mjohnson/dbms/xml/note.xml http://pages.stern.nyu.edu/~mjohnson/dbms/xml/badnote.xml http://pages.stern.nyu.edu/~mjohnson/dbms/xml/badnote2.xml Copy from: http://pages.stern.nyu.edu/~mjohnson/dbms/eg/xml.txt
Partial publisher example rules: Root publisher Publisher name, book*, author* Book title, date, author+ Author firstname, middlename?, lastname
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
30
Partial DTD example (typos!)<?xml version=“1.0” encoding=“UTF-8” ?><!DOCTYPE PUBLISHER [<!ELEMENT PUBLISHER (name, book*, author*)><!ELEMENT name (#PCDATA)><!ELEMENT BOOK (title, date, author+)><!ELEMENT AUTHOR (firstname, middlename?,
lastname><!ELEMENT firstname (#PCDATA)><!ELEMENT lastname (#PCDATA)><!ELEMENT middlename (#PCDATA)>
<?xml version=“1.0” encoding=“UTF-8” ?><!DOCTYPE PUBLISHER [<!ELEMENT PUBLISHER (name, book*, author*)><!ELEMENT name (#PCDATA)><!ELEMENT BOOK (title, date, author+)><!ELEMENT AUTHOR (firstname, middlename?,
lastname><!ELEMENT firstname (#PCDATA)><!ELEMENT lastname (#PCDATA)><!ELEMENT middlename (#PCDATA)>
DTD is not XML, but can be embedded in or ref.ed from XML Replacement for DTDs is XML Schemas
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
31
XML Applications/dialects MathML: Mathematical Markup Language
http://wwwasdoc.web.cern.ch/wwwasdoc/WWW/publications/ictp99/ictp99N8059.html
VoiceXML: http://newmedia.purchase.edu/~Jeanine/interfaces/rps.xml
ChemML: Chemical Markup Language
XHMTL: HTML retrofitted as an XML application
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
32
SQL*Plus settings
SQL> SET RECSEP OFFSQL> COLUMN text FORMAT A60
SQL> SET RECSEP OFFSQL> COLUMN text FORMAT A60
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
33
XML in Oracle - purchase-order example<?xml version="1.0"?>
<purchase_order> <customer_name>Alpha Tech</customer_name> <po_number>11257></po_number> <po_date>2004-01-20</po_date> <po_items> <item> <part_number>AI5-4557</part_number> <quantity>20</quantity> </item> <item> <part_number>EI-T5-001</part_number> <quantity>12</quantity> </item> </po_items></purchase_order>
<?xml version="1.0"?><purchase_order> <customer_name>Alpha Tech</customer_name> <po_number>11257></po_number> <po_date>2004-01-20</po_date> <po_items> <item> <part_number>AI5-4557</part_number> <quantity>20</quantity> </item> <item> <part_number>EI-T5-001</part_number> <quantity>12</quantity> </item> </po_items></purchase_order>
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
34
Storing XML data As of 9i, has XMLType data type
By default, underlying storage is as CLOB
CREATE TABLE purchase_order( po_id number(5) not null, customer_po_nbr varchar(20), customer_inception_date date, order_nbr number(5), purchase_order_doc xmltype, constraint purchase_order_pk primary key(po_id));
CREATE TABLE purchase_order( po_id number(5) not null, customer_po_nbr varchar(20), customer_inception_date date, order_nbr number(5), purchase_order_doc xmltype, constraint purchase_order_pk primary key(po_id));
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
35
Loading XML into Oracle First, log in as sys:
Now scott can import:
connect sys/junk as sysdbacreate directory xml_data as '/xml‘;grant read, write on directory xml_data to scott;
connect sys/junk as sysdbacreate directory xml_data as '/xml‘;grant read, write on directory xml_data to scott;
connect scott/tiger
declare bf1 bfile;beginbf1 := bfilename('XML_DATA', 'purch_ord.xml');insert into purchase_order(po_id, purchase_order_doc) values(1000, xmltype(bf1,
nls_charset_id('we8mswin1252')));end;
connect scott/tiger
declare bf1 bfile;beginbf1 := bfilename('XML_DATA', 'purch_ord.xml');insert into purchase_order(po_id, purchase_order_doc) values(1000, xmltype(bf1,
nls_charset_id('we8mswin1252')));end;
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
36
Loading XML into Oracle Not just loading raw text
XMLType data must be well-formed Parsable as XML
Try modifying customer_name open tag
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
37
Accessing XML in Oracle Now can look at raw XML:
Can also use XPath to extract particular nodes and values, with extract function:
SQL> SELECT purchase_order_docFROM purchase_order;
SQL> SELECT purchase_order_docFROM purchase_order;
SQL> SELECT extract(purchase_order_doc, '/purchase_order/customer_name')FROM purchase_order;
SQL> SELECT extract(purchase_order_doc, '/purchase_order/customer_name')FROM purchase_order;
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
38
XPath in Oracle Can also extract all nodes of one type, underneath some
node, with double-slash // All purchase order items
NB: this is not valid XML No unique root Can request just one with bracket op Numbering starts at 1, not 0 Wrong name/number no error, no results
SQL> SELECT extract(purchase_order_doc, '/purchase_order/po_items/item[1]')FROM purchase_order;
SQL> SELECT extract(purchase_order_doc, '/purchase_order/po_items/item[1]')FROM purchase_order;
SQL> SELECT extract(purchase_order_doc, '/purchase_order//item')FROM purchase_order;
SQL> SELECT extract(purchase_order_doc, '/purchase_order//item')FROM purchase_order;
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
39
extract v. extractvalue extractvalue returns value, not whole node:
vs.
extractvalue applies only to unique nodes:
SQL> SELECT extractvalue(purchase_order_doc, '/purchase_order/customer_name')FROM purchase_order;
SQL> SELECT extractvalue(purchase_order_doc, '/purchase_order/customer_name')FROM purchase_order;
SQL> SELECT extract(purchase_order_doc, '/purchase_order/customer_name')FROM purchase_order;
SQL> SELECT extract(purchase_order_doc, '/purchase_order/customer_name')FROM purchase_order;
SQL> SELECT extractvalue(purchase_order_doc, '/purchase_order/po_items')FROM purchase_order;
SQL> SELECT extractvalue(purchase_order_doc, '/purchase_order/po_items')FROM purchase_order;
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
40
existsnode function Can check whether node/location exists with
existnode function Returns 1 or 0
Also applies to bracketed paths:
SQL> SELECT po_id FROM purchase_orderWHERE existsnode(purchase_order_doc, '/purchase_order/customer_name') = 1;
SQL> SELECT po_id FROM purchase_orderWHERE existsnode(purchase_order_doc, '/purchase_order/customer_name') = 1;
SQL> SELECT po_id FROM purchase_orderWHERE existsnode(purchase_order_doc, '/purchase_order/po_items/item[1]') = 1;
SQL> SELECT po_id FROM purchase_orderWHERE existsnode(purchase_order_doc, '/purchase_order/po_items/item[1]') = 1;
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
41
Moving data from XML to relations To move single values from XML to tables, can
simply use extractvalue in UPDATE statements:
SQL> UPDATE purchase_orderSET order_nbr = 7101,customer_po_nbr = extractvalue(purchase_order_doc, '/purchase_order/po_number'),customer_inception_date =
to_date(extractvalue(purchase_order_doc,'/purchase_order/po_date'), 'yyyy-mm-dd');
SQL> UPDATE purchase_orderSET order_nbr = 7101,customer_po_nbr = extractvalue(purchase_order_doc, '/purchase_order/po_number'),customer_inception_date =
to_date(extractvalue(purchase_order_doc,'/purchase_order/po_date'), 'yyyy-mm-dd');
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
42
Moving data from XML to relations What about moving set of nodes
The two item nodes
Use xmlsequence to get a varray of items Use TABLE to convert to a relation
SQL> SELECT extract(purchase_order_doc, '/purchase_order//item')
FROM purchase_order;
SQL> SELECT extract(purchase_order_doc, '/purchase_order//item')
FROM purchase_order;
SQL> SELECT rownum, item.* FROM TABLE(SELECT xmlsequence(extract(purchase_order_doc, '/purchase_order//item'))FROM purchase_order) item;
SQL> SELECT rownum, item.* FROM TABLE(SELECT xmlsequence(extract(purchase_order_doc, '/purchase_order//item'))FROM purchase_order) item;
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
43
Moving data from XML to relations Result is a two-row relation with XMLTypes Can use extractvalue to extract this data First, create destination table:
CREATE TABLE LINE_ITEM( ORDER_NBR NUMBER(9) NOT NULL, PART_NBR VARCHAR2(20) NOT NULL, QTY NUMBER(5) NOT NULL, FILLED_QTY NUMBER(5), CONSTRAINT line_item_pk PRIMARY KEY (ORDER_NBR,PART_NBR));
CREATE TABLE LINE_ITEM( ORDER_NBR NUMBER(9) NOT NULL, PART_NBR VARCHAR2(20) NOT NULL, QTY NUMBER(5) NOT NULL, FILLED_QTY NUMBER(5), CONSTRAINT line_item_pk PRIMARY KEY (ORDER_NBR,PART_NBR));
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
44
Moving data from XML to relations Then insert results:
SQL> INSERT INTO line_item(order_nbr,part_nbr,qty)SELECT 7109, extractvalue(column_value, '/item/part_number'),
extractvalue(column_value, '/item/quantity')FROM TABLE(
SELECT xmlsequence(extract(purchase_order_doc, '/purchase_order//item'))
FROM purchase_order);
SQL> INSERT INTO line_item(order_nbr,part_nbr,qty)SELECT 7109, extractvalue(column_value, '/item/part_number'),
extractvalue(column_value, '/item/quantity')FROM TABLE(
SELECT xmlsequence(extract(purchase_order_doc, '/purchase_order//item'))
FROM purchase_order);
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
45
XML Schemas and Oracle By default, XML must be well-formed to be read into
the XMLType field XML is valid if it conforms to a schema To use a schema with Oracle, must first register it:
declare bf1 bfile;beginbf1 := bfilename('XML_DATA',
'purch_ord.xsd');dbms_xmlschema.registerschema('http://localhost:8080/home/xml/schemas/purch_ord.xsd', bf1);end;
declare bf1 bfile;beginbf1 := bfilename('XML_DATA',
'purch_ord.xsd');dbms_xmlschema.registerschema('http://localhost:8080/home/xml/schemas/purch_ord.xsd', bf1);end;
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
46
XML Schemas and Oracle With schema registered, can apply it to an XMLType field
CREATE TABLE purchase_order2 (po_id NUMBER(5) NOT NULL, customer_po_nbr VARCHAR2(20), customer_inception_date DATE, order_nbr NUMBER(5), purchase_order_doc XMLTYPE, CONSTRAINT purchase_order2_pk PRIMARY KEY (po_id))XMLTYPE COLUMN purchase_order_doc XMLSCHEMA "http://localhost:8080/home/xml/schemas/purch_ord.xsd"
ELEMENT "purchase_order";
CREATE TABLE purchase_order2 (po_id NUMBER(5) NOT NULL, customer_po_nbr VARCHAR2(20), customer_inception_date DATE, order_nbr NUMBER(5), purchase_order_doc XMLTYPE, CONSTRAINT purchase_order2_pk PRIMARY KEY (po_id))XMLTYPE COLUMN purchase_order_doc XMLSCHEMA "http://localhost:8080/home/xml/schemas/purch_ord.xsd"
ELEMENT "purchase_order";
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
47
Importing to schema field Try to import xml file, get error:
declare bf1 bfile;begin bf1 := bfilename('XML_DATA', 'purch_ord.xml'); insert into purchase_order2(po_id, purchase_order_doc) values (2000, XMLTYPE(bf1, nls_charset_id('WE8MSWIN1252')));end;
declare bf1 bfile;begin bf1 := bfilename('XML_DATA', 'purch_ord.xml'); insert into purchase_order2(po_id, purchase_order_doc) values (2000, XMLTYPE(bf1, nls_charset_id('WE8MSWIN1252')));end;
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
48
Importing to schema field Root node of XML must specify the schema Change root to the following:
Now can import Also fails if extra or missing nodes
Modify company_name node Add new comments node
<purchase_order xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xsi:noNamespaceSchemaLocation="http://localhost:8080/home/xml/schemas/purch_ord.xsd">
<purchase_order xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xsi:noNamespaceSchemaLocation="http://localhost:8080/home/xml/schemas/purch_ord.xsd">
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
49
Can check to see whether schema is used Can call isSchemaBased(), getSchemaURL()
and isSchemaValid() on XMLType fields:
SQL> select po.purchase_order_doc.isSchemaBased(),po.purchase_order_doc.getSchemaURL(),po.purchase_order_doc.isSchemaValid()
from purchase_order2 po;
SQL> select po.purchase_order_doc.isSchemaBased(),po.purchase_order_doc.getSchemaURL(),po.purchase_order_doc.isSchemaValid()
from purchase_order2 po;
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
50
Updating XMLType data Can update XMLType data with ordinary
UPDATE statements:
Replaces whole XMLType object with new one
SQL> UPDATE purchase_order poSET po.purchase_order_doc = XMLTYPE(BFILENAME('XML_DATA', 'purch_ord_alt.xml'), nls_charset_id('WE8MSWIN1252'))WHERE po.po_id = 2000;
SQL> UPDATE purchase_order poSET po.purchase_order_doc = XMLTYPE(BFILENAME('XML_DATA', 'purch_ord_alt.xml'), nls_charset_id('WE8MSWIN1252'))WHERE po.po_id = 2000;
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
51
Updating XMLType data Can also modify the existing XMLType object
By writing node values updateXML() function does search/replace
But searches for node, not value
SQL> SELECT extract(po.purchase_order_doc,'/purchase_order/customer_name') FROM purchase_order poWHERE po_id = 1000;
SQL> UPDATE purchase_order poSET po.purchase_order_doc = updateXML(po.purchase_order_doc,'/purchase_order/customer_name/text()', 'some other company')WHERE po.po_id = 1000;
SQL> SELECT extract(po.purchase_order_doc,'/purchase_order/customer_name') FROM purchase_order poWHERE po_id = 1000;
SQL> UPDATE purchase_order poSET po.purchase_order_doc = updateXML(po.purchase_order_doc,'/purchase_order/customer_name/text()', 'some other company')WHERE po.po_id = 1000;
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
52
Updating XMLType data Can also write whole node, using XMLType:
Validation/well-formedness is still checked
SQL> UPDATE purchase_order poSET po.purchase_order_doc =
updateXML(po.purchase_order_doc,'/purchase_order/customer_name',XMLTYPE('<customer_name>some third
company</customer_name>'))WHERE po.po_id = 1000;
SQL> SELECT extract(po.purchase_order_doc,'/purchase_order/customer_name')
FROM purchase_order poWHERE po_id = 1000;
SQL> UPDATE purchase_order poSET po.purchase_order_doc =
updateXML(po.purchase_order_doc,'/purchase_order/customer_name',XMLTYPE('<customer_name>some third
company</customer_name>'))WHERE po.po_id = 1000;
SQL> SELECT extract(po.purchase_order_doc,'/purchase_order/customer_name')
FROM purchase_order poWHERE po_id = 1000;
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
53
Updating XMLType data And can update items in a collection:
SQL> SELECT extract(po.purchase_order_doc, '/purchase_order//item')FROM purchase_order poWHERE po.po_id = 1000;
SQL> UPDATE purchase_order poSET po.purchase_order_doc = updateXML(po.purchase_order_doc, '/purchase_order/po_items/item[1]', XMLTYPE('<item><part_number>T-1000</part_number><quantity>33</quantity></item>'))WHERE po.po_id = 1000;
SQL> SELECT extract(po.purchase_order_doc, '/purchase_order//item')FROM purchase_order poWHERE po.po_id = 1000;
SQL> UPDATE purchase_order poSET po.purchase_order_doc = updateXML(po.purchase_order_doc, '/purchase_order/po_items/item[1]', XMLTYPE('<item><part_number>T-1000</part_number><quantity>33</quantity></item>'))WHERE po.po_id = 1000;
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
54
Converting relational data to XML Saw how to put XML in a table Conversely, can convert ordinary relational
data to XML XMLElement() generates an XML node
First, create supplier table:CREATE TABLE SUPPLIER( SUPPLIER_ID NUMBER(5) NOT NULL, NAME VARCHAR2(30) NOT NULL, PRIMARY KEY (SUPPLIER_ID));insert into supplier values(1, 'Acme');insert into supplier values(2, 'Tilton');insert into supplier values(3, 'Eastern');
CREATE TABLE SUPPLIER( SUPPLIER_ID NUMBER(5) NOT NULL, NAME VARCHAR2(30) NOT NULL, PRIMARY KEY (SUPPLIER_ID));insert into supplier values(1, 'Acme');insert into supplier values(2, 'Tilton');insert into supplier values(3, 'Eastern');
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
55
Converting relational data to XML Now can call XMLElement function to wrap values in
tags:
And can build it up:
Don’t concatenate! Turns to strings, escapes < > Error in book
SELECT XMLElement("supplier_id", s.supplier_id) ||XMLElement("name", s.name) xml_fragment
FROM supplier s;
SELECT XMLElement("supplier_id", s.supplier_id) ||XMLElement("name", s.name) xml_fragment
FROM supplier s;
SELECT XMLElement("supplier",XMLElement("supplier_id", s.supplier_id), XMLElement("name", s.name))
FROM supplier s;
SELECT XMLElement("supplier",XMLElement("supplier_id", s.supplier_id), XMLElement("name", s.name))
FROM supplier s;
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
56
XMLForest() More simply, can use XMLForest() function:
SELECT XMLElement("supplier", XMLForest(s.supplier_id, s.name))FROM supplier s;
SELECT XMLElement("supplier", XMLForest(s.supplier_id, s.name))FROM supplier s;
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
57
XMLAgg() Can use XMLAgg() to put nodes together
inside another node:
|| typo again…
SELECT XMLElement("supplier_list", XMLAgg(XMLElement("supplier", XMLElement("supplier_id", s.supplier_id), XMLElement("name", s.name) ))) xml_documentFROM supplier s;
SELECT XMLElement("supplier_list", XMLAgg(XMLElement("supplier", XMLElement("supplier_id", s.supplier_id), XMLElement("name", s.name) ))) xml_documentFROM supplier s;
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
58
New topic: Regular Expressions In automata theory, Finite Automata are the
simplest weakest of computer, Turing Machines the strongest Chomsky’s Hierarchy
FA are equivalent to a regular expression Expressions that specify a pattern Can check whether a string matches the pattern
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
59
RegEx matching Use REGEX_LIKE Metachar for any char is . First, get employee_comment table:
http://pages.stern.nyu.edu/~mjohnson/oracle/empcomm.sql
Now do search:
So far, like LIKE
SELECT emp_id, textFROM employee_commentWHERE REGEXP_LIKE(text,'...-....');
SELECT emp_id, textFROM employee_commentWHERE REGEXP_LIKE(text,'...-....');
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
60
RegEx matching Can also pull out the matching text with
REGEXP_SUBSTR:
If want only numbers, can specify a set of chars rather than a dot:
SELECT emp_id, REGEXP_SUBSTR(text,'...-....') textFROM employee_commentWHERE REGEXP_LIKE(text,'...-....');
SELECT emp_id, REGEXP_SUBSTR(text,'...-....') textFROM employee_commentWHERE REGEXP_LIKE(text,'...-....');
SELECT emp_id, REGEXP_SUBSTR(text, '[0123456789]..-....') textFROM employee_commentWHERE REGEXP_LIKE(text,'...-....');
SELECT emp_id, REGEXP_SUBSTR(text, '[0123456789]..-....') textFROM employee_commentWHERE REGEXP_LIKE(text,'...-....');
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
61
RegEx matching Or can specify a range of chars:
Or, finally, can state how many copies to match:
SELECT emp_id, REGEXP_SUBSTR(text, '[0-9]..-....') textFROM employee_commentWHERE REGEXP_LIKE(text,'...-....');
SELECT emp_id, REGEXP_SUBSTR(text, '[0-9]..-....') textFROM employee_commentWHERE REGEXP_LIKE(text,'...-....');
SELECT emp_id, REGEXP_SUBSTR(text,'[0-9]{3}-[0-9]{4}') text
FROM employee_commentWHERE REGEXP_LIKE(text,'[0-9]{3}-[0-9]{4}');
SELECT emp_id, REGEXP_SUBSTR(text,'[0-9]{3}-[0-9]{4}') text
FROM employee_commentWHERE REGEXP_LIKE(text,'[0-9]{3}-[0-9]{4}');
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
62
RegExp matching Other operators:
* - 0 or more matches + - 1 or more matches ? - 0 or 1 match
Also, can OR options together with | op Here: some phone nums have area codes, some
not, so want to match both:
SELECT emp_id, REGEXP_SUBSTR(text,'[0-9]{3}-[0-9]{3}-[0-9]{4}|[0-9]{3}-
[0-9]{4}') textFROM employee_commentWHERE REGEXP_LIKE(text,'[0-9]{3}-[0-9]{3}-[0-9]{4}|[0-9]{3}-[0-9]{4}');
SELECT emp_id, REGEXP_SUBSTR(text,'[0-9]{3}-[0-9]{3}-[0-9]{4}|[0-9]{3}-
[0-9]{4}') textFROM employee_commentWHERE REGEXP_LIKE(text,'[0-9]{3}-[0-9]{3}-[0-9]{4}|[0-9]{3}-[0-9]{4}');
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
63
RegExp matching Order of ORed together patterns matters:
First matching pattern wins
SELECT emp_id, REGEXP_SUBSTR(text,'[0-9]{3}-[0-9]{4}|[0-9]{3}-[0-9]{3}-
[0-9]{4}') textFROM employee_commentWHERE REGEXP_LIKE(text,'[0-9]{3}-[0-9]{4}|[0-9]{3}-[0-9]{3}-[0-9]{4}');
SELECT emp_id, REGEXP_SUBSTR(text,'[0-9]{3}-[0-9]{4}|[0-9]{3}-[0-9]{3}-
[0-9]{4}') textFROM employee_commentWHERE REGEXP_LIKE(text,'[0-9]{3}-[0-9]{4}|[0-9]{3}-[0-9]{3}-[0-9]{4}');
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
64
RegExp matching There’s a shared structure between the two,
tho Area code is just optional Can use ? op
SELECT emp_id, REGEXP_SUBSTR(text,'([0-9]{3}-)?[0-9]{3}-[0-9]{4}') text
FROM employee_commentWHERE REGEXP_LIKE(text,'([0-9]{3}-)?[0-9]{3}-[0-9]{4}');
SELECT emp_id, REGEXP_SUBSTR(text,'([0-9]{3}-)?[0-9]{3}-[0-9]{4}') text
FROM employee_commentWHERE REGEXP_LIKE(text,'([0-9]{3}-)?[0-9]{3}-[0-9]{4}');
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
65
RegExp matching Also, different kinds of separators:
dash, dot, just blank Can OR together whole number patterns Better: Just use set of choices of each sep.
SELECT emp_id, REGEXP_SUBSTR(text, '([0-9]{3}[-. ])?[0-9]{3}[-. ][0-9]{4}') textFROM employee_commentWHERE REGEXP_LIKE(text,'([0-9]{3}[-. ])?[0-9]{3}[-. ][0-9]{4}');
SELECT emp_id, REGEXP_SUBSTR(text, '([0-9]{3}[-. ])?[0-9]{3}[-. ][0-9]{4}') textFROM employee_commentWHERE REGEXP_LIKE(text,'([0-9]{3}[-. ])?[0-9]{3}[-. ][0-9]{4}');
Matthew P. Johnson, OCL1, CISDD CUNY, F2004
66
RegExp matching One other thing: area codes in parentheses
Of course, area codes are still optional Parentheses must be escaped - \( \)
SELECT emp_id, REGEXP_SUBSTR(text, '([0-9]{3}[-. ]|\([0-9]{3}\) )?[0-9]{3}[-. ][0-9]{4}') textFROM employee_commentWHERE REGEXP_LIKE(text,'([0-9]{3}[-. ]|\([0-9]{3}\) )?[0-9]{3}[-. ][0-9]{4}');
SELECT emp_id, REGEXP_SUBSTR(text, '([0-9]{3}[-. ]|\([0-9]{3}\) )?[0-9]{3}[-. ][0-9]{4}') textFROM employee_commentWHERE REGEXP_LIKE(text,'([0-9]{3}[-. ]|\([0-9]{3}\) )?[0-9]{3}[-. ][0-9]{4}');