1
Storing XML using Relational Model
XML and data management
By: Habiba Skalli
For: Dr. Haddouti
2
Schema of this presentation
Introduction. XML Storage Requirements. XML enabled XML databases. Strategies of Storing XML. Microsoft SQL Server 2000. Oracle 8i/9i. IBM DB2. Sybase Adaptive Server.
3
Introduction eXtensible Markup Languages becomes the
standard for B2B systems. XML documents are conceived as transitory form
of data because XML is not designed to facilitate efficient data retrieval or storage. (1)
Processing and accessing data in large XML files is a time consuming process. (5)
XML data has to be stored in a consistent and efficient manner and retrieved fast and accurately. (1)
XML documents can be stored in a RDBMS, an ODBMS or a native XML database. (6)
4
XML Storage Requirements XML documents and data storage requirements can often
be thought of in two general categories:– Data centric:
An XML document is data-centric if it has well defined structure and contains updateable data. (e.g. invoice documents, purchase orders)
– Document centric:Document centric data tends to be more unpredictable in size and content. (e.g. newspaper content, articles, and advertisements)
Traditional relational databases are typically better at dealing with data centric requirements. (2)
5
Relational Database A relational database consists of a set of tables. Each table is a set of records. A record or a row is a set of fields. All records in a particular table have the same number
of fields with the same field names. (4) From here we see that it is not obvious to store XML
into RDBMS because the XML and the relational approaches are built on different principles.
This created the need for tools that can map XML to relational data. (6)
6
What we need
We need an automatic mapping technique to effectively convert XML documents into relational model.
We need techniques SQL statements and convert their result sets to XML documents. (7)
7
XML Enabled Relational Databases
Various Database vendors have developed efficient solutions for automatic conversions of XML into and out of relational databases:– Microsoft's SQL Server 2000.
– Oracle 8i, Oracle 9i.
– IBM DB2.
– Sybase Adaptive Server. (3)
The tools and techniques offered to achieve retrieval and storage of the XML data varies. (2)
8
Strategies of Storing XML
Method 1: Store the entire document as text, such as a binary
large object (BLOB) into the relational database. This strategy is appropriate for document centric data.
All leading RDBMS vendors support this method (Microsoft SQL Server, Oracle Oracle8i, IBM DB2).
This method is simple but it does not allow indexing and searching.
9
Strategies of Storing XML
Method 2: Store the entire document in the file system with a
pointer to that file stored in the database. This method is useful if the number of XML
documents is small and infrequently updated. Supported by leading database vendors. It has problems of flexibility to store and retrieve and
of security since the files are stored outside the database.
10
Strategies of Storing XML
Method 3: Map the structure of the XML document to the
database. This way the different elements are stored into relational tables (called side tables).
This way the content of XMl documents stored in the database can be searched, updated, and retrieved.
This method is the most popular one and the subject of many books and articles. (2 + 11)
11
Microsoft SQL Server 2000
SQL server 2000 introduced many features to deal with the storage of XML documents and retrieve XML data.– SELECT statement result sets can be mapped
into XML documents by using FOR XML keyword.
– XML documents can be stored into the database using OPENXML. (2)
12
<!-- Primitive Types -->
<!ELEMENT CURRENCY1 (#PCDATA)>
<!ATTLIST CURRENCY1 e-dtype NMTOKEN #FIXED "string" e-dsize NMTOKEN #FIXED "3">
<!ELEMENT CURRENCY2 (#PCDATA)>
<!ATTLIST CURRENCY2 e-dtype NMTOKEN #FIXED "string" e-dsize NMTOKEN #FIXED "3">
<!ELEMENT AMOUNT (#PCDATA)>
<!ATTLIST AMOUNT e-dtype NMTOKEN #FIXED "decimal">
<!ELEMENT SETTLEMENT (#PCDATA)>
<!ATTLIST SETTLEMENT e-dtype NMTOKEN #FIXED "date">
<!ELEMENT BANKCODE (#PCDATA)>
<!ATTLIST BANKCODE e-dtype NMTOKEN #FIXED "string">
<!ELEMENT BANKACCT (#PCDATA)>
<!ATTLIST BANKACCT e-dtype NMTOKEN #FIXED "string">
<!-- Derived Types -->
<!ELEMENT ACCOUNT (BANKCODE, BANKACCT)>
<!ELEMENT FXTRADE (CURRENCY1, CURRENCY2, AMOUNT, SETTLEMENT, ACCOUNT)> (3)
13
14
SQL to XML
In General this is the SQL code that generates an XML document:
SELECT select_listFROM table_sourceWHERE search_conditionFOR XML AUTO | RAW | EXPLICIT [, XMLDATA] [, ELEMENTS] [, BINARY BASE64]
15
SQL to XML (FOR XML)
The process of generating an XML document from the database is achieved in two steps:– Step1: Create As-aliases to atomic elements in the desired output
XML; the alias defines the parent/child relationships between elements. FXTRADE /* LEVEL=1 */ CURRENCY1 [FXTRADE!1!CURRENCY1] CURRENCY2 [FXTRADE!1!CURRENCY2] AMOUNT [FXTRADE!1!AMOUNT] SETTLEMENT [FXTRADE!1!SETTLEMENT] ACCOUNT /* LEVEL=2 */
BANKCODE [ACCOUNT!2!BANKCODE] BANKACCT [ACCOUNT!2!BANKACCT]
16
Step2:– The output tree structure in SQL is defined. Each level
of the tree is defined through a SELECT statement. Then the levels are combined together into the tree by means of a UNION ALL statement.
– The level-1 SELECT statement introduces the names of atomic elements on all levels.
– Each SELECT statement introduces a tree level tag and its parent tag.
– There is a single record in the result set corresponding to the tree root, as defined in the following first SELECT statement:
17
SELECT 1 AS Tag, NULL AS Parent, NULL AS [FXTRADE!1!CURRENCY1], NULL AS [FXTRADE!1!CURRENCY2], NULL AS [FXTRADE!1!AMOUNT], NULL AS [FXTRADE!1!SETTLEMENT], NULL AS [ACCOUNT!2!BANKCODE], NULL AS [ACCOUNT!2!BANKACCT]
FROM FXTRADE UNION ALL SELECT
2, 1, FXTRADE.CURRENCY1, FXTRADE.CURRENCY2, FXTRADE.AMOUNT, FXTRADE.SETTLEMENT, ACCOUNT.BANKCODE, ACCOUNT.BANKACCT
FROM FXTRADE, ACCOUNT WHERE FXTRADE.ACCOUNT = ACCOUNT.ID ORDER BY [ACCOUNT!2!BANKCODE], [ACCOUNT!2!BANKACCT] FOR XML EXPLICIT, ELEMENTS
18
SQL to XML FOR XML creates an XML document containing the results
of the query.FOR XML mode [, XMLDATA] [, ELEMENTS]
The keyword EXPLICIT means that you specify the format of the results.
Another mode, AUTO, constructs XML documents by applying the default rules.
If XMLDATA is specified, the schema is returned along with the results.
The keyword ELEMENTS models SQL columns at the element level; if ELEMENTS is not mentioned the default is that the columns will be modeled on attribute level. (3)
19
Storing XML in the database
Storing XML documents into the SQL database uses OPENXML (for insert and update).
The syntax:OPENXML(idoc int [in], rowpattern nvarchar[in],
[flags byte[in]]) [WITH (SchemaDeclaration | TableName)]
idoc is the document handle of the internal representation of an XML document created by calling sp_xml_preparedocument.
rowpattern is the XPath pattern used to identify the nodes. flags Indicates the mapping that should be used between the
XML data and the relational rowset.(3+10)
20
Storing
The storing is done in three steps:
1. Compiling the XML document into internal DOM representation to obtain an “XML document handler” using the stored procedure sp_xml_preparedocument.2. Creating a schema.3. Removing the compiled XML document from memory using the stored procedure sp_xml_removedocument.
21
22
DECLARE @idoc int DECLARE @doc varchar(1000) SET @doc =' <FXTRADE>
<CURRENCY1>GBP</CURRENCY1> <CURRENCY2>JPY</CURRENCY2> <AMOUNT>10000</AMOUNT> <SETTLEMENT>20010325</SETTLEMENT> <ACCOUNT>
<BANKCODE>812</BANKCODE>
<BANKACCT>00365888</BANKACCT> </ACCOUNT> </FXTRADE>'
23
-- Create internal DOM representation of the XML document. EXEC sp_xml_preparedocument @idoc OUTPUT, @doc
-- Execute a SELECT statement using OPENXML row set provider. SELECT * FROM OPENXML (@idoc, '/FXTRADE/ACCOUNT', 2) WITH (
CURRENCY1 CHAR (3), '../@CURRENCY1', CURRENCY2 CHAR (3), '../@CURRENCY2', AMOUNT NUMERIC (18,2), '../@AMOUNT', SETTLEMENT DATETIME, '../@SETTLEMENT', BANKCODE VARCHAR (100),'@BANKCODE', BANKACCT VARCHAR (100),'@BANKACCT') EXEC sp_xml_removedocument @idoc (3)
24
Oracle 8i/9i Oracle offers XML SQL utility (a set of Java classes)
that:
– models XML document elements as a collection of nested tables and allows update and delete.
– generates XML documents from SQL query results or a JDBC ResultSet object. (2+3+8)
25
Oracle 8i/9i Mapping of SQL to XML:
CREATE TABLE FXTRADE { CURRENCY1 CHAR (3),
CURRENCY2 CHAR (3), AMOUNT NUMERIC (18,2), SETTLEMENT DATE, ACCOUNT AccountType // object reference
} CREATE TYPE AccountType as OBJECT{ BANKCODE VARCHAR (100),
BANKACCT VARCHAR (100)
} The field ACCOUNT in the table FXTRADE is modeled as an object reference of type AccountType.
26
SQL to XML
Using “;SELECT * FROM FXTRADE” SQL statement the following xml is generated:<?xml version="1.0"?>
<ROWSET> <ROW num="1"> <CURRENCY1>GBP</CURRENCY1>
<CURRENCY2>JPY</CURRENCY2> <AMOUNT>10000</AMOUNT>
<SETTLEMENT>20010325</SETTLEMENT> <ACCOUNT>
<BANKCODE>812</BANKCODE>
<BANKACCT>00365888</BANKACCT> </ACCOUNT> </ROW> <!-- additional rows ... -->
</ROWSET>
27
The same XML document is obtained by the following code:import oracle.jdbc.driver.*;import oracle.xml.sql.query.OracleXMLQuery;import java.lang.*;import java.sql.*; // class to test XML document generation as Stringclass testXMLSQL { public static void main(String[] args)
{ try { // Create the connection
Connection conn = getConnection("scott","tiger");
// Create the query class OracleXMLQuery qry = new
OracleXMLQuery(conn, "SELECT * FROM FXTRADE");
28
Storing XML in the databaseimport java.sql.*;import oracle.xml.sql.dml.OracleXMLSave;public class testXMLInsert{
public static void main(String args[]) throws SQLException { Connection conn = getConnection("scott","tiger"); OracleXMLSave sav = new OracleXMLSave(conn,
"scott. FXTRADE"); // Assume that the user passes in this document as 0-arg
sav.insertXML(args[0]); sav.close(); } ...
}
29
Remarks
It is important to note that XSU (XML SQL utility does not allow the storage of attributes. These have to be transformed into elements. (3)
30
IBM DB2
IBM DB2 offers DB2 XML Extender. It allows the storage of XML documents in two ways:– XML Column: allows the storage an retrieval
of the entire XML document as a column data.– XML Collection: decomposes/composes the
XML document into/from a collection of relational tables. (3)
31
Document Access Definition
DB2 XML Extender provides a mapping scheme called a Document Access Definition (DAD).
DAD is a file that allows an XML document to be mapped into relational data using either XML Columns or XML Collections.
32
Example
33
Sybase Adaptive Server Sybase Adaptive Server offers the ResultSetXml Java class for the processing of XML
documents in both directions. XML from the database:
– The Java class ResultSetXml has a constructor that takes an SQL query as an argument, thereafter the getXmlLText method extracts an XML document from result set:
jcs.xml.resultset.ResultSetXml rsx = new jcs.xml.resultset.ResultSetXml ("Select * from FxTrade", <other parameters>); FileUtil.string2File ("FxTradeSet.xml",
rsx.getXmlText());
34
Sybase Adaptive Server
Storing XML in the database: The ResultSetXml class constructor can also take an XML
document as an argument. Then the method toSqlScript generates sequences of SQL statements for insert/update into a specified table from a result set:
String xmlString = FileUtil.file2string ("FxTradeSet.xml"); jcs.xml.resultset.ResultSetXml rsx = new jcs.xml.resultset.ResultSetXml(xmlString);
String sqlString = rsx.toSqlScript ("FxTrade", <other parameters>) (2)
35
Vendor Mapping rules Single table / Multiple tables
Means of transformation
Symmetrical extraction / storing
Oracle Implicitly; by constructing object-relational data model
Multiple Designated Java classes
Symmetrical, if XML document and object-relational model match
IBM Data Access Definition file
Multiple Designated stored procedures
Symmetrical
Microsoft SQL extension; row set function
Multiple for extraction;
By using SQL construct FOR XML and row set OPENXML
Asymmetrical
Single for storing
Sybase Result Set DTD Single; query may encompass multiple tables
By using Java classes Symmetrical
36
References
1. www.eaijournal.com/PDF/StoringXMLChampion.pdf 2. www.acm.org/crossroads/xrds8-4/XML_RDBMS.html 3. http://www.xml.com/pub/a/2001/06/20/databases.html 4. http://www.w3.org/XML/RDB.html 5. http://www.infoloom.com/gcaconfs/WEB/granada99/noe.HTM 6. http://www.hitsw.com/products_services/whitepapers/integrating_xml_rdb/ 7.
http://www.utdallas.edu/~lkhan/papers/APESXDRD_ProcACM3rdWIDM2001.pdf
8. http://www.xml.com/pub/r/846 9. http://msdn.microsoft.com/library/default.asp?url=/library/en-us/tsqlref/ts_oa-
oz_5c89.asp 10. http://msdn.microsoft.com/library/psdk/sql/ts_oa-oz_5c89.htm 11. Db2XMLeXtender.pdf 12. www.microsoft.com/mspress/books/sampchap/5178a.asp