IMS XML Database and XQuery Christopher Holtz IBM August 15 th, 2007 #1231

Embed Size (px)

DESCRIPTION

3 What is XML A Standardized, Simple, and Self-Describing Markup Language for documents containing structured or semi-structured information. IMS XML Database 60 Holtz Christopher Loved It Can’t wait to start using it

Citation preview

IMS XML Database and XQuery Christopher Holtz IBM August 15 th, 2007 #1231 2 Okso whats the big idea behind XML these days anyway?!? 3 What is XML A Standardized, Simple, and Self-Describing Markup Language for documents containing structured or semi-structured information. IMS XML Database 60 Holtz Christopher Loved It Cant wait to start using it 4 Why XML Standard Internet Data Exchange Format Self-Describing Handles encoding (internationalization) Handles byte ordering Can Represent Almost Anything Forces Syntax-Level Interoperability Easily Parsed Confers Longevity Standard! 5 Standards built on XML (to name a few) DTD DOM SAX SOAP SQL/XML VoiceXML WAP WSDL WS-Policy XForms XHTML XInclude XLink XML Base XML Encryption XML Key Management XML Processing XML Schema XML Signature XPath XPointer XQuery XSL and XSLT FIXML FpML IFX MMDL OFX Schema RIXML SWIFTNet XBRL Accounting Advertising Astronomy Building Chemistry Construction Education Food Finance Government Healthcare Insurance Legal Manufacturing News Physics Telecommunications FinancialOther Field Vocabularies Xerces Xalan JAXB JAXP JAXR JAX-RPC JDOM XQJ Open Source and Java 6 The XML Schema Definition Language An XML Schema: defines elements and attributes that can appear in a document defines which elements are child elements defines the order and number of child elements defines whether an element is empty or can include text defines data types for elements and attributes defines default and fixed values for elements and attributes Defines an agreed upon communication contract for exchanging XML documents An XML language for defining the legal building blocks of a valid XML document 7 XML Schema Example Presentation Presenter Comments comment 8 Well formed vs. Valid XML Document Well formed Obeys the XML Syntax Rules must begin with the XML declaration must have one unique root element all start tags must match end-tags XML tags are case sensitive all elements must be closed all elements must be properly nested all attribute values must be quoted XML entities must be used for special characters Valid Conforms to a specific XML Schema 9 Great.so XML is fantastic as a data exchange but why would I want to store my data as XML. 10 Why XML Databases Physical Storage Benefits Very much depends on the type of data and how it is going to be used!! Data coming in as XML and going out as XML Need to convert to / from XML Structure Types / Encoding Conceptual divide between XML and Physical Storage Layout XML Standards (XML Schemas, XQuery, etc) 11 Sounds greatbut I certainly cant migrate my data off of IMS 12 IMS XML-DB Metadata Natural mapping between hierarchic XML data and hierarchic IMS database definitions. PSB DBD IMS DB definition XML Schema mapping XML view of IMS data XML document definition 13 IMS XML Database Introduces a way to view/map native IMS hierarchical data to XML documents Aligns IMS Database (DBD) with XML Schema Allows the retrieval and storage of IMS Records as XML documents with no change to existing IMS databases XML Documents IMS Data title seq pricepublisherchoice author lastfirst seq editor lastfirst seq affiliation xs:date xs:string xs:decimal XML Schema TITLE PUBLISH FIRSTLAST FIRST 0:oo AUTH EDIT BOOK YEAR PRICE LASTAFFIL PCB: BIB21 IMS DBD 14 XML Visualization DABBDC A BD C abcd efgjkl hi a bc d efg j k l hi doc DBD, PCB, Copybooks XML Schema On Disk I/O IMS Hierarchical Model XQuery 1.0 XPath 2.0 Data Model 15 Its the Metadata, Stupid! Physical Metadata Segment Sizes Segment Hierarchy (field relationships 1-to-1, 1-to-n) DBD Defined Fields Application Defined Fields Field Type, Type Length, Byte Ordering, Encoding, etc. Offer Field/Segment Renaming (lift 8 char restriction) Structural Metadata XML layout for fields (field relationships must still match) Element vs. Attribute (names must match) Type Restrictions, Enumerations, etc. Defined in DBD Defined in Copylibs (IMS Java) Defined in XML Schema 16 Its the Metadata, Stupid! Defined in DBD Defined in Copylibs (IMS Java) Defined in XML Schema number lastName firstName payment type date INT CHAR CHAR CHAR CHAR DATE Holtz Christopher 10/21/2003 MC 17 AA B BBCC DD XML Schema/ Metadata Composed XML Decomposed XML Retrieval in IMS 18 Simple XML Structuring Uses Default structure All data is represented as elements A B C D E F G H 19 Complex XML Structuring (V10) Uses Attribute Mapping Uses hard coded values Uses hard coded levels A B C D E F G H 20 Wonderfulso how do I actually use it 21 Using IMS as an XML Database Metadata Generation DLIModel utility Physical Metadata(PSB, DBD, Copybooks) Structural Metadata(XML Schema) Application Programming IMS JDBC retrieveXML / storeXML XQuery support Example 22 IMS Java Metadata classes (HFS) IMS Java Report (HFS) COBOL copybook members Control statements: 1) Choose PSBs/DBDs 2) Choose copybook members 3) Aliases, data types, new fields. DL/I Model Utility PSB DBD package samples.dealership; import com.ibm.ims.db.*; import com.ibm.ims.base.*; public class AUTPSB11DatabaseView extends DLIDatabaseView { // The following DLITypeInfo[] array describes Segment: DEALER in PCB: AUTOLPCB static DLITypeInfo[] AUTOLPCBDEALERArray= { new DLITypeInfo("DealerNo", DLITypeInfo.CHAR, 1, 4, "DLRNO"), new DLITypeInfo("DealerName", DLITypeInfo.CHAR, 5, 30, "DLRNAME"), new DLITypeInfo("DealerCity", DLITypeInfo.CHAR, 35, 10, "CITY"), new DLITypeInfo("DealerZip", DLITypeInfo.CHAR, 45, 10, "ZIP"), new DLITypeInfo("DealerPhone", DLITypeInfo.CHAR, 55, 7, "PHONE") }; static DLISegment AUTOLPCBDEALERSegment= new DLISegment ("DealerSegment","DEALER",AUTOLPCBDEALERArray,61);... // An array of DLISegmentInfo objects follows to describe the view for PCB: AUTOLPCB static DLISegmentInfo[] AUTOLPCBarray = { new DLISegmentInfo(AUTOLPCBDEALERSegment,DLIDatabaseView.ROOT), new DLISegmentInfo(AUTOLPCBMODELSegment,0), new DLISegmentInfo(AUTOLPCBORDERSegment,1), new DLISegmentInfo(AUTOLPCBSALESSegment,1), new DLISegmentInfo(AUTOLPCBSTOCKSegment,1), new DLISegmentInfo(AUTOLPCBSTOCSALESegment,4), new DLISegmentInfo(AUTOLPCBSALESINFSegment,5) };... } DLIModel IMS Java Report ======================== Class: AUTPSB11DatabaseView in package: samples.dealership generated for PSB: AUTPSB11 ================================================== PCB: Dealer ================================================== Segment: DealerSegment Field: DealerNo Type=CHAR Start=1 Length=4 ++ Primary Key Field ++ Field: DealerName Type=CHAR Start=5 Length=30 (Search Field) Field: DealerCity Type=CHAR Start=35 Length=10 (Search Field) Field: DealerZip Type=CHAR Start=45 Length=10 (Search Field) Field: DealerPhone Type=CHAR Start=55 Length=7 (Search Field) ================================================== Segment: ModelSegment Field: ModelKey Type=CHAR Start=3 Length=24 ++ Primary Key Field ++ Field: ModelType Type=CHAR Start=1 Length=2 (Search Field) Field: Make Type=CHAR Start=3 Length=10 (Search Field) Field: Model Type=CHAR Start=13 Length=10 (Search Field) Field: Year Type=CHAR Start=23 Length=4 (Search Field) Field: MSRP Type=CHAR Start=27 Length=5 (Search Field) Field: Count Type=CHAR Start=32 Length=2 (Search Field) ================================================== Segment: OrderSegment Field: OrderNo Type=CHAR Start=1 Length=6 ++ Primary Key Field ++ Field: LastName Type=CHAR Start=7 Length=25 (Search Field) Field: FirstName Type=CHAR Start=32 Length=25 (Search Field) Field: Date Type=CHAR Start=57 Length=10 (Search Field) Field: Time Type=CHAR Start=67 Length=8 (Search Field) ================================================== Segment: SalesSegment Field: SaleNo Type=CHAR Start=49 Length=4 ++ Primary Key Field If you can read this you do not need glasses; however this is just silly writting to represent the control statements that are the input to the utility. XML Schema(s) (HFS) (PDS) (HFS or PDS) XMI 1.2 (HFS) DL/I Model Utility 23 GUI DL/I Model Utility Install eclipse Plug-in Create new DL/I Utility Model Project 24 GUI DL/I Model Utility Select DBDs / PSBs must be FTPed locally Source is Parsed any errors are reported XMI Metamodel generated opened for editing 25 GUI DL/I Model Utility 26 GUI DL/I Model Utility 27 GUI DL/I Model Utility 28 CEETDLI Interface JNI Base AppApp DB Customer Code IMS Java Class Library Assembler Layer Interface to IMS IMS DB Metadata Business Logic Mapping to DL/I APIs IMS Dep. Region Transaction and Message Processing JDBC, JCA interface Java to C interface IMS Java Class Library JDBC/SQLXML-DB IMS Java App DLI Database View XML Shredder, XML Materializer Code 29 retrieveXML() UDF SELECT retrieveXML(A) FROM C WHERE C.fieldA = 35 *Two Rows of XML CLOBs in the ResultSet XML Context 35 30 storeXML() UDF INSERT INTO A (storeXML()) VALUES (?) *Insert Statement must be a Prepared Statement XML Context 31 CEETDLI Interface JNI Base AppApp DB JDBC / SQL XMSXMS IMS Java App DLI Database View MPPMPP BMPBMP IFPIFP COPYLIB PSBs DBDs CEETDLI Interface JNI Base AppApp DB JDBC / SQL IMS Java App DLI Database View CEETDLI Interface JNI Base AppApp DB JDBC / SQL IMS Java App DLI Database View JMP JBP CEETDLI Interface JNI Base AppApp DB JDBC / SQL IMS Java App DLI Database View CEETDLI Interface JNI Base AppApp DB JDBC / SQL IMS Java App DLI Database View Stored Procedure EJB JCICS DBDGEN PSBGEN ACBGEN IMS DB DRA ODBA CEETDLI Interface JNI Base AppApp DB JDBC / SQL IMS Java App DLI Database View Java Virtual Machine IMS JDBC Runtime 32 XQuery support in IMS V10 Further aligns IMS with industry direction XML, SOA, Web Services, etc. More natural fit for hierarchical data querying Enables customers to leverage emerging standard skill set Enhanced product and tooling integration Our IMS XML solution has a 38+ year head start Immediately usable with no migration of existing IMS data Data Content Information as a Service Information as a Service 33 SQL Road to Interoperability through XQuery Relational Engines SQLRelational XQuery Hierarchical (XML) DB2 UDB v9 (~2006) Hierarchical DL/I IMS Engine IMS v9 (2004) (XML) IMS v10 (~2007) (XML View) 34 FOR : iterates through a sequence, bind variable to items LET : binds a variable to a sequence WHERE : eliminates items of the iteration ORDER BY : reorders items of the iteration RETURN : constructs query results XQuery FLWOR Expressions { for $b in /bib/book let $title := $b/title where $b/publisher = "Addison-Wesley order by return { $title } } Advanced Programming in the Unix TCP/IP Illustrated 35 Extended JDBC interface SELECT retrieveXML( A, { for $b in book where $b/publisher = 'Addison-Wesley and > 1991 return { $b/title } } ) FROM C WHERE C.fieldA = 35 35 XML Context 36 Extended JDBC interface SELECT retrieveXML( { for $b in /bib/book where $b/publisher = 'Addison-Wesley and > 1991 return { $b/title } } ) FROM PCB XQuery Context PCB 37 Optimizing XQuery for IMS { for $b in /bib/book where $b/publisher = 'Addison-Wesley and $b/year > 1991 return { $b/title } } Step through every book and evaluate if it meets criteria For each match return this 38 Optimizing XQuery for IMS { for $b in ims:gn( ims:particle('/bib/book'), ims:and( ims:eq(ims:particle('/bib/book/publisher'), 'Addison-Wesley '), 1991) ) return { $b/title } } Optimized using native IMS XQuery functions Move directly to matching book particles using these SSAs and for each match return this We are working on Algorithms to do this For you post-QPP 39 Continually Reducing Development Costs Example Application Development (very simplified) Publishing tracking information for specified package SHIPFROM 0:oo SCAN PACKAGE NUM DATETIMELOCATION SHIPTO San Jose Los Angeles San Diego Desired OutputTracking Database 40 Continually Reducing Development Costs (COBOL Application Programming) Build SSA Template and Establish Parentage over Package Record Iterate through child SCAN segments with GNP calls Keep count and move selected fields into XML template ADD 1TO numScans MOVE SCAN-DATETO scan (numScans) date MOVE SCAN-LOCTO scan (numScans) location CALL AERTDLI USING DLI-GU AIB-MASK PACKAGE-SEG PACKAGE-SSA. IF IPCB-STATUS-CODE = SPACES 01 PACKAGE-SSA 05 FILLERPIC X(9)VALUE PACKAGE (. 05 FILLERPIC X(10)VALUE NUM EQ. 05 PACKAGE-NUMPIC X(12). 05 FILLERPIC XVALUE ). CALL AERTDLI USING DLI-GNP AIB-MASK SCAN-SEG SCAN-SSA. IF IPCB-STATUS-CODE = SPACES 01 SCAN-SSA 05 FILLERPIC X(9)VALUE SCAN . 41 Continually Reducing Development Costs (COBOL Application Programming) Create COBOL XML Template* For XML formatting 01 numScansPIC XMLDOC. 05 FILLERPIC X(13)VALUE . 05 scans OCCURS DEPENDING ON numScans. 10 FILLERPIC X(12)VALUE . 10 locationPIC X(35). 10 FILLERPIC X(7)VALUE . 05 FILLERPIC X(14)VALUE . *XML GENERATE Command could be used 42 Continually Reducing Development Costs (Java SQL) Start XML tag: Build SQL call: Iterate ResultSet: End XML tag: StringBuffer result = new StringBuffer( ); SELECT scan.date, scan.location FROM PCB1.Scan WHERE package.num = while (resultSet.next()) { result.append(); result.append(resultSet.getString(location)); result.append( ); } result.append( ); 43 Continually Reducing Development Costs (Java XQuery) Issue single XQuery call Queries and formats as XML { let $package := /package[num = ] for $scan in $package/scan return { $scan/location/text() } } 44 alphaWorks is an early adopter website for emerging IBM technologies in early stages of R&D. Virtual XML Garden Java-based XQuery IMS is a featured implementation Include DBD, PSB, load job, sample apps, and step by step instructions 45 Why XML Databases Data coming in as XML and going out as XML Need to convert to / from XML Structure Types / Encoding Conceptual divide between XML and Physical Storage Layout XML Standards (XML Schemas, XQuery, etc) Very much depends on the type of data and how it is going to be used!! 46 XML can be a better choice than relational for... Data thats inherently hierarchical or nested in nature Example: Medical data, Bill-of-materials, etc. Data sets with sparsely populated attributes Example: FIXML, FpML, Customer profiles Schema evolution Example: Frequently changing services/products/processes Variable schemas, many schemas Example: Data integration, consolidation of diverse data sources Combining structured & unstructured data Example: CM, Life Sciences, News & Media These are already IMSs Strengths These are not supported (not well) in IMS 47 IMS XML Database Introduces a way to view/map native IMS hierarchical data to XML documents Aligns IMS Database (DBD) with XML Schema Allows the retrieval and storage of IMS Records as XML documents with no change to existing IMS databases Enables query of IMS data using XQuery XML Documents IMS Data title seq pricepublisherchoice author lastfirst seq editor lastfirst seq affiliation xs:date xs:string xs:decimal XML Schema TITLE PUBLISH FIRSTLAST FIRST 0:oo AUTH EDIT BOOK YEAR PRICE LASTAFFIL PCB: BIB21 IMS DBD 48 SoIs IMS a native XML database? Defines a (logical) model for an XML document -- as opposed to the data in that document -- and stores and retrieves documents according to that model. Has an XML document as its fundamental unit of (logical) storage. Is not required to have any particular underlying physical storage model. A native XML database*... *According to XML:DB mailing list Backup Slides if needed 50 Decomposed Storage XML document must be parsed and validated. Data is converted to traditional IMS types COMP-1, COMP-2, etc. EBCDIC CHAR, Picture Strings Stored data is searchable by IMS and transparently accessible by non-XML enabled applications. 51 AA B BBCC DD XML Schema/ Metadata Incoming XML Decomposed XML Storage in IMS 52 Two general types of XML documents Data-centric Highly structured Limited size and strongly typed data elements Order of elements generally insignificant Invoices, purchase orders, etc. Document-centric Loosely structured Unpredictable sizes with mostly character data Order of elements significant Newspaper articles, manuals, etc. 53 Decomposed vs. Intact Storage Decomposed (data-centric storage) XML tags are stripped from XML data Identical as current IMS storage Strict data-centric XML Schema validated data EBCDIC encoding Searching on IMS Search Fields Intact (document-centric storage) Entire XML document is stored (including tags) Relaxed un-validated data Any desired encoding is possible Searching is through XPath specified and generated Secondary Indexed Side Segments 54 Intact Storage No (or little) XML Parsing or Schema validation Storage and Retrieval Performance No (or little) data type conversions Unicode storage Stored documents are no longer searchable by IMS and only accessible to XML-enabled applications XPath side segments 55 AA o o o oo Incoming XML Intact XML Storage in IMS 56 Intact Storage Structure Base Segment 1 byteVersion Number (0x01) 1 byteReserved 2 bytes1 st bit: indicates more segments than just this one 2 nd 16 th bits: indicates this segments length n bytesIntact XML Data Overflow Segment 2 bytesKey field sequence number 2 bytes1 st bit: indicates more segments than just this one 2 nd 16 th bits: indicates this segments length n bytesIntact XML Data XML Intact Data XML Intact Overflow 57 A osss Intact Storage Secondary Indexing XPath expression identifying Side Segments Side segment is converted to traditional data type and copied into segment. Side Segments are secondary indexed with documents root as target. XPath=/Dealer/Model[Year>1995]/Order/LastName XPath=/Dealer/DealerName Example: 58 AA o o o oo Incoming XML s XPath=/A/B/f4 XPath=/A/E/f1 Example: Intact XML Storage in IMS o s s 59 public void processMessage(String dealerName) { obtain connection... String query = SELECT DealerSegment.DealerName, retrieveXML(DealerSegment) AS DealerXMLDoc + FROM Dealer.DealerSegment + WHERE DealerSegment.DealerName = + dealerName + ; Statement statement = connection.createStatement(); ResultSet results = statement.executeQuery(query); process results... close connection... } public void processMessage(String dealerName) { obtain connection... String query = SELECT DealerSegment.DealerName, retrieveXML(DealerSegment) AS DealerXMLDoc + FROM Dealer.DealerSegment + WHERE DealerSegment.DealerName = + dealerName + ; Statement statement = connection.createStatement(); ResultSet results = statement.executeQuery(query); process results... close connection... } Execute Query retrieveXML() call 60 public void processMessage(String dealerName) { obtain connection... execute query... while (results.next()) { Clob xmlDoc = results.getClob(DealerXMLDoc); saveClobToFile(xmlDoc, results.getString(DealerName)); } close connection... } public void processMessage(String dealerName) { obtain connection... execute query... while (results.next()) { Clob xmlDoc = results.getClob(DealerXMLDoc); saveClobToFile(xmlDoc, results.getString(DealerName)); } close connection... } Process Results getClob() call 61 public void saveClobToFile(Clob clob, String fileName) throws IOException { Reader reader = clob.getCharacterStream(); FileWriter writer = new FileWriter(fileName + .xml); char[] line = new char[1024]; int x = reader.read(line,0,1024); while (x != -1) { writer.write(line,0,x); x = reader.read(line,0,1024); } reader.close(); writer.close(); } public void saveClobToFile(Clob clob, String fileName) throws IOException { Reader reader = clob.getCharacterStream(); FileWriter writer = new FileWriter(fileName + .xml); char[] line = new char[1024]; int x = reader.read(line,0,1024); while (x != -1) { writer.write(line,0,x); x = reader.read(line,0,1024); } reader.close(); writer.close(); } Process Results getCharacterStream() or getAsciiStream() 62 Approaches to XMLing legacy data Three traditional approaches Migration to store data in XML Duplicate data, Synchronize frequently (ETL) App-start-up-time on the fly XML construction New Solution - On Demand XML Construction Only materialize required portions of data as XML Including lazy materialization 63 Virtual XML XDM Virtual-XML Processor Plain Ol' XML Adaptor DFDL Adaptor IMS Adaptor ZIP Adaptor RDF Adaptor Relational Adaptor Notes Adaptor is the ability to view and process non-XML data as though it were XML Intelligent XML Processor Abstract XML Data Source adaptor interface