Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Rethinking the DatabaseRethinking the DatabaseUsing XML databases to align with business
processes and enhance capabilitiesprocesses and enhance capabilities
GTC EASTThe New York Digital Government Summitg
September 21-24, 2009
Components of XML
• XML - encoding documents electronically• XSL - transforming and rendering XML documentsXSL transforming and rendering XML documents• XPath - addressing the parts of an XML document• XML Database – storing collections of XML datag• XQuery - querying collections of XML data• SQL/XML - querying XML within SQLq y g Q• XForms - interfacing with XML data• XRX - coming soon to a database near you
XML for Data Exchange
• Data exchange - XML enable platform-independent data exchange among applications.
• XML serves to "glue together" or mediate a common data layer between two separate and already existing y p y gprograms.
• Typically XML based messaging services (such as• Typically, XML-based messaging services (such as SOAP) enable different applications to communicate.
XML for Data Exchange
XML for Content Management
• Web content management - usually implemented as a Web application, for creating and managing HTML content.
• Most systems use a database to store content. Content is frequently, but not universally, stored as XML, to facilitate reuse and enable flexible presentation options.
• A presentation layer displays the content to regular Web-site p y p y gvisitors based on a set of templates. The templates are sometimes XSL files.
XML for Content Management
XML for Syndication: RSS
• RSS is a simple XML format used to syndicate headlines.
• It is used by websites that publish new content regularly and provides a list of headlines with links to their latest content.
• Content such as news feeds, events listings, project updates, and most recently podcasting, video and image distribution can all be distributed by RSS. y
RSS Reader
Databases: Relational vs. XML
• Relational – no hierarchy or significant order; based on two-dimensional tables. Used for storing and querying data.
• XML – hierarchical and sequential; based on trees in which order matters. Used for
h i d di l i d texchanging and displaying data.
Relational Data(tables, columns, rows, keys)( , , , y )
XML Data(collections, files)( , )
A Complicated RelationshipData exchange b t XML Data arrives in an
XML format.
between XML, applications, and databases is not a simple one
XML document is mapped to
i bj t
a simple, one-step event.
It involves many processing objects.It involves many processing steps and translations of the data into
And also mapped to rows and tables in relational database.
totally different formats.
A Complicated Relationship
Simplifying the Relationship• Storing data as XML (in native XML or XML-
enabled databases) eliminates the process of translating data back-and-forth into varioustranslating data back and forth into various formats.
• Data is recei ed stored and processed as XML• Data is received, stored, and processed as XML.
• Eliminates multiple translation steps (along with p p ( gtheir development times and their possibilities for errors).
The Real Difference
"XML is by definition self-describing data. Build the database around that structure not the other way around. The y
implementation is far from being that simplistic. This basic concept however – leverage XML’s self-describing and
hierarchical nature to manage it – is the very foundation of an XML database."
(f h //bi /2009/05/28/ db / )(from http://bigmenoncontent.com/2009/05/28/xdb-matters/ )
Two Immediate Benefits
By storing XML data as XML in XML database:
• Simplified storage• New query capabilitiesq y p
New Query CapabilityQueryCreate a personalizedRSS feed based on keyword(s) that ankeyword(s) that anindividual wants totrack within CTG’sWebsite. How?Since all content on website is stored in XML database, it is all available to query. An XForms interface enables visitors to submit their own terms of interest for building ainterest for building a personalized RSS.
New Query CapabilityResults of QueryResults of Query
New Query CapabilityQueryQueryCreate a list of CTG publicationspgrouped by main author.How?Since all content on website is stored in XML database, individual documents can be queried at the node level and combined, grouped and sorted into results that are difficult if not impossible to achieve without XQuery.achieve without XQuery.
New Query CapabilityResults of QueryResults of Query
Resources
• XML Databases - The Business Casehttp://www.cfoster.net/articles/xmldb-business-case/
R ld B t C lti iti d h i XML d d t b• Ronald Bourret, Consulting, writing, and research in XML and databaseshttp://www.rpbourret.com/xml/
• Introduction to Native XML Databaseshttp://www xml com/pub/a/2001/10/31/nativexmldb htmlhttp://www.xml.com/pub/a/2001/10/31/nativexmldb.html
• A comparison of XML-enabled and native XML data management techniqueshttp://xml sys-con com/node/104980?page=0 0http://xml.sys con.com/node/104980?page 0,0
• Feature Comparison: EMC Documentum xDB vs. Oracle XML DB & IBM DB2 pureXML https://community.emc.com/docs/DOC-2999p y
Contact
• Jim CostelloCenter for Technology in GovernmentCenter for Technology in [email protected]
Real World Large Scale Deployment
• New York State Taxation and Finance
Facilitating Business Alignment
Process
Sharing the same Business defined business object
(XML)Eliminates Technical Abstraction.All processes, functionsand reports “configured” in user’s Processand reports “configured” in user’sterms.
The same core Business Object
PersistencePresentation
Is leveraged by all of the systemcomponents .
Returns Returns ProcessingProcessingOperational XMLOperational XML –– Next Generation DB DesignNext Generation DB DesignOperational XML Operational XML Next Generation DB DesignNext Generation DB Design
<filing><form formid = ‘IT201’>
<wages>134</wages>
col1 col2 col3 col4 col5 … col1000
134 NULL 11/23/05 NULL NULL … NULLIT-<date>11/23/05</date>
</form><form formid = ‘W2’>
<wages>278</wages><jointTP>Yes</jointTP>
NULL 276 NULL NULL Yes … NULL
12 NULL NULL 99.99 NULL … NULL
201W2
IT <jointTP>Yes</jointTP></form>
</filing> NULL NULL NULL 123.23 NULL … No
IT-150
G li d R l ti l T blRelational Table by Form•3600 tables required• Difficult to get filing context• Made Rules engine, display difficult• Much IO
Generalized Relational Table •Needed DB to translate fields• Sparsely populated• Performance issues• Rules engine limitatations
XML Solution•Business object based (the audit folder)• Keeps business context • Robust rules processing• Can leverage XML tooling
Transactional XML Layeringy g
• ExceptionsExceptions – Quality or Condition of Data
– Auto Routing– Customer Service
Hi tHistory • History– What, Why and Who of all Changes– Auditability– Exception Resolution
Received XML
Exception Resolution
• All relevant data in one place– Less I/O– Data IntegrityData Integrity– Enable Transaction Processing
Development EnablementEstablishing PatternsEstablishing Patterns
External Channels Processing
Web Development challenges• Develop quicker• Reuse “segments” of web appsg pp• Consistent features (print, return to
application)application)• Consistently defined navigation patterns
T k f li ti• Track users usage of application
A series of form based UI objects that create a transaction forWhat is a Web Application?
A series of form based UI objects that create a transaction for processing.
Then each of these “forms” can be designed exactly the same
<transaction><form formid = “a’>
<field1>134</field1><field2>abc</field21>
</form>
g yway….
</form><exceptions>
<context>a</context><errNbr>17</errNbr>
</exception></transaction>
A Form has a UI object has an object enforcing form
has an XML segmentin the transaction XML
rulesIn this way pages can be coded separately and following the same pattern and integrated into a web application.
Web Navigation Pattern
Web Navigation highlights•All pages coded exactly the same• Single XML table allows restart • Navigation patterns enforced (Wizard, conditional, etc.)• Process server allows for externalization of navigation• Common error handling
C i fili t t t t ti• Can use previous filing to start new transaction
Web Navigation
Web Service Pattern
Web Service Process highlights• Same pattern serves all web services • Le erages same r les as eb• Leverages same rules as web• Once web is established no additional coding required
XML within CMXML within CM<USAddress>– <Address>
<AddressLine1>PO BOX 228</AddressLine1><City>SCHENECTADY</City><State>NY</State><State>NY</State><ZIPCode>123080000</ZIPCode></USAddress>
– </Address></USAddress></StateOfIncorporation><HdrCode><HdrCode>
– <FederalReturnFiledOther>String</FederalReturnFiledOther><FilerClassificationCode>AA3</FilerClassificationCode><FormType>CT5</FormType><ReturnTypeCode>CT5</ReturnTypeCode><SoftwareDeveloper><DeveloperName><DeveloperName>
<BusinessNameLine1>Sunrise Investments Inc</BusinessNameLine1><BusinessNameLine2>A A</BusinessNameLine2>
Receive XML and show the documentin the form (integrated with our EDMS) • user view is independent of channel!• all data received is stored in one table• form can be used as input and for correctionp• PureXML solution uses advanced indexing
Paper Process Patternp
Paper Process highlightsPaper Process highlights•Same pattern serves all paper processes • Leverages the same form Interfaces • Leverages same XML databaseLeverages same XML database• Leverages same rules as web• Has built in capabilities to have different acceptance rules for paper
XML Indexing
XML indexing highlights•Allows for a many to one indexing scheme• Index fields can be added on fly
S t ti l i d• Supports optional index
Legacy Integration
Legacy Integration highlightsLegacy Integration highlights• Allows for modernizing data capture while keeping legacy processing• Single XML table stores all received documents and integrated with CM• Many mapping services are re usable• Many mapping services are re-usable
Process Automation Pattern
P A t ti hi hli htProcess Automation highlights• Can be integrated with UI for complete inline processing• Single XML table stores all received documents and integrated with CM
P b d h h M i• Process can be reported on through Monitor• Operational data system becomes XML
ee--MPIRE R3 Returns ProcessingMPIRE R3 Returns Processing
Flat File WebSphere Process Server MicroflowAdapter and
WID WebSphere MQ to back-end systems
SCA async invoke (uses
SIBUS)
MicroflowPersistInvoke CICS
Tx 1Invoke CICS
Tx 2Invoke CICS
Tx X
Leverages source – target mapping in one COBOL step
Common taxpayer validation routinewhich reconciles taxpayers andoutput common DB structure
Third batch program which
DB2 V9pureXML
JCA Connector (AIX)
CICS Transaction Gateway 6.1 (z/OS)Third batch program whichprocesses form rulesMicroflow
Transaction boundary, two-phase commit
(parallelism not illustrated)CICS
CICSTransaction1
CICSTransaction2
CICSTransactionX
(parallelism not illustrated)
Returns Process (BatchProcessModule)
Mainframe
TAQueue(MQ)
Mai
nfra
me
NAT127ZSNRD300Z
NAT549ZSTA
Interest Free Date
Calculation
NAT122ZS
14 28
ImageFile
NRP850Bformat and
write image fileTZ0052Z CPT517ZNRP520ZNAT115Z
19
NRD301Z
5 7 8 9 29
NDP001ZWorkflow
Repoll
29c
Updated July 11, 2007
NAT164Z NAT216Z
2320 2118
NDP010ZPopulate
exceptionsDetails
16
NDP300Z
NRD300ZRemove
2ndary ID / Create LP for
2ndary ID
NDP010ZExpire current
exceptions
15
TZ0052ZBreak JL
Association
264 6 29b
NAT403ZCreate/
Expire TA Restrictions
38
TA Restrictions
CICS TRANSACTION GATEWAY (CTG)
NRD300ZCreate/Update Liability Period
NAT127ZSCompute
Return File Date
NRD300ZCreate/Update Liability Period
NAT122ZSRetarget payments
ImageQueue(MQ)
TZ0052ZTI
Load/Validate
CPT517ZSSN
validate
NRP520ZInsert
DCMT-REL
NAT115ZRetrieve
INT_TP_Id,PMT databased on
DLN
C6
TILPProcessModule
TEAFINTYPE
NRD301ZRetrieve Liability
Period (for secondary
id)
NDP010ZExceptionController
NAT164ZMaintainDCMTRow(s)
NAT216ZPre-
payments
31
NDP300ZTrigger Denial
Correspondence
C1
NDP010ZRetrieve
Exceptions by Form Id/
Rule Group
If no system exceptions
NDP010ZExceptionController
- maintain exception array- maintain variables
ProcessBODIM
COUNT
Image TriggerQueue(MQ)
Put TA Msg on
TA queue
Proc
ess
Serv
er
RPQueue
(SIBus -Process
BO)
Returns Process(BatchProcessModule)
RTN_STATUSCD = 30
LIAB_PRDST_CD = 14
File Adapter/Mapper
(bank files)
Process Mode is:1)Mapper
Return00020
Workflow Monitor
1 2
Format & write Image
queue record
40
S1
ocess odu eValidation Subprocess (TI/LP)
BPEL
R1
FVProcModuleFiling Validity Subprocess
BPEL
RulesEngine Module
PIT Rules Engine Controller
JAVA ExceptionProcessModuleBPEL
3
C3
MapNDPEX1WSBO to
ReturnExceptionsBO
C2
Rules EngineJAVA
24
ReturnCompositionBO
ProcessBO
Severe exceptions?
25
exceptions
29d
maintain variables- maintain forms in memory- accumulate statistics
ProcessBO ProcessBO
If no system exceptions
Update Header &
Invoke Fwrk
Service
32If
JL_SUSPENSE_IND=Y or
LP_SUSPENSE_IND=Y
Proc
ess
Mod
e =
0002
0
42
PROCBPEL41
TA queue
39a
NRP291JRetrieve Forms for
prior composition
NRP425JFormat
TA Message
NRP351JDependent
Control Service
Other AUDPIT checks(TBD):- dependents
- item deductionsprevious IT 214
NRP341JFiling Validity
- Dupe check- Check HoldsIT2/W3 Dupe
CheckRetrieve
Addl Data
Retrieve Rules
25b
NRP291JRetrieve
Return Data
Once the suspense data is saved, all previous
updates are rolled out, exception is written, and the process is stopped.
Suspense Repair
Process
39C5
Switch Ids
NRP281JPopulate
Supporting Data for
ExceptionsBased on
Process Seg Id
30
NRP121JStatus
Updater
2712After suspended data is corrected using the
Suspense Repair UI, the process is restarted at
beginning.
C4
NRP222J Return
Framework Service Put CISS msg to queue
NRP511JPersistence
Service
11
1713
22
25a
29a
ReturnComposition
BO
33
37AdjustBO37a
RMI – IIOP (REMOTE METHOD INVOCATION – INTERNET INTER-ORB PROTOCOL)
C7
WebSphere Application Server
Process Server
Fwrk Event Queue (MQ)
CISS Scoring Rules
- previous IT-214Retrieve
Exception Details
RTN_FILING(FORMS,ADJSTS,
EXCPTNS)(UDB-XML)
CISS/RP Reply Queue(SIB )
Return Data
RTN_FILING(FORMS,ADJSTS,
EXCPTNS)(UDB-XML)
34WA
S
NRP351JRetrieve
DependentInformation
Mode is set based on1) Reprocess adjustment2) CISS adjustment3) CISS no adjustment 37b
NRP222JAdjust
FormatterService
CompositionBO
ReturnAdjustProcessModule
BPEL
Validate and LoadDependent Data
NRP511JDelete
Preview Rows
NRP121JExpire/Update
Secondary Filing and
Filing Status
171325c
Adjust BO36
AdjustBOR1
AdjustBOProcessBO
cess
ve
r 35Call to NAT403Z - TA RestrictionsRules(SIBus) BPEL
ProcessBOReprocess
R1
Pro
Serv
NY State Tax XML/SOA Processingg• PIT (Personal Income Tax)
– 11M returns processed– Peak in April: 390 000+ per dayPeak in April: 390,000+ per day– Up to 14,500 different data elements for the filings (60% Electronic)– 6M Refunds ($4.9 B), direct deposit up 13.1%, checks down 6%– Electronic extensions up 160% (439,000)
C t T• Corporate Tax– IRS ELF Program – 2007 - 32,317, 2008 -193,977– Peak month: 100,000 returns in April 2009– Peak day: 20,000 returns
• Sales Tax– 1Q2009 : 60,000 on the Web– 1Q2009 : 400,000 from partners
• Withholdings TaxWithholdings Tax– 50,000 web filings of XML
• STAR Property Tax Rebate Application (Tax Refund)– 3.5 web applications in a 3 month period
40
NYS Tax at DirectionNYS Tax at DirectionNYS Tax at DirectionNYS Tax at Direction• Convert other Subsystems (Domains) to XML
Simplify conversion– Simplify conversion– Map data structures closer to the business– Leverage the rules engineLeverage the rules engine
• Expand the use of web navigation with integration into operational XML
• Incorporate more XML enabled tools to speed delivery and improve product
• Leverage the XML data in new ways (AJAX REST RIA)• Leverage the XML data in new ways (AJAX, REST, RIA)
Contact• Jim Lieb
NYS Department of Taxation and Financej li b@t t [email protected]