Upload
emery-parker
View
213
Download
0
Embed Size (px)
Citation preview
Openadaptor XML SupportOpenadaptor XML Support
Using openadaptor for XML processing
Oleg Dulin, [email protected], http://www.dulinresearch.com/
MotivationMotivation
• The ideas behind the components described in this presentation are driven by the necessity to be able to manipulate XML documents in the Openadaptor pipeline XML Parsing HTML scraping XSL Transformations XPath mapping between DOM elements and
DataObject attributes Storing XML documents in the relational database XML-based Web Services
DOM Support in DataObjectsDOM Support in DataObjects
• DOMDataObject component Parses an XML String into DOM tree Serializes DOM tree into an XML String XML FixedDOType
• Doesn’t make sense to hold on to more than one DOM tree at a time
• Additional attributes should be added to DOM tree Get and SetAttribute methods overriden to “fool” other
components into thinking that XML is really stored as a String
Also methods for getting the reference to DOM Document object
DOMDataObject to do listDOMDataObject to do list
• Add support for setting parser features
FileBufferSourceFileBufferSource
• In order to parse XML into DOM we needed to either: Write a source that parses an XML document Add support for InputStream in DataObject Write a source that loads an XML document
into a String and passes it to the DOMParserPipe (next slide)
• The third option was the easiest, but second one is under development
DOMParserPipeDOMParserPipe
Takes a simple DataObject with a String attribute containing XML document
Creates a new DOMDataObject that will contain parsed XML document as a DOM tree
DOMParserPipe to do listDOMParserPipe to do list
• Add suport for setting parser features
XSLTransformPipeXSLTransformPipe
• Applies XSL transform to a DOM tree stored in DOMDataObject
• Replaces it with the result as another DOMDataObject
ExamplesExamples
• Apply XSL transformations to an XML document (next slide)
XSL Transform ExampleXSL Transform Example
FileBufferSource (read a document and place it into a String buffer attribute)
DOMParserPipe (parse the document in the string buffer attribute)
XSLTransformPipe (apply XSL transform to the XML document to extract only the elements that we need)
FileSink (save the result XML document into a file)
Work in ProgressWork in Progress
• Components in the following slides are currently a work in progress
XPathFilterPipeXPathFilterPipe
• Uses XPath to map DOM element values onto DataObject attributes
• Presently under development A very simple version is currently in the
source repository
XML2DBSinkXML2DBSink
• Takes SQL statements from an XML document and executes them The idea is to use XSLTransformPipe to convert an
XML document into a series of SQL statements and execute them.
E.g.: store values from the XML document in the relational database
E.g.: trigger database events based on values in the XML document
• Currently under development
HTMLTidyPipeHTMLTidyPipe
• Uses Jtidy to convert an arbitrary HTML document into a well-formed XML document (XHTML) The resulting document can then be passed
to XSLTransformPipe or XPathFilterPipe to extract values out of the HTML document
• A.k.a. screen scraping
• Currently under development
InputStreamDataObjectInputStreamDataObject
• The idea is to support passing a handle to an InputStream through the pipeline Avoid reading the entire file into the buffer for
DOM parsing SOAP with attachments Support for streaming content in the pipeline
• Currently under development
XML-based Web ServicesXML-based Web Services
• SOAPSink The version currently in CVS is outdated Add support for delivering message to any
SOAP service
• SOAPSource SOAP-enable any adaptor Create web services out of practically any
legacy application