View
238
Download
4
Category
Tags:
Preview:
Citation preview
SDPL 2011 3.3: (XML APIs) JAXP 1
3.3 JAXP: Java API for XML 3.3 JAXP: Java API for XML ProcessingProcessing
How can applications use XML How can applications use XML processors?processors?– In Java: through In Java: through JAXPJAXP– An overview of the JAXP interfaceAn overview of the JAXP interface
» What does it specify?What does it specify?» What can be done with it?What can be done with it?» How do the JAXP components fit together?How do the JAXP components fit together?
[Partly based on Sun tutorial “An Overview of the APIs”, [Partly based on Sun tutorial “An Overview of the APIs”, from which some graphics are borrowed; Chap 4 in from which some graphics are borrowed; Chap 4 in online J2EE 1.4 Tutorial]online J2EE 1.4 Tutorial]
SDPL 2011 3.3: (XML APIs) JAXP 2
Some History: JAXP Some History: JAXP VersionsVersions
JAXP 1.1 included in Java JDK 1.4 (2001)JAXP 1.1 included in Java JDK 1.4 (2001) An interface for “plugging-in” and using An interface for “plugging-in” and using
XML processors in Java applicationsXML processors in Java applications– includes packagesincludes packages
» org.xml.saxorg.xml.sax:: SAX 2.0 SAX 2.0» org.w3c.domorg.w3c.dom:: DOM Level 2 DOM Level 2» javax.xml.parsersjavax.xml.parsers::
initialization and use of parsersinitialization and use of parsers» javax.xml.transformjavax.xml.transform::
initialization and use of initialization and use of Transformers Transformers
(XSLT processors)(XSLT processors)
SDPL 2011 3.3: (XML APIs) JAXP 3
Later Versions: 1.2Later Versions: 1.2
JAXP 1.2 added property-strings for setting the JAXP 1.2 added property-strings for setting the language and source of a language and source of a schemaschema used for used for validationvalidation– http://java.sun.com/xml/jaxp/http://java.sun.com/xml/jaxp/properties/schemaLanguageproperties/schemaLanguage
– http://java.sun.com/xml/jaxp/http://java.sun.com/xml/jaxp/properties/schemaSourceproperties/schemaSource
– JAXP 1.3 allows to set the schema byJAXP 1.3 allows to set the schema by setSchema(Schema) setSchema(Schema)
method of the method of the FactoryFactory classes (used to classes (used to initialize initialize SAXParsersSAXParsers or DOM or DOM DocumentBuildersDocumentBuilders))
SDPL 2011 3.3: (XML APIs) JAXP 4
Later Versions: 1.3 & 1.4Later Versions: 1.3 & 1.4
JAXP 1.3 major update, included in JDK 1.5 (2005)JAXP 1.3 major update, included in JDK 1.5 (2005)– more flexible validation (decoupled from parsing)more flexible validation (decoupled from parsing)– DOM Level 3 Core, and Load and SaveDOM Level 3 Core, and Load and Save– API for applying XPath to do documentsAPI for applying XPath to do documents– mapping btw XML Schema and Java data types mapping btw XML Schema and Java data types
JAXP 1.4 maintenance release, included in JDK 1.6JAXP 1.4 maintenance release, included in JDK 1.6– includes the Streaming API for XML (StAX)includes the Streaming API for XML (StAX)
We'll focus on basic ideas (of JAXP 1.1)We'll focus on basic ideas (of JAXP 1.1)– touching validation, and discussing StAX in some detailtouching validation, and discussing StAX in some detail
SDPL 2011 3.3: (XML APIs) JAXP 5
JAXP: XML processor plugin JAXP: XML processor plugin (1)(1)
Vendor-independent method for selecting Vendor-independent method for selecting processor implementations at run timeprocessor implementations at run time– principally through system propertiesprincipally through system properties
javax.xml.parsers.SAXParserFactoryjavax.xml.parsers.SAXParserFactoryjavax.xml.parsers.DocumentBuilderFactoryjavax.xml.parsers.DocumentBuilderFactoryjavax.xml.transform.TransformerFactoryjavax.xml.transform.TransformerFactory
– Set on command line (say, to select Xerces Set on command line (say, to select Xerces (current default) as the DOM implementation):(current default) as the DOM implementation):
$$ java java -D-Djavax.xml.parsers.DocumentBuilderFactoryjavax.xml.parsers.DocumentBuilderFactory= = org.apache.xerces.jaxp.DocumentBuilderFactoryImplorg.apache.xerces.jaxp.DocumentBuilderFactoryImpl
SDPL 2011 3.3: (XML APIs) JAXP 6
JAXP: XML processor plugin JAXP: XML processor plugin (2)(2)
– Set during execution (Set during execution ( Saxon as the XSLT impl): Saxon as the XSLT impl): System.setProperty(System.setProperty(""javax.xml.transform.TransformerFactoryjavax.xml.transform.TransformerFactory", ", "com.icl.saxon.TransformerFactoryImpl");"com.icl.saxon.TransformerFactoryImpl");
By default, reference implementations usedBy default, reference implementations used– Apache Xerces as the XML parserApache Xerces as the XML parser– Xalan (JDK 1.4) / XSLTC (JDK 1.6) as the XSLT processorXalan (JDK 1.4) / XSLTC (JDK 1.6) as the XSLT processor
Supported by a few compliant processors:Supported by a few compliant processors:– Parsers: Apache Crimson and Xerces, Aelfred, Parsers: Apache Crimson and Xerces, Aelfred,
Oracle XML Parser for Java, Oracle XML Parser for Java, libxml2 (via GNU JAXP libxmlj) libxml2 (via GNU JAXP libxmlj)
– Transformers: Apache Xalan, Saxon, GNU XSL Transformers: Apache Xalan, Saxon, GNU XSL transformertransformer
"highly "highly experimental"experimental"
SDPL 2011 3.3: (XML APIs) JAXP 7
JAXP: Basic FunctionalityJAXP: Basic Functionality
Parsing using SAX 2.0 or DOM (Level 3)Parsing using SAX 2.0 or DOM (Level 3) Transformation using XSLTTransformation using XSLT
– (more about XSLT later)(more about XSLT later) Adds functionality missing from SAX 2.0 Adds functionality missing from SAX 2.0
and DOM Level 2:and DOM Level 2:– controlling validation and handling of parse controlling validation and handling of parse
errorserrors» error handling error handling cancan be controlled in SAX, be controlled in SAX,
by implementing by implementing ErrorHandlerErrorHandler methods methods
– loading and saving of DOM Document objectsloading and saving of DOM Document objects
SDPL 2011 3.3: (XML APIs) JAXP 8
JAXP Parsing APIJAXP Parsing API
Included in JAXP package Included in JAXP package javax.xml.parsers
Used for invoking and using SAX …Used for invoking and using SAX …
SAXParserFactorySAXParserFactory spf = spf =SAXParserFactorySAXParserFactory..newInstancenewInstance();();
and DOM parser implementations:and DOM parser implementations:
DocumentBuilderFactoryDocumentBuilderFactory dbf = dbf =DocumentBuilderFactoryDocumentBuilderFactory..newInstancenewInstance();();
SDPL 2011 3.3: (XML APIs) JAXP 9
XMLXML
.getXMLReader().getXMLReader()
JAXP: Using a SAX parser JAXP: Using a SAX parser (1)(1)
f.xmlf.xml
.parse(.parse( ” ”f.xml”)f.xml”)
.newSAXParser().newSAXParser()
SDPL 2011 3.3: (XML APIs) JAXP 10
JAXP: Using a SAX parser JAXP: Using a SAX parser (2)(2)
We have already seen this:We have already seen this:SAXParserFactorySAXParserFactory spf = spf =
SAXParserFactorySAXParserFactory..newInstancenewInstance();(); try { try { SAXParserSAXParser saxParser = spf. saxParser = spf.newSAXParsernewSAXParser();(); XMLReaderXMLReader xmlReader = xmlReader =
saxParser.saxParser.getXMLReadergetXMLReader();(); ContentHandler handler = new myHdler(); ContentHandler handler = new myHdler(); xmlReaderxmlReader..setContentHandlersetContentHandler(handler);(handler); xmlReaderxmlReader..parseparse(URIOrInputSrc); (URIOrInputSrc);
} catch (Exception e) {} catch (Exception e) {System.err.println(e.getMessage());System.err.println(e.getMessage());System.exit(1); }System.exit(1); }
SDPL 2011 3.3: (XML APIs) JAXP 11
f.xmlf.xml
JAXP: Using a DOM parser JAXP: Using a DOM parser (1)(1)
.parse(”f.xml”).parse(”f.xml”)
.newDocument().newDocument()
.newDocumentBuilder().newDocumentBuilder()
SDPL 2011 3.3: (XML APIs) JAXP 12
JAXP: Using a DOM parser JAXP: Using a DOM parser (2)(2)
Parsing a file into a DOM Parsing a file into a DOM Document:Document:DocumentBuilderFactoryDocumentBuilderFactory dbf = dbf =
DocumentBuilderFactoryDocumentBuilderFactory..newInstancenewInstance();(); try {try { // to get a new// to get a new DocumentBuilderDocumentBuilder::
DocumentBuilderDocumentBuilder builder = builder = dbf.dbf.newDocumentBuildernewDocumentBuilder(); ();
DocumentDocument domDoc = domDoc = builder.builder.parseparse(fileOrURIetc);(fileOrURIetc);
} catch (} catch (ParserConfigurationExceptionParserConfigurationException e) { e) {e.printStackTrace());e.printStackTrace());System.exit(1); }System.exit(1); }
SDPL 2011 3.3: (XML APIs) JAXP 13
DOM building in JAXPDOM building in JAXP
XMLXMLReaderReader
(SAX(SAXParser)Parser)
XMLXML
ErrorErrorHandlerHandler
DTDDTDHandlerHandler
EntityEntityResolverResolver
DocumentDocumentBuilderBuilder
(Content(ContentHandler)Handler)
DOM DocumentDOM Document
DOM on top of SAX - So what?DOM on top of SAX - So what?
SDPL 2011 3.3: (XML APIs) JAXP 14
JAXP: Controlling parsing JAXP: Controlling parsing (1)(1)
Errors of DOM parsing can be handled Errors of DOM parsing can be handled – by creating a SAXby creating a SAX ErrorHandler ErrorHandler
» to implement to implement errorerror, , fatalErrorfatalError and and warningwarning methods methods
and passing it to the and passing it to the DocumentBuilderDocumentBuilder::builder.builder.setErrorHandlersetErrorHandler(new (new
myErrHandler()); myErrHandler()); domDoc = builder.domDoc = builder.parseparse(fileName);(fileName);
Parser properties can be configured:Parser properties can be configured:– for both for both SAXParserFactoriesSAXParserFactories and and
DocumentBuilderFactories DocumentBuilderFactories (before parser/builder (before parser/builder
creation)creation)::factory.factory.setValidatingsetValidating(true/(true/falsefalse))factory.factory.setNamespaceAwaresetNamespaceAware(true/(true/falsefalse))
SDPL 2011 3.3: (XML APIs) JAXP 15
JAXP: Controlling parsing JAXP: Controlling parsing (2)(2)
dbf.dbf.setIgnoringCommentssetIgnoringComments(true/(true/falsefalse))
dbf.dbf.setIgnoringElementContentWhitespacesetIgnoringElementContentWhitespace(true/(true/falsefalse))
dbf.dbf.setCoalescingsetCoalescing(true/(true/falsefalse))
• combine CDATA sections with surrounding text?combine CDATA sections with surrounding text?
dbf.dbf.setExpandEntityReferencessetExpandEntityReferences((truetrue/false)/false)
Further Further DocumentBuilderFactoryDocumentBuilderFactory configuration methods to control the configuration methods to control the form of the resulting DOM Document:form of the resulting DOM Document:
SDPL 2011 3.3: (XML APIs) JAXP 16
DOM vs. Other Java/XML DOM vs. Other Java/XML APIsAPIs
JDOM (JDOM (www.jdom.orgwww.jdom.org), ), DOM4J (DOM4J (www.dom4j.orgwww.dom4j.org)), , JAXB (JAXB (java.sun.com/xml/jaxbjava.sun.com/xml/jaxb))
The others may be more convenient to The others may be more convenient to use, but …use, but … “ “The The DOM offersDOM offers not only the not only the ability to ability to
move between languagesmove between languages with minimal with minimal relearning, but to relearning, but to move between multiple move between multiple implementationsimplementations in a single language – in a single language – which a specific set of classes such as JDOM which a specific set of classes such as JDOM can’t support”can’t support”» J. Kesselman, IBM & W3C DOM WG J. Kesselman, IBM & W3C DOM WG
SDPL 2011 3.3: (XML APIs) JAXP 17
JAXP Transformation APIJAXP Transformation API
Package Package javax.xml.transform TransformerFactoryTransformerFactory and and Transformer Transformer classes; classes;
initialization similar to parser factories and parsersinitialization similar to parser factories and parsers
Allows application to apply a Allows application to apply a TransformerTransformer to a to a SourceSource document to get a document to get a ResultResult documentdocument
TransformerTransformer can be created can be created – from an XSLT scriptfrom an XSLT script– without instructions without instructions an identity transformation an identity transformation
from a from a SourceSource to the to the ResultResult
SDPL 2011 3.3: (XML APIs) JAXP 18
XSLTXSLT
JAXP: Using Transformers JAXP: Using Transformers (1)(1)
.newTransformer(…).newTransformer(…)
.transform(.,.).transform(.,.)
SourceSource
SDPL 2011 3.3: (XML APIs) JAXP 19
Transformation Source & Transformation Source & ResultResult
Transformation Transformation SourceSource object can be object can be– (a (a DocumentDocument//ElementElement node of) a DOM tree node of) a DOM tree– a SAX XMLReader or a SAX XMLReader or – an input streaman input stream
Transformation Transformation ResultResult object can be object can be– (a node of) a DOM tree (a node of) a DOM tree – a SAX ContentHandler or a SAX ContentHandler or – an output streaman output stream
SDPL 2011 3.3: (XML APIs) JAXP 20
Source-Result combinationsSource-Result combinations
XMLXMLReaderReader
(SAX (SAX Parser)Parser)
TransformTransformerer
DOMDOM
ConteContentnt
HandlHandlerer
Input Input StreamStream
Output Output StreamStream
DOMDOM
SourceSource ResultResult
SDPL 2011 3.3: (XML APIs) JAXP 21
JAXP Transformation PackagesJAXP Transformation Packages
Classes to create Classes to create SourceSource and and ResultResult objects from DOM, SAX and I/O streams objects from DOM, SAX and I/O streams defined in packagesdefined in packages– javax.xml.transform.domjavax.xml.transform.dom,, javax.xml.transform.saxjavax.xml.transform.sax,, andand javax.xml.transform.streamjavax.xml.transform.stream
Identity transformation to an output stream is Identity transformation to an output stream is a vendor-neutral way to serialize DOM a vendor-neutral way to serialize DOM documents documents – as an alternative to DOM3 Saveas an alternative to DOM3 Save
SDPL 2011 3.3: (XML APIs) JAXP 22
Serializing a DOM Document as XML Serializing a DOM Document as XML texttext
By an identity transformation to an output stream:By an identity transformation to an output stream:
TransformerFactoryTransformerFactory tFactory = tFactory = TransformerFactoryTransformerFactory..newInstancenewInstance();();
// Create an identity transformer:// Create an identity transformer:TransformerTransformer transformer = transformer =
tFactory.tFactory.newTransformernewTransformer(); (); DOMSourceDOMSource source = new source = new DOMSourceDOMSource(myDOMdoc); (myDOMdoc); StreamResultStreamResult result = result =
new new StreamResultStreamResult(System.out); (System.out);
transformer.transformer.transformtransform(source, result);(source, result);
SDPL 2011 3.3: (XML APIs) JAXP 23
Controlling the form of the result?Controlling the form of the result?
Could specify the requested form of the result by an Could specify the requested form of the result by an XSLT script, say, in file XSLT script, say, in file saveSpec.xsltsaveSpec.xslt::
<xsl:transform version="1.0" <xsl:transform version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output encoding="ISO-8859-1" indent="yes"<xsl:output encoding="ISO-8859-1" indent="yes" doctype-system="reglist.dtd" />doctype-system="reglist.dtd" />
<xsl:template match="/"> <xsl:template match="/"> <!-- copy the whole document: --><!-- copy the whole document: --> <xsl:copy-of select="." /><xsl:copy-of select="." /> </xsl:template> </xsl:template>
</xsl:transform></xsl:transform>
SDPL 2011 3.3: (XML APIs) JAXP 24
Creating an XSLT TransformerCreating an XSLT Transformer
Create a tailored transfomer:Create a tailored transfomer:
StreamSourceStreamSource saveSpecSrc = saveSpecSrc = new new StreamSourceStreamSource((
new File(”saveSpec.xslt”) );new File(”saveSpec.xslt”) );TransformerTransformer transformer = transformer =
tFactory.tFactory.newTransformernewTransformer(saveSpecSrc); (saveSpecSrc);
// and use it to transform a Source to a Result, // and use it to transform a Source to a Result,
// as before// as before
The The Source Source of transformation instructions could of transformation instructions could be given also as abe given also as a DOMSource DOMSource or or SAXSourceSAXSource
SDPL 2011 3.3: (XML APIs) JAXP 25
Transformation OutputPropertiesTransformation OutputProperties
TransformerTransformer myTr = tFactory. myTr = tFactory.newTransformernewTransformer(); ();
// Set identity transformer's output properties:// Set identity transformer's output properties:
myTr.myTr.setOutputPropertysetOutputProperty((OutputKeys.ENCODINGOutputKeys.ENCODING, , "iso-8859-1"); "iso-8859-1");
myTr.myTr.setOutputPropertysetOutputProperty((OutputKeys.DOCTYPE_SYSTEMOutputKeys.DOCTYPE_SYSTEM, , "reglist.dtd"); "reglist.dtd");
myTr.myTr.setOutputPropertysetOutputProperty((OutputKeys.INDENTOutputKeys.INDENT,"yes");,"yes");
// Then use it as above// Then use it as above
Equivalent to the previous Equivalent to the previous ”saveSpec.xslt””saveSpec.xslt” TransformerTransformer
SDPL 2011 3.3: (XML APIs) JAXP 26
Stylesheet ParametersStylesheet Parameters
Can also pass parameters to a transformer created Can also pass parameters to a transformer created from a script like this: from a script like this: <xsl:transform ... ><xsl:transform ... >
<xsl:output method="text" /><xsl:output method="text" />
<xsl:param name="In" select="0" /><xsl:param name="In" select="0" /> <xsl:template match="/"> <xsl:template match="/">
<xsl:value-of select="2* <xsl:value-of select="2*$In$In"/> "/> </xsl:template> </xsl:template>
</xsl:transform></xsl:transform>
usingusing
myTrans.myTrans.setParametersetParameter("In", 10)("In", 10)
default valuedefault value
JAXP ValidationJAXP Validation
JAXP 1.3 introduced also a JAXP 1.3 introduced also a ValidationValidation frameworkframework– based on familial Factory pattern, to based on familial Factory pattern, to
provide independence of schema language provide independence of schema language and implementationand implementation» SchemaFactorySchemaFactory Schema Schema ValidatorValidator
– separates validation from parsingseparates validation from parsing» say, to validate an in-memory DOM subtreesay, to validate an in-memory DOM subtree
– implementations must support XML Schema implementations must support XML Schema
SDPL 2011 3.3: (XML APIs) JAXP 27
Validation Example: "Xeditor"Validation Example: "Xeditor"
Xeditor, an experimental XML editorXeditor, an experimental XML editor– to experiment and demonstrate JAXP-to experiment and demonstrate JAXP-
based, on-the-fly, multi-schema validationbased, on-the-fly, multi-schema validation– M. Saesmaa and P. Kilpeläinen: On-the-fly M. Saesmaa and P. Kilpeläinen: On-the-fly
Validation of XML Markup Languages using off-Validation of XML Markup Languages using off-the-shelf Tools. Extreme Markup Languages 2007, the-shelf Tools. Extreme Markup Languages 2007, Montréal, August 2007Montréal, August 2007
SDPL 2011 3.3: (XML APIs) JAXP 28
SDPL 2011 3.3: (XML APIs) JAXP 29
Look & Feel of ”Xeditor”Look & Feel of ”Xeditor”
- off- off- WF check, as - WF check, as XML or DTDXML or DTD- validate - validate using DTD, using DTD, or against schemaor against schema
SDPL 2011 3.3: (XML APIs) JAXP 30
Different Schemas and Schema Different Schemas and Schema LanguagesLanguages
A Validator created when the user A Validator created when the user selects Schemaselects Schema
SDPL 2011 3.3: (XML APIs) JAXP 31
Event-driven document Event-driven document validationvalidation
Modifified document passed to the ValidatorModifified document passed to the Validator– errors caught as SAX parse exceptionserrors caught as SAX parse exceptions
SDPL 2011 3.3: (XML APIs) JAXP 32
Efficiency of In-Memory Efficiency of In-Memory ValidationValidation
Is brute-force re-validation too inefficient?Is brute-force re-validation too inefficient? No: Delays normally unnoticeable No: Delays normally unnoticeable
times for validatingtimes for validatingXMLSchema.xsdXMLSchema.xsd
SDPL 2011 3.3: (XML APIs) JAXP 33
JAXP: SummaryJAXP: Summary
An interface for using XML ProcessorsAn interface for using XML Processors– SAX/DOM parsers, XSLT transformersSAX/DOM parsers, XSLT transformers– schema-based validators (since JAXP 1.3)schema-based validators (since JAXP 1.3)
Supports pluggability of XML processorsSupports pluggability of XML processors Defines means to control parsing, and Defines means to control parsing, and
handling of parse errors (through SAX handling of parse errors (through SAX ErrorHandlers)ErrorHandlers)
Defines means to create and save DOM Defines means to create and save DOM DocumentsDocuments
Recommended