22
SDPL 2003 Notes 3: XML Processor In terfaces 1 3.3 JAXP: Java API for XML 3.3 JAXP: Java API for XML Processing Processing How can applications use XML processors? How can applications use XML processors? A Java-based answer: through A Java-based answer: through JAXP JAXP An overview of the JAXP interface An overview of the JAXP interface » What does it specify? What does it specify? » What can be done with it? What can be done with it? » How do the JAXP components fit together? How do the JAXP components fit together? [Partly based on tutorial “An Overview of the APIs” [Partly based on tutorial “An Overview of the APIs” available at available at http://java.sun.com/xml/jaxp/dist/1.1/docs/tutorial/ov http://java.sun.com/xml/jaxp/dist/1.1/docs/tutorial/ov erview/3_apis.html, from which also some graphics are erview/3_apis.html, from which also some graphics are borrowed] borrowed]

SDPL 2003Notes 3: XML Processor Interfaces1 3.3 JAXP: Java API for XML Processing n How can applications use XML processors? –A Java-based answer: through

Embed Size (px)

Citation preview

Page 1: SDPL 2003Notes 3: XML Processor Interfaces1 3.3 JAXP: Java API for XML Processing n How can applications use XML processors? –A Java-based answer: through

SDPL 2003 Notes 3: XML Processor Interfaces 1

3.3 JAXP: Java API for XML 3.3 JAXP: Java API for XML ProcessingProcessing

How can applications use XML processors?How can applications use XML processors?– A Java-based answer: through A Java-based answer: through JAXPJAXP– An overview of the JAXP interfaceAn overview of the JAXP interface

» What does it specify?What does it specify?» What can be done with it?What can be done with it?» How do the JAXP components fit together?How do the JAXP components fit together?

[Partly based on tutorial “An Overview of the APIs” [Partly based on tutorial “An Overview of the APIs” available at available at http://java.sun.com/xml/jaxp/dist/1.1/docs/tutorial/overvihttp://java.sun.com/xml/jaxp/dist/1.1/docs/tutorial/overview/3_apis.html, from which also some graphics are ew/3_apis.html, from which also some graphics are borrowed]borrowed]

Page 2: SDPL 2003Notes 3: XML Processor Interfaces1 3.3 JAXP: Java API for XML Processing n How can applications use XML processors? –A Java-based answer: through

SDPL 2003 Notes 3: XML Processor Interfaces 2

JAXP 1.1JAXP 1.1

An interface for “plugging-in” and using An interface for “plugging-in” and using XML processors in Java applicationsXML processors in Java applications– includes packagesincludes packages

» org.xml.sax:org.xml.sax: SAX 2.0 interface SAX 2.0 interface» org.w3c.dom:org.w3c.dom: DOM Level 2 interface DOM Level 2 interface» javax.xml.parsersjavax.xml.parsers::

initialization and use of parsersinitialization and use of parsers» javax.xml.transformjavax.xml.transform::

initialization and use of initialization and use of transformers transformers

(XSLT processors)(XSLT processors)

Included in JDK starting from vers. 1.4Included in JDK starting from vers. 1.4

Page 3: SDPL 2003Notes 3: XML Processor Interfaces1 3.3 JAXP: Java API for XML Processing n How can applications use XML processors? –A Java-based answer: through

SDPL 2003 Notes 3: XML Processor Interfaces 3

JAXP: XML processor plugin JAXP: XML processor plugin (1)(1)

Vendor-independent method for selecting Vendor-independent method for selecting processor implementation at run timeprocessor implementation at run time– principally through system propertiesprincipally through system properties

javax.xml.parsers.SAXParserFactoryjavax.xml.parsers.SAXParserFactoryjavax.xml.parsers.DocumentBuilderFactoryjavax.xml.parsers.DocumentBuilderFactoryjavax.xml.transform.TransformerFactoryjavax.xml.transform.TransformerFactory

– Set on command line (to use Apache Xerces Set on command line (to use Apache Xerces as the DOM implementation):as the DOM implementation): java java -D-Djavax.xml.parsers.DocumentBuilderFactoryjavax.xml.parsers.DocumentBuilderFactory= = org.apache.xerces.jaxp.DocumentBuilderFactoryImplorg.apache.xerces.jaxp.DocumentBuilderFactoryImpl

Page 4: SDPL 2003Notes 3: XML Processor Interfaces1 3.3 JAXP: Java API for XML Processing n How can applications use XML processors? –A Java-based answer: through

SDPL 2003 Notes 3: XML Processor Interfaces 4

JAXP: XML processor plugin JAXP: XML processor plugin (2)(2)

– Set during execution (Set during execution ( Saxon as the XSLT impl): Saxon as the XSLT impl): System.setProperty(System.setProperty(""javax.xml.transform.TransformerFactoryjavax.xml.transform.TransformerFactory", ", "com.icl.saxon.TransformerFactoryImpl");"com.icl.saxon.TransformerFactoryImpl");

By default, reference implementations usedBy default, reference implementations used– Apache Crimson/Xerces as the XML parserApache Crimson/Xerces as the XML parser– Apache Xalan as the XSLT processorApache Xalan as the XSLT processor

Currently supported only by a few compliant Currently supported only by a few compliant XML processors:XML processors:– Parsers: Apache Crimson and Xerces, Aelfred, Parsers: Apache Crimson and Xerces, Aelfred,

Oracle XML Parser for Java Oracle XML Parser for Java– XSLT transformers: Apache Xalan, SaxonXSLT transformers: Apache Xalan, Saxon

Page 5: SDPL 2003Notes 3: XML Processor Interfaces1 3.3 JAXP: Java API for XML Processing n How can applications use XML processors? –A Java-based answer: through

SDPL 2003 Notes 3: XML Processor Interfaces 5

JAXP: FunctionalityJAXP: Functionality

Parsing using SAX 2.0 or DOM Level 2Parsing using SAX 2.0 or DOM Level 2 Transformation using XSLTTransformation using XSLT

– (We’ll study XSLT in detail later)(We’ll study XSLT in detail later) Fixes features left unspecified in SAX 2.0 Fixes features left unspecified in SAX 2.0

and DOM 2 interfacesand DOM 2 interfaces– control of parser validation and error control of parser validation and error

handlinghandling» error handling error handling cancan be controlled in SAX by be controlled in SAX by

implementing implementing ErrorHandlerErrorHandler methods methods

– loading and saving of DOM Document objectsloading and saving of DOM Document objects

Page 6: SDPL 2003Notes 3: XML Processor Interfaces1 3.3 JAXP: Java API for XML Processing n How can applications use XML processors? –A Java-based answer: through

SDPL 2003 Notes 3: XML Processor Interfaces 6

JAXP Parsing APIJAXP Parsing API

Included in JAXP package Included in JAXP package javax.xml.parsersjavax.xml.parsers

Used for invoking and using SAXUsed for invoking and using SAX

SAXParserFactorySAXParserFactory spf = spf =SAXParserFactorySAXParserFactory..newInstancenewInstance();();

and DOM parser implementations:and DOM parser implementations:

DocumentBuilderFactoryDocumentBuilderFactory dbf = dbf =DocumentBuilderFactoryDocumentBuilderFactory..newInstancenewInstance();();

Page 7: SDPL 2003Notes 3: XML Processor Interfaces1 3.3 JAXP: Java API for XML Processing n How can applications use XML processors? –A Java-based answer: through

SDPL 2003 Notes 3: XML Processor Interfaces 7

XMLXML

.getXMLReader().getXMLReader()

JAXP: Using a SAX parser JAXP: Using a SAX parser (1)(1)

f.xmlf.xml

.parse(.parse( ” ”f.xml”)f.xml”)

.newSAXParser().newSAXParser()

Page 8: SDPL 2003Notes 3: XML Processor Interfaces1 3.3 JAXP: Java API for XML Processing n How can applications use XML processors? –A Java-based answer: through

SDPL 2003 Notes 3: XML Processor Interfaces 8

JAXP: Using a SAX parser JAXP: Using a SAX parser (2)(2)

We have already seen this:We have already seen this:SAXParserFactorySAXParserFactory spf = spf =

SAXParserFactorySAXParserFactory..newInstancenewInstance();(); trytry { {

SAXParserSAXParser saxParser = spf. saxParser = spf.newSAXParsernewSAXParser();(); XMLReaderXMLReader xmlReader = xmlReader =

saxParser.saxParser.getXMLReadergetXMLReader();(); } } catchcatch (Exception e) { (Exception e) {

System.err.println(e.getMessage());System.err.println(e.getMessage());System.exit(1); };System.exit(1); };

… … xmlReaderxmlReader..setContentHandlersetContentHandler(handler);(handler);xmlReaderxmlReader..parseparse(fileName);(fileName);

Page 9: SDPL 2003Notes 3: XML Processor Interfaces1 3.3 JAXP: Java API for XML Processing n How can applications use XML processors? –A Java-based answer: through

SDPL 2003 Notes 3: XML Processor Interfaces 9

f.xmlf.xml

JAXP: Using a DOM parser JAXP: Using a DOM parser (1)(1)

.parse(”f.xml”).parse(”f.xml”)

.newDocument().newDocument()

.newDocumentBuilder().newDocumentBuilder()

Page 10: SDPL 2003Notes 3: XML Processor Interfaces1 3.3 JAXP: Java API for XML Processing n How can applications use XML processors? –A Java-based answer: through

SDPL 2003 Notes 3: XML Processor Interfaces 10

JAXP: Using a DOM parser JAXP: Using a DOM parser (2)(2)

Code to parse a file into DOM:Code to parse a file into DOM:DocumentBuilderFactoryDocumentBuilderFactory dbf = dbf =

DocumentBuilderFactoryDocumentBuilderFactory..newInstancenewInstance();(); trytry { { // to get a new// to get a new DocumentBuilderDocumentBuilder::

documentBuilderdocumentBuilder builder = builder = dbf.dbf.newDocumentBuildernewDocumentBuilder();();

} } catchcatch ( (ParserConfigurationExceptionParserConfigurationException e) { e) {e.printStackTrace());e.printStackTrace());System.exit(1); };System.exit(1); };

DocumentDocument domDoc = builder. domDoc = builder.parseparse(fileName);(fileName);

Page 11: SDPL 2003Notes 3: XML Processor Interfaces1 3.3 JAXP: Java API for XML Processing n How can applications use XML processors? –A Java-based answer: through

SDPL 2003 Notes 3: XML Processor Interfaces 11

DOM building in JAXPDOM building in JAXP

XMLXMLReaderReader

(SAX(SAXParser)Parser)

XMLXML

ErrorErrorHandlerHandler

DTDDTDHandlerHandler

EntityEntityResolverResolver

DocumentDocumentBuilderBuilder

(Content(ContentHandler)Handler)

DOM DocumentDOM Document

DOM on top of SAX - So what?DOM on top of SAX - So what?

Page 12: SDPL 2003Notes 3: XML Processor Interfaces1 3.3 JAXP: Java API for XML Processing n How can applications use XML processors? –A Java-based answer: through

SDPL 2003 Notes 3: XML Processor Interfaces 12

JAXP: Controlling parsing JAXP: Controlling parsing (1)(1)

Errors of DOM parsing can be handled Errors of DOM parsing can be handled – by creating a SAXby creating a SAX ErrorHandler ErrorHandler, say , say errHandlererrHandler

» to implement to implement errorerror, , fatalErrorfatalError and and warningwarning methods methods

and passing it to the and passing it to the DocumentBuilder DocumentBuilder (before (before parsing)parsing)::

builder.builder.setErrorHandlersetErrorHandler(errHandler);(errHandler); Parser properties can be configured:Parser properties can be configured:

– for both for both SAXParserFactoriesSAXParserFactories and and DocumentBuilderFactories DocumentBuilderFactories (before parser (before parser creation):creation):factory.factory.setValidatingsetValidating(true/false)(true/false)factory.factory.setNamespaceAwaresetNamespaceAware(true/false)(true/false)

Page 13: SDPL 2003Notes 3: XML Processor Interfaces1 3.3 JAXP: Java API for XML Processing n How can applications use XML processors? –A Java-based answer: through

SDPL 2003 Notes 3: XML Processor Interfaces 13

JAXP: Controlling parsing JAXP: Controlling parsing (2)(2)

setIgnoringCommentssetIgnoringComments(true/false)(true/false)

setIgnoringElementContentWhitespacesetIgnoringElementContentWhitespace(true/false)(true/false)

setCoalescingsetCoalescing(true/false)(true/false)

• combine CDATA sections with surrounding text?combine CDATA sections with surrounding text?

setExpandEntityReferencessetExpandEntityReferences(true/false)(true/false)

Further Further DocumentBuilderFactoryDocumentBuilderFactory configuration methods to control the configuration methods to control the form of the resulting DOM tree:form of the resulting DOM tree:

Page 14: SDPL 2003Notes 3: XML Processor Interfaces1 3.3 JAXP: Java API for XML Processing n How can applications use XML processors? –A Java-based answer: through

SDPL 2003 Notes 3: XML Processor Interfaces 14

JAXP Transformation APIJAXP Transformation API

earlier known as TrAXearlier known as TrAX Allows application to apply a Allows application to apply a TransformerTransformer

to a to a SourceSource document to get a document to get a ResultResult documentdocument

TransformerTransformer can be created can be created – from XSLT transformation instructions (to be from XSLT transformation instructions (to be

discussed later)discussed later)– without instructionswithout instructions

» gives an identity transformation, which simply gives an identity transformation, which simply copies the copies the SourceSource to the to the ResultResult

Page 15: SDPL 2003Notes 3: XML Processor Interfaces1 3.3 JAXP: Java API for XML Processing n How can applications use XML processors? –A Java-based answer: through

SDPL 2003 Notes 3: XML Processor Interfaces 15

XSLTXSLT

JAXP: Using Transformers JAXP: Using Transformers (1)(1)

.newTransformer(…).newTransformer(…)

.transform(.,.).transform(.,.)

Page 16: SDPL 2003Notes 3: XML Processor Interfaces1 3.3 JAXP: Java API for XML Processing n How can applications use XML processors? –A Java-based answer: through

SDPL 2003 Notes 3: XML Processor Interfaces 16

JAXP Transformation JAXP Transformation PackagesPackages

javax.xml.transformjavax.xml.transform:: – Classes Classes TransformerTransformer and and TransformerFactoryTransformerFactory; initialization similar ; initialization similar to parsers and parser factoriesto parsers and parser factories

Transformation Transformation SourceSource object can be object can be– a DOM tree, a SAX XMLReader or an input a DOM tree, a SAX XMLReader or an input

streamstream Transformation Transformation ResultResult object can be object can be

– a DOM tree, a SAX ContentHandler a DOM tree, a SAX ContentHandler or an output streamor an output stream

Page 17: SDPL 2003Notes 3: XML Processor Interfaces1 3.3 JAXP: Java API for XML Processing n How can applications use XML processors? –A Java-based answer: through

SDPL 2003 Notes 3: XML Processor Interfaces 17

Source-Result Source-Result combinationscombinations

XMLXMLReaderReader

(SAX (SAX Parser)Parser)

TransformTransformerer

DOMDOM

ConteContentnt

HandlHandlerer

Input Input StreamStream

Output Output StreamStream

DOMDOM

SourceSource ResultResult

Page 18: SDPL 2003Notes 3: XML Processor Interfaces1 3.3 JAXP: Java API for XML Processing n How can applications use XML processors? –A Java-based answer: through

SDPL 2003 Notes 3: XML Processor Interfaces 18

JAXP Transformation Packages JAXP Transformation Packages (2)(2)

Classes to create Classes to create SourceSource and and ResultResult objects from DOM, SAX and I/O streams objects from DOM, SAX and I/O streams defined in packagesdefined in packages– javax.xml.transform.domjavax.xml.transform.dom,, javax.xml.transform.saxjavax.xml.transform.sax,, andand javax.xml.transform.streamjavax.xml.transform.stream

An identity transformation from a DOM An identity transformation from a DOM Document to I/O stream gives a vendor-Document to I/O stream gives a vendor-neutral way to serialize DOM documentsneutral way to serialize DOM documents– (the only option in JAXP)(the only option in JAXP)

Page 19: SDPL 2003Notes 3: XML Processor Interfaces1 3.3 JAXP: Java API for XML Processing n How can applications use XML processors? –A Java-based answer: through

SDPL 2003 Notes 3: XML Processor Interfaces 19

Serializing a DOM Document as XML Serializing a DOM Document as XML texttext

Identity transformation to an output stream:Identity transformation to an output stream:

TransformerFactoryTransformerFactory tFactory = tFactory = TransformerFactoryTransformerFactory..newInstancenewInstance();();

// Create an identity transformer:// Create an identity transformer:TransformerTransformer transformer = transformer =

tFactory.tFactory.newTransformernewTransformer(); (); DOMSourceDOMSource source = source = newnew DOMSourceDOMSource(myDOMdoc); (myDOMdoc); StreamResultStreamResult result = result =

newnew StreamResultStreamResult(System.out); (System.out); transformer.transformer.transformtransform(source, result);(source, result);

Page 20: SDPL 2003Notes 3: XML Processor Interfaces1 3.3 JAXP: Java API for XML Processing n How can applications use XML processors? –A Java-based answer: through

SDPL 2003 Notes 3: XML Processor Interfaces 20

Controlling the form of the result?Controlling the form of the result?

As above, but create a As above, but create a TransformerTransformer with XSLT with XSLT instructions as a instructions as a StreamSourceStreamSource, say , say saveSpecSrcsaveSpecSrc::

<xsl:transform version="1.0" <xsl:transform version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:output encoding="ISO-8859-1" indent="yes"<xsl:output encoding="ISO-8859-1" indent="yes" doctype-system="reglist.dtd" /> doctype-system="reglist.dtd" />

<xsl:template match="/"> <!--copy entire doc:--><xsl:template match="/"> <!--copy entire doc:--> <xsl:copy-of select="." /><xsl:copy-of select="." /> </xsl:template> </xsl:template> </xsl:transform></xsl:transform>

// Now create a tailored transformer:// Now create a tailored transformer:TransformerTransformer transformer = transformer =

tFactory.tFactory.newTransformernewTransformer(saveSpecSrc); (saveSpecSrc);

Page 21: SDPL 2003Notes 3: XML Processor Interfaces1 3.3 JAXP: Java API for XML Processing n How can applications use XML processors? –A Java-based answer: through

SDPL 2003 Notes 3: XML Processor Interfaces 21

Other Java APIs for XMLOther Java APIs for XML

JDOMJDOM– a Java-specific variant of W3C DOM a Java-specific variant of W3C DOM – http://www.jdom.org/http://www.jdom.org/

DOM4J DOM4J ((http://www.dom4j.org/http://www.dom4j.org/)) – roughly similar to JDOM; richer set of features: roughly similar to JDOM; richer set of features: – powerful navigation with integrated XPath supportpowerful navigation with integrated XPath support

JAXB (Java Architecture for XML Binding)JAXB (Java Architecture for XML Binding)– compiles DTDs to DTD-specific classes for reading, compiles DTDs to DTD-specific classes for reading,

manipulating and writing valid documentsmanipulating and writing valid documents– http://java.sun.com/xml/jaxb/http://java.sun.com/xml/jaxb/

Page 22: SDPL 2003Notes 3: XML Processor Interfaces1 3.3 JAXP: Java API for XML Processing n How can applications use XML processors? –A Java-based answer: through

SDPL 2003 Notes 3: XML Processor Interfaces 22

JAXP: SummaryJAXP: Summary

An interface for using XML ProcessorsAn interface for using XML Processors– SAX/DOM parsers, XSLT transformersSAX/DOM parsers, XSLT transformers

Supports plugability of XML processorsSupports plugability of XML processors Defines means to control parsing and Defines means to control parsing and

handling of parse errors (through SAX handling of parse errors (through SAX ErrorHandlers)ErrorHandlers)

Defines means to write out DOM Defines means to write out DOM DocumentsDocuments

Included in JDK 1.4Included in JDK 1.4