Upload
willow-weiss
View
18
Download
0
Embed Size (px)
DESCRIPTION
5. Processing XML. Overview. Parsing XML documents Document Object Model (DOM) Simple API for XML (SAX) Class generation. What's the Problem?. ?. The XML Handbook Goldfarb Prescod - PowerPoint PPT Presentation
Citation preview
5
Processing XML
5 - 2
Parsing XML documents Document Object Model (DOM) Simple API for XML (SAX)
Class generation
Overview
5 - 3
What's the Problem?
<?xml version="1.0"?><books> <book> <title>The XML Handbook</title> <author>Goldfarb</author> <author>Prescod</author> <publisher>Prentice Hall</publisher> <pages>655</pages> <isbn>0130811521</isbn> <price currency="USD">44.95</price>
</book> <book> <title>XML Design</title> <author>Spencer</author> <publisher>Wrox Press</publisher>
...</book>
</books>
?
Book
?
5 - 4
Parsing XML Documents
Document Tree
Parser
Docu-ment
DTD /Schema
Applicationimplements
DocumentHandler
endDocument
startDocument
endElement
endElement
startElement
startElement
DOM SAX
5 - 5
Parser
Project X (Sun Microsystems) Ælfred (Microstar Software) XML4J (IBM) Lark (Tim Bray) MSXML (Microsoft) XJ (Data Channel) Xerces (Apache) ...
5 - 6
Prescod
book
PrenticeHall
<?xml version="1.0"?><books> <book> <title>The XML Handbook</title> <author>Goldfarb</author> <author>Prescod</author> <publisher>Prentice Hall</publisher> <pages>655</pages> <isbn>0130811521</isbn> <price currency="USD">44.95</price>
</book> <book> <title>XML Design</title> <author>Spencer</author> <publisher>Wrox Press</publisher>
...</book>
</books>
The Document Object Model
XML Document Structure
The XMLHandbook Goldfarb 655
books
book
publisher pages isbnauthortitle
...
5 - 7
The Document Object Model
Provides a standard interface for access to and manipulation of XML structures.
Represents documents in the form of a hierarchy of nodes.
Is platform- and programming-language-neutral
Is a recommendation of the W3C (October 1, 1998)
Is implemented by many parsers
5 - 8
DOM - Structure Model
Document
Node
NodeList
Element
Prescod
book
PrenticeHall
The XMLHandbook Goldfarb 655
books
book
publisher pages isbnauthortitle
...
5 - 9
The Document Interface
Method Result
docTypeimplementationdocumentElementgetElementsByTagName(String)createTextNode(String)createComment(String)createElement(String)create CDATASection(String)
DocumentTypeDOMImplementationElementNodeListStringCommentElementCDATASection
5 - 10
The Node Interface
Method Result
nodeNamenodeValuenodeTypeparentNodechildNodesfirstChildlastChildpreviousSiblingnextSiblingattributesinsertBefore(Node new,Node ref)replaceChild(Node new,Node old)removeChild(Node)hasChildNode
StringStringshortNodeNodeListNodeNodeNodeNodeNodeNamedMapNodeNodeNodeBoolean
5 - 11
Node Types / Node NamesResult: NodeType /NodeName
Node Node Node Fields Type NameELEMENT_NODE 1 tagNameATTRIBUTE_NODE 2 name of attributeTEXT_NODE 3 "#text"CDATA_SECTION_NODE 4 "#cdata-section"ENTITY_REFERENCE_NODE 5 name of entity referencedENTITY_NODE 6 entity namePROCESSING_INSTRUCTION_NODE 7 targetCOMMENT_NODE 8 "#comment"DOCUMENT_NODE 9 "#document"DOCUMENT_TYPE_NODE 10 document type nameDOCUMENT_FRAGMENT_NODE 11 "#document-fragment"NOTATION_NODE 12 notation name
5 - 12
The NodeList Interface
Method Result
lengthitem(int)
IntNode
5 - 13
The Element Interface
Method Result
tagNamegetAttribute(String)setAttribute(String name, String value)removeAttribute(String)getAttributeNode(String)setAttributeNode(Attr)removeAttributeNode(String)getElementsByTagName
StringStringAttr
AttrAttr
NodeList
5 - 14
DOM Methods for Navigation
firstChild lastChild
nextSiblingpreviousSibling
parentNode
getElementsByTagName
childNodes(length, item())
5 - 15
DOM Methods for Manipulation
appendChildinsertBeforereplaceChildremoveChild
createElementcreateAttributecreateTextNode
5 - 16
Example
Goldfarb Spencer
books
book book
author authorauthor
Prescod
doc.documentElement.childNodes.item(0).getElementsByTagName("author"). item(1).childNodes.item(0).datadoc.documentElement.childNodes.item(0).getElementsByTagName("author"). item(1).childNodes.item(0).data
Root NodeDOM
Object TextBookssecondAuthor
TextSubnodes
firstthereof
firstBook
Authors
5 - 17
Script
<HTML><HEAD><TITLE>DOM Example</TITLE></HEAD><BODY><H1>DOM Example</H1><SCRIPT LANGUAGE="JavaScript">
var doc, root, book1, authors, author2; doc = new ActiveXObject("Microsoft.XMLDOM"); doc.async = false; doc.load("books.xml"); if (doc.parseError != 0)
alert(doc.parseError.reason); else {
root = doc.documentElement;document.write("Name of Root node: " + root.nodeName + "<BR>");document.write("Type of Root node: " + root.nodeType + "<BR>");book1 = root.childNodes.item(0);authors = book1.getElementsByTagName("author");document.write("Number of authors: " + authors.length + "<BR>");author2 = authors.item(1);document.write("Name of second author: " + author2.childNodes.item(0).data);}
</SCRIPT></BODY></HTML>
<HTML><HEAD><TITLE>DOM Example</TITLE></HEAD><BODY><H1>DOM Example</H1><SCRIPT LANGUAGE="JavaScript">
var doc, root, book1, authors, author2; doc = new ActiveXObject("Microsoft.XMLDOM"); doc.async = false; doc.load("books.xml"); if (doc.parseError != 0)
alert(doc.parseError.reason); else {
root = doc.documentElement;document.write("Name of Root node: " + root.nodeName + "<BR>");document.write("Type of Root node: " + root.nodeType + "<BR>");book1 = root.childNodes.item(0);authors = book1.getElementsByTagName("author");document.write("Number of authors: " + authors.length + "<BR>");author2 = authors.item(1);document.write("Name of second author: " + author2.childNodes.item(0).data);}
</SCRIPT></BODY></HTML>
5 - 18
SAX - Simple API for XML
Docu-ment
DTD
Application
endDocument
startDocument
endElement
endElement
startElement
startElement
Parser
5 - 19
SAX - Simple API for XML
Event-driven parsing model "Don't call the DOM, the parser calls you." Developed by the members of the XML-DEV Mailing List Released on May 11, 1998 Supported by many parsers ... ... but Ælfred is the saxon king.
5 - 20
Procedure
DOM Creating a parser instance Parsing the whole document Processing the DOM tree
SAX Creating a parser instance Registrating event handlers with the parser Parser calls the event handler during parsing
5 - 21
Namespace Support
<?xml version="1.0"?><order xmlns="http://www.net-standard.com/namespaces/order" xmlns:bk="http://www.net-standard.com/namespaces/books" xmlns:cust="http://www.net-standard.com/namespaces/customer">...<bk:book> <bk:title>XML Handbook</bk:title> <bk:isbn>0130811521</bk:isbn></bk:book>....</order>
5 - 22
Access to Qualified Elements
Node "book"
bk:book
http://www.net-standard.com/namespaces/books
bk
book
Interface "Node"
DOM Level 2
Method
nodeName
namespaceURI
prefix
localName
qName
uri
localName
SAX 2.0
startElement
5 - 23
Generation of Data Structures
DTD / Schema'yacht'
Generation
01 yacht05 name05 details10 type
Class
Processing
<?xml?><yacht yachtid='147'><name>Mona Lisa</name><image file='yacht147.jpg'/><description> Any text describing this yacht 147</description><details> <type>GULFSTAR 55</type> ength>1700</length> <width>480</width> <draft>170</draft> <sailsurface>112</sailsurface> <motor>84</motor> <headroom>202</headroom> <bunks>8</bunks></details></yacht>
01 yacht05 VENTANA05 details10 GULFSTAR 55
Object
5 - 24
Summary
To avoid expensive text processing, applications use an XML parser that creates a DOM tree of a document.
The DOM provides a standardized API to access the content of documents and to manipulate them.
Alternatively or additionally, applications can work event-based using the SAX interface, which is provided by many parsers.