View
216
Download
0
Category
Tags:
Preview:
Citation preview
Challenges in handling XML: performance and memory
usage
15.11.2001
Sami Poikonen
Republica oy
Republica Oy is Finland’s leading provider of products and services
based on XML standards.
Founded: 1996
Employees: 70+ (11/2001)
Offices:Helsinki, Jyväskylä
1. DOM2. SAX3. DOM or SAX or something else...4. Transformations5. Conclusions
TOCTOC
Parsing XML: DOM Parsing XML: DOM
• Document Object Model• standard API for accessing and creating xml data• tree-based • programming language indepedent• developed by W3C• whole document is read into memory• read and write
DomNode book||-->DomNode title| || |-->DomNode text||-->DomNode author
||-->DomNode name
<?xml version="1.0"?><book type="pokkari">
<title>Tuntematon sotilas</title><author>
<name first="Väinö" last="Linna"/></author>
</book>
Parsing XML: SAX Parsing XML: SAX
• Simple API for XML• API for accessing xml data• event based • programming language indepedent• not defined by W3C• application has to store fragments into memory• read only
<?xml version="1.0"?><poem><line>Roses are red,</line><line>Violets are blue.</line><line>Sugar is sweet,</line><line>and I love you.</line></poem>
Start element: poemStart element: lineEnd element: lineStart element: lineEnd element: lineStart element: lineEnd element: lineStart element: lineEnd element: lineEnd element: poem
DOM or SAX or DOM or SAX or something else?something else?
DOM:• read and write• need to move back and forth in data• document is human created
SAX:• read only• huge data or streams• data is machine generated
Best of both worlds?Adaptive parsing!
TransformationsTransformations
• XSLT: XSL Transformations• XSLT processors are built to use DOM• XSLT to java conversion: still uses DOM• SAX based custom-made application for trasformations
• Adaptive parsing with data binding?
ConclusionsConclusions
ConclusionsConclusions
• When building XML applications, you have to think how will youhandle large chunks of data
• Choosing between SAX and DOM is not always trivial
• There are more smarter ways to parse XML also
• Adaptive parsing with data binding gives a lot of needed performance into transformations
• It is easy to reach the limits of XLST processing capabilities
• In some cases problems handling xml streams and large files has lead to assume that its is almost impossible to handle those
Republica Oy http://www.republica.fi/Survontie 940500 Jyväskylä http://www.x-fetch.com/
Sami PoikonenVice President, Solutionsp. 040 301 1154sami.poikonen@republica.fi
Contact Information
Recommended