17
Efficient XML Interchange High Performance XML Don McGregor (mcgredo (at) nps.edu) Don Brutzman (brutzman (at) nps.edu)

Efficient XML Interchange High Performance XML Don McGregor (mcgredo (at) nps.edu) Don Brutzman (brutzman (at) nps.edu)

Embed Size (px)

Citation preview

Page 1: Efficient XML Interchange High Performance XML Don McGregor (mcgredo (at) nps.edu) Don Brutzman (brutzman (at) nps.edu)

Efficient XML Interchange

High Performance XML

Don McGregor (mcgredo (at) nps.edu)Don Brutzman (brutzman (at) nps.edu)

Page 2: Efficient XML Interchange High Performance XML Don McGregor (mcgredo (at) nps.edu) Don Brutzman (brutzman (at) nps.edu)

W3C EXI Working Group

Chartered by the World Wide Web Consortiumhttp://www.w3.org/XML/EXI/

Tasked to•Develop a specification for an encoding format that

allows efficient interchange of the XML Information Set, and

• Illustrate effective processor implementations of that encoding

Page 3: Efficient XML Interchange High Performance XML Don McGregor (mcgredo (at) nps.edu) Don Brutzman (brutzman (at) nps.edu)

XML Virtues

XML has become the default format for the storage and interchange of information

It has many virtues: simple, human readable, flexible, and a huge array of tools to support transformation, storage, reading, indexing, etc.

Like any standard, it becomes more valuable as more things use it. This dynamic has driven its use to places not previously considered XML domains

Page 4: Efficient XML Interchange High Performance XML Don McGregor (mcgredo (at) nps.edu) Don Brutzman (brutzman (at) nps.edu)

Newer XML Applications

• Network protocols• High speed web services• Low power devices (cell phones, sensor

networks, etc)• Archiving large data sets• DoD tactical messaging systems

Page 5: Efficient XML Interchange High Performance XML Don McGregor (mcgredo (at) nps.edu) Don Brutzman (brutzman (at) nps.edu)

XML Vices

XML is also • Verbose & bandwidth intensive• Text-based; converting to binary is

expensive (databinding)• Parsers can consume significant power;

batteries are not on the same improvement curve as CPUs

Page 6: Efficient XML Interchange High Performance XML Don McGregor (mcgredo (at) nps.edu) Don Brutzman (brutzman (at) nps.edu)

XML Vices: Bandwidth

Text takes bandwidthTags (<tag></aTag>) are a significant

portion of the overall document sizeGzip of XML documents is not always

enough; we can get better compactness, and there are interactions with other issues

Size is a major issue in tactical links

Page 7: Efficient XML Interchange High Performance XML Don McGregor (mcgredo (at) nps.edu) Don Brutzman (brutzman (at) nps.edu)

XML Vices: Databinding

Databinding ties text XML to (usually) programming language objects

<point x=“1” y=“2” z=“3”/>Needs to be tied to something like a Java objectpublic class Point { float x, y, z;}This involves parsing the text, converting to binary,

and stuffing it into the objectIt’s a lot more efficient to simply send binary (though

you need to be careful about endian issues)

Page 8: Efficient XML Interchange High Performance XML Don McGregor (mcgredo (at) nps.edu) Don Brutzman (brutzman (at) nps.edu)

XML Vices: Power Consumption

XML is migrating to phones, sensor networks, etc.

Batteries are on a slow improvement curve; a full XML parser can be somewhat expensive in terms of power. Doing gzip to get compactness adds to the power budget

Page 9: Efficient XML Interchange High Performance XML Don McGregor (mcgredo (at) nps.edu) Don Brutzman (brutzman (at) nps.edu)

Efficient XML

This has led the W3C to create an Efficient XML Interchange (EXI) working group to agree on a standard, rather than continue to use multiple incompatible formats

EXI is an alternative representation of XML that is more compact, faster to parse, consumes less power, and has better data binding characteristics

It accomplishes this by giving up the text-based, human-readable representation in favor of a binary format

It’s still XML--just a different representation of the same information

Page 10: Efficient XML Interchange High Performance XML Don McGregor (mcgredo (at) nps.edu) Don Brutzman (brutzman (at) nps.edu)

EXI

• Replace text tags (<start>, <stop>, etc) with a shorter binary representation

• For schemas-described documents, we can use the information to handle numeric fields as binary values

• The resulting document compresses better than the original text document

Better data binding than classic XML or gzipped XML.If you run the EXI through gzip, you’ll also get a

smaller file than the original gzip’d XMLLower power consumption to parse

Page 11: Efficient XML Interchange High Performance XML Don McGregor (mcgredo (at) nps.edu) Don Brutzman (brutzman (at) nps.edu)

EXI Status

Public working draft of specification published; probably go to “final call” in early 2008. Final call is the point at which it receives most of the outside scrutiny

One commercial implementation (subject to specification changes)

Open source Java implementation being worked on by Sun, Fujitsu, Siemens, NPS

Lower power J2ME implementation is possible down the road

Page 12: Efficient XML Interchange High Performance XML Don McGregor (mcgredo (at) nps.edu) Don Brutzman (brutzman (at) nps.edu)

EXI

DoD should adopt a standard, not a product.

EXI is just another format for XML. If you want to go back to XML, or even go to another binary XML standard, simply convert back to XML

XML Infoset

Text XMLFormat

EXI Format

Page 13: Efficient XML Interchange High Performance XML Don McGregor (mcgredo (at) nps.edu) Don Brutzman (brutzman (at) nps.edu)

Applications:Messaging

Military messaging: conventional XML is 10-20X larger than existing binary message formats

Bandwidth is limited and heavily in demand for other applications

Bespoke message formats are brittle and not easily handled by other applications

EXI can be ~10-30% larger than custom binary formats (depending on application)

Page 14: Efficient XML Interchange High Performance XML Don McGregor (mcgredo (at) nps.edu) Don Brutzman (brutzman (at) nps.edu)

Applications: Low Power

Cell phone/PDA integration with military formats

QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.

Page 15: Efficient XML Interchange High Performance XML Don McGregor (mcgredo (at) nps.edu) Don Brutzman (brutzman (at) nps.edu)

Applications: Chat

Chat/IM is playing a bigger role in the military

XMPP is an XML based chat application adopted as a standard by DoD

Chat messages are XML based, but go over military TCP/IP channels, which have limited bandwidth

Should be possible to replace the existing XML streams with EXI streams

Page 16: Efficient XML Interchange High Performance XML Don McGregor (mcgredo (at) nps.edu) Don Brutzman (brutzman (at) nps.edu)

Applications: X3D

X3D is an XML-based standard for 3d scenegraphs

File sizes can get large; there’s a lot of information to represent in a 3D scene

We may need to transfer this across networks (as in X3D-Earth)

Page 17: Efficient XML Interchange High Performance XML Don McGregor (mcgredo (at) nps.edu) Don Brutzman (brutzman (at) nps.edu)

EXI

The benefits of XML: easily transcoded to conventional XML format, access to the vast XML toolset

Alleviates some of XML’s problems by giving up human-readable text

We gain compactness, databinding, fast parsing speed, and low power