30
Mulberry Technologies, Inc. What is XML and Why Should You Care B. Tommie Usdin Mulberry Technologies Inc. 17 West Jefferson St. Suite 207 Rockville MD 20850 Phone: 301/315-9631 Fax: 301/315-8285 [email protected] http://www.mulberrytech.com July 2000 Version 1.0 © 2000 Mulberry Technologies

What is XML and Why Should You Care - Mulberry …What is XML and Why Should You Care Page 4 Mulberry Technologies, Inc. XML Is a Data Format slide 5 “Employee Record” Example

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Page 1: What is XML and Why Should You Care - Mulberry …What is XML and Why Should You Care Page 4 Mulberry Technologies, Inc. XML Is a Data Format slide 5 “Employee Record” Example

MulberryTechnologies, Inc.

What is XML

and

Why Should You Care

B. Tommie UsdinMulberry Technologies Inc.17 West Jefferson St.Suite 207Rockville MD 20850Phone: 301/315-9631Fax: 301/[email protected]://www.mulberrytech.com

July 2000Version 1.0© 2000 Mulberry Technologies

Page 2: What is XML and Why Should You Care - Mulberry …What is XML and Why Should You Care Page 4 Mulberry Technologies, Inc. XML Is a Data Format slide 5 “Employee Record” Example
Page 3: What is XML and Why Should You Care - Mulberry …What is XML and Why Should You Care Page 4 Mulberry Technologies, Inc. XML Is a Data Format slide 5 “Employee Record” Example

What is XML and Why Should You Care

MulberryTechnologies, Inc. i

What is XML and Why Should You Care

What is XML?What XML Means............................................................................................ 1XML Works Through Tags.............................................................................. 2Elements Can Nest ........................................................................................... 3What XML Isn’t............................................................................................... 3XML Is a Data Format

“Employee Record” Example................................................................... 4View This in a Browser ............................................................................ 5A Familiar Print Application .................................................................... 5Same Data, Different Application............................................................. 6Same Source: Load a Database................................................................. 6Ultimate Purpose of XML: Encode Data Once......................................... 7

Parts of an XML ApplicationLogical Components of an XML Application.................................................. 7Component: XML Document .......................................................................... 8Component: DTD (Document Type Definition)............................................. 8

DTDs Express Rules................................................................................. 9Why Use a DTD?...................................................................................... 9DTDs today, Schemas tomorrow............................................................ 10

How to Specify Formatting (and Behavior)XML Separates Content from Format/Behavior..................................... 10Use an Output Specification to get There............................................... 11Documents and Stylesheets..................................................................... 11

Component: XML TransformsXML and XSL ........................................................................................ 12XSL For XML Transformation (XSLT) ................................................. 13

Where XML can be Used in PublishingXML Files ...................................................................................................... 13XML for Print and Web Publishing............................................................... 14XML Integrates Services on a Web Base....................................................... 14XML Between Application Layers ................................................................ 15A Typical “3-tiered” Model ........................................................................... 16XML Tagged Data for Information Interchange............................................ 17XML Repositories and Databases.................................................................. 17XML for Search and Retrieval....................................................................... 18

Page 4: What is XML and Why Should You Care - Mulberry …What is XML and Why Should You Care Page 4 Mulberry Technologies, Inc. XML Is a Data Format slide 5 “Employee Record” Example

What is XML and Why Should You Care

ii MulberryTechnologies, Inc.

Future of XMLXML Versus HTML

XML Will Replace HTML ..................................................................... 18XML Won’t Replace HTML .................................................................. 19

XML and PDFNOT XML Versus PDF........................................................................... 19XML and PDF ........................................................................................ 20

XML and Proprietary FormatsProprietary Formats................................................................................. 20XML feeds Proprietary Formats.............................................................. 21

XML Tools..................................................................................................... 21XML Ubiquitous Under the Covers............................................................... 22XML Tag Sets will Proliferate....................................................................... 22

Where to Get More InformationThe Source for XML and Related Information.............................................. 22General XML Information ............................................................................. 23XML E-Business and EDI Information.......................................................... 24Printed Books on Concepts ............................................................................ 25Other Information Sources............................................................................. 26Still More Information Sources...................................................................... 26

Page 5: What is XML and Why Should You Care - Mulberry …What is XML and Why Should You Care Page 4 Mulberry Technologies, Inc. XML Is a Data Format slide 5 “Employee Record” Example

What is XML and Why Should You Care

MulberryTechnologies, Inc. Page 1

What is XML?

slide 1

The Word “XML” is Used to Mean:

� An open standard (well... a W3C recommendation) that provides:

â A data format

â A data modeling language

� The use of XML-formatted data in an application (like a browser)

� A metalanguage for creating markup languages

� A set of associated recommendations and specifications

(link, style, query, language specification, transformation,APIs, etc.)

Page 6: What is XML and Why Should You Care - Mulberry …What is XML and Why Should You Care Page 4 Mulberry Technologies, Inc. XML Is a Data Format slide 5 “Employee Record” Example

What is XML and Why Should You Care

Page 2 MulberryTechnologies, Inc.

slide 2

XML Works Through Tags

Paired tags:

â Enclose data

â Identify/name the data

â Named component called an “element”

<message>Hello World!</message>

start tagmarksbeginning

the dataend tagmarksend

Page 7: What is XML and Why Should You Care - Mulberry …What is XML and Why Should You Care Page 4 Mulberry Technologies, Inc. XML Is a Data Format slide 5 “Employee Record” Example

What is XML and Why Should You Care

MulberryTechnologies, Inc. Page 3

slide 3

Elements Can Nest

paraparapara

body

to

from

sig

memo

to from body sig

para para para

memo

slide 4

XML Isn’t:

� A programming languageDoes not replace C++, Java, perl, Python, ...

� A user interface

� A presentation format

� A formatting or processing system

� A standard set of tags

� A recommended set of tags

Page 8: What is XML and Why Should You Care - Mulberry …What is XML and Why Should You Care Page 4 Mulberry Technologies, Inc. XML Is a Data Format slide 5 “Employee Record” Example

What is XML and Why Should You Care

Page 4 MulberryTechnologies, Inc.

XML Is a Data Format

slide 5

“Employee Record” Example

In XML we can separate data content from behavior/presentation.

<employee-record type="dog" empno="9"><name><first>Sarsparilla</first><last>Usdin</last></name><affiliation><title>Deputy in Charge of Chewables</title><company>Mulberry Technologies</company><location>Rockville, MD 20850</location><email-name>sassy</email-name></affiliation><height unit="in">36</height><weight unit="lb">70</weight></employee-record>

Page 9: What is XML and Why Should You Care - Mulberry …What is XML and Why Should You Care Page 4 Mulberry Technologies, Inc. XML Is a Data Format slide 5 “Employee Record” Example

What is XML and Why Should You Care

MulberryTechnologies, Inc. Page 5

slide 6

View This in a Browser

Convert into HTML (today). Or in an XML browser (tomorrow):

slide 7

A Familiar Print Application

MulberryTechnologies, Inc.

Sasparilla Usdin

17 West Jefferson StreetSuite 207Rockville, MD 20850

Phone: 301/315-9631Fax: 301/315-9634

[email protected]

Page 10: What is XML and Why Should You Care - Mulberry …What is XML and Why Should You Care Page 4 Mulberry Technologies, Inc. XML Is a Data Format slide 5 “Employee Record” Example

What is XML and Why Should You Care

Page 6 MulberryTechnologies, Inc.

slide 8

Same Data, Different Application

New Employee Announcements

Sasparilla Usdin has recently joined Mulberry Technologies, Inc.’s Rockville staff as Deputy in Charge of Chewables.

Welcome to the team, Sassy!

� XML elements rolled into “form letter”

� Something (perhaps employee-id) linked to photo

slide 9

Same Source: Load a Database

Key: 00095AUSEMPNO: 009001:USDIN002:Sasparilla008:36014:70020:Deputy in Charge of Chewables

Page 11: What is XML and Why Should You Care - Mulberry …What is XML and Why Should You Care Page 4 Mulberry Technologies, Inc. XML Is a Data Format slide 5 “Employee Record” Example

What is XML and Why Should You Care

MulberryTechnologies, Inc. Page 7

slide 10

Ultimate Purpose of XML: Encode Data Once

� Enable semantically complex search

� Produce many products from that markup

� Reuse data (in whole or part) many times

� Interchange data freely

� Enable machine–machine communication

� Let whole communities agree on data content

� Let data live a long time

Parts of an XML Application

slide 11

Logical Components of an XML Application

� XML document (tags and text)

� DTD or Schema (the model)

� Output Specifications (how looks/behaves)

� Transformations (from here to there)

Page 12: What is XML and Why Should You Care - Mulberry …What is XML and Why Should You Care Page 4 Mulberry Technologies, Inc. XML Is a Data Format slide 5 “Employee Record” Example

What is XML and Why Should You Care

Page 8 MulberryTechnologies, Inc.

slide 12

Component: XML Document

The tags (markup) and the text (content)

� Two types

â Well-formed

â Valid (has a model)

� Usually created

â using an XML editor (authoring)

â by a program from

• a database

• another type of XML file by transform

• conversion from another format (like Quark or Word)

slide 13

Component: DTD (Document Type Definition)

The modeling mechanism specified by the XML specification

� Models one type/class of information (a “document”)(reference book, bank transfer, journal article, memo, help-topic)

� Is a set of rules describing how documents of that type can bemarked up

� Is written in the formal syntax of XML

Page 13: What is XML and Why Should You Care - Mulberry …What is XML and Why Should You Care Page 4 Mulberry Technologies, Inc. XML Is a Data Format slide 5 “Employee Record” Example

What is XML and Why Should You Care

MulberryTechnologies, Inc. Page 9

slide 14

DTDs Express Rules

for example:

� article = metadata followed by article body, followed by optionalback matter

� paragraph = data characters and may include any of thefollowing: Person Names URLs, and/or Geographic Regions

� Reference book =Front Matter followed by Body followed by Back Matter

� Purchase Order =Order Header followed by List of Order Detail, followed byoptional Order Summary

slide 15

Why Use a DTD?

� DTD is a contract between producers and consumers(Both can validate to see if they got/sent what they expected)

� Formal specification of information types allows consistentdownstream processing

� Support for interoperable families of documents and documenttypes

â To ensure information conforms to model (validationmechanism)

â Parties don’t have to share software or applications

To share information, share the DTD

Page 14: What is XML and Why Should You Care - Mulberry …What is XML and Why Should You Care Page 4 Mulberry Technologies, Inc. XML Is a Data Format slide 5 “Employee Record” Example

What is XML and Why Should You Care

Page 10 MulberryTechnologies, Inc.

slide 16

DTDs today, Schemas tomorrow

� Provide data typing

� Model multi-element dependencies, attribute dependencies, realinheritance, or other complex data dependencies

� Perform real data modeling

� Enforce rules about the data content (as opposed to the tagstructure)

Component: Output Specification/Stylesheet

slide 17

XML Separates Content from Format/Behavior

How it looks (16 pt Helvetica Bold) or what it does (start a javascript)

� Is based on the tagging

� Is the same for every tag in the same context

â NOT one tag per one format

â Table title may differ from Figure title from Chapter title

Page 15: What is XML and Why Should You Care - Mulberry …What is XML and Why Should You Care Page 4 Mulberry Technologies, Inc. XML Is a Data Format slide 5 “Employee Record” Example

What is XML and Why Should You Care

MulberryTechnologies, Inc. Page 11

slide 18

Use an Output Specification to get There

(Frequently called “Stylesheet”)

� Says what XML data will look like or how to behave

â On screen or paper

â Or in other media (for example in audible output)

� Defines an appearance or rendition or behavior

â For each element

â In each of its contexts within a document

slide 19

Documents and Stylesheets

� One stylesheet, many documents

â Maintain consistency of format, “look and feel” acrossdocuments

â House style is easy to develop, maintain, apply

� One document, many stylesheets

â Different media types: print, on-line, etc.

â Different derivative documents: selections, summaries,indexes, catalogs....

Page 16: What is XML and Why Should You Care - Mulberry …What is XML and Why Should You Care Page 4 Mulberry Technologies, Inc. XML Is a Data Format slide 5 “Employee Record” Example

What is XML and Why Should You Care

Page 12 MulberryTechnologies, Inc.

Component: XML Transforms

slide 20

XML and XSL

BrowserPrint

Print

OtherDevice

Editing Application

Database

In-house Printing

XSLXSL

XSL

XSL

XSL

<XML/>

Page 17: What is XML and Why Should You Care - Mulberry …What is XML and Why Should You Care Page 4 Mulberry Technologies, Inc. XML Is a Data Format slide 5 “Employee Record” Example

What is XML and Why Should You Care

MulberryTechnologies, Inc. Page 13

slide 21

XSL For XML Transformation (XSLT)

Transforms from one set of tags directly into anotherTransform XML into

� HTML for browsers

� Other (XML) tag sets for further processing

� Plain text formats (e.g., loader files for databases)

� Non-XML tag sets

Where Does XML Fit in the PublishingProcess?

slide 22

XML Files

� Are “plain text” underneath

â Use any text editor or any word processor that can handleplain text

â Built on Unicode (represents all major scripts of the world)

� XML could be

â Native file format for a software package

â Something you “import” and “export”

Page 18: What is XML and Why Should You Care - Mulberry …What is XML and Why Should You Care Page 4 Mulberry Technologies, Inc. XML Is a Data Format slide 5 “Employee Record” Example

What is XML and Why Should You Care

Page 14 MulberryTechnologies, Inc.

slide 23

XML for Print and Web Publishing

� Many different outputs, one manageable source

â Many media/device types(Web, CD-ROM, handheld PDAs, voice-synthesis)

â Many styles of print/display

� Different hardware, software, OS for input, manipulation, display

� Publish on demand/Customized output

slide 24

XML Integrates Services on a Web Base

� Web-based “data-centric” integration

â Allows loose coupling of applications: robust and open

� Web becomes friendly to other media

â Information syndication, on-demand publication

â Web and print services complementary

â Information into and out of databases

� From existing data systems directly to the web

Page 19: What is XML and Why Should You Care - Mulberry …What is XML and Why Should You Care Page 4 Mulberry Technologies, Inc. XML Is a Data Format slide 5 “Employee Record” Example

What is XML and Why Should You Care

MulberryTechnologies, Inc. Page 15

slide 25

XML Between Application Layers

� In the “Three-tier” system model:

â Presentation/User interface layer

â Processing or “business logic” layer

â Storage or repository layer

� XML used in any of the three tiers, especially in the middle

� XSL is used for any processing

â Within the middle tier, and

â Between tiers

Page 20: What is XML and Why Should You Care - Mulberry …What is XML and Why Should You Care Page 4 Mulberry Technologies, Inc. XML Is a Data Format slide 5 “Employee Record” Example

What is XML and Why Should You Care

Page 16 MulberryTechnologies, Inc.

slide 26

A Typical “3-tiered” Model

XML+CSSbrowser

Print

OtherDevice

Editing Application

<XML/>

HTMLbrowser

XSL

XSL

XSL

<XML/><XML/> XSL

Object DB

<XML/>XSL

File system

<XML/>

Relational DB

<XML/>

Partnersystem

<XML/>XSL DOM

XSL

Processing Layer

Presentation Layer

Storage Layer

Page 21: What is XML and Why Should You Care - Mulberry …What is XML and Why Should You Care Page 4 Mulberry Technologies, Inc. XML Is a Data Format slide 5 “Employee Record” Example

What is XML and Why Should You Care

MulberryTechnologies, Inc. Page 17

slide 27

XML Tagged Data for Information Interchange

� By data aggregators (scientific and journal websites,semiconductor industry, aircraft)

� Through the life-cycle of a product (among divisions)

� Direct machine-to-machine transfer

� Between rival proprietary formats

� Inter-process communication (IPC)

� Among business partners (E-Business transactions B2B, B2C,EDI replacing EDIFACT or proprietary formats)

slide 28

XML Repositories and Databases

� XML is used inside:

â OO databases

â Relational databases

â Object-relational and hierarchical databases

� XML is used to communication among

â Databases and applications

â Databases and databases

� XML Repositories

â Store content for use and reuse

â Enable repurposing

Page 22: What is XML and Why Should You Care - Mulberry …What is XML and Why Should You Care Page 4 Mulberry Technologies, Inc. XML Is a Data Format slide 5 “Employee Record” Example

What is XML and Why Should You Care

Page 18 MulberryTechnologies, Inc.

slide 29

XML for Search and Retrieval

� Enhance documents with metadata

� Use grammar to increase relevance

limit to or exclude in named element(s)

� Use vocabulary to increase recall

Future of XML

XML Versus HTML

slide 30

XML Will Replace HTML for:

� Users running into limitations of HTML

â Retrieval

â Formatting

â Richness of encoding

� Users with complex data requirements (semi-conductors, airlines)

� Users with complex security requirements

� Users with SGML data

Page 23: What is XML and Why Should You Care - Mulberry …What is XML and Why Should You Care Page 4 Mulberry Technologies, Inc. XML Is a Data Format slide 5 “Employee Record” Example

What is XML and Why Should You Care

MulberryTechnologies, Inc. Page 19

slide 31

XML Won’t Replace HTML for:

� Huge amounts of legacy HTML

� Simple display-only pages

� Unstructured data

� Write-once web pages

XML and PDF

slide 32

NOT XML Versus PDF

� We will continue to need both

� XML source used to produce:

â PDF

â HTML for older browsers

â Search service proprietary format (or specific-DTD tagging)

â eBook format(s)

â CD-ROM internal format

Page 24: What is XML and Why Should You Care - Mulberry …What is XML and Why Should You Care Page 4 Mulberry Technologies, Inc. XML Is a Data Format slide 5 “Employee Record” Example

What is XML and Why Should You Care

Page 20 MulberryTechnologies, Inc.

slide 33

XML and PDF

� PDF will still be a popular pre-press/archival form

� XML systems will produce PDF as an output product (as they do now)

� Searchable PDF will make in-roads against XML but be droppedas lesser functionality is recognized.

� It will take pretty pages and snazzy behavior to wean people fromPDF, which is

â easy

â expensive

XML and Proprietary Formats

slide 34

Proprietary Formats

� Optimized for one use (and one set of tools)

� Faster, easier for intended use

� Controlled by vendors or other non-public process

� Often change with version of software or application

Page 25: What is XML and Why Should You Care - Mulberry …What is XML and Why Should You Care Page 4 Mulberry Technologies, Inc. XML Is a Data Format slide 5 “Employee Record” Example

What is XML and Why Should You Care

MulberryTechnologies, Inc. Page 21

slide 35

XML feeds Proprietary Formats

� XML will be transformed into proprietary formats for:

â Print and display

â Database loads and device drivers

â Existing processes and procedures

� Proprietary formats will be transformed into XML

â (Generally harder than XML to proprietary)

â Allows people to use known tools

â Captures existing content

slide 36

XML Tools

(XML got its start from SGML tools:

� XML is SGML minus some features

� XML tools have been available since XML announced)

New XML Tools announced every week

� Authoring applications (word processors, spread sheets, etc.)

� Document management and workflow applications

� Databases with XML smarts (or XML-based databases)

� XML-based search and retrieval

� XML formatting and display

Page 26: What is XML and Why Should You Care - Mulberry …What is XML and Why Should You Care Page 4 Mulberry Technologies, Inc. XML Is a Data Format slide 5 “Employee Record” Example

What is XML and Why Should You Care

Page 22 MulberryTechnologies, Inc.

slide 37

XML Ubiquitous Under the Covers

� All major software companies using XML internally

� XML the format of choice for process-to-process interchange

� XML will dominate e-commerce and e-business in a year or two

slide 38

XML Tag Sets will Proliferate

� Discipline based

â CML (Chemistry Markup Language)

â MathML (Math Markup Language)

� Subject or industry domain (financial services, health care, autoand aircraft maintenance, ...)

Groups of users creating shared XML applicationsliterally hundreds of efforts announced and more daily

� SVG (Scaleable Vector Graphics)

Where to Get More Information

slide 39

The Source for XML and Related Information

� Robin Cover’s SGML/XML Web Page:http://www.oasis-open.org/cover/sgml-xml.html

Page 27: What is XML and Why Should You Care - Mulberry …What is XML and Why Should You Care Page 4 Mulberry Technologies, Inc. XML Is a Data Format slide 5 “Employee Record” Example

What is XML and Why Should You Care

MulberryTechnologies, Inc. Page 23

slide 40

General XML Information

� W3C’s XML page: http://www.w3.org/XML/

� XML FAQ (Peter Flynn): http://www.ucc.ie/xml/

� XML.com: http://www.xml.com (industry coverage andtools)

� XMLinfo.org http://www.xmlinfo.org (covers tools anddevelopment)

� XSLinfo.org: http://www.xslinfo.org (covers XSLdevelopment and implementation issues)

Page 28: What is XML and Why Should You Care - Mulberry …What is XML and Why Should You Care Page 4 Mulberry Technologies, Inc. XML Is a Data Format slide 5 “Employee Record” Example

What is XML and Why Should You Care

Page 24 MulberryTechnologies, Inc.

slide 41

XML E-Business and EDI Information

� The OASIS and UN/CEFACT entry http://www.ebxml.org

� CommerceOne’s entry http://www.commerceone.com(Common Business Library etc.)

� The Microsoft entry http://www.BizTalk.org

� RosettaNet consortium http://www.rosettanet.org(vocabulary and process issues, industry coverage)

� XML.ORG http://www.xml.org

� Commerce XML (cXML) http://www.cxml.org

� Oracle’s XML materialhttp://www.oracle.com/xml/content.html

� IBM is heavily into this, too.http://www.developer.ibm.com [for Business RulesMarkup Language (BRML) and Trading Partner AgreementMarkup Language (tpaML)

And others too numerous to mention: CommerceNet’seCoFramework, SAIC’s Universal Commerce Language and Protocol(ULCP), XEDI.org, etc.

Page 29: What is XML and Why Should You Care - Mulberry …What is XML and Why Should You Care Page 4 Mulberry Technologies, Inc. XML Is a Data Format slide 5 “Employee Record” Example

What is XML and Why Should You Care

MulberryTechnologies, Inc. Page 25

slide 42

Printed Books on Concepts

� SGML: the Billion-Dollar Secret, by Chet Ensign (Prentice-HallPTR, 1997)

â Manager level. Written about SGML (XML’s parentstandard), but almost entirely applicable: excellent on issuesof scalable system development.

� ABCD… SGML , by Liora Alschuler (Thompson ComputerPress, 1995)

â Written about SGML (XML’s parent standard), but change theword “SGML” to “XML” as you read it and it still applies.Talks about work process changes an XML system can bring.

� XML: A Manager’s Guide, by Kevin Dick (Addison-WesleyInformation Technology Series, 2000)

â Manager level. Solid view, but stays at 10,000 feet up.

� The XML Companion (2nd Edition), by Neil Bradley (Addison-Wesley, 2000)

â Very good basic technical introduction.

� Professional XML, by Richard Anderson, Mark Birbeck and tenmore authors. (Wrox Press Ltd.)

â Light technical level. Each author wrote an introduction andthen examples/case study for one technical topic. Introducesthe problems of XML and databases, the XML APIs DOMand SAX, server to server XML (XML-RPC, SOAP, etc.) andmore.

Page 30: What is XML and Why Should You Care - Mulberry …What is XML and Why Should You Care Page 4 Mulberry Technologies, Inc. XML Is a Data Format slide 5 “Employee Record” Example

What is XML and Why Should You Care

Page 26 MulberryTechnologies, Inc.

slide 43

Other Information Sources

� Markup Languages: Theory and Practice (a quarterly journal):http://mitpress.mit.edu/MLANG

� OASIS Home Page (vendor consortium): http://www.oasis-open.org

â XML.ORG: http://xml.org (document model repositoryand support materials)

â OASIS XML Conformance Subcommittee:http://www.oasis-open.org/committees/xmlconf-pub.html

� Graphic Communications Association: http://www.gca.org(sponsors conferences including Extreme Markup Languages,August 2000 and XML 2000, December 2000)

� XML.COM: http://www.xml.com

slide 44

Still More Information Sources

� Basic newsgroup: comp.text.xml (also some oncomp.text.sgml)

� Useful Lists

â XML-L: http:/listserv.heaanet.ie/xml-l.html(for newcomers)

â XML-Developer’s List:http://www.lists.ic.ac.uk/hypermail/xml-dev(heavy technical discussion)

â XSL-List: http://www.mulberrytech.com