Upload
jerome-charles
View
217
Download
1
Tags:
Embed Size (px)
Citation preview
A. Dogac Grenoble Ecole de Managenent MEDFORIST Workshop
1
Grenoble Ecole de ManagementMEDFORIST WorkshopXML in Brief
Asuman DogacMiddle East Technical UniversityAnkara [email protected]
A. Dogac Grenoble Ecole de Managenent MEDFORIST Workshop
2/29
XML
Extensible Markup Language has become the “universal” standard for representing data
XML started out as a standard data exchange format for the Web
Yet, it has quickly become the fundamental instrument in the development of Web-based online information services and electronic commerce applications
Almost all recent electronic commerce standards are based on XML
A. Dogac Grenoble Ecole de Managenent MEDFORIST Workshop
3/29
XML
A subset of SGML (Standard Generalized Markup Language); it is defined by World Wide Web Consortium
HTML enables a universal method of displaying data; XML provides a universal method of describing data
Provides the ability to describe data in an open text-based format and deliver it using standard http protocol
A. Dogac Grenoble Ecole de Managenent MEDFORIST Workshop
4/29
XML
At present, many applications on the Web use XML for hosting large amounts of structured and semi-structured data
Representation of information in XML documents has been increasing at an astonishing pace
According to Meta Group, by 2003, about 65% of corporate data will be stored in an XML format
A. Dogac Grenoble Ecole de Managenent MEDFORIST Workshop
5/29
XMLXMLMessagingMessaging
Internet
ElectronicData
Interchange
LargeEnterprise
XML: The Unifying Technology
Mail, Phone,FAX, Email
Small, MediumEnterprise
A. Dogac Grenoble Ecole de Managenent MEDFORIST Workshop
6/29
Browse Browse the Webthe Web
Program Program the Webthe Web
Maturity of Web Infrastructure
Technology
Technology
Web Services
Web Services
XMLXML
ProgrammabiliProgrammabilityty
HTMLHTML
Web PagesWeb Pages
PresentatioPresentationn
StandardStandard
FTP,FTP, EE-mail, Gopher
-mail, Gopher
TTCCP/IPP/IP
ConnectivitConnectivityyInnovation
Innovation
A. Dogac Grenoble Ecole de Managenent MEDFORIST Workshop
7/29
XML helps address the challenge The data is self-describing
e.g. the meaning of the data is included: identifiers surround every bit of data, indicating what it means
Far more flexible method of representing transmitted information e.g. batched orders sent together can have different
fields and format without breaking apps on each end Open, standard technologies for moving,
processing and validating the data e.g. the XML parser can automatically parse, validate,
and feed the information to an application, instead of every application having to include this functionality
A. Dogac Grenoble Ecole de Managenent MEDFORIST Workshop
8/29
XML: An Example
“Electronic Commerce”, “100”, “Turban”, “25”, “Addison-Wesley”
Data stream in a typical interface…
<BOOK><TITLE> Electronic Commerce </TITLE><QUANTITY> 100 </QUANTITY><AUTHOR>Turban</AUTHOR><PRICE>25</PRICE><PUBLISHER>Addison-Wesley</PUBLISHER>
</BOOK>
Same data stream in XML…
A. Dogac Grenoble Ecole de Managenent MEDFORIST Workshop
9/29
Mark up (or Tagging)
XML uses textual markups to define data
An XML document is comprised of a collection of tagged elements each containing a start tag (<tagname>), an end tag (</tagname>), and the content between the two tags
Example:<PONumber> 1234ABCD </PONumber>
A. Dogac Grenoble Ecole de Managenent MEDFORIST Workshop
10/29
Tagging Data in XML
<PONumber> 1234ABCD </PONumber> Considering the content only, it is not possible to
understand what 1234ABCD stands for The tag name PONumber intuitively tells that the
content is a purchase order number Similarly, an XML element might be tagged as
name, gender, birth date, salary, price,… XML is extensible in the sense that users can create
their own vocabularies, the tag names are neither predefined nor limited
A. Dogac Grenoble Ecole de Managenent MEDFORIST Workshop
11/29
Adding Structure to data
Tagged elements may be nested to any depth to provide structured data, or may be repeated to represent a list of values
A valid XML document usually contains a single root element, which constitutes the top-level of nesting
In other words, a valid XML document represents a tree of elements
A. Dogac Grenoble Ecole de Managenent MEDFORIST Workshop
12/29
Giving Meaning and Structure to Data<PurchaseOrderRequest>
<PONumber>1234ABCD </PONumber>
<PurchaseOrderDate>20030601</PurchaseOrderDate>
<LineItem>
<ItemEAN_Identification no=9344 />
<QuantityOrdered> 16 </QuantityOrdered>
<UnitPrice> 95 <UnitPrice>
</LineItem>
<LineItem> … </LineItem>
</PurchaseOrderRequest>
Start TagStart Tag
An Element
An Attribute
Another Element
End Tag
Data
A. Dogac Grenoble Ecole de Managenent MEDFORIST Workshop
13/29
Giving Structure to Data
PurchaseOrderRequest
PurchaseOrderDatePONumber LineItem
ItemEAN_Identification QuantityOrdered UnitPrice
A. Dogac Grenoble Ecole de Managenent MEDFORIST Workshop
14/29
Document Type Definition (DTD) The principle purpose of the DTD is to declare the
hierarchy of document elements A document type definition defines:
The name of the elements, The content model of each element, How often and in which order elements may appear, If the end-tags can be shortcut, The possible presence of attributes and their default
values, The names of the entities
A. Dogac Grenoble Ecole de Managenent MEDFORIST Workshop
15/29
An Example DTD
<?xml version=“1.0” encoding=“UTF-8”?><!DOCTYPE simple [<!ELEMENT PurchaseOrderRequest (PONumber,
PurchaseOrderDate, LineItem+)><!ELEMENT LineItem (ItemEAN_Identification,
QuantityOrdered, UnitPrice)><!ELEMENT ItemEAN_Identification(#PCDATA)><!ELEMENT QuantityOrdered (#PCDATA)><!ELEMENT UnitPrice (#PCDATA)><! --This is a comment line to state that the other
elements are skipped --> ...]>
A. Dogac Grenoble Ecole de Managenent MEDFORIST Workshop
16/29
DTDs
A DTD specifies the structure of an XML element by specifying the names of its sub-elements and attributes
Sub-element structure is specified using the operators * : set with zero or more elements + : set with one or more elements ?: optional | : or
All values are assumed to be string values, unless the type is ANY in which case the value can be an arbitrary XML fragment
A. Dogac Grenoble Ecole de Managenent MEDFORIST Workshop
17/29
XML Namespaces
Namespaces are a simple and straightforward way to distinguish names used in XML documents, no matter where they come from
The only reason namespaces exist, is to give elements and attributes programmer-friendly names that will be unique across the whole Internet
A. Dogac Grenoble Ecole de Managenent MEDFORIST Workshop
18/29
Example<h:html xmlns:xdc="http://www.xml.com/books"
xmlns:h="http://www.w3.org/HTML/1998/html4">
<h:head><h:title>Book Review</h:title></h:head>
<h:body> <xdc:bookreview> <xdc:title>XML: A Primer</xdc:title>
<h:table> <h:tr align="center">
<h:td>Author</h:td><h:td>Price</h:td>
<h:td>Pages</h:td><h:td>Date</h:td></h:tr>
<h:tr align="left">
<h:td><xdc:author>Simon St. Laurent</xdc:author></h:td>
<h:td><xdc:price>31.98</xdc:price></h:td>
<h:td><xdc:pages>352</xdc:pages></h:td>
<h:td><xdc:date>1998/01</xdc:date></h:td>
</h:tr> </h:table> </xdc:bookreview> </h:body> </h:html>
A. Dogac Grenoble Ecole de Managenent MEDFORIST Workshop
19/29
XML Namespaces
The prefixes are linked to the full names using the attributes on the top element.whose names begin xmlns:.
The prefixes are just shorthand placeholders for the full names Those full names are URLs, i.e. Web addresses
A. Dogac Grenoble Ecole de Managenent MEDFORIST Workshop
20/29
Extensibility in XML
Anyone can invent new tags and attach a meaning to those tags
But if every user creates its own XML definition for describing his data, it is not possible to achieve interoperability
For example, one may prefer to use the tag name “POR”, while another prefers using the tag name “PurchaseOrderReq”
In other words, a tagged document is not very useful without some kind of agreement on the tags among inter-operating applications
A. Dogac Grenoble Ecole de Managenent MEDFORIST Workshop
21/29
Many Efforts for Standardized Tags… HL7 for healthcare RosettaNet for supply chain integration in
Information Technology and Electronic Components domain
ebXML for eBusiness Common Business Library (CBL) for
electronic catalogs, purchase orders, etc. …
A. Dogac Grenoble Ecole de Managenent MEDFORIST Workshop
22/29
XML Parsers
A parser takes an XML document and makes its structure and content available to an application through an API
There are two main Application Programming Interfaces (APIs) for writing parsers: Document Object Model (DOM) and Simple API for XML (SAX)
Today, many parsers are both DOM and SAX compliant
A. Dogac Grenoble Ecole de Managenent MEDFORIST Workshop
23/29
XML DOM ParserApplication Code
Initialize Parser
In memory DOM: PerformProcessing
XMLParser
XMLDocument
Begin parsing
Parsing complete
A parser validates and makes the data contained in an XML document availableto the application
A. Dogac Grenoble Ecole de Managenent MEDFORIST Workshop
24/29
XSLT Processor
XMLDocument
XSL StyleSheet 2
Output from Style
Sheet 1
Output fromStyle
Sheet 2
XSL StyleSheet 1
Parser
XSLT Processor
• Converts an XML document to another form
• An XSL style sheet is a set of transformation instructions for converting a source XML document to a target document
A. Dogac Grenoble Ecole de Managenent MEDFORIST Workshop
25/29
table.xsl
bar.xsl
art.xsl
A. Dogac Grenoble Ecole de Managenent MEDFORIST Workshop
26/29
Why XML?
EDI XML
BGM+220+1234ABCD+9' DTM+137:20030601:102'
LIN+1' PIA+5+9344:EN+1078341ITEM:VP' QTY+21:16:EA' PRI+AAA:95'
LIN+2' …
<PurchaseOrderRequest> <PONumber>1234ABCD </PONumber> <PurchaseOrderDate>20030601</PurchaseOrderDate> <FirstLineItem> <ItemEAN_Identification no=9344 /> <QuantityOrdered> 16 </QuantityOrdered> <UnitPrice> 95 <UnitPrice></FirstLineItem> <SecondLineItem>…
A. Dogac Grenoble Ecole de Managenent MEDFORIST Workshop
27/29
XML vs EDI
XML is an open human-readable, text format
EDI documents are typically in a compressed, machine-only readable form
XML is designed to require one customised mapping per industry grouping, so most companies will be able to work to one format and use XML
EDI traditionally requires customised mapping of each new trading partners document format
XML requires a reliable PC with an Internet connection
EDI typically requires dedicated servers that cost from USD10,000 and up
A. Dogac Grenoble Ecole de Managenent MEDFORIST Workshop
28/29
XML vs EDI
XML documents are typically sent via the Internet - i.e. a relatively low-cost public network
EDI documents are typically sent via private and relatively expensive value-added networks (VANs)
XML in Internet-based has low ongoing flat-rate costs using existing Internet connections and relatively low-cost Web Servers
EDI can involve high on-going transaction based costs keeping up the connection to the EDI network and keeping the servers up and running
A. Dogac Grenoble Ecole de Managenent MEDFORIST Workshop
29/29
XML vs EDI
XML appears to have no upper limit in terms of numbers of users
EDI is estimated to be limited to 300,000 companies worldwide and about 20% of their suppliers because of operational costs and complexity
XML is being developed in a world of shared software development populated by many low-cost tools and open source projects.
EDI was traditionally built from the ground up in semi-isolation without being able to share resources with other programs