41
Data interchange is essential to carry out business transactions. However, organizations store data in disparate formats, which makes the process of data interchange complex and time- consuming. Extensible Markup Language (XML) is a standard, simple way of storing the data in a format that can be exchanged across multiple systems across an enterprise. Providing an insight into XML will benefit students as it is a standard technology to describe and define documents. The course is applicable to students who want to create well-formed XML documents. This course introduces students to the fundamentals of XML and enable them to effectively use XML as a markup language to develop Web applications. Rationale

XML Unit 01

Embed Size (px)

Citation preview

Data interchange is essential to carry out business transactions. However, organizations store data in disparate formats, which makes the process of data interchange complex and time-consuming.

Extensible Markup Language (XML) is a standard, simple way of storing the data in a format that can be exchanged across multiple systems across an enterprise. Providing an insight into XML will benefit students as it is a standard technology to describe and define documents.

The course is applicable to students who want to create well-formed XML documents. This course introduces students to the fundamentals of XML and enable them to effectively use XML as a markup language to develop Web applications.

Rationale

In this session, you will learn to:Identify the need for XML as a standard data interchange format

Identify the structure of XML documents

Objectives

Traditionally, preprinted formats were used to exchange information between businesses.

Need for a more effective way of communicating and processing business data led to the emergence of Electronic Data Interchange (EDI).

EDI refers to the process of exchanging documents in a standard format between two computer systems.

EDI has the following limitations:Rigid transaction set

Fixed business rules

High costs

Slow pace of standards evolution

Getting Started with XML

XML is a text-based markup language that enables storage of data in a structured format.

XML is a cross-platform, hardware and software independent markup language that enables structured data transfer between heterogeneous systems.

XML is used as a common data interchange format in a number of applications.

Let us understand the usage of XML with the help of the

diagram.

Introducing XML

Introducing XML (Contd.)

Web Application

Web Services

.

XML

Windows Application

Mobile Application

SQL ServerDB2 Oracle

AccessData Sources

Web Architecture Using XMLIn a traditional Web architecture, a client sends a request to the server in a pre-defined format and receives the response accordingly.

The advantage of using XML in the Web architecture is that the structure of the request can be obtained from the server at run time.

XML can encode non-relational data as well as relational data structure.

Introducing XML (Contd.)

Web Architecture Using XML (Contd.)The following figure depicts the XML Web architecture.

Introducing XML (Contd.)

Difference Between SGML, HTML, and XMLStandard Generalized Markup Language (SGML) allows documents to describe their grammar by specifying the tag set used in the document and the structural relationship that these tags represent.

Hypertext Markup Language (HTML) is used for data presentation.

XML is used for data description and definition.

Introducing XML (Contd.)

Difference Between SGML, HTML, and XML (Contd.)

Introducing XML (Contd.)

SGML HTML XML

Extensibility

Structure

Validation

Browser Dependency

Cost/Benefit

Low High Low

Poor Good Medium

Yes Yes Yes

Yes No Yes

Yes No Yes

Advantages of XMLSome of the advantages of XML are:

Domain-specific vocabulary

Data interchange

Smart searches

Granular updates

User-selected view of data

Message transformation

Introducing XML (Contd.)

Advantages of XMLSome of the advantages of XML are:

Domain-specific vocabulary

Data interchange

Smart searches

Granular updates

User-selected view of data

Message transformation

Introducing XML (Contd.)

In HTML, only the predefined tags can be used.

In XML, you can create new tags based on the requirements of the application.

Various languages such as MathML and WML have been derived from XML.

Advantages of XMLSome of the advantages of XML are:

Domain-specific vocabulary

Data interchange

Smart searches

Granular updates

User-selected view of data

Message transformation

Introducing XML (Contd.)

XML produces files that are unambiguous, easy to generate, and easy to read.

XML provides a structure to store data in textual format, which can then be used as a standard format or protocol for data interchange.

Advantages of XMLSome of the advantages of XML are:

Domain-specific vocabulary

Data interchange

Smart searches

Granular updates

User-selected view of data

Message transformation

Introducing XML (Contd.)

The flexibility to create user-defined tags in XML enables creation of smart search engines.

You can differentiate whether you want do to search based on a text or on a tag, which enables the browser to perform a focused search and return precise information that matches the search query.

Advantages of XMLSome of the advantages of XML are:

Domain-specific vocabulary

Data interchange

Smart searches

Granular updates

User-selected view of data

Message transformation

Introducing XML (Contd.)

Document updates in HTML are slow as the entire document needs to be refreshed from the server.

Document updates in XML are faster as only the changed content needs to be downloaded.

Advantages of XMLSome of the advantages of XML are:

Domain-specific vocabulary

Data interchange

Smart searches

Granular updates

User-selected view of data

Message transformation

Introducing XML (Contd.)

In HTML, you need to create separate HTML pages to display the same information in different formats while XML concentrates on data and not on its presentation.

HTML does not allow conditional formatting of a document while in XML conditional formatting is possible.

Advantages of XMLSome of the advantages of XML are:

Domain-specific vocabulary

Data interchange

Smart searches

Granular updates

User-selected view of data

Message transformation

Introducing XML (Contd.)

In XML, a message can be stored in the form of a document, object data, or data from a database.

XML design provides flexibility while storing data as it does not impose any restriction on the field size and the order in which the data is stored.

Future of XMLThe future uses of XML can be summarized as:

XML will be widely used in e-commerce.

XML will have a huge core market in the form of Business to Business (B2B).

XML will be used for mobile devices due to its ability to easily convert into the appropriate format for any device.

XML will be used to solve communication problems in EDI and Enterprise Application Integration (EAI) as it provides interoperability between disparate applications.

Introducing XML (Contd.)

W3C is responsible for the development of Web specifications that describe communication protocols and technologies for the Web.

Due to the flexibility for customization in XML, W3C has laid down these rules that need to be followed by all XML vendors:

XML must be directly usable over the Internet.

XML must support a wide variety of applications.

XML must be compatible with SGML.

XML should have absolute minimum number of optional features, ideally zero.

XML documents must be human legible and clear.

XML design must be formal and concise.

XML documents must adhere to a set of constraints called full normalization.

Introducing W3C

An XML application is considered well designed if it is robust and scalable.

To design a robust and scalable XML application, the following steps need to be performed:1. Create an information model.

2. Identify the required components of the XML document.

3. Create the XML document.

Identifying the Structure of XML Documents

An information model is a description of the information used in an organization.

Information modeling helps identify:Objects involved in an application

Properties of the objects

Relationships among objects

XML provides the following additional capabilities to information modeling:

Heterogeneity

Extensibility

Flexibility

Information Modeling

Each record can contain different data fields. New data types can be added whenever required. Data fields can vary in size and configuration between instances.

Types of information models that can be created for an XML application are:

Static model: Helps define all the objects in an application and the relationships among them.

Dynamic model: Helps to determine the information flow of an application in the form of messages.

Information Modeling (Contd.)

The various components of an XML document used for representing data in a hierarchical order are:

Processing Instruction (PI)

Tags

Elements

Content

Attributes

Entities

Comments

Components of an XML Document

<?xml version=“1.0” encoding=“UTF-8”?>

<STOREDATA>

<!--STOREDATA is the root element-->

<STORE STOREID=“S101”> <PRODUCTNAME>Toys</PRODUCTNAME> <QUANTITY>100</QUANTITY>

<DISPLAY>The price of this toy is &lt; 200 </DISPLAY>

</STORE>

</STOREDATA>

Components of an XML Document (Contd.)

Processing Instruction (PI)

Provides information on how the XML file should be processed.

<?xml version=“1.0” encoding=“UTF-8”?>

<STOREDATA>

<!--STOREDATA is the root element-->

<STORE STOREID=“S101”> <PRODUCTNAME>Toys</PRODUCTNAME> <QUANTITY>100</QUANTITY>

<DISPLAY>The price of this toy is &lt; 200 </DISPLAY>

</STORE>

</STOREDATA>

Components of an XML Document (Contd.)

Tags

Is a means of identifying data. Tags consist of start tag and end tag.

<?xml version=“1.0” encoding=“UTF-8”?>

<STOREDATA>

<!--STOREDATA is the root element-->

<STORE STOREID=“S101”> <PRODUCTNAME>Toys</PRODUCTNAME> <QUANTITY>100</QUANTITY>

<DISPLAY>The price of this toy is &lt; 200 </DISPLAY>

</STORE>

</STOREDATA>

Components of an XML Document (Contd.)

Root Element

Contains all other elements in the document.

<?xml version=“1.0” encoding=“UTF-8”?>

<STOREDATA>

<!--STOREDATA is the root element-->

<STORE STOREID=“S101”> <PRODUCTNAME>Toys</PRODUCTNAME> <QUANTITY>100</QUANTITY>

<DISPLAY>The price of this toy is &lt; 200 </DISPLAY>

</STORE>

</STOREDATA>

Components of an XML Document (Contd.)

Comments

Are statements used to explain the XML code.

<?xml version=“1.0” encoding=“UTF-8”?>

<STOREDATA>

<!--STOREDATA is the root element-->

<STORE STOREID=“S101”> <PRODUCTNAME>Toys</PRODUCTNAME> <QUANTITY>100</QUANTITY>

<DISPLAY>The price of this toy is &lt; 200 </DISPLAY>

</STORE>

</STOREDATA>

Components of an XML Document (Contd.)

Child Elements

Are the basic units used to identify and describe data in XML.

<?xml version=“1.0” encoding=“UTF-8”?>

<STOREDATA>

<!--STOREDATA is the root element-->

<STORE STOREID=“S101”> <PRODUCTNAME>Toys</PRODUCTNAME> <QUANTITY>100</QUANTITY>

<DISPLAY>The price of this toy is &lt; 200 </DISPLAY>

</STORE>

</STOREDATA>

Components of an XML Document (Contd.)

Attributes

Provide additional information about the elements for which they are declared.

<?xml version=“1.0” encoding=“UTF-8”?>

<STOREDATA>

<!--STOREDATA is the root element-->

<STORE STOREID=“S101”> <PRODUCTNAME>Toys</PRODUCTNAME> <QUANTITY>100</QUANTITY>

<DISPLAY>The price of this toy is &lt; 200 </DISPLAY>

</STORE>

</STOREDATA>

Components of an XML Document (Contd.)

Content

Refers to the information represented by the elements of an XML document. An element can contain:

• Character or data content

• Element content

• Combination or mixed content

<?xml version=“1.0” encoding=“UTF-8”?>

<STOREDATA>

<!--STOREDATA is the root element-->

<STORE STOREID=“S101”> <PRODUCTNAME>Toys</PRODUCTNAME> <QUANTITY>100</QUANTITY>

<DISPLAY>The price of this toy is &lt; 200 </DISPLAY>

</STORE>

</STOREDATA>

Components of an XML Document (Contd.)

Entities

Is a set of information that can be used by specifying a single name.

The rules that govern the creation of a well-formed XML document:

Every start tag must have an end tag.

Empty tags must be closed using a forward slash (/).

All attribute values must be given in double quotation marks.

Tags must have proper nesting.

XML tags are case sensitive.

Identifying the Rules for Creating XML Documents

Problem Statement:CyberShoppe, Inc. sells toys and books in the United States. It has three branches in different parts of the country. Currently, the three branches maintain data on their local computer systems. The IT manager at CyberShoppe has identified that a centralized data repository on the products sold through its e-commerce site is required. The data from all branches must be collated and housed in a centralized location. This data must be made available to the Accounts and Sales sections at the individual branches, regardless of the hardware and software platforms being used at the branches.

In addition, the sales personnel require access to the data using palmtops and cellular phones. The product details of CyberShoppe consist of the product name, a brief description, the price, and the available quantity on hand. A product ID uniquely identifies each product.

Demo: Creating an XML Document

Consider the following statement:

<?xml version="1.0" encoding= "UTF-8"?>

Which component of an XML document does the preceding statement represent?a. Element

b. Content

c. Entity

d. Processing Instruction

Answer:d. Processing Instruction

Practice Questions

Bob is the EDP head of an organization that manufactures and sells hardware parts. The organization has presence in all the major cities of the United States. At present, all branch offices maintain their data locally. Bob wants to centralize the repository of data in his organization. Data from all the branch offices needs to be collated and stored in a centralized location. Data pertaining to a branch should be available only to that branch office. However, the head office should be able to access all the data.

Practice Questions

In addition, Bob also wants that the sales personnel should be able to access sales data from mobile devices, such as palmtops and mobile phones. This sales information should have a brief description of the product, the price, and the available inventory. Using which of the following markup languages can Bob achieve the preceding goals?a. HTML

b. XML

c. SGML

d. EDI

Answer:b. XML

Practice Questions (Contd.)

Which of the following statement is NOT true about information modeling?a. Information Modeling is used to understand the structure and

meaning of information that will be stored in XML documents.

b. Information Modeling helps you identify the objects involved in an application, the properties of the objects, and the relationships among them.

c. In an Information Model, each record can contain different data fields.

d. An information model imposes restrictions on data.

Answer:d. An information model imposes restrictions on data.

Practice Questions

Which one of the following statements is true about XML?a. XML is a text-based markup language that provides predefined

tags to store data.

b. XML is a platform-neutral data interchange format.

c. XML requires VAN for data interchange.

d. XML allows you to specify data formatting instructions.

Answer:b. XML is a platform-neutral data interchange format.

Practice Questions

Which one of the following is a disadvantage of traditional EDI?a. It provides fixed transaction sets.

b. It increases the communication lag time between an agency and a customer.

c. It increases data entry errors.

d. It increases the time taken to process orders.

Answer:a. It provides fixed transaction sets.

Practice Questions

In this session, you learned that: EDI refers to the process of exchanging documents in a standard format between two computer systems.

XML is a text-based markup language that enables you to store data in a structured format by using meaningful tags.

Using XML in Web architecture enables loose coupling between the server application and the client application.

XML has the following advantages:Domain‑specific vocabulary

Data interchange

Smart searches

Granular updates

User‑selected view of data

Message transformation

Summary

In future, XML will be widely used in:E-commerce

B2B services

Mobile services

EDI and EAI

XML was defined by W3C to ensure that structured data is uniform and independent of vendors and applications.

In XML, an information model is used to understand the structure and meaning of information that will be stored in XML documents.

You can create static, dynamic, or a combination of both these information models for an XML application.

A static information model helps you define all the objects in an application and the relationships among them.

Summary (Contd.)

In a dynamic model, data flow diagrams and process diagrams are used to determine the flow of information.

An XML document consists of:Processing Instruction (PI)

Tags

Elements

Content

Attributes

Entities

Comments

Summary (Contd.)