27
ITEC 2336 Internet ITEC 2336 Internet Application Development XML

ITEC 2336ITEC 2336 –Internet Apppp plication Developmentsmiertsc/2336itec/XML.pdfWhat is XML?What is XML? XML stands for EXtensible Markup LanguageXML stands for EXtensible Markup

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: ITEC 2336ITEC 2336 –Internet Apppp plication Developmentsmiertsc/2336itec/XML.pdfWhat is XML?What is XML? XML stands for EXtensible Markup LanguageXML stands for EXtensible Markup

ITEC 2336 –InternetITEC 2336 –Internet Application Developmentpp p

XML

Page 2: ITEC 2336ITEC 2336 –Internet Apppp plication Developmentsmiertsc/2336itec/XML.pdfWhat is XML?What is XML? XML stands for EXtensible Markup LanguageXML stands for EXtensible Markup

What is XML?What is XML?

XML stands for EXtensible Markup Language XML stands for EXtensible Markup Language markup language designed to describe data designed to describe data tags are not predefined – you define your

own tagsown tags uses a Document Type Definition (DTD) or an

XML Schema to describe the tag system Sc e a to desc be t e tag systeused to markup the data

XML+(DTD or Schema) is self-describing ( ) g XML is a W3C Recommendation

Page 3: ITEC 2336ITEC 2336 –Internet Apppp plication Developmentsmiertsc/2336itec/XML.pdfWhat is XML?What is XML? XML stands for EXtensible Markup LanguageXML stands for EXtensible Markup

The 10 Primary XML Design GoalsThe 10 Primary XML Design Goals

1 XML must be easily usable over the Internet1. XML must be easily usable over the Internet2. XML must support a wide variety of

applicationspp3. XML must be compatible with SGML4 It must be easy to write programs that4. It must be easy to write programs that

process XML documents5. The number of optional features in XML 5 e u be o opt o a eatu es

must be kept small

Page 4: ITEC 2336ITEC 2336 –Internet Apppp plication Developmentsmiertsc/2336itec/XML.pdfWhat is XML?What is XML? XML stands for EXtensible Markup LanguageXML stands for EXtensible Markup

The 10 Primary XML Design GoalsThe 10 Primary XML Design Goals

6. XML documents should be clear and easily6. XML documents should be clear and easily understood

7. The XML design should be prepared quicklyg p p q y8. The design of XML must be exact and

concise9. XML documents must be easy to create10. Keeping an XML document size small is of 0 eep g a docu e t s e s a s o

minimal importance

Page 5: ITEC 2336ITEC 2336 –Internet Apppp plication Developmentsmiertsc/2336itec/XML.pdfWhat is XML?What is XML? XML stands for EXtensible Markup LanguageXML stands for EXtensible Markup

XML ParsersXML Parsers

An XML processor (also called XML parser) An XML processor (also called XML parser) evaluates the document to make sure it conforms to all XML specifications for structure and syntax.

There are two categories of XML documents Well-formed Valid

Page 6: ITEC 2336ITEC 2336 –Internet Apppp plication Developmentsmiertsc/2336itec/XML.pdfWhat is XML?What is XML? XML stands for EXtensible Markup LanguageXML stands for EXtensible Markup

XML ParsersXML Parsers

Microsoft’s parser is called MSXML and is Microsoft s parser is called MSXML and is built into IE. The .dll file for MSXML can be downloaded

and used with other applications http://support.microsoft.com/kb/269238 to read

all bout the parser versions (from 1.0 to 6.0) Mozilla used the eXpat XML parser in FireFox

eXpat is an XML parser library written in C http://expat.sourceforge.net/

Page 7: ITEC 2336ITEC 2336 –Internet Apppp plication Developmentsmiertsc/2336itec/XML.pdfWhat is XML?What is XML? XML stands for EXtensible Markup LanguageXML stands for EXtensible Markup

Well-Formed and Valid XML Documents An XML document is well-formed if it contains An XML document is well formed if it contains

no syntax errors and fulfills all of the specifications for XML code as defined by the W3C

An XML document is valid if it is well-formed and also satisfies the rules laid out in the DTD

h h d h dor schema attached to the document

Page 8: ITEC 2336ITEC 2336 –Internet Apppp plication Developmentsmiertsc/2336itec/XML.pdfWhat is XML?What is XML? XML stands for EXtensible Markup LanguageXML stands for EXtensible Markup

Valid XML DocumentsValid XML Documents

http://www.w3schools.com/Schema/schema ip _ntro.asp

Different types of schemas for XML All are rules that define the elements,

attributes, and structure of a particular markup languagelanguage

When you point to a DTD or a schema document then those are the rules for that particular document’s markup

Page 9: ITEC 2336ITEC 2336 –Internet Apppp plication Developmentsmiertsc/2336itec/XML.pdfWhat is XML?What is XML? XML stands for EXtensible Markup LanguageXML stands for EXtensible Markup

Different types of schemas for XMLXML Document Type Definition (DTD) yp ( ) W3C XML Schema RELAX NG – http://relaxng.org/p g g

A simple schema language for XML based on [RELAX] and [TREX]

S h t Schematron -http://www.schematron.com/overview.html differs in basic concept from other schema differs in basic concept from other schema

languages in that it not based on grammars but on finding tree patterns in the parsed documentdocument

Page 10: ITEC 2336ITEC 2336 –Internet Apppp plication Developmentsmiertsc/2336itec/XML.pdfWhat is XML?What is XML? XML stands for EXtensible Markup LanguageXML stands for EXtensible Markup

The building blocks of XML documents Seen from a DTD point of view, all XML Seen from a DTD point of view, all XML

documents (and XHTML documents) are made up of the following simple building blocks:

Elements Tags Attributes Entities PCDATA CDATA

Page 11: ITEC 2336ITEC 2336 –Internet Apppp plication Developmentsmiertsc/2336itec/XML.pdfWhat is XML?What is XML? XML stands for EXtensible Markup LanguageXML stands for EXtensible Markup

Document Type Definition (DTD)Document Type Definition (DTD) An XML document can only have one Document

T D l tiType Declaration A DTD provides a list of the elements, tags, attributes

and entity references contained in an XML document d d ib th i l ti hi t h thand describes their relationships to each other.

Inline Definition: <?xml version="1.0"?>

!DOCTYPE d t l t [d fi iti ]<!DOCTYPE documentelement [definition]> External Definition:

<?xml version="1.0"?> <!DOCTYPE d t l t SYSTEM<!DOCTYPE documentelement SYSTEM "documentelement.dtd">

Page 12: ITEC 2336ITEC 2336 –Internet Apppp plication Developmentsmiertsc/2336itec/XML.pdfWhat is XML?What is XML? XML stands for EXtensible Markup LanguageXML stands for EXtensible Markup

XML – ElementsXML Elements

Elements are the main building blocks of both Elements are the main building blocks of both XML and XHTML documents.

XML elements can contain text, other ,elements, or be empty Elements could be “note” and “message”, e.g.

Page 13: ITEC 2336ITEC 2336 –Internet Apppp plication Developmentsmiertsc/2336itec/XML.pdfWhat is XML?What is XML? XML stands for EXtensible Markup LanguageXML stands for EXtensible Markup

XML - Elements and AttributesXML Elements and Attributes

Element names are case sensitive Element names are case sensitive

Elements can be nested as follows: Elements can be nested, as follows:<CD>Kind of Blue

<TRACK>So What (:22)</TRACK>TRACK So What (:22) /TRACK<TRACK>Blue in Green (5:37)</TRACK>

</CD>

Page 14: ITEC 2336ITEC 2336 –Internet Apppp plication Developmentsmiertsc/2336itec/XML.pdfWhat is XML?What is XML? XML stands for EXtensible Markup LanguageXML stands for EXtensible Markup

Elements and Attributes: Adding elements to the Jazz.XML File

This figure shows an XML documentg

prologprolog

document elements

Page 15: ITEC 2336ITEC 2336 –Internet Apppp plication Developmentsmiertsc/2336itec/XML.pdfWhat is XML?What is XML? XML stands for EXtensible Markup LanguageXML stands for EXtensible Markup

XML -TagsXML Tags

Tags are used to mark up elements.g p A starting tag like <element_name> marks up

the beginning of an element, and an ending lik / l k h dtag like </element_name> marks up the end

of an element. Examples: Examples:

body element marked up with body tags: <body> some body text in between</body>y y y

message element marked up with message <message> some message in between

</message></message>

Page 16: ITEC 2336ITEC 2336 –Internet Apppp plication Developmentsmiertsc/2336itec/XML.pdfWhat is XML?What is XML? XML stands for EXtensible Markup LanguageXML stands for EXtensible Markup

XML - EntitiesXML Entities Variables used to define common text Entity references provide access to character entities

that are used for part of the syntax Entities are expanded when a document is parsed by p p y

an XML parser XHTML entity reference: "&nbsp;“ - "non-breaking-

space" entity is used to insert an extra space in a doc mentdocument

The following entities are predefined in XML: Entity References Character

&lt&lt; <&gt; >&amp; && t “&quot; “&apos; '

Page 17: ITEC 2336ITEC 2336 –Internet Apppp plication Developmentsmiertsc/2336itec/XML.pdfWhat is XML?What is XML? XML stands for EXtensible Markup LanguageXML stands for EXtensible Markup

XML - PCDATAXML PCDATA PCDATA is text that will be parsed by a parser. T i id th t t ill b t t d k d titi ill b Tags inside the text will be treated as markup and entities will be

expanded<?xml version="1.0"?>

<!DOCTYPE message [<!ELEMENT message (to from subject text)><!ELEMENT message (to,from,subject,text)><!ELEMENT to (#PCDATA)><!ELEMENT from (#PCDATA)><!ELEMENT subject (#PCDATA)><!ELEMENT text (#PCDATA)>

]>]><message><to>Dave</to><from>Susan</from><subject>Reminder</subject><text>Don't forget to buy milk on the way home.</text>

</message>

#PCDATA is character data that must be parsed and expanded. If a #PCDATA section contains elements, those elements must also be

declared

Page 18: ITEC 2336ITEC 2336 –Internet Apppp plication Developmentsmiertsc/2336itec/XML.pdfWhat is XML?What is XML? XML stands for EXtensible Markup LanguageXML stands for EXtensible Markup

XML - CDATAXML CDATA Everything inside a CDATA section is ignored by the parser

<script> <![CDATA[ function match (a, b) { if (a < b && a < 0) then{ if (a < b && a < 0) then {

return 1 }}

else {

return 0 }

} ]]> </script>

Page 19: ITEC 2336ITEC 2336 –Internet Apppp plication Developmentsmiertsc/2336itec/XML.pdfWhat is XML?What is XML? XML stands for EXtensible Markup LanguageXML stands for EXtensible Markup

CDATA SectionsCDATA Sections

This figure shows CDATA section in Jazz.XML fileThis figure shows CDATA section in Jazz.XML file

CDATA section

Page 20: ITEC 2336ITEC 2336 –Internet Apppp plication Developmentsmiertsc/2336itec/XML.pdfWhat is XML?What is XML? XML stands for EXtensible Markup LanguageXML stands for EXtensible Markup

The Document Creation ProcessThe Document Creation Process

This figure shows the document creation process

Page 21: ITEC 2336ITEC 2336 –Internet Apppp plication Developmentsmiertsc/2336itec/XML.pdfWhat is XML?What is XML? XML stands for EXtensible Markup LanguageXML stands for EXtensible Markup

Structure of an XML DocumentStructure of an XML Document

XML documents consist of three partsp The prolog The document body The epilog

The prolog is optional and provides information about the document itself

The document body contains the document’s content in a hierarchical tree structurein a hierarchical tree structure.

The epilog is also optional and contains any final comments or processing instructions.p g

Page 22: ITEC 2336ITEC 2336 –Internet Apppp plication Developmentsmiertsc/2336itec/XML.pdfWhat is XML?What is XML? XML stands for EXtensible Markup LanguageXML stands for EXtensible Markup

The Structure of an XML Document: Creating the Prolog The prolog consists of four parts in the The prolog consists of four parts in the

following order: XML declaration Miscellaneous statements or comments Document type declarationy Miscellaneous statements or comments

This order has to be followed or the parser pwill generate an error message.

None of these four parts is required, but it is good form to include them.

Page 23: ITEC 2336ITEC 2336 –Internet Apppp plication Developmentsmiertsc/2336itec/XML.pdfWhat is XML?What is XML? XML stands for EXtensible Markup LanguageXML stands for EXtensible Markup

The Structure of an XML Document: The XML Declaration The XML declaration is always the first line of code in y

an XML document. It tells the processor what follows is written using XML. It can also provide any information about how the parser should interpret theinformation about how the parser should interpret the code.

The complete syntax is:<?xml version=“version number” encoding=“encoding type” standalone=“yes | no” ?>

A sample declaration might look like this: A sample declaration might look like this:<?xml version=“1.0” encoding=“UTF-8” standalone=“yes” ?>

Page 24: ITEC 2336ITEC 2336 –Internet Apppp plication Developmentsmiertsc/2336itec/XML.pdfWhat is XML?What is XML? XML stands for EXtensible Markup LanguageXML stands for EXtensible Markup

Linking to a Style SheetLinking to a Style Sheet

A style sheet is linked to an XML document to A style sheet is linked to an XML document to format the document.

XML processor combines style sheet with p yXML document to display a formatted document

There are two main style sheet languages used with XML: Cascading Style Sheets (CSS) Extensible Style Sheets (XSL)

Page 25: ITEC 2336ITEC 2336 –Internet Apppp plication Developmentsmiertsc/2336itec/XML.pdfWhat is XML?What is XML? XML stands for EXtensible Markup LanguageXML stands for EXtensible Markup

HTML AND XMLHTML AND XML

XHTML tags can be added to an XMLXHTML tags can be added to an XML document Add the namespace to the XML documentp http://www.w3.org/TR/REC-html40 An XML processor recognizes any tag g y g

associated with this namespace as an HTML tag, and a browser treats those tags as if they came from an HTML filecame from an HTML file

Mixing HTML and XML allows you to place an inline image into an XML document or toinline image into an XML document or to create hypertext links

Page 26: ITEC 2336ITEC 2336 –Internet Apppp plication Developmentsmiertsc/2336itec/XML.pdfWhat is XML?What is XML? XML stands for EXtensible Markup LanguageXML stands for EXtensible Markup

MIXING XHTML AND XMLMIXING XHTML AND XML

Hyperlink example:Hyperlink example:

Page 27: ITEC 2336ITEC 2336 –Internet Apppp plication Developmentsmiertsc/2336itec/XML.pdfWhat is XML?What is XML? XML stands for EXtensible Markup LanguageXML stands for EXtensible Markup

ITEC 2336 –InternetITEC 2336 –Internet Application Developmentpp p

XML