Upload
rajesh-bindumadhav-matth
View
214
Download
0
Embed Size (px)
Citation preview
7/31/2019 XML Syntax 2
1/11
Rajesh Math
10/17/2012SICSR XML - Lecture 11
XML Session 2
7/31/2019 XML Syntax 2
2/11
XML Syntax Rules Summary
10/17/2012SICSR XML - Lecture 12
XML documents have exactly one root element.All elements have a parent element except for the root element.
All elements have a start and an end tag (except for empty elements), eg:
some content in here
Element names: must begin with a letter or underscore(-) followed by letters, digits,underscore, period(.) or hyphen(-). Element names cannot start with the string xml in
any case combination (xml is a reserved keyword).
Attributes: elements may have attributes associated with them. Attribute names followthe element naming rules. Attribute values must be enclosed inside double quotes().
Nested elements: Elements can be nested within other elements.
is allowed
7/31/2019 XML Syntax 2
3/11
XML Syntax Rules Summary
No overlapping tags: XML elements must be properly nested. is not allowed.
The XML declaration is the first line of the document. It identifies the document as an XMLdocument, specifies the xml version being used and the character encoding system.
Builtin reference Entities:
< > right anglebracket' 'apostrophe " double quotation mark
&&ersand Attributes: the attribute value must be supplied if the attribute is used and the value must be
quoted.
...
...
10/17/2012 SICSR XML - Lecture 1 3
7/31/2019 XML Syntax 2
4/11
XML Is Not Just A Markup Language
10/17/2012SICSR XML - Lecture 14
When we say "XML", we are really referring to a whole family of technologies: DTD
HTML is defined by a Document Type Definition (DTD) that specifies the structure andsyntax of all HTML valid documents.
In XML we can define our own markup language and the structure of any documentscreated from it. The rules are defined in a DTD that we design for that particularapplication.
Schema
Another method for defining the structure and rules for an XML document. Schema givesa tighter definition of the elements and their allowed values as well as the order in whichnesting of tags is allowed.
XSL
eXtensible Stylesheet Language: A markup language that allows you to describe a set ofrules for translating one XML document to another XML document.
XSLT
XSL Transformer: A set of Application Programming Interfaces (API) that are used toaccomplish the transformation. Utility programs exist that take an XML document and aXSL file to produce the transformation to a new file.
7/31/2019 XML Syntax 2
5/11
XML Is Not Just A Markup Language
10/17/2012SICSR XML - Lecture 15
Parsing A parser is responsible for disassembling an XML document into its basic
objects. The objects are then available for manipulation by a computerprogrammer to extract and process in what way they desire.
DOM - Document Object Model
Used by Browsers in HTML. DOM parsers are available as libraries for Java, Perl, C++ and many other languages.
Uses a tree representation of documents, see above.
Very memory and CPU intensive for large documents.
SAX: Simple API for XML But not that simple to use!
uses a state engine and event notification to extract objects from the XML doc. SAX parsers are less CPU and memory demanding than DOM parsers for large
documents.
Sequentially processes the document from start to end.
useful for extracting single items from the document.
7/31/2019 XML Syntax 2
6/11
XML syntax
10/17/2012SICSR XML - Lecture 16
XML consists of ELEMENTS
Each element is named and contains some
content (except for special empty elements).George
Elements are represented using tags and each
tag has a corresponding closing tag unless it is
an empty tag, such as:
7/31/2019 XML Syntax 2
7/11
Attributes, Comments
10/17/2012SICSR XML - Lecture 17
Attribute valuesmustbe present, and must bequoted. For example, in HTML we could get awaywith: . This is not legal in XHTML,where it must be written .Similarly, in XML:
..... NB:An openissue in XML design is whether a particular entityshould be modelled/described as a tag/element of its
own, or as an attribute of an existing element. Thegeneral "rule-of-thumb" is that elements should bethought of as containers (which are understood tohave contents) and attributes are characteristicsofthe element.
7/31/2019 XML Syntax 2
8/11
CData
10/17/2012SICSR XML - Lecture 18
The content of a CDATA section is not treated asmarkup. Typically used to include data that will beused by another application eg. JavaScript. Syntax isa little messy, and looks like:
type="text/javascript"> var name="fred"; var x = 3.0;var y = 4.0; if ( x < y ) document.write( x is less thany ); ]]> Note the strange use of the "squarebrackets", [ CDATA [ ...]].
In XHTML the above example would be written:
7/31/2019 XML Syntax 2
9/11
Processing Instructions
10/17/2012SICSR XML - Lecture 19
An XML file can also contain processinginstructions that give commands orinformation to an application that isprocessing the XML data. Processing
instructions look rather like the lines in theprolog: where targetis thename of the application and instructionsis a
string of text which is passed to it, eg:
7/31/2019 XML Syntax 2
10/11
Namespace Similar concept to scope rules for
variables in programming. For example,in Java, the key word "this" is used asa prefix to refer to the instance variableto avoid confusion with a local variableof the same name. Another example: inPerl, the keywords "my" and "local"provide fine-grain control over variablescope.
Universal Resource Identifiers (URIs)are used to uniquely identify anamespace.
URI Universal Resource Identifier -- can be a
URL or a URN
URL Universal Resource Locator
URN
Universal Resource Name
10/17/2012 SICSR XML - Lecture 1 10
7/31/2019 XML Syntax 2
11/11
URN Syntax
10/17/2012SICSR XML - Lecture 111
The syntax is similar to URL All URNs have the following syntax:
::= "urn:" ":" where is theNamespace Identifier, and is the Namespace SpecificString.
A namespace can be declared for any XML document type(custom markup language). The namespace is identified using aunique URN or URL.
IMPORTANT: The URI need not physically exist. It is only beingused a means of uniquely identifying a document definition.
A name space is a conceptual zone in which all names areunique