37
The Official 2002 XML Marathon April 4, 2002

The Official 2002 XML Marathon April 4, 2002. Revised Requirements A photocopy of the original text A short description (read: single paragraph) discussing

Embed Size (px)

Citation preview

Page 1: The Official 2002 XML Marathon April 4, 2002. Revised Requirements A photocopy of the original text A short description (read: single paragraph) discussing

The Official 2002 XML Marathon

April 4, 2002

Page 2: The Official 2002 XML Marathon April 4, 2002. Revised Requirements A photocopy of the original text A short description (read: single paragraph) discussing

Revised Requirements

A photocopy of the original textA short description (read: single paragraph) discussing your choice of text and any challenges that you encountered.A completed DTD (internal or external)A completed valid and well-formed XML document (refer to our previous class for definitions of valid and well-formed XML)

Page 3: The Official 2002 XML Marathon April 4, 2002. Revised Requirements A photocopy of the original text A short description (read: single paragraph) discussing

Revised XML Plan

Please keep in mind that XML is all about structure and to a much lesser extent presentation. It is about what is ‘under the hood’.Remember XML is all about simplicity, portability, and readability.To this effect, you will no longer be required to use pictures, links, or multimedia. You may choose to use pictures, but links and multimedia--for which there is a wealth of different protocols--will not be covered in the time remaining.

Page 4: The Official 2002 XML Marathon April 4, 2002. Revised Requirements A photocopy of the original text A short description (read: single paragraph) discussing

The XML Marathon

Covering the following:– The Official XML Refresher– Defining and incorporating entities– Working with empty tags (including use of images)– Oh, the places you’ll go: Advanced XML and

Beyond!

Page 5: The Official 2002 XML Marathon April 4, 2002. Revised Requirements A photocopy of the original text A short description (read: single paragraph) discussing

XML Refresher

The Plan:– Before you do anything, take your project document

and break it down, from its overall largest structure (e.g., book) to its simplest or smallest structure (e.g., sub-sub-sub-sub-heading).

– Devise your own XML names for defining these structural elements (remember your time-o-gram) and, if you wish, presentation elements as well (e.g., <emphasis> or <funkyfont>).

Page 6: The Official 2002 XML Marathon April 4, 2002. Revised Requirements A photocopy of the original text A short description (read: single paragraph) discussing

Document or DTD

Your next step is up to you, but you have two choices:– You can immediately start in on creating your DTD

using the elements you have discovered, or,– You can begin marking up your XML document,

taking the raw text you have to work with and diving right into it.

– Both methods are perfectly acceptable.

Page 7: The Official 2002 XML Marathon April 4, 2002. Revised Requirements A photocopy of the original text A short description (read: single paragraph) discussing

Building XML

When creating your XML document, remember the first steps:– Declare the XML Version (<?xml version=“1.0”?>)– Define your DTD, either: include it as an internal

DTD, or reference it (we’ll go over these in a minute)– Begin your XML with your ROOT element (e.g.,

book).– Make this your mantra: Version-DTD-Root

Page 8: The Official 2002 XML Marathon April 4, 2002. Revised Requirements A photocopy of the original text A short description (read: single paragraph) discussing

Choosing a DTD style

Internal– Simple and compact– Applies to only one document

External– A little more complex (one more line of XML!)– Can apply to any number of documents– Harder to work with (navigating between multiple

screens, flipping back and forth between docs)

Page 9: The Official 2002 XML Marathon April 4, 2002. Revised Requirements A photocopy of the original text A short description (read: single paragraph) discussing

Real-World DTDs

Internal– A single independent document that has only one

purpose or application (e.g., only viewed online)

External– A document or series of documents that can be re-

purposed (e.g., online, print, database, e-mail, etc.)– Cuts down on duplication and coding time.

Page 10: The Official 2002 XML Marathon April 4, 2002. Revised Requirements A photocopy of the original text A short description (read: single paragraph) discussing

Your Project DTD

The choice of which style of DTD to use is completely up to you.Internal Example:

<?xml version=“1.0”?><!DOCTYPE book […]><book>

External Example:<?xml version=“1.0” standalone=“no”?><!DOCTYPE book SYSTEM “http://is.dal.ca/~sboon/time-o-gram.dtd”><book>

Page 11: The Official 2002 XML Marathon April 4, 2002. Revised Requirements A photocopy of the original text A short description (read: single paragraph) discussing

Defining Elements and Attributes

A DTD must define rules for each and every element and attribute that will appear in your XML document. Otherwise, it will not be valid.Whenever you change your XML, remember to make the corresponding changes to your DTD, particularly if you are adding elements or attributes to your document.

Page 12: The Official 2002 XML Marathon April 4, 2002. Revised Requirements A photocopy of the original text A short description (read: single paragraph) discussing

Also remember..

At any time in your document, you can create comments for yourself or for others. This will be useful for those of you involved in the showcase!Comments are create by using the <!-- and --> start and end tags, such as:

<!--The following section uses the “class.css” style sheet. You will need to ensure that the “class.css” is in the proper directory.-->

Page 13: The Official 2002 XML Marathon April 4, 2002. Revised Requirements A photocopy of the original text A short description (read: single paragraph) discussing

Defining Structure

Defining our root element– <!ELEMENT book (chapterone, sectionone,

sectiontwo, chaptertwo,…, …, etc., etc.)>– Like a tree hierachry:

book (root element)chapterone chaptertwo

sectionone sectiontwo sectionone sectiontwo… … … … … … … … … … … … …

Always remember that XML is case-sensitive!

Page 14: The Official 2002 XML Marathon April 4, 2002. Revised Requirements A photocopy of the original text A short description (read: single paragraph) discussing

Defining Text

To define an element to contain text:– Type <!ELEMENT yourtag where yourtag is the tag you are

creating and wish to define.– Next, type a space and (#PCDATA)>– This states that the element you define will only contain text– PCDATA stands for parsed character data and refers to

everything except your XML code.Example:

<!ELEMENT booktitle (#PCDATA)>

Page 15: The Official 2002 XML Marathon April 4, 2002. Revised Requirements A photocopy of the original text A short description (read: single paragraph) discussing

Defining Elements… cont’d

So, for example, your DTD will contain structural elements, such as your root element, which describes what other elements are contained within it, as well as textual elements that contain only text:

<!ELEMENT book (chapterone, chaptertwo,…, sectionone, sectiontwo…, etc., etc.)>

….

Page 16: The Official 2002 XML Marathon April 4, 2002. Revised Requirements A photocopy of the original text A short description (read: single paragraph) discussing

Defining Constraints

Also remember that it is possible to put constraints on the number of times that a given element can appear in your document (e.g., you don’t want 2 book titles), using the symbols: ? + *Example:

<!ELEMENT book (booktitle?, chapter+, chaptertitle*,…, etc., etc.)>• Here we limit <booktitle> so that it can only appear once, as well as

indicating that a book must have at least one <chapter>, and that a book can contain as many <chaptertitle>s as necessary.

Page 17: The Official 2002 XML Marathon April 4, 2002. Revised Requirements A photocopy of the original text A short description (read: single paragraph) discussing

Attributes

Remember that information contained in attributes tends to be about your XML document, rather than your content. They are primarily metadata.Users rarely see attributes: they are primarily used by parsers and XML designers.Attributes are very commonly used with empty elements to point or link to the content of the element.

Page 18: The Official 2002 XML Marathon April 4, 2002. Revised Requirements A photocopy of the original text A short description (read: single paragraph) discussing

Attributes… cont’d

To define attributes:– Type <!ATTLIST yourtag where yourtag is the name of the element in which the

attribute will appear.– Type the name of the attribute– Then, either type CDATA (not #PCDATA) for any combination of numbers or

text (basically anything), or type (value1 | value2 | etc.) where either value1 or value2 (etc.) is the ONLY value acceptable.

– Finally, you must type one of the following:• “value” where value will be the default value if none is explicitly set• “#FIXED value” where value is the default and ONLY value for that attribute (i.e., it

is fixed)• “#REQUIRED” to specify that the attribute must contain some (not pre-specified

value)• Or, “#IMPLIED” to specify that there is no default value, and the value may be

omitted if desired.

Page 19: The Official 2002 XML Marathon April 4, 2002. Revised Requirements A photocopy of the original text A short description (read: single paragraph) discussing

Attribute examples

<!ELEMENT date (#PCDATA)><!ATTLIST year CDATA #IMPLIED>

• This attribute definition says that the date element may contain an optional (#IMPLIED) year attribute that contains any number of characters (CDATA).

<!ELEMENT date (#PCDATA)><!ATTLIST year (1999 | 2000 | 2001 | 2002) #REQUIRED>

• This attribute definition says that the date element must be used (#REQUIRED) and that the value must be one of 1999, 2000, 2001, or 2002. Those are the only choices (from value list).

Page 20: The Official 2002 XML Marathon April 4, 2002. Revised Requirements A photocopy of the original text A short description (read: single paragraph) discussing

Any Questions?

Page 21: The Official 2002 XML Marathon April 4, 2002. Revised Requirements A photocopy of the original text A short description (read: single paragraph) discussing

Defining Entities

Entities are essentially acronyms that you create to stand for a string of text (e.g., slis for School of Library and Information Studies). They allow you to code your document using the acronym instead of the long-form, and then when the document is parsed, it substitutes your acronym for the long-form on the fly.So a parser encountering EFF would, based on the Entity’s definition in the DTD, replace it with Electronic Frontier Foundation.

Page 22: The Official 2002 XML Marathon April 4, 2002. Revised Requirements A photocopy of the original text A short description (read: single paragraph) discussing

More on Entities

There are two kinds of Entities: general entities and parameter entities. We are going to concentrate on general entities, which load data into the XML document itself, rather than parameter entities, which load data into your DTD.General entities are also sometimes called ‘shortcuts’.

Page 23: The Official 2002 XML Marathon April 4, 2002. Revised Requirements A photocopy of the original text A short description (read: single paragraph) discussing

Creating Entities

To create an Entity:– In your DTD, type <!ENTITY and add a space– Next, type in your acronym and another space (e.g.,

EFF)– Lastly type the long-form of your text in quotation

marks (e.g., “Electronic Frontier Foundation”) and finish the tag with a >

Example:<!ENTITY eff “Electronic Frontier Foundation”>

Page 24: The Official 2002 XML Marathon April 4, 2002. Revised Requirements A photocopy of the original text A short description (read: single paragraph) discussing

Using Entities

Once you have created your Entities, in order to activate them within your XML document, you need to build a command using the ampersand, your acronym, and the semi-colon:For example:– The &eff; has been protecting the digital rights of

online user groups since 1990. The &eff; is currently involved in litigation concerning…

Page 25: The Official 2002 XML Marathon April 4, 2002. Revised Requirements A photocopy of the original text A short description (read: single paragraph) discussing

External Entities

If you have cause to create large Entities (e.g., whole paragraphs or pages of text), you are better of creating external Entities, where the bulk of the Entity will not affect the speed at which your XML is loaded or parsed.This practice also allows you to share Entities with other documents.

Page 26: The Official 2002 XML Marathon April 4, 2002. Revised Requirements A photocopy of the original text A short description (read: single paragraph) discussing

Creating External Entities

First, create the Entity itself in an external text file.Second, if you were using an internal DTD, you will need to add the following attribute to your XML declaration: standalone=“no”Next, type <!ENTITY and a spaceThen, your acronymn and another spaceThen, SYSTEM, a space, and in quotation marks, the location and name of the file (e.g., “docs/entity.txt” or simply “entity.txt”)Close the tag with a >

Page 27: The Official 2002 XML Marathon April 4, 2002. Revised Requirements A photocopy of the original text A short description (read: single paragraph) discussing

Using External Entities

Example: <!ENTITY effpara SYSTEM “docs/effpara.txt”>To use the Entity in your XML document, again use the ampersand (&) and semi-colon (;), for example:The &eff; had the following to say about the event: &effpara;

Page 28: The Official 2002 XML Marathon April 4, 2002. Revised Requirements A photocopy of the original text A short description (read: single paragraph) discussing

Entities for Unparsed Content

We can also use Entities to load unparsed content into our XML. Unparsed content can include all sorts of multimedia, including images.Unparsed Entities can also be used to embed non-text or non-XML content into an XML document.

Page 29: The Official 2002 XML Marathon April 4, 2002. Revised Requirements A photocopy of the original text A short description (read: single paragraph) discussing

Creating Unparsed Entities

In your DTD, type <!ENTITY and a spaceNext, type your desired acronym (e.g., effpict)Next, type SYSTEM, a space, and in quotation marks the location and name of the file you want to loan (e.g., “img/effpict.jpg”).Next, type NDATA id where id is the word that identifies the notation that will describe the unparsed data (we’ll do this next).Close the tag with a >

Page 30: The Official 2002 XML Marathon April 4, 2002. Revised Requirements A photocopy of the original text A short description (read: single paragraph) discussing

Unparsed Entities… cont’d

Example:– <!ENTITY effpict SYSTEM “img/effpict.jpg” NDATA

jpg>

Again, all Entities are used in XML by using the same & and ; signs– For example: &effpict;

Page 31: The Official 2002 XML Marathon April 4, 2002. Revised Requirements A photocopy of the original text A short description (read: single paragraph) discussing

Entity Notation

Because an Unparsed Entity can contain anything, it requires that a Notation be created to give some definition to your entity. Usually, the Notation includes a word describing the file type or content of the Entity. It is purely metadata.To create the Notation, you add another simple tag to your DTD immediately following your Unparsed Entity.

Page 32: The Official 2002 XML Marathon April 4, 2002. Revised Requirements A photocopy of the original text A short description (read: single paragraph) discussing

Creating Notations

On a new line in your DTD, type <!NOTATION and a spaceNext, type id where id is the name you used to identify it in the !ENTITY tag (e.g., picture) and a spaceNext type SYSTEM, a space, and then in quotations the content information you are going to provide (e.g., “jpg/jpeg”)Close the tag using a >Example:

<!NOTATION picture SYSTEM “jpg/jpeg”>

Page 33: The Official 2002 XML Marathon April 4, 2002. Revised Requirements A photocopy of the original text A short description (read: single paragraph) discussing

A Simpler Image

There is, however, a simpler method of including an image in your XML. It involves using an EMPTY Elements (not Entities).EMPTY elements often express unparsed content (i.e., content that cannot be written out in text) such as multimedia.To use an EMPTY Element, you must first define an new element in your DTD (e.g., <!ENTITY picture EMPTY>)

Page 34: The Official 2002 XML Marathon April 4, 2002. Revised Requirements A photocopy of the original text A short description (read: single paragraph) discussing

Simpler Image… cont’d

However, just like the <IMG> tag in HTML, the EMPTY element requires an attribute telling it where to find the image (e.g., filename=“effpict.jpg”)To create the attribute, use the format:– <!ATTLIST filename CDATA #REQUIRED>, or– <!ATTLIST filename CDATA #IMPLIED>

Page 35: The Official 2002 XML Marathon April 4, 2002. Revised Requirements A photocopy of the original text A short description (read: single paragraph) discussing

Image Example

So, your DTD should include something like the following:

<!ELEMENT picture EMPTY><!ATTLIST filename CDATA #REQUIRED>

Your corresponding XML should look like:<picture filename=“effpict.jpg”></picture>or, <picture filename=“effpict.jpg”/>

Page 36: The Official 2002 XML Marathon April 4, 2002. Revised Requirements A photocopy of the original text A short description (read: single paragraph) discussing

Closing EMPTY elements

XML allows for EMPTY elements to be closed in the traditional manner, using a forward slash (<element></element>), or by including the forward slash as the last character in the opening tag (<element/>). The choice to use one or another is left to the designer.

Page 37: The Official 2002 XML Marathon April 4, 2002. Revised Requirements A photocopy of the original text A short description (read: single paragraph) discussing

Any Questions?