Review Writing XML Style Common errors 1XML Technologies - 2012 - David Raponi

Preview:

Citation preview

Review

• Writing XML Style Common errors

1XML Technologies - 2012 - David Raponi

2XML Technologies - 2012 - David Raponi

Week 1 Review: 1. Style

1. Style issues

Be consistent:<title>The Title</title><Description>The description</Description>

(same for camelCasing vs under_scores)

3XML Technologies - 2012 - David Raponi

Week 1 Review: 1. Style

2. Common errors

Be sure to group related items

<book>The book</book><ISBN>123142</ISBN><book>Another book</book><ISBN>1412314124</ISBN>

<book><name>The

book</name><ISBN>123142</ISBN>

</book><book>

<name>Another book</name>

<ISBN>1412314124</ISBN></book>

4XML Technologies - 2012 - David Raponi

Week 1 Review: 1. Style

2. Common errors

Avoid repeated names. Examine the following:<band>

<name></name><member>

<name>

<first></first>

<last></last></name>

</member></band>

Something that holds text

Something that holds more elements

Eek! We’re using “name” to define two different structural things.

5XML Technologies - 2012 - David Raponi

Week 1 Review: 1. Style

2. Common errors

No need to increment elements!<library>

<book1></book1><book2></book2>

</library>

Hey! What’s the name of that book2 you’re reading?

(That doesn’t make sense.)

Tag names should be thought of as pure nouns. In this case, just <book>, forget the numbers.

XML Technologies

• Week 2: Document Type Definitions (DTD) What is a DTD? How to write a DTD?

6XML Technologies - 2012 - David Raponi

7XML Technologies - 2012 - David Raponi

Week 2: DTDs > 1. What is a DTD?

1. What is a DTD?

Recall:XML is a language that can make new markup languages

This involves two steps:1. Creating own tags and structure (last week)2. Defining that structure (DTDs, this week)

8XML Technologies - 2012 - David Raponi

Week 2: DTDs > 1. What is a DTD?

Define the blocks

W3C: “The purpose of a DTD is to define the legal building blocks of an XML document.”

A successful check of an XML file against a DTD makes it “valid” for that DTD (not just well-formed)

Note: It still doesn’t DO anything Sorry.

… so why bother?

9XML Technologies - 2012 - David Raponi

Week 2: DTDs > 1. What is a DTD?

So why bother?

Tons of data being passed around

Tons of keeners coming up with their own way to manipulate the data

The Apocalypse

+

=

10XML Technologies - 2012 - David Raponi

Week 2: DTDs > 1. What is a DTD?

Standards: ensuring predictability

With a DTD, you: restrict the data content and how that content is organized

This is important because if several people are passing along updated versions of an XML file, and each is trying to be clever by adding/changing things, then the file is no longer “predictable”, which would make future processing of that XML file cumbersome.

The more predictable the structure of an XML file, the easier it is to work with it

11XML Technologies - 2012 - David Raponi

Week 2: DTDs > 1. What is a DTD?

HTML goes XHTML

Compare:

• <!DOCTYPE html public “-//W3C//DTD HTML 4.01 Strict//EN” “http://www.w3.org/TR/html4/strict.dtd”>

• <!DOCTYPE html public “-//W3C//DTD HTML 4.01 Transitional//EN” “http://www.w3.org/TR/html4/loose.dtd”>

Q: What are some differences?

12XML Technologies - 2012 - David Raponi

Week 2: DTDs > 2. How to write a DTD?

2. How to write a DTD?

To write a DTD, ask yourself these questions:• What elements are there?• How many times are those elements allowed

to appear?• What is inside those elements• What attributes are there?

13XML Technologies - 2012 - David Raponi

Week 2: DTDs > 2. How to write a DTD?

Declaring an element

Empty elements (like hr and br tags)• <!ELEMENT hr EMPTY>

Parsed Character Data (your text)• <!ELEMENT title (#PCDATA)>

Any Data• <!ELEMENT title ANY>

Elements with subelements• <!ELEMENT div(h1, p)>

<!ELEMENT name contents><!ELEMENT name (further subelements)>

14XML Technologies - 2012 - David Raponi

Week 2: DTDs > 2. How to write a DTD?

Declaring a subelement’s recurrence

How many times does the subelement appear?

Once (do nothing, default)

Once or more elem+ Zero or more elem* Zero or once elem? Either / or (elem1|

elem2)

15XML Technologies - 2012 - David Raponi

Week 2: DTDs > 2. How to write a DTD?

Declaring a subelement’s recurrence

Recurrence examples:<!ELEMENT book (title)><!ELEMENT person (child*)><!ELEMENT book (chapter+)><!ELEMENT band (record_label?)><!ELEMENT gnathostome (fins+|legs+)>

*gnatho-what?! Dude, it’s any animal that has a jaw. Doesn’t everyone know that?

16XML Technologies - 2012 - David Raponi

Week 2: DTDs > 2. How to write a DTD?

Declaring an attribute

Type:• CDATA (attribute values are not parsed)• Enumerated (this|that|another) – specific values

Value:• #REQUIRED (must be present, and can’t be “ “)• #IMPLIED (attr may/may not be there)• #FIXED “value” (attr must always have this value)• “value” (sets a default, but it can be changed)

<!ATTLIST elem-name attr-name type default-value>

17XML Technologies - 2012 - David Raponi

Week 2: DTDs > 2. How to write a DTD?

Declaring an attribute

Enumeration Examples:<! ATTLIST person status (single|married) #REQUIRED><! ATTLIST person status (single|married) #IMPLIED>

CDATA Examples:<! ATTLIST person name CDATA #REQUIRED><! ATTLIST person status CDATA #IMPLIED><! ATTLIST person gender CDATA #FIXED “female”><! ATTLIST person gender CDATA “female”>

<!ATTLIST elem-name attr-name type default-value>

18XML Technologies - 2012 - David Raponi

Week 2: DTDs > 2. How to write a DTD?

Declaring an attribute: special note!

Spaces• Recall: Spaces are NOT allowed in tag names• Recall: Spaces ARE allowed in attribute values• BUT: not allowed in DTD enumerated values!

Ex: <!ATTLIST name attr (word|two words) #IMPLIED>(this is bad)

In other words: You can have well-formed XML docs with spaces in attributes, but it may not validate with a DTD.

In general: Avoid spaces in attributes!Instead, use _, -, or if you need to.

19XML Technologies - 2012 - David Raponi

Week 2: DTDs > 2. How to write a DTD?

Putting it together (program_v1.xml)

Please open up program_v1.xml and look at it as we go through these slides…

20XML Technologies - 2012 - David Raponi

Week 2: DTDs > 2. How to write a DTD?

Putting it together (program_v1.xml)

<!DOCTYPE program […(other elements to follow)]>

Begin with a doctype declaration: <!DOCTYPE root_element [elements]>

21XML Technologies - 2012 - David Raponi

Week 2: DTDs > 2. How to write a DTD?

DTD for program_v1.xml

<!DOCTYPE program [<!ELEMENT program (title, semester)>]>

Now list the next element (don’t forget the root itself!) and any sub elements it may contain <!ELEMENT elem (subelem, subelem…)>

22XML Technologies - 2012 - David Raponi

Week 2: DTDs > 2. How to write a DTD?

DTD for program_v1.xml

<!DOCTYPE program [<!ELEMENT program (title, semester+)>]>

Add recurrences:(how many times the subelements appear):

23XML Technologies - 2012 - David Raponi

Week 2: DTDs > 2. How to write a DTD?

DTD for program_v1.xml

To move on, state the contents of the elements in the order you listed them. The contents are one of three options:• Text• Another element (the cycle continues)• Both (a “mixed element”)

24XML Technologies - 2012 - David Raponi

Week 2: DTDs > 2. How to write a DTD?

DTD for program_v1.xml

<!DOCTYPE program [<!ELEMENT program (title, semester+)><!ELEMENT title (#PCDATA)><!ELEMENT semester (course*)>]>

Updated:

25XML Technologies - 2012 - David Raponi

Week 2: DTDs > 2. How to write a DTD?

DTD for program_v1.xml

Now keep cycling through until the elements are all accounted for:

<!DOCTYPE program [<!ELEMENT program (title, semester+)><!ELEMENT title (#PCDATA)><!ELEMENT semester (course*)><!ELEMENT course (#PCDATA)>]>

Updated:

26XML Technologies - 2012 - David Raponi

Week 2: DTDs > 2. How to write a DTD?

DTD for program_v1.xml

Then finish up with attributes:

<!DOCTYPE program [<!ELEMENT program (title, semester+)><!ELEMENT title (#PCDATA)><!ELEMENT semester (course*)><!ELEMENT course (#PCDATA)><!ATTLIST semester number CDATA #REQUIRED>]>

Updated:

27XML Technologies - 2012 - David Raponi

Week 2: DTDs > 2. How to write a DTD?

To sum up:

Doctypes are (basically) written as follows:1. Wrap everything in the <!DOCTYPE…2. Then list the elements with recurrences3. List what’s inside those elements• Text• More elements?

4. Repeat steps 2 and 3 until all elements are accounted for

5. Declare attributes

28XML Technologies - 2012 - David Raponi

Week 2: DTDs > 2. How to write a DTD?

Attaching a DTD to an XML file

InternalJust place the whole DTD right after your <?xml version=“1.0”?> declaration

External• Remove the initial <!DOCTYPE declaration• Save your DTD as a .dtd file• Add the following to your .xml file

<!DOCTYPE root_elem SYSTEM “location-of-dtd-file”>

29XML Technologies - 2012 - David Raponi

Week 2: DTDs > 2. How to write a DTD?

Other notes

CAPITALIZATIONNote that <!ELEMENT and <!ATTLIST must be in CAPS

IndentingThe convention is to NOT indent your DTD items

30XML Technologies - 2012 - David Raponi

Week 2: DTDs > 2. How to write a DTD?

Other notes

CDATACharacter Data (XML engine will NOT parse the info) because the value is just text (no markup) and so it simply spits it out to screen

#PCDATAParsed Character Data (XML engine will try to interpret the contents before spitting it out) to check if there’s actually another node <elem></elem> or special characters.

What you need to know:CDATA for attributes, #PCDATA for elements with just text

31XML Technologies - 2012 - David Raponi

Week 2: DTDs > 2. How to write a DTD?

Other notes

Mixed Data<elem>

Some text<subelem>More text</subelem>

</elem>

If you do this, let me know so I can come over and yell at you.But if you MUST, then you’d declare it like so:

<!ELEMENT elem ANY>… or …

<!ELEMENT elem (#PCDATA|subelem)*><!ELEMENT subelem (#PCDATA)>

32XML Technologies - 2012 - David Raponi

Week 2: DTDs > 2. How to write a DTD?

Other notes

XML docs are made to follow DTDs, NOT the other way around:

<?xml version=“1.0” encoding=“utf-8”?><!DOCTYPE band [<!ELEMENT band (tour)><!ELEMENT tour EMPTY>]><band>

<tour></tour></band>

Q: Why is this technically correct, but conceptually wrong?

Recommended