76
XML Schema XML Schema Part I Part I Introduction Introduction

XML Schema Part I Introduction XML Schema XML itself does not restrict what elements existing in a document. In a given application, you want to fix

Embed Size (px)

Citation preview

XML SchemaXML Schema

Part IPart I

IntroductionIntroduction

XML Schema XML itself does not restrict what elements

existing in a document.

In a given application, you want to fix a vocabulary -- what elements make sense, what their types are, etc.

Use a Schema to define an XML dialect MusicXML, ChemXML, VoiceXML, ADXML, etc.

Restrict documents to those tags.

Schema can be used to validate a document -- ie to see if it obeys the rules of the dialect.

Schema determine … What sort of elements can appear in the

document.

What elements MUST appear

Which elements can appear as part of another element

What attributes can appear or must appear

What kind of values can/must be in an attribute.

<?xml version="1.0" encoding="UTF-8"?><library> <book id="b0836217462" available="true"> <isbn> 0836217462 </isbn> <title lang="en"> Being a Dog is a Full-Time Job </title> <author id="CMS"> <name> Charles Schulz </name> <born> 1922-11-26 </born> <dead> 2000-02-12 </dead> </author> <character id="PP"> <name> Peppermint Patty </name> <born> 1966-08-22 </born> <qualification> bold,brash, and tomboyish </qualification> </character> <character id="Snoopy"> <name> Snoopy</name> <born>1950-10-04</born> <qualification>extroverted beagle</qualification> </character> <character id="Schroeder"> <name>Schroeder</name> <born>1951-05-30</born> <qualification>brought classical music to the Peanuts Strip</qualification> </character> <character id="Lucy"> <name>Lucy</name> <born>1952-03-03</born> <qualification>bossy, crabby, and selfish</qualification> </character> </book></library>

• We start with sample XML document and reverse engineer a schema as a simple example

First identify the elements:author, book, born, character,dead, isbn, library, name,qualification, title

Next categorize by contentmodelEmpty: contains nothingSimple: only text nodesComplex: only sub-elementsMixed: text nodes + sub-elements

Note: content model independentof comments, attributes, or processing instructions!

Content modelsContent models

Simple content model: name, born, title, dead, isbn, qualification

Complex content model: libarary, character, book, author

Content TypesContent Types

We further distinguish between complex and simple content Types: Simple Type: An element with only text

nodes and no child elements or attributes Complex Type: All other cases

We also say (and require) that all attributes themselves have simple type

Content TypesContent Types

Simple content type: name, born, dead, isbn, qualification

Complex content type: library, character, book, author, title

Building the schemaBuilding the schema

Schema are XML documentsSchema are XML documents

They must contain a schema root element as suchThey must contain a schema root element as such <?xml version="1.0"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"

targetNamespace="http://www.w3schools.com" xmlns="http://www.w3schools.com" elementFormDefault="qualified">

... ... </xs:schema>

We will discuss details in a bit -- note for now that yellow part can be excluded for now.

Flat schema for libraryFlat schema for library

Start by defining all of the simple types (including attributes):

<xs:schema xmlns:xs=http://www.w3.org/2001/XMLSchema> <xs:element name=“name” type=“xs:string”/> <xs:element name=“qualification” type=“xs:string”/> <xs:element name=“born” type=“xs:date”/> <xs:element name=“dead” type=“xs:date”/> <xs:element name=“isbn” type=“xs:string”/> <xs:attribute name=“id” type=“xs:ID”/> <xs:attribute name=“available” type=“xs:boolean”/> <xs:attribute name=“lang” type=“xs:language/> …/…</xs:schema>

Complex types with simple content

Now to complex types:

<title lang=“en”> Being a Dog is …</title>

<xs:element name=“title”> <xs:complexType> <xs:simpleContent> <xs:extension base=“xs:string”> <xs:attribute ref=“lang”/> </xs:extension> </xs:simpleContent> </xs:complexType></xs:element>

“the element named title has a complextype which is a simple content obtainedby extending the predefined datatypexs:string by adding the attribute definedin this schema and having the name lang.”

Complex TypesComplex TypesAll other types are complex types with complex content. For example:

<xs:element name=“library”> <xs:complexType> <xs:sequence> <xs:element ref=“book” maxOccurs=“unbounded”/> </xs:sequence> </xs:complexType></xs:element>

<xs:element name=“author”> <xs:complexType> <xs:sequence> <xs:element ref=“name”/> <xs:element ref=“born”/> <xs:element ref=“dead” minOccurs=0/> </xs:sequence> <xs:attribute ref=“id”/> </xs:complexType></xs:element>

<?xml version="1.0" encoding="UTF-8"?><xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="name" type="xs:string"/> <xs:element name="qualification" type="xs:string"/> <xs:element name="born" type="xs:date"> </xs:element> <xs:element name="dead" type="xs:date"> </xs:element> <xs:element name="isbn" type="xs:string"> </xs:element> <xs:attribute name="id" type="xs:ID"> </xs:attribute> <xs:attribute name="available" type="xs:boolean"> </xs:attribute> <xs:attribute name="lang" type="xs:language"> </xs:attribute> <xs:element name="title"> <xs:complexType> <xs:simpleContent> <xs:extension base="xs:string"> <xs:attribute ref="lang"> </xs:attribute> </xs:extension> </xs:simpleContent> </xs:complexType> </xs:element> <xs:element name="library"> <xs:complexType> <xs:sequence> <xs:element maxOccurs="unbounded" ref="book"> </xs:element> </xs:sequence> </xs:complexType> </xs:element> <xs:element name="author"> <xs:complexType> <xs:sequence> <xs:element ref="name"> </xs:element> <xs:element ref="born"> </xs:element> <xs:element ref="dead" minOccurs="0"> </xs:element> </xs:sequence> <xs:attribute ref="id"> </xs:attribute> </xs:complexType> </xs:element>

<xs:element name="book"> <xs:complexType> <xs:sequence> <xs:element ref="isbn"> </xs:element> <xs:element ref="title"> </xs:element> <xs:element ref="author" minOccurs="0" maxOccurs="unbounded”/> <xs:element ref="character" minOccurs="0" maxOccurs="unbounded"/> </xs:sequence> <xs:attribute ref="available"> </xs:attribute> <xs:attribute ref="id"> </xs:attribute> </xs:complexType> </xs:element> <xs:element name="character"> <xs:complexType> <xs:sequence> <xs:element ref="name"/> <xs:element ref="born"/> <xs:element ref="qualification"/> </xs:sequence> <xs:attribute ref="id"> </xs:attribute> </xs:complexType> </xs:element>

</xs:schema>

<?xml version="1.0" encoding="UTF-8"?><xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="library"> <xs:complexType> <xs:sequence> <xs:element name="book" maxOccurs="unbounded"> <xs:complexType> <xs:sequence> <xs:element name="isbn" type="xs:integer"> </xs:element> <xs:element name="title"> <xs:complexType> <xs:simpleContent> <xs:extension base="xs:string"> <xs:attribute name="lang" type="xs:language" > </xs:attribute> </xs:extension> </xs:simpleContent> </xs:complexType> </xs:element> <xs:element name="author" minOccurs="0" maxOccurs="unbounded"> <xs:complexType> <xs:sequence> <xs:element name="name" type="xs:string"> </xs:element> <xs:element name="born" type="xs:date"> </xs:element> <xs:element name="dead" type="xs:date"> </xs:element> </xs:sequence> <xs:attribute name="id" type="xs:ID"> </xs:attribute> </xs:complexType> </xs:element> <xs:element name="character" minOccurs="0" maxOccurs="unbounded"> <xs:complexType> <xs:sequence> <xs:element name="name" type="xs:string"> </xs:element> <xs:element name="born" type="xs:date"> </xs:element> <xs:element name="qualification" type="xs:string" > </xs:element> </xs:sequence> <xs:attribute name="id" type="xs:ID"> </xs:attribute> </xs:complexType> </xs:element> </xs:sequence> <xs:attribute type="xs:ID" name="id"> </xs:attribute> <xs:attribute name="available" type="xs:boolean"> </xs:attribute> </xs:complexType> </xs:element> </xs:sequence> </xs:complexType> </xs:element></xs:schema>

Same schema but with everythingdefined locally!

Dissecting SchemaDissecting Schema

What’s in a Schema? A Schema is an XML document (a DTD is not)

Because it is an XML document, it must have a root element The root element is <schema>

Within the root element, there can be Any number and combination of

Inclusions Imports Re-definitions Annotations

Followed by any number and combinations of Simple and complex data type definitions Element and attribute definitions Model group definitions Annotations

Structure of a SchemaStructure of a Schema

<schema>

<!– any number of the following -->

<include .../>

<import> ... </import>

<redefine> ... </redefine>

<annotation> ... </annotation>

<!– any number of following definitions -->

<simpleType> ... </simpleType>

<complexType> ... </complexType>

<element> ... </element>

<attribute/>

<attributeGroup> ... </attributeGroup>

<group> ... </group>

<annotation> ... </annotation>

</schema>

Simple TypesSimple Types

ElementsElements

What is an element with simple type? A simple element is an XML element that can

contain only text. It cannot contain any other elements or attributes.

Can also add restrictions (facets) to a data type in order to limit its content, and you can require the data to match a defined pattern.

Example Simple ElementExample Simple Element

The syntax for defining a simple element is: <xs:element name="xxx" type="yyy"/>

where xxx is the name of the element and yyy is the data type of the element. Here are some XML elements:

<lastname>Refsnes</lastname> <age>37</age> <dateborn>1968-03-27</dateborn>

And here are the corresponding simple element definitions: <xs:element name="lastname" type="xs:string"/> <xs:element name="age" type="xs:integer"/> <xs:element name="dateborn" type="xs:date"/>

Common XML Schema Data Types

XML Schema has a lot of built-in data types. XML Schema has a lot of built-in data types. Here is a list of the most common types:Here is a list of the most common types: xs:string xs:string xs:decimal xs:decimal xs:integer xs:integer xs:boolean xs:boolean xs:date xs:date xs:time xs:time

Declare Default and Fixed Values for Simple Elements

Simple elements can have a default value OR a fixed value set.

A default value is automatically assigned to the element when no other value is specified. In the following example the default value is "red":

<xs:element name="color" type="xs:string" default="red"/>

A fixed value is also automatically assigned to the element. You cannot specify another value. In the following example the fixed value is "red":

<xs:element name="color" type="xs:string" fixed="red"/>

AttributesAttributes(Another simple type)(Another simple type)

All attributes are declared as simple types.

Only complex elements can have attributes!

What is an Attribute?

Simple elements cannot have Simple elements cannot have attributes. attributes. If an element has attributes, it is If an element has attributes, it is

considered to be of complex type. considered to be of complex type. But the attribute itself is always declared But the attribute itself is always declared

as a simple typeas a simple type. . This means that an element with This means that an element with

attributes always has a complex type attributes always has a complex type definition.definition.

How to Define an Attribute

The syntax for defining an attribute is: The syntax for defining an attribute is: <xs:attribute name="xxx" type="yyy"/><xs:attribute name="xxx" type="yyy"/>

where xxx is the name of the attribute and yyy where xxx is the name of the attribute and yyy is the data type of the attribute. Here are an is the data type of the attribute. Here are an XML element with an attribute:XML element with an attribute:

<lastname lang="EN">Smith</lastname><lastname lang="EN">Smith</lastname>

And here are a corresponding simple attribute And here are a corresponding simple attribute definition:definition:

<xs:attribute name="lang" type="xs:string"/><xs:attribute name="lang" type="xs:string"/>

Declare Default and Fixed Values for Attributes

Attributes can have a default value OR a fixed value Attributes can have a default value OR a fixed value specified.specified.

A default value is automatically assigned to the attribute A default value is automatically assigned to the attribute when no other value is specified. In the following example when no other value is specified. In the following example the default value is "EN":the default value is "EN":

<xs:attribute name="lang" type="xs:string" default="EN"/><xs:attribute name="lang" type="xs:string" default="EN"/> A fixed value is also automatically assigned to the attribute. A fixed value is also automatically assigned to the attribute.

You cannot specify another value. In the following example You cannot specify another value. In the following example the fixed value is "EN":the fixed value is "EN":

<xs:attribute name="lang" type="xs:string" fixed="EN"/><xs:attribute name="lang" type="xs:string" fixed="EN"/>

Creating Optional and Required Attributes

All attributes are optional by default. To explicitly specify that the attribute is optional, use the "use" attribute:

<xs:attribute name="lang" type="xs:string" use="optional"/>

To make an attribute required:

<xs:attribute name="lang" type="xs:string" use="required"/>

Restrictions

As we will see later, simple types can have ranges put on their values

These are known as restrictions

Complex TypesComplex Types

Complex Elements

A complex element is an XML element that contains other elements and/or attributes.

There are four kinds of complex elements: empty elements elements that contain only other elements elements that contain only text elements that contain both other elements and text

Note: Each of these elements may (or must) contain attributes as well!

Examples of Complex XML Elements

A complex XML element, "product", which is empty: <product pid="1345"/>

A complex XML element, "employee", which contains only other elements: <employee> <firstname>John</firstname> <lastname>Smith</lastname> </employee>

A complex XML element, "food", which contains only text: <food type="dessert">Ice cream</food>

A complex XML element, "description", which contains both elements and text: <description> It happened on <date

lang="norwegian">03.03.99</date>..</description>

An Example XML SchemaAn Example XML Schema

<?xml version="1.0" encoding="UTF-8"?><xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified" attributeFormDefault="unqualified"> <xs:element name="rooms"> <xs:complexType> <xs:sequence> <xs:element name="room" minOccurs="0" maxOccurs="unbounded"> <xs:complexType> <xs:sequence> <xs:element name="capacity" type="xs:decimal"/> <xs:element name="equiptmentList"/> <xs:element name="features" minOccurs="0"> <xs:complexType> <xs:sequence> <xs:element name="feature" type="xs:string" maxOccurs="unbounded"/> </xs:sequence> </xs:complexType> </xs:element> </xs:sequence> <xs:attribute name="name" type="xs:string" use="required"/> </xs:complexType> </xs:element> </xs:sequence> </xs:complexType> </xs:element></xs:schema>

Referencing XML Schema Referencing XML Schema in XML documentsin XML documents

Sample Schema header

The <schema> element may contain some attributes. A schema declaration often looks something like this: <?xml version="1.0"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"

targetNamespace="http://www.w3schools.com" xmlns="http://www.w3schools.com" elementFormDefault="qualified">

... ... </xs:schema>

Schema headers, cont.

The following fragment:

xmlns:xs=http://www.w3.org/2001/XMLSchema

indicates that the elements and data types used in the schema (schema, element, complexType, sequence, string, boolean, etc.) come from the "http://www.w3.org/2001/XMLSchema" namespace.

It also specifies that the elements and data types that come from the "http://www.w3.org/2001/XMLSchema" namespace should be prefixed with xs: !!

Schema header, cont.

This fragment: targetNamespace=http://www.w3schools.com indicates that the elements defined by this schema (note, to,

from, heading, body.) come from the "http://www.w3schools.com" namespace.

This fragment: xmlns=http://www.w3schools.com indicates that the default namespace is

"http://www.w3schools.com".

This fragment: elementFormDefault="qualified“ indicates that any elements used by the XML instance document

which were declared in this schema must be namespace qualified.

Referencing schema in XML

This XML document has a reference to an XML Schema:

<?xml version="1.0"?> <note xmlns="http://www.w3schools.com"

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3schools.com note.xsd">

<to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note>

Referencing schema in xml, cont.

The following fragment: xmlns=http://www.w3schools.com specifies the default namespace

declaration.

This declaration tells the schema-validator that all the elements used in this XML document are declared in the "http://www.w3schools.com" namespace.

……

Once you have the XML Schema Instance namespace available:

xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance

you can use the schemaLocation attribute. This attribute has two values. The first value is the namespace to use. The second value is the location of the XML schema to use for that namespace:

xsi:schemaLocation="http://www.w3schools.com note.xsd"

Using ReferencesUsing References

Using References You don't have to have the content of an element

defined in the nested fashion as just shown <xs:element name="rooms"> <xs:complexType> <xs:sequence> <xs:element name="room"> <xs:complexType>

<xs:sequence> <xs:element name="capacity" type="xs:decimal"/>

You can define the element globally and use a reference to it instead<xs:element name="rooms"> <xs:complexType> <xs:sequence>

<xs:element ref="room"/></xs:sequence></xs:complexType></xs:element>

<xs:element name="room"> …</xs:element>

Rooms Schema using References

<?xml version="1.0" encoding="UTF-8"?><xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified" attributeFormDefault="unqualified"> <xs:element name="rooms"> <xs:complexType> <xs:sequence> <xs:element ref="room" maxOccurs="unbounded"/> </xs:sequence> </xs:complexType></xs:element>

<xs:element name="room"> <xs:complexType>

<xs:sequence><xs:element name="capacity" type="xs:decimal"/><xs:element name="equiptmentList"/><xs:element ref="features" minOccurs="0" maxOccurs="1"/>

</xs:sequence> <xs:attribute name="name" type="xs:string" use="required"/>

</xs:complexType></xs:element>

<xs:element name="features"> <xs:complexType>

<xs:sequence><xs:element name="feature" type="xs:string"

maxOccurs="unbounded"/> </xs:sequence>

</xs:complexType></xs:element>

</xs:schema>

Types

Both elements and attributes have types, which are defined in the Schema.

One can reuse types by giving them names.

<xsd:element name="Robot"><xsd:complexType>

<xsd:sequence><xsd:element ref="Sensor_List" minOccurs="0"/><xsd:element ref="Specification_List" minOccurs="0"/><xsd:element ref="Note" minOccurs="0"/>

</xsd:sequence></xsd:complexType>

</xsd:element>

<xsd:element name="Robot” type=“RoboType”><xsd:complexType name="RoboType” >

<xsd:sequence><xsd:element ref="Sensor_List" minOccurs="0"/><xsd:element ref="Specification_List" minOccurs="0"/><xsd:element ref="Note" minOccurs="0"/>

</xsd:sequence></xsd:complexType></xsd:element>

OR

Other XML Schema FeaturesOther XML Schema Features

Foreign key facility (uses Xpath)

Rich datatype facility Build up datatypes by inheritance Don’t need to list all of the attributes

(can say "these attributes plus others"). Restrict strings using regular

expressions

Namespace aware. Can restrict location of an element

based on a namespaces

RestrictionsRestrictions

Datatype Restrictions A DTD can only say that price can be any non-markup text.

Like this translated to Schemas

<xsd:element name="zip" type="xsd:string"/>

But in Schema you can do better:

<xsd:element name="zip" type="xsd:decimal"/>

Or even, make your own restrictions

<xsd:simpleType name="ZipPlus4"> <xsd:restriction base="xsd:string"> <xsd:length value="10"/> <xsd:pattern value="\d{5}-\d{4}"/> </xsd:restriction></xsd:simpleType><xsd:element name="zip" type="ZipPlus4">

Restriction Ranges The restrictions must be "derived" from a base type, so it's object

based

<xs:element name="LifeUniverseAndEverything"><xs:simpleType>

<xs:restriction base="xs:integer"><xs:minInclusive value="42"/><xs:maxInclusive value="42"/>

</xs:restriction></xs:simpleType>

</xs:element> Preceding "derived" from "integer" Has 2 restrictions (called "facets")

The first says that it must be greater than 41 The second says that it must be less than 43

XML file is "42"<LifeUniverseAndEverything

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="LifeUniverseEverything.xsd">42</LifeUniverseAndEverything>

FacetFacet Description Description

enumeratioenumeration n

Defines a list of acceptable valuesDefines a list of acceptable values

fractionDigifractionDigitsts

The maximum number of decimal places allowed. The maximum number of decimal places allowed. >=0>=0

lengthlength The exact number of characters or list items allowed. The exact number of characters or list items allowed. >=0 >=0

maxExclusimaxExclusiveve

The upper bounds for numeric values (the value must The upper bounds for numeric values (the value must be less than the value specified)be less than the value specified)

maxInclusimaxInclusive ve

The upper bounds for numeric values (the value must The upper bounds for numeric values (the value must be less than or equal to the value specified) be less than or equal to the value specified)

maxLengthmaxLength The maximum number of characters or list items The maximum number of characters or list items allowed. >=0allowed. >=0

minExclusiminExclusive ve

The lower bounds for numeric values (the value must The lower bounds for numeric values (the value must be greater than the value specified) be greater than the value specified)

minInclusivminInclusive e

The lower bounds for numeric values (the value must The lower bounds for numeric values (the value must be greater than or equal to the value specified)be greater than or equal to the value specified)

minLength minLength The minimum number of characters or list items The minimum number of characters or list items allowed >=0allowed >=0

pattern pattern The sequence of acceptable characters based on a The sequence of acceptable characters based on a regular expressionregular expression

totalDigits totalDigits The exact number of digits allowed. >0The exact number of digits allowed. >0

whiteSpace whiteSpace Specifies how white space (line feeds, tabs, spaces, Specifies how white space (line feeds, tabs, spaces, and carriage returns) is handledand carriage returns) is handled

Enumeration FacetEnumeration Facet

<xs:element name="FavoriteColor"><xs:element name="FavoriteColor">

<xs:simpleType><xs:simpleType>

<xs:restriction base="xs:string"><xs:restriction base="xs:string">

<xs:enumeration value="red"/><xs:enumeration value="red"/>

<xs:enumeration value="no blue"/><xs:enumeration value="no blue"/>

<xs:enumeration <xs:enumeration value="aarrrrggghh!!"/>value="aarrrrggghh!!"/>

</xs:restriction></xs:restriction>

</xs:simpleType></xs:simpleType>

</xs:element></xs:element>

Patterns (Regular Expressions)Patterns (Regular Expressions) One interesting facet is the pattern, which One interesting facet is the pattern, which

allows restrictions based on a regular allows restrictions based on a regular expressionexpression

This regular expression specifies a normal This regular expression specifies a normal word of one or more characters:word of one or more characters:

<xs:element name="Word"><xs:element name="Word"><xs:simpleType name="WordType"><xs:simpleType name="WordType">

<xs:restriction base="xs:string"><xs:restriction base="xs:string"><xs:pattern value="[a-zA-Z]+"/><xs:pattern value="[a-zA-Z]+"/>

</xs:restriction></xs:restriction></xs:simpleType></xs:simpleType>

</xs:element></xs:element>

Patterns (Regular Expressions)Patterns (Regular Expressions) Individual characters may be repeated a Individual characters may be repeated a

specific number of times in the regular specific number of times in the regular expression.expression.

The following regular expression restricts the The following regular expression restricts the string to exactly 8 alpha-numeric characters:string to exactly 8 alpha-numeric characters:

<xs:element name="password"><xs:element name="password"><xs:simpleType><xs:simpleType>

<xs:restriction base="xs:string"><xs:restriction base="xs:string"><xs:pattern value="[a-zA-Z0-9]{8}"/><xs:pattern value="[a-zA-Z0-9]{8}"/>

</xs:restriction></xs:restriction></xs:simpleType></xs:simpleType>

</xs:element></xs:element>

Whitespace facetWhitespace facet

The "whitespace" facet controls how white space in the element will be processed

There are three possible values to the whitespace facet "preserve" causes the processor to keep all whitespace as-is "replace" causes the processor to replace all whitespace

characters (tabs, carriage returns, line feeds, spaces) with space characters

"collapse" causes the processor to replace all strings of whitespace characters (tabs, carriage returns, line feeds, spaces) with a single space character

<xs:simpleType><xs:restriction base="xs:string">

<xs:whitespace value="replace"/></xs:restriction>

</xs:simpleType>

Both elements and attributes have types, which are defined in the Schema.

One can reuse types by giving them names.

Addr.xsd:<xsd:element name="Address">

<xsd:complexType><xsd:sequence>

<xsd:element name="Street" type="xsd:string"/><xsd:element name="Apartment" type="xsd:string"/><xsd:element name="Zip" type="xsd:string"/>

</xsd:sequence></xsd:complexType>

</xsd:element>

<xsd:complexType name="AddrType"><xsd:sequence>

<xsd:element name="Street" type="xsd:string"/><xsd:element name="Apartment" type="xsd:string"/><xsd:element name="Zip" type="xsd:string"/>

</xsd:sequence></xsd:complexType><xsd:element name=“ShipAddress" type="AddrType"/><xsd:element name=“BillAddress" type="AddrType"/>

TypesTypes

OR

TypesTypes The usage in the XML file is identical:

<?xml version="1.0" encoding="UTF-8"?><PurchaseOrder xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xsi:noNamespaceSchemaLocation="Address-WithTypeName.xsd">

<BillAddress> <Street>1108 E. 58th St.</Street>

<Apartment>Ryerson 155</Apartment> <Zip>60637</Zip>

</BillAddress> <ShipAddress>

<Street>1108 E. 58th St.</Street> <Apartment>Ryerson 155</Apartment> <Zip>60637</Zip>

</ShipAddress></PurchaseOrder>

Type ExtensionsType Extensions

A third way of creating a complex type is to extend A third way of creating a complex type is to extend another complex type (like OO inheritance)another complex type (like OO inheritance)

<xs:element name="Employee" type="PersonInfoType"/><xs:element name="Employee" type="PersonInfoType"/><xs:complexType name="PersonNameType"><xs:complexType name="PersonNameType">

<xs:sequence><xs:sequence><xs:element name="FirstName" type="xs:string"/><xs:element name="FirstName" type="xs:string"/><xs:element name="LastName" type="xs:string"/><xs:element name="LastName" type="xs:string"/>

</xs:sequence></xs:sequence></xs:complexType></xs:complexType><xs:complexType name="PersonInfoType"><xs:complexType name="PersonInfoType">

<xs:complexContent><xs:complexContent><xs:extension base="PersonNameType"><xs:extension base="PersonNameType">

<xs:sequence><xs:sequence><xs:element name="Address" type="xs:string"/><xs:element name="Address" type="xs:string"/><xs:element name="City" type="xs:string"/><xs:element name="City" type="xs:string"/><xs:element name="Country" type="xs:string"/><xs:element name="Country" type="xs:string"/>

</xs:sequence></xs:sequence></xs:extension></xs:extension>

</xs:complexContent></xs:complexContent></xs:complexType></xs:complexType>

Type Extensions (use)Type Extensions (use)

To use a type that is an extension of another, it is as though it were all defined in a single type

<Employee xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="TypeExtension.xsd">

<FirstName>King</FirstName>

<LastName>Arthur</LastName>

<Address>Round Table</Address>

<City>Camelot</City>

<Country>England</Country>

</Employee>

Simple Content in Complex TypeSimple Content in Complex Type If a type contains only simple content (text and

attributes), a <simpleContent> element can be put inside the <complexType>

<simpleContent> must have either a <extension> or a <restriction>

This example is from the (Bridge of Death) Episode Dialog:

<xs:element name="dialog"><xs:complexType> <xs:simpleContent> <xs:extension base="xs:string"> <xs:attribute name="speaker"

type="xs:string" use="required"/> </xs:extension>

</xs:simpleContent></xs:complexType>

</xs:element>

Model GroupsModel Groups

Model Groups are used to define an element that has mixed content (elements and text mixed) element content

Model Groups can be all

the elements specified must all be there, but in any order

choice any of the elements specified may or may not be there

sequence all of the elements specified must appear in the specified

order

"All" Model Group"All" Model Group

The following schema specifies 3 elements and mixed content<xs:element name="BookCover">

<xs:complexType mixed="true"><xs:all minOccurs="0" maxOccurs="1">

<xs:element name="BookTitle" type="xs:string"/><xs:element name="Author" type="xs:string"/><xs:element name="Publisher" type="xs:string"/>

</xs:all></xs:complexType>

</xs:element>

The following XML file is valid in the above schema<BookCover xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xsi:noNamespaceSchemaLocation="AllModelGroup.xsd">

Title: <BookTitle>The Holy Grail</BookTitle>Published: <Publisher>Moose</Publisher>Author: <Author>Monty Python</Author>

</BookCover>

AttributesAttributes<xs:element name="dialog">

<xs:complexType><xs:simpleContent> <xs:extension base="xs:string"> <xs:attribute name="speaker"

type="xs:string" use="required"/>

</xs:extension></xs:simpleContent>

</xs:complexType></xs:element>

The attribute declarationis part of the type ofthe element.

AttributesAttributes<xsd:element name="cartoon">

<xsd:complexType><xsd:sequence>

<xsd:element ref="character" minOccurs="0" maxOccurs="unbounded"/>

</xsd:sequence><xsd:attribute name="name" type="xsd:string"

use="required"/><xsd:attribute name="genre" type="xsd:string"

use="required"/><xsd:attribute name="syndicated" use="required">

<xsd:simpleType><xsd:restriction base="xsd:NMTOKEN">

<xsd:enumeration value="yes"/><xsd:enumeration value="no"/>

</xsd:restriction></xsd:simpleType>

</xsd:attribute></xsd:complexType>

</xsd:element>

If an attribute type is more complicated than a basic type, then we spell out the type in a type declaration.

Optional and Required AttributesOptional and Required Attributes

All attributes are optional by default. To explicitly specify that the attribute is optional, use the "use" attribute:

<xs:attribute name="speaker" type="xs:string" use="optional"/>

To make an attribute required:

<xs:attribute name="speaker" type="xs:string" use="required"/>

Other XML Schema FeaturesOther XML Schema Features

Foreign key facility (uses Xpath)

Rich datatype facility Build up datatypes by inheritance Don’t need to list all of the attributes (can

say “these attributes plus others). Restrict strings using regular expressions

Namespace aware. Can restrict location of an element based on

a namespaces

XML Schema Status XML Schema Status

Became a W3C recommendation Spring 2001

World domination expected imminently. Supported in Xalan. Supported in XML spy and other

editor/validators.

On the other hand: More complex than DTDs. Ultra verbose.

Validating a SchemaValidating a Schema

By using Xeena or XMLspy or XML Notepad. When publishing hand-written XML

docs, this is the way to go.

By using a Java program that performs validation. When validating on-the-fly, must do it

this way

Some guidelines for Some guidelines for Schema designSchema design

Designing a SchemaDesigning a Schema Analogous to database schema design --- look

for intuitive names

Can start with an E-R diagram, and then convert Attributes to Attributes Subobjects to Subelements Relationships to IDREFS

Normalization? Still makes sense to avoid repetition whenever possible– If you have an Enrolment document, only list Ids of

students, not their names. Store names in a separate document Leave it to tools to connect them

Designing a Schema (cont.)Designing a Schema (cont.)

Difficulties: Many more degrees of freedom than with database schemas: e.g. one can associate information with something by

including it as an attribute or a subelement.

<ADDRESS NAME=“Martin Sheen”, Street=“1222 Alameda Drive” ,City=“Carmel”, State=“CA”, ZIP=“40145”>

<ADDRESS><NAME> Martin Sheen </NAME>…<ZIP> 4145 </ZIP>

</ADDRESS>

ELEMENTS are more extensible – use when there is a possibility that more substructure will be added.

ATTRIBUTES are easier to search on.

““Rules” for Designing a SchemaRules” for Designing a Schema

Never leave structure out. The following is definitely a bad idea: <ADDRESS> Martin Sheen 1222 Alameda Drive, Carmel, CA 40145 </ADDRESS>

Better would be: <ADDRESS firstName=“Martin” lastname=“Sheen” streenNum=“1222” streenName=“Alameda Drive” city=“Carmel” state=“CA” zip=“40145” />

Or:<ADDRESS>

<name><first>Martin</first><last>Sheen</last>

</name><street>

<num>1222</num><name>Alameda Drive</name></street><city>Carmel</city><state>CA</state><zip>40145</zip>

</ADDRESS>

More“Rules” for Designing a SchemaMore“Rules” for Designing a Schema When to use Elements (instead of attributes)

Do not put large text blocks inside an attribute (Bad Idea) <book type=“memoir” content=“Bravely bold Sir Robin rode forth from Camelot.

He was not afraid to die, O brave Sir Robin.He was not at all afraid to be killed in nasty ways,Brave, brave, brave, brave Sir Robin!

He was not in the least bit scared to be mashed into a pulp,

Or to have his eyes gouged out and his elbows broken,To have his kneecaps split and his body burned awayAnd his limbs all hacked and mangled, brave Sir Robin!

His head smashed in and his heart cut out

And his liver removed and his bowels unplugged…”> Elements are more flexible, so use an Element if you

think you might have to add more substructure later on.

More “Rules” for Designing SchemasMore “Rules” for Designing Schemas

More on when to use Elements (instead of Attributes) Use an embedded element when the information

you are recording is a constituent part of the parent element

one's head and one's height are both inherent to a human being,

you can't be a conventionally structured human being without having a head and having a height

One's head is a constituent part and one's height isn't -- you can cut off my head, but not my height

use embedded elements for complex structure validation (obvious)

use embedded elements when you need to show order (attributes are not ordered)

More “Rules” for Designing SchemasMore “Rules” for Designing Schemas

When to use Attributes instead of Elements use an attribute when the information is inherent

to the parent but not a constituent part (height instead of head)

use attributes to stress the one-to-one relationship among pieces of information

to stress that the element represents a tuple of information

dangerous rule, though Leads to the extreme formulation that a <chapter> element

can have a TITLE= attribute And then to the conclusion that it really ought to have a CONTENT= attribute too

Then you find yourself writing the entire document as an empty element with an attribute value as long as the Quest for the Holy Grail

use attributes for simple datatype validation (obviously)

BookingsBookings XML Schema XML Schema

<?xml version="1.0" encoding="UTF-8"?><xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified" attributeFormDefault="unqualified"> <xs:element name="bookings"> <xs:complexType> <xs:sequence>

<xs:element ref="lastUpdated" maxOccurs="1" minOccurs="0"/><xs:element ref="meetingDate" maxOccurs="unbounded"/>

</xs:sequence> </xs:complexType> </xs:element> <xs:element name="year" type="xs:integer"/> <xs:element name="month" type="xs:string"/> <xs:element name="day" type="xs:integer"/>

Note that there are four global types in this document!

BookingsBookings, cont., cont.

<xs:element name="meetingDate"> <xs:complexType> <xs:sequence>

<xs:element ref="year"/><xs:element ref="month"/><xs:element ref="day"/><xs:element ref="meeting" maxOccurs="unbounded" minOccurs="0"/>

</xs:sequence> </xs:complexType> </xs:element>

<xs:element name="lastUpdated"> <xs:complexType> <xs:attribute name="date" type="xs:string"/> <xs:attribute name="time" type="xs:string"/> </xs:complexType></xs:element>

BookingsBookings, cont., cont.

<xs:element name="meeting"> <xs:complexType> <xs:sequence> <xs:element name="meetingName" maxOccurs="1" minOccurs="1" type="xs:string"/>

<xs:element name="roomName" maxOccurs="1" minOccurs="1" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element></xs:schema>

An Example Bookings DocumentAn Example Bookings Document

• Reverse engineer a reasonable schema for the following sample xml file

<?xml version="1.0" encoding="UTF-8"?><bookings xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="../schemas/Bookings.xsd">

<meetingDate><year>2003</year><month>April</month><day>1</day><meeting>

<meetingName>Democratic Party</meetingName><roomName>Green Room</roomName>

</meeting><meeting>

<meetingName>Republican Party</meetingName><roomName>Red Room</roomName>

</meeting></meetingDate>

</bookings>