65
XML Schemas Microsoft XML Schemas W3C XML Schemas

XML Schemas Microsoft XML Schemas W3C XML Schemas

  • View
    228

  • Download
    7

Embed Size (px)

Citation preview

Page 1: XML Schemas Microsoft XML Schemas W3C XML Schemas

XML Schemas

Microsoft XML SchemasW3C XML Schemas

Page 2: XML Schemas Microsoft XML Schemas W3C XML Schemas

Objectives of Schemas

To understand what a schema is. To understand the basic differences

between DTDs and schema. To be able to create Microsoft XML

Schema. To become familiar with both Microsoft

XML Schema and W3C XML Schema. To use schema to describe elements and

attributes. To use schema data types.

Page 3: XML Schemas Microsoft XML Schemas W3C XML Schemas

In Chapter 5 (XML Step by Step), we studied Document Type Definitions (DTDs). These describe an XML document's structure. DTDs are inherited from SGML. Many developers in the XML community feel DTDs are not flexible enough to meet today's programming needs.

Page 4: XML Schemas Microsoft XML Schemas W3C XML Schemas

For example, DTDs cannot be manipulated (e.g., searched, transformed into different representation such as HTML, etc.) in the same manner as XML documents can because DTDs are not XML documents.

Page 5: XML Schemas Microsoft XML Schemas W3C XML Schemas

From DTDs to Schemas

We introduce an alternative to DTDs—called schemas—for validating XML documents. Like DTDs, schemas must be used with validating parsers. Schemas are expected to replace DTDs as the primary means of describing document structure.

Page 6: XML Schemas Microsoft XML Schemas W3C XML Schemas

Schema Models Two major schema models exist: W3C

XML Schema and Microsoft XML Schema. Because W3C XML Schema technology is still in the early stages of development, we focus primarily on the well-developed Microsoft XML Schema.

[Note: New schema models (e.g., RELAX—www.xml.gr.jp/relax) are beginning to emerge.]

Page 7: XML Schemas Microsoft XML Schemas W3C XML Schemas

Schemas vs. DTDs We highlight a few major differences

between XML Schema and DTDs. A DTD describes an XML document's structure—not its element content.

For example, <quantity>5</quantity> contains character data.

Page 8: XML Schemas Microsoft XML Schemas W3C XML Schemas

Element quantity can be validated to confirm that it does indeed contain content (e.g., PCDATA), but its content cannot be validated to confirm that it is numeric; DTDs do not provide such a capability. So, unfortunately, markup such as

<quantity>hello</quantity>

is also considered valid.

Page 9: XML Schemas Microsoft XML Schemas W3C XML Schemas

With XML Schema, element quantity's data can indeed be described as numeric. When the preceding markup examples are validated against an XML Schema that specifies element quantity's data must be numeric, 5 conforms and hello fails.

Page 10: XML Schemas Microsoft XML Schemas W3C XML Schemas

An XML document that conforms to a schema document is schema valid and a document that does not conform is invalid.

Page 11: XML Schemas Microsoft XML Schemas W3C XML Schemas

Unlike DTDs, schema do not use the Extended Backus-Naur Form (EBNF) grammar. Instead, schema use XML syntax. Because schemas are XML documents, they can be manipulated (e.g., elements added, elements removed, etc.) like any other XML document. Later, we will discuss how to manipulate XML documents programmatically.

Page 12: XML Schemas Microsoft XML Schemas W3C XML Schemas

We begin our discussion of Microsoft XML Schema. We discuss several key schema elements and attributes, which are used in the examples. We also present our first Microsoft XML Schema document and use Microsoft's XML Validator to check it for validity. XML Validator also validates documents against DTDs as well as schema.

Page 13: XML Schemas Microsoft XML Schemas W3C XML Schemas

Microsoft XML Schema: Describing Elements

To use Microsoft XML Schema, Microsoft's XML parser (msxml) is required; this parser is part of Internet Explorer 5.

Page 14: XML Schemas Microsoft XML Schemas W3C XML Schemas

Elements are the primary building blocks used to create XML documents. In Microsoft XML Schema, element ElementType defines an element. ElementType contains attributes that describe the element's content, data type, name, etc.

Page 15: XML Schemas Microsoft XML Schemas W3C XML Schemas

The application using the XML document containing this markup would need to test if quantity is numeric and take appropriate action if it is not.

Page 16: XML Schemas Microsoft XML Schemas W3C XML Schemas

            

The Fig. 7.1 presents a complete schema. This schema describes the structure for an XML document that marks up messages passed between users. We name the schema intro-schema.xml.

In Fig 7.2, we show, in the browser Internet Explorer, a document that conforms to this schema.

Page 17: XML Schemas Microsoft XML Schemas W3C XML Schemas

1 <?xml version = "1.0"?>2 3 <!-- Fig. 7.1 : intro-schema.xml -->4 <!-- Microsoft XML Schema showing the ElementType -->5 <!-- element and element element -->6 7 <Schema xmlns = "urn:schemas-microsoft-com:xml- data">8 <ElementType name = "message" content = "textOnly" 9 model = "closed">10 <description>Text messages</description>11 </ElementType>12 <ElementType name = "greeting" model = "closed"14 content = "mixed" order = "many">15 <element type = "message"/>16 </ElementType>17 18 <ElementType name = "myMessage“ model = "closed"19 content = "eltOnly" order = "seq">20 21 <element type = "greeting" minOccurs = "0"22 maxOccurs = "1"/>23 <element type = "message" minOccurs = "1"24 maxOccurs = "*"/>25 26 </ElementType>27 </Schema>

Page 18: XML Schemas Microsoft XML Schemas W3C XML Schemas

Line 7 <Schema xmlns = "urn:schemas-microsoft-

com:xml-data">

declares the Microsoft XML Schema root element. Element Schema is the root element for every Microsoft XML Schema document.

Page 19: XML Schemas Microsoft XML Schemas W3C XML Schemas

The xmlns attribute specifies the default namespace for the Schema element and the elements it contains. The attribute value urn:schemas-microsoft-com:xml-data specifies the URI for this namespace.

Page 20: XML Schemas Microsoft XML Schemas W3C XML Schemas

Microsoft Schema documents always use this URI because it is recognized by msxml. Microsoft's XML parser recognizes element Schema and this particular namespace URI and also it validates the schema.

Page 21: XML Schemas Microsoft XML Schemas W3C XML Schemas

Element Schema can contain only elements ElementType—for defining elements, AttributeType—for defining attributes and description—for describing the Schema element. We will discuss each of these elements momentarily.

Page 22: XML Schemas Microsoft XML Schemas W3C XML Schemas

Lines 8-11

<ElementType name = "message" content = "textOnly"   model = "closed">   <description>Text messages</description>

</ElementType>

define element message, which can contain only text, because attribute content is textOnly.

Page 23: XML Schemas Microsoft XML Schemas W3C XML Schemas

Attribute model has the value closed (line 9)—indicating that only elements declared in this schema are permitted in a conforming XML document. Any elements not defined in this schema would invalidate the document.

Page 24: XML Schemas Microsoft XML Schemas W3C XML Schemas

We will elaborate on this when we discuss an XML document that conforms to the schema ( Fig 7.2). Element description contains text that describe this schema. In this particular case (line 10), we indicate in the description element that the message element we define is intended to contain Text messages.

Page 25: XML Schemas Microsoft XML Schemas W3C XML Schemas

Lines 13-16

<ElementType name = "greeting" model = "closed"    content = "mixed" order = "many">    <element type = "message"/></ElementType>

Page 26: XML Schemas Microsoft XML Schemas W3C XML Schemas

define element greeting. Because attribute content has the value mixed, this element can contain both elements and character data. The order attribute specifies the number and order of child elements a greeting element may contain.

Page 27: XML Schemas Microsoft XML Schemas W3C XML Schemas

The value many indicates that any number of message elements and text can be contained in the greeting element in any order. The element element on line 15 indicates message elements (defined on lines 8-11) may be included in a greeting element.

Page 28: XML Schemas Microsoft XML Schemas W3C XML Schemas

Lines 18 and 19

<ElementType name = "myMessage" model = "closed"    content = "eltOnly" order = "seq">

Page 29: XML Schemas Microsoft XML Schemas W3C XML Schemas

define element myMessage. The content attribute's value eltOnly specifies that the myMessage element can only contain elements. Attribute order has the value seq, indicating that myMessage child elements must occur in the sequence defined in the schema.

Page 30: XML Schemas Microsoft XML Schemas W3C XML Schemas

Lines 21-24

<element type = "greeting" minOccurs = "0"   maxOccurs = "1"/><element type = "message" minOccurs = "1"   maxOccurs = "*"/>

Page 31: XML Schemas Microsoft XML Schemas W3C XML Schemas

indicate that element myMessage contains child elements greeting and message. These elements are myMessage child elements, because the element elements that reference them are nested inside element myMessage. Because the element order in element myMessage is set as sequential, the greeting element (if used) must precede all message elements.

Page 32: XML Schemas Microsoft XML Schemas W3C XML Schemas

Attributes minOccurs and maxOccurs specify the minimum and maximum number of times the element may appear in the myMessage element, respectively. The value 1 for the minOccurs attribute (line 23) indicates that element myMessage must contain at least one message element.

Page 33: XML Schemas Microsoft XML Schemas W3C XML Schemas

The value * for the maxOccurs attribute (line 24) indicates that there is no limit on the maximum number of message elements that may appear in myMessage.

Page 34: XML Schemas Microsoft XML Schemas W3C XML Schemas

Figure 7.2 shows an XML document that conforms to the schema shown in Fig 7.1.

We use Microsoft's XML Validator to check the document's conformity. It is available as a free download at

Page 35: XML Schemas Microsoft XML Schemas W3C XML Schemas

msdn.microsoft.com/dowloads/samples/internet/xml/xml_validator/sample.asp

Page 36: XML Schemas Microsoft XML Schemas W3C XML Schemas

1 <?xml version = "1.0"?>2 3 <!-- Fig. 7.3 : Um documento XML intro.xml -->4 <!-- Introduction to Microsoft XML Schema -->5 6 <myMessage xmlns = "x-schema:intro-schema.xml">7 8 <greeting>Welcome to XML Schema!9 <message>This is the first message.</message>10 </greeting>11 12 <message>This is the second message.</message>13 </myMessage>

Page 37: XML Schemas Microsoft XML Schemas W3C XML Schemas

Line 6 <myMessage xmlns = "x-schema:intro-schema.xml">references the schema ( Fig 7.1) through the

namespace declaration. A document using a Microsoft XML Schema uses attribute xmlns to reference its schema through a URI which begins with x-schema followed by a colon (:) and the name of the schema document.

Page 38: XML Schemas Microsoft XML Schemas W3C XML Schemas

Mixed Content Lines 8-10

<greeting>Welcome to XML Schema!    <message>This is the first message.</message></greeting>

use element greeting to mark up text and a message element. Recall that in Fig 7.1, element greeting (lines 13-16) may contain mixed content.

Page 39: XML Schemas Microsoft XML Schemas W3C XML Schemas

Line 12 <message>This is the second message.</message>

marks up text in a message element.

Line 8 in Fig 7.1 specifies that element message can contain only text.

Page 40: XML Schemas Microsoft XML Schemas W3C XML Schemas

In the discussion of Fig 7.1, we mentioned that a closed model allows an XML document to contain only those elements defined in its schema.

For example, the markup <greeting>Welcome to XML Schema!   <message>This is the first message.</message>   <newElement>A new element.</newElement></greeting>

Page 41: XML Schemas Microsoft XML Schemas W3C XML Schemas

uses element newElement, which is not defined in the schema. With a closed model, the document containing newElement is invalid. However, with an open model, the document is valid.

Page 42: XML Schemas Microsoft XML Schemas W3C XML Schemas

Figure 7.3 shows a well-formed document that fails to conform to the schema shown in Fig 7.1, because element message cannot contain child elements.

Page 43: XML Schemas Microsoft XML Schemas W3C XML Schemas

Figure 7.4 lists the available attributes for the ElementType element. Schema authors use these attributes to specify the properties of an element, such as its content, data type, name, etc.

      If the content attribute for an ElementType element has the value eltOnly or mixed content, the ElementType may only contain the elements listed in Fig 7.5

Page 44: XML Schemas Microsoft XML Schemas W3C XML Schemas

Attibute Description

ElementType element

Page 45: XML Schemas Microsoft XML Schemas W3C XML Schemas

ElementType element attributes Content

Describes the element's content. The valid values for this attribute are empty (an empty element), eltOnly (may contain only elements), textOnly (may contain only text) and mixed (may contain both elements and text). The default value for this attribute is mixed.

Page 46: XML Schemas Microsoft XML Schemas W3C XML Schemas

ElementType element attributes Name - The element's name. This is a

required attribute.

Model - Specifies whether elements not defined in the schema are permitted in the element. Valid values are open (the default, which permits the inclusion of elements defined outside the schema) and closed (only elements defined inside the schema are permitted). We use only closed models.

Page 47: XML Schemas Microsoft XML Schemas W3C XML Schemas

ElementType element attributes

dt:type - Defines the element's data type. Data types exist for real numbers, integers, booleans, etc. Namespace prefix dt qualifies data types. We discuss data types in detail in Section 7.5.

Page 48: XML Schemas Microsoft XML Schemas W3C XML Schemas

ElementType element attributes Order Specifies the order in which child

elements must occur. The valid values for this attribute are one

(exactly one child element is permitted), seq (child elements must appear in the order in which they are defined) and many (child elements can appear in any order, any number of times).

Page 49: XML Schemas Microsoft XML Schemas W3C XML Schemas

The default value is many if attribute content is mixed and is seq if attribute content has the value eltOnly.

Page 50: XML Schemas Microsoft XML Schemas W3C XML Schemas

ElementType's child elements

description - Provides a description of the ElementType.

datatype - Specifies the data type for the ElementType element. We will discuss data types.

element - Specifies a child element by name.

Page 51: XML Schemas Microsoft XML Schemas W3C XML Schemas

group - Groups related element elements and defines their order and frequency.

attributeType - Defines an attribute.

attribute - Specifies an AttributeType for an element

Page 52: XML Schemas Microsoft XML Schemas W3C XML Schemas

The element element does not define an element, but rather refers to an element defined by an ElementType. This allows the schema author to define an element once and refer to it from many places inside the schema document. The attributes of the element element are listed in Fig 7.6.

Page 53: XML Schemas Microsoft XML Schemas W3C XML Schemas

As mentioned in Fig 7.5, element group creates groups of element elements. Groups define the order and frequency in which elements appear using the attributes listed in Fig 7.7.

Page 54: XML Schemas Microsoft XML Schemas W3C XML Schemas

Element element attributes

type - A required attribute that specifies a child element's name (i.e., the name defined in the ElementType).

Page 55: XML Schemas Microsoft XML Schemas W3C XML Schemas

Element element attributes

minOccurs - Specifies the minimum number of occurrences an element can have. The valid values are 0 (the element is optional) and 1 (the element must occur one or more times). The default value is 1.

Page 56: XML Schemas Microsoft XML Schemas W3C XML Schemas

Element element attributes maxOccursSpecifies the maximum

number of occurrences an element can have. The valid values are 1 (the element occurs at most once) and * ...

* (the element can occur any number of times). The default value is 1 unless the ElementType's content attribute is mixed

Page 57: XML Schemas Microsoft XML Schemas W3C XML Schemas

Element group's attributes Order

Specifies the order in which the elements occur. The valid values are one (contains exactly one child element from the group), seq (all child elements must appear in the sequential order in which they are listed) and many (the child elements can appear in any order, any number of times).

Page 58: XML Schemas Microsoft XML Schemas W3C XML Schemas

Element group's attributes minOccurs

Specifies the minimum number of occurrences an element can have. The valid values are 0 (the element is optional) and 1 (the element must occur at least once). The default value is 1.

Page 59: XML Schemas Microsoft XML Schemas W3C XML Schemas

Element group's attributes maxOccurs

Specifies the maximum number of occurrences an element can have. The valid values are 1 (the element occurs at most once) and * (the element can occur any number of times). The default value is 1 unless the ElementType's content attribute is mixed.

Page 60: XML Schemas Microsoft XML Schemas W3C XML Schemas

7.4 Microsoft XML Schema: Describing Attributes

Page 61: XML Schemas Microsoft XML Schemas W3C XML Schemas

XML elements can contain attributes that describe elements.

In Microsoft XML Schema, element AttributeType defines attributes. Figure 7.8 lists AttributeType element attributes.

Page 62: XML Schemas Microsoft XML Schemas W3C XML Schemas

Like element ElementType element, element AttributeType may contain description elements and datatype elements.

Page 63: XML Schemas Microsoft XML Schemas W3C XML Schemas

To indicate that an element has an AttributeType, element attribute is used. The attributes of the attribute element are shown in Fig 7.9.

Page 64: XML Schemas Microsoft XML Schemas W3C XML Schemas

1 <?xml version = "1.0"?>2 3 <!-- Fig. 7.10 : contact-schema.xml -->4 <!-- Defining attributes -->5 6 <Schema xmlns = "urn:schemas-microsoft-com:xml-data">7 8 <ElementType name = "contact" content = "eltOnly" order = "seq"9 model = "closed">10 11 <AttributeType name = "owner" required = "yes"/>12 <attribute type = "owner"/>13 14 <element type = "name"/>15 <element type = "address1"/>16 <element type = "address2" minOccurs = "0" maxOccurs = "1"/>17 <element type = "city"/>18 <element type = "state"/>19 <element type = "zip"/>20 <element type = "phone" minOccurs = "0" maxOccurs = "*"/>21 </ElementType>

Page 65: XML Schemas Microsoft XML Schemas W3C XML Schemas

22 23 <ElementType name = "name" content = "textOnly" 24 model = "closed"/>25 26 <ElementType name = "address1" content = "textOnly"27 model = "closed"/>28 29 <ElementType name = "address2" content = "textOnly"30 model = "closed"/>31 32 <ElementType name = "city" content = "textOnly" 33 model = "closed"/>34 35 <ElementType name = "state" content = "textOnly"36 model = "closed"/>37 38 <ElementType name = "zip" content = "textOnly" model = "closed"/>39 40 <ElementType name = "phone" content = "textOnly" model = "closed">41 <AttributeType name = "location" default = "home"/>42 <attribute type = "location"/>43 </ElementType>44 45 </Schema>