39
Introduction to XML Schema John Arnett, MSc Standards Modeller Information and Statistics Division NHSScotland Tel: 0131 551 8073 (x2073) mailto:[email protected] http://isdscotland.org/xml

Introduction to XML Schema John Arnett, MSc Standards Modeller Information and Statistics Division NHSScotland Tel: 0131 551 8073 (x2073) mailto:[email protected]

Embed Size (px)

Citation preview

Page 1: Introduction to XML Schema John Arnett, MSc Standards Modeller Information and Statistics Division NHSScotland Tel: 0131 551 8073 (x2073) mailto:John.Arnett@isd.csa.scot.nhs.uk

Introduction to XML Schema

John Arnett, MScStandards ModellerInformation and Statistics DivisionNHSScotlandTel: 0131 551 8073 (x2073)mailto:[email protected]://isdscotland.org/xml

Page 2: Introduction to XML Schema John Arnett, MSc Standards Modeller Information and Statistics Division NHSScotland Tel: 0131 551 8073 (x2073) mailto:John.Arnett@isd.csa.scot.nhs.uk

Contents

• Introduction• Document Type Definitions -

reminder• W3C Schema

– Schema Structures– Built-In Types

• Summary• Find Out More

Page 3: Introduction to XML Schema John Arnett, MSc Standards Modeller Information and Statistics Division NHSScotland Tel: 0131 551 8073 (x2073) mailto:John.Arnett@isd.csa.scot.nhs.uk

Introduction

• Schema– a diagram, plan or framework– XML – a document that

describes an XML document.

Page 4: Introduction to XML Schema John Arnett, MSc Standards Modeller Information and Statistics Division NHSScotland Tel: 0131 551 8073 (x2073) mailto:John.Arnett@isd.csa.scot.nhs.uk

Introduction

• Purpose– Data validation– Contract– System documentation– Processing information

Page 5: Introduction to XML Schema John Arnett, MSc Standards Modeller Information and Statistics Division NHSScotland Tel: 0131 551 8073 (x2073) mailto:John.Arnett@isd.csa.scot.nhs.uk

Introduction

• Schema Data Validation– Element and attribute structure– Element ordering– Value constraints

•Built-in data types•Size and pattern constraints •Enumerations

– Uniqueness constraints

Page 6: Introduction to XML Schema John Arnett, MSc Standards Modeller Information and Statistics Division NHSScotland Tel: 0131 551 8073 (x2073) mailto:John.Arnett@isd.csa.scot.nhs.uk

Introduction

• Schema Languages– Document Type Definitions

(DTD’s)– W3C XML Schema– OASIS RELAX NG– Schematron

Page 7: Introduction to XML Schema John Arnett, MSc Standards Modeller Information and Statistics Division NHSScotland Tel: 0131 551 8073 (x2073) mailto:John.Arnett@isd.csa.scot.nhs.uk

Document Type Definitions

• DTD Benefits<!ELEMENT Record (FamilyName, GivenName, Sex, DateOfBirth)><!ELEMENT FamilyName (#PCDATA)><!ELEMENT GivenName (#PCDATA)><!ELEMENT Sex (#PCDATA)><!ELEMENT DateOfBirth (#PCDATA)><!ATTLIST Record recordId CDATA #REQUIRED>

– Easy to understand and implement– Lightweight alternative to schemas

Page 8: Introduction to XML Schema John Arnett, MSc Standards Modeller Information and Statistics Division NHSScotland Tel: 0131 551 8073 (x2073) mailto:John.Arnett@isd.csa.scot.nhs.uk

Document Type Definitions

• DTD Limitations– Use non-XML syntax– Only limited support for data

typing and namespaces– Difficult to extend

Page 9: Introduction to XML Schema John Arnett, MSc Standards Modeller Information and Statistics Division NHSScotland Tel: 0131 551 8073 (x2073) mailto:John.Arnett@isd.csa.scot.nhs.uk

W3C Schema

• W3C Recommendation– XML Schema Part 0: Primer

•Introduction (guidance)– XML Schema Part I: Structures

•defines schema components– XML Schema Part 2: Datatypes

•defines built-in datatypes and their restrictions

Page 10: Introduction to XML Schema John Arnett, MSc Standards Modeller Information and Statistics Division NHSScotland Tel: 0131 551 8073 (x2073) mailto:John.Arnett@isd.csa.scot.nhs.uk

W3C Schema Structures

• Most commonly used structures: –elements and attributes –simpleTypes–complexTypes– model groups–minOccurs and maxOccurs–annotation and documentation–schema and namespaces

Page 11: Introduction to XML Schema John Arnett, MSc Standards Modeller Information and Statistics Division NHSScotland Tel: 0131 551 8073 (x2073) mailto:John.Arnett@isd.csa.scot.nhs.uk

W3C Schema Structures

• element and attribute– Basic building blocks of documents

<element name=“Record”><complexType>

<sequence><element name=“FamilyName” type=“string”/><element name=“GivenName” type=“string”/><element name=“Sex” type=“token”/><element name=“DateOfBirth” type=“date”/>

</sequence><attribute name=“recordId” type=“integer”/>

</complexType></element>

Page 12: Introduction to XML Schema John Arnett, MSc Standards Modeller Information and Statistics Division NHSScotland Tel: 0131 551 8073 (x2073) mailto:John.Arnett@isd.csa.scot.nhs.uk

W3C Schema Structures

• element and attribute– valid instances of Record element

<Record recordId=“1”><FamilyName>Arnett</FamilyName><GivenName>John</GivenName><Sex>M</Sex><DateOfBirth>1963-06-01</DateOfBirth>

</Record><Record recordId=“2”>

<FamilyName>Smith</FamilyName><GivenName/><Sex>FEMALE</Sex><DateOfBirth>1971-04-11</DateOfBirth>

</Record>

Page 13: Introduction to XML Schema John Arnett, MSc Standards Modeller Information and Statistics Division NHSScotland Tel: 0131 551 8073 (x2073) mailto:John.Arnett@isd.csa.scot.nhs.uk

W3C Schema Structures

• element and attribute– invalid Record element instance

<Record recordId=“1”>Mr<Surname>Arnett</Surname><GivenName>John</GivenName><Sex>M</Sex><DateOfBirth>06-Jan-63</DateOfBirth>

</Record>

Page 14: Introduction to XML Schema John Arnett, MSc Standards Modeller Information and Statistics Division NHSScotland Tel: 0131 551 8073 (x2073) mailto:John.Arnett@isd.csa.scot.nhs.uk

W3C Schema Structures

• simpletype Definitions– Define element content– Character data only - no nested

(child) elements permitted– No attributes permitted– Always derived from a built-in

types (using restriction)

Page 15: Introduction to XML Schema John Arnett, MSc Standards Modeller Information and Statistics Division NHSScotland Tel: 0131 551 8073 (x2073) mailto:John.Arnett@isd.csa.scot.nhs.uk

W3C Schema Structures

• simpletype definition examples<simpleType name=“TextType”>

<restriction base=“string”><minLength value=“1”/><maxLength value=“35”/>

</restriction></simpleType>

<simpleType name=“GenderType”><restriction base=“token”>

<enumeration value=“M”/><enumeration value=“F”/><enumeration value=“NK”/>

</restriction></simpleType>

Page 16: Introduction to XML Schema John Arnett, MSc Standards Modeller Information and Statistics Division NHSScotland Tel: 0131 551 8073 (x2073) mailto:John.Arnett@isd.csa.scot.nhs.uk

W3C Schema Structures

• complexType Definitions– Define element content– Child elements and character

data permitted–attributes permitted

Page 17: Introduction to XML Schema John Arnett, MSc Standards Modeller Information and Statistics Division NHSScotland Tel: 0131 551 8073 (x2073) mailto:John.Arnett@isd.csa.scot.nhs.uk

W3C Schema Structures

• complexType definition examples<complexType name=“DemographicsStructure”>

<sequence><element name=“FamilyName” type=“TextType”/><element name=“GivenName” type=“TextType”/><element name=“Sex” type=“GenderType”/><element name=“DateOfBirth” type=“date”/>

</sequence><attribute name=“recordId” type=“integer”/>

</complexType>

<element name=“Record” type=“DemographicsStructure”/><element name=“Person” type=“DemographicsStructure”/><element name=“Client” type=“DemographicsStructure”/>

Page 18: Introduction to XML Schema John Arnett, MSc Standards Modeller Information and Statistics Division NHSScotland Tel: 0131 551 8073 (x2073) mailto:John.Arnett@isd.csa.scot.nhs.uk

W3C Schema Structures

–sequence•elements must occur in the order specified

–choice•one of several child elements must be selected

–all•0 or 1 occurences in any order

• Model groups

Page 19: Introduction to XML Schema John Arnett, MSc Standards Modeller Information and Statistics Division NHSScotland Tel: 0131 551 8073 (x2073) mailto:John.Arnett@isd.csa.scot.nhs.uk

W3C Schema Structures

• Model group examples<complexType name=“DemographicsStructure”>

<sequence><element name=“FamilyName” type=“TextType”/><element name=“GivenName” type=“TextType”/><element name=“Sex” type=“GenderType”/><choice>

<element name=“DateOfBirth” type=“date”/><element name=“Age” type=“integer”/>

</choice></sequence><attribute name=“recordId” type=“integer”/>

</complexType>

Page 20: Introduction to XML Schema John Arnett, MSc Standards Modeller Information and Statistics Division NHSScotland Tel: 0131 551 8073 (x2073) mailto:John.Arnett@isd.csa.scot.nhs.uk

W3C Schema Structures

• Model groups– Valid instances of Record element

<Record recordId=“1”><FamilyName>Arnett</FamilyName><GivenName>John</GivenName><Sex>M</Sex><DateOfBirth>1963-06-01</DateOfBirth>

</Record><Record recordId=“2”>

<FamilyName>Smith</FamilyName><GivenName>Jane</GivenName><Sex>F</Sex><Age>28</Age>

</Record>

Page 21: Introduction to XML Schema John Arnett, MSc Standards Modeller Information and Statistics Division NHSScotland Tel: 0131 551 8073 (x2073) mailto:John.Arnett@isd.csa.scot.nhs.uk

W3C Schema Structures

• minOccurs and maxOccurs – control the occurence of

element instances•minOccurs=“0”

–occurrence is optional•maxOccurs=“unbounded”

–multiple occurences allowed

– may be applied to any child element, sequence or choice

Page 22: Introduction to XML Schema John Arnett, MSc Standards Modeller Information and Statistics Division NHSScotland Tel: 0131 551 8073 (x2073) mailto:John.Arnett@isd.csa.scot.nhs.uk

W3C Schema Structures

• minOccurs and maxOccurs examples<complexType name=“DemographicsStructure”>

<sequence><element name=“FamilyName” type=“TextType”/><element name=“GivenName” type=“TextType”

maxOccurs=“unbounded”/><element name=“Sex” type=“GenderType”

minOccurs=“0”/><choice>

<element name=“DateOfBirth” type=“date”/><element name=“Age” type=“integer”/>

</choice></sequence><attribute name=“recordId” type=“integer”/>

</complexType>

Page 23: Introduction to XML Schema John Arnett, MSc Standards Modeller Information and Statistics Division NHSScotland Tel: 0131 551 8073 (x2073) mailto:John.Arnett@isd.csa.scot.nhs.uk

W3C Schema Structures

• minOccurs and maxOccurs– Valid instances of Record element

<Record recordId=“1”><FamilyName>Arnett</FamilyName><GivenName>John</GivenName><GivenName>Gordon</GivenName><Sex>M</Sex><DateOfBirth>1963-06-01</DateOfBirth>

</Record><Record recordId=“2”>

<FamilyName>Smith</FamilyName><GivenName>Jane</GivenName><Age>28</Age><!-- Optional “Sex” element missing -->

</Record>

Page 24: Introduction to XML Schema John Arnett, MSc Standards Modeller Information and Statistics Division NHSScotland Tel: 0131 551 8073 (x2073) mailto:John.Arnett@isd.csa.scot.nhs.uk

W3C Schema Structures

• Namespaces– W3C namespace http//www.w3.org/2001/XMLSchema•element, complexType, sequence, etc

–targetNamespace•Optional•User defined •One per schema document

Page 25: Introduction to XML Schema John Arnett, MSc Standards Modeller Information and Statistics Division NHSScotland Tel: 0131 551 8073 (x2073) mailto:John.Arnett@isd.csa.scot.nhs.uk

W3C Schema Structures

• schema with namespaces<xsd:schema=“PersonalRecord” targetNamespace=“http://www.person.rec” xmlns:xsd=“http//www.w3.org/2001/XMLSchema”>

<!-- Type definitions, etc with namespace prefixes -->

<xsd:complexType name=“RecordStructure”>...

</xsd:complexType><xsd:simpleType name=“TextType”/>

...</xsd:complexType><xsd:simpleType name=“GenderType”/>

...</xsd:complexType>

</xsd:schema>

Page 26: Introduction to XML Schema John Arnett, MSc Standards Modeller Information and Statistics Division NHSScotland Tel: 0131 551 8073 (x2073) mailto:John.Arnett@isd.csa.scot.nhs.uk

W3C Schema Structures

• annotation and documentation<xsd:simpleType name=“GenderType”>

<xsd:annotation><xsd:documentation>The sex of an individual

for administrative purposes.</xsd:documentation><xsd:annotation><xsd:restriction base=“token”>

<xsd:enumeration value=“M”/><xsd:enumeration value=“F”/><xsd:enumeration value=“NK”/>

</xsd:restriction></xsd:simpleType>

Page 27: Introduction to XML Schema John Arnett, MSc Standards Modeller Information and Statistics Division NHSScotland Tel: 0131 551 8073 (x2073) mailto:John.Arnett@isd.csa.scot.nhs.uk

W3C Schema Structures

• annotation and documentation<xsd:simpleType name=“GenderType”>

<xsd:restriction base=“token”><xsd:enumeration value=“M”/><xsd:enumeration value=“F”/><xsd:enumeration value=“NK”>

<xsd:annotation><xsd:documentation>This is used when

the sex cannot be determined for physical reasons, e.g. a new born baby</xsd:documentation>

<xsd:annotation></xsd:enumeration>

</xsd:restriction></xsd:simpleType>

Page 28: Introduction to XML Schema John Arnett, MSc Standards Modeller Information and Statistics Division NHSScotland Tel: 0131 551 8073 (x2073) mailto:John.Arnett@isd.csa.scot.nhs.uk

Built-in Simple Types

• 44 built-in simple types - most are atomic

• Used directly in schemas or used to create user-defined simple types

Page 29: Introduction to XML Schema John Arnett, MSc Standards Modeller Information and Statistics Division NHSScotland Tel: 0131 551 8073 (x2073) mailto:John.Arnett@isd.csa.scot.nhs.uk

Built-in Simple Types

• String-based types–string–normalizedString–token

Page 30: Introduction to XML Schema John Arnett, MSc Standards Modeller Information and Statistics Division NHSScotland Tel: 0131 551 8073 (x2073) mailto:John.Arnett@isd.csa.scot.nhs.uk

Built-in Simple Types

• Numeric Types–float and double–decimal–integer

Page 31: Introduction to XML Schema John Arnett, MSc Standards Modeller Information and Statistics Division NHSScotland Tel: 0131 551 8073 (x2073) mailto:John.Arnett@isd.csa.scot.nhs.uk

Built-in Simple Types

• Date and Time Types–date–time–dateTime–gYear, gMonth, gDay–duration

Page 32: Introduction to XML Schema John Arnett, MSc Standards Modeller Information and Statistics Division NHSScotland Tel: 0131 551 8073 (x2073) mailto:John.Arnett@isd.csa.scot.nhs.uk

Built-in Simple Types

• Others–boolean–base64Binary and hexBinary–anyURI

Page 33: Introduction to XML Schema John Arnett, MSc Standards Modeller Information and Statistics Division NHSScotland Tel: 0131 551 8073 (x2073) mailto:John.Arnett@isd.csa.scot.nhs.uk

Built-in Simple Types

• Facets– length– minLength– maxLength– minExclusive– minInclusive– maxExclusive– minExclusive

– totalDigits– fractionDigits– whiteSpace– pattern– enumeration

Page 34: Introduction to XML Schema John Arnett, MSc Standards Modeller Information and Statistics Division NHSScotland Tel: 0131 551 8073 (x2073) mailto:John.Arnett@isd.csa.scot.nhs.uk

Built-in Simple Types

• Length facets

<xsd:simpleType name=“TextType”><xsd:restriction base=“string”>

<xsd:minLength value=“1”/><xsd:maxLength value=“35”/>

</xsd:restriction></xsd:simpleType>

<xsd:element name=“Comment” type=“TextType”/>

<Comment>This is a valid value</Comment><Comment/><Comment>This is an invalid value because it contains more than 35 characters</Comment>

Page 35: Introduction to XML Schema John Arnett, MSc Standards Modeller Information and Statistics Division NHSScotland Tel: 0131 551 8073 (x2073) mailto:John.Arnett@isd.csa.scot.nhs.uk

Built-in Simple Types

• enumeration facet

<xsd:simpleType name=“GenderType”><xsd:restriction base=“token”>

<xsd:enumeration value=“M”/> <xsd:enumeration value=“F”/> <xsd:enumeration value=“NK”/>

</xsd:restriction></xsd:simpleType>

<xsd:element name=“Sex” type=“GenderType”/>

<Sex>NK</Sex><Sex>Male</Sex>

Page 36: Introduction to XML Schema John Arnett, MSc Standards Modeller Information and Statistics Division NHSScotland Tel: 0131 551 8073 (x2073) mailto:John.Arnett@isd.csa.scot.nhs.uk

Built-in Simple Types

• pattern facet<xsd:simpleType name="PostCodeType">

<xsd:restriction base="xsd:string"><xsd:pattern value="[A-Z]{1,2}[0-9R][0-9A-Z]?

[0-9][A-Z]{2}"/></xsd:restriction>

</xsd:simpleType>

Page 37: Introduction to XML Schema John Arnett, MSc Standards Modeller Information and Statistics Division NHSScotland Tel: 0131 551 8073 (x2073) mailto:John.Arnett@isd.csa.scot.nhs.uk

Advanced Features

• Multi-document schemas• Complex type derivation• Reusable groups• Element substitution• Schema redefinition• Identity constraints• Schema design

Page 38: Introduction to XML Schema John Arnett, MSc Standards Modeller Information and Statistics Division NHSScotland Tel: 0131 551 8073 (x2073) mailto:John.Arnett@isd.csa.scot.nhs.uk

Summary

• Used to validate structure and values XML instance documents

• Uses XML syntax• W3C Recommendation specifies

data structures and built-in types• Supports namespaces• Has many advanced features, incl.

several extensibilty mechanisms

Page 39: Introduction to XML Schema John Arnett, MSc Standards Modeller Information and Statistics Division NHSScotland Tel: 0131 551 8073 (x2073) mailto:John.Arnett@isd.csa.scot.nhs.uk

Find Out More

• XML Schema Part 0: Primer– www.w3.org/TR/xmlschema-0/

• XML Schema Part 0: Structures– www.w3.org/TR/xmlschema-1/

• XML Schema Part 0: Datatypes– www.w3.org/TR/xmlschema-2/