19
XML and Validation Tools Schema Schematron

XML and Validation Tools Schema Schematron. XML eXtensible Markup Language (XML) –A metamarkup language. –The basic unit is called an element –Fairly

Embed Size (px)

Citation preview

Page 1: XML and Validation Tools Schema Schematron. XML eXtensible Markup Language (XML) –A metamarkup language. –The basic unit is called an element –Fairly

XML and Validation Tools

SchemaSchematron

Page 2: XML and Validation Tools Schema Schematron. XML eXtensible Markup Language (XML) –A metamarkup language. –The basic unit is called an element –Fairly

XML

•eXtensible Markup Language (XML)– A metamarkup language.– The basic unit is called an element

– Fairly similar to HTML

Page 2

<tag attribute="attribute value">element value</tag>

Element

AttributeOpening tag Closing tag

Page 3: XML and Validation Tools Schema Schematron. XML eXtensible Markup Language (XML) –A metamarkup language. –The basic unit is called an element –Fairly

Metamarkup?

•What does metamarkup mean?– There is no predefined and fixed set of tags for

XML– XML allows implementers to define their own set

of tags to meet their needs

Page 3

Examples• Office Open XML (ISO/IEC 29500)• Geography Markup Language (ISO 19136)

Page 4: XML and Validation Tools Schema Schematron. XML eXtensible Markup Language (XML) –A metamarkup language. –The basic unit is called an element –Fairly

Markup – ESRI ArcGIS 10 XML

Page 4

<idCitation> <resTitle>Title</resTitle> <date> <createDate>20110906</createDate> </date></idCitation>

Page 5: XML and Validation Tools Schema Schematron. XML eXtensible Markup Language (XML) –A metamarkup language. –The basic unit is called an element –Fairly

Markup – ISO 19139 XML

Page 5

<gmd:citation> <gmd:CI_Citation> <gmd:title><gco:CharacterString>Title</gco:CharacterString> </gmd:title> <gmd:date> <gmd:CI_Date> <gmd:date> <gco:Date>2011-09-06</gco:Date> </gmd:date> <gmd:dateType> <gmd:CI_DateTypeCode codeList="...#CI_DateTypeCode" codeListValue="creation">creation</gmd:CI_DateTypeCode> </gmd:dateType> </gmd:CI_Date> </gmd:date> </gmd:CI_Citation></gmd:citation>

Page 6: XML and Validation Tools Schema Schematron. XML eXtensible Markup Language (XML) –A metamarkup language. –The basic unit is called an element –Fairly

Well-Formed

•XML has strict rules, e.g.:– There must be one, and only one root

element– All elements must have an opening and

closing tag– Element names are case sensitive:

• <citation/> is different from <Citation/>

– XML conforming to the rules is said to be well-formed

Page 6

Page 7: XML and Validation Tools Schema Schematron. XML eXtensible Markup Language (XML) –A metamarkup language. –The basic unit is called an element –Fairly

Well-Formed

Page 7

<idCitation> <resTitle>Title</resTitle> <date> <createDate>20110906</createDate> </date></idCitation>

<idCitation> <resTitle>Title</ResTitle> <date> <createDate>20110906 </date></idCitation><idPurp>Summary</idPurp>

No closing tag

Opening and closing tagsare different

Two root elements

Page 8: XML and Validation Tools Schema Schematron. XML eXtensible Markup Language (XML) –A metamarkup language. –The basic unit is called an element –Fairly

Structure

•The markup defines data structure:– It signifies which elements are

associated– It can define semantics:

– It says nothing about how to display data (there are exceptions to this rule)

Page 8

<date> <createDate>20110906</createDate></date>

Page 9: XML and Validation Tools Schema Schematron. XML eXtensible Markup Language (XML) –A metamarkup language. –The basic unit is called an element –Fairly

XML is machine readable

•And…– Human readable… honestly

Page 9

Page 10: XML and Validation Tools Schema Schematron. XML eXtensible Markup Language (XML) –A metamarkup language. –The basic unit is called an element –Fairly

Page 10

Page 11: XML and Validation Tools Schema Schematron. XML eXtensible Markup Language (XML) –A metamarkup language. –The basic unit is called an element –Fairly

Schema and Validation

Page 12: XML and Validation Tools Schema Schematron. XML eXtensible Markup Language (XML) –A metamarkup language. –The basic unit is called an element –Fairly

Schema

•Schemas document the elements that are permitted in an XML application

– XML that conforms to a schema is said to be schema-valid

– XML that does not conform to a schema is said to be invalid

Page 12

Page 13: XML and Validation Tools Schema Schematron. XML eXtensible Markup Language (XML) –A metamarkup language. –The basic unit is called an element –Fairly

XML Schema Definition Language

Page 13

<xs:complexType name="CI_Citation_Type"> ... <xs:complexContent> <xs:extension base="gco:AbstractObject_Type"> <xs:sequence> <xs:element name="title" type="gco:CharacterString_PropertyType"/> <xs:element name="alternateTitle" type="gco:CharacterString_PropertyType" minOccurs="0" maxOccurs="unbounded"/> <xs:element name="date" type="gmd:CI_Date_PropertyType" maxOccurs="unbounded"/> ... </xs:sequence> </xs:extension> </xs:complexContent></xs:complexType>

Page 14: XML and Validation Tools Schema Schematron. XML eXtensible Markup Language (XML) –A metamarkup language. –The basic unit is called an element –Fairly

Markup – ISO 19139 XML

Page 14

<gmd:citation> <gmd:CI_Citation> <gmd:title><gco:CharacterString>Title</gco:CharacterString> </gmd:title> <gmd:date> <gmd:CI_Date> <gmd:date> <gco:Date>2011-09-06</gco:Date> </gmd:date> <gmd:dateType> <gmd:CI_DateTypeCode codeList="...#CI_DateTypeCode" codeListValue="creation">creation</gmd:CI_DateTypeCode> </gmd:dateType> </gmd:CI_Date> </gmd:date> </gmd:CI_Citation></gmd:citation>

Page 15: XML and Validation Tools Schema Schematron. XML eXtensible Markup Language (XML) –A metamarkup language. –The basic unit is called an element –Fairly

Schematron

•Schematron is:– A schema language for XML

• Document Schema Definition Language (DSDL)

– Written in XML– It’s an ISO Standard – ISO 19757-3

Find out more at: http://www.schematron.com/

Page 15

Page 16: XML and Validation Tools Schema Schematron. XML eXtensible Markup Language (XML) –A metamarkup language. –The basic unit is called an element –Fairly

Why use Schematron?

•XSD schema is unable to test some constraints:

– The ability to specify a choice of attributes

– The ability to vary the content model based on the value of an element or attribute (this sort of constraint is common in the ISO 19115 logical model)

•Implementing profiles (e.g. MEDIN):– With Schematron there’s no need to edit

the underlying standardised XSD

Page 16

Page 17: XML and Validation Tools Schema Schematron. XML eXtensible Markup Language (XML) –A metamarkup language. –The basic unit is called an element –Fairly

Page 17

Page 18: XML and Validation Tools Schema Schematron. XML eXtensible Markup Language (XML) –A metamarkup language. –The basic unit is called an element –Fairly

Validation Workflow

Page 19

ISO 19139 Schema Validation

Valid?ISO 19139 Table A.1 Constraints

Schematron

MEDIN Profile Schematron

Valid?

Valid?

END FAIL

END PASS

YES

YES

YES

NO

XSD Schema Validation

Schematron Validation

Page 19: XML and Validation Tools Schema Schematron. XML eXtensible Markup Language (XML) –A metamarkup language. –The basic unit is called an element –Fairly

Validation Tools

Page 20

Select profile

XSD Schema

Schematronschemas