Upload
alondra-barnett
View
214
Download
1
Tags:
Embed Size (px)
Citation preview
Moving ISO & OGC standards into the Semantic WebPresented at “Metadata DownUnder”“Metadata DownUnder”11th Open Forum on Metadata Registries, Sydney, NSW, Australia
Laurent Lefort
22 May 2008
Water For a Healthy Country
CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Outline
• Ontologies and water data standards• Transforming ISO TC 211 and OGC
standards into ontologies• Work on OWL versions for multiple standards
• Findings on the transformation methods• Findings on the resulting ontologies and on how to build better ontologies
• What is the added value of Semantic Web technologies?
CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Context: water resources management for Australia
Water Resources Observation Network (WRON) program
• One of CSIRO’s Water for a Healthy Country Flagship themes
• Support to major research alliance between Bureau of Meteorology & CSIRO (WIRADA) to deliver mission-critical R&D
Specific Activity on Water Data standards
Hydrometricdata
Geospatialdata
Usage andentitlement
data
Models
Source: Vertessy 2006: Australia’s water resources information imperative and the role of the Water Resources Observation Network (WRON)
CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Interest in water data standards
Water budget combining 4 sub-domains• Atmospheric Water (& Climate)• Surface water• Groundwater (& Geology)• Human use of water
Need to manage features and observations
Complex cross-domain interactions • e.g. transfer between surface water and groundwater
Need for a consistent standard basis (& method)• Data and Metadata
CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Generations of “standards” & integration complexity
ASCII-based
DB-based
Registries
XML
Model-driven generation of XML schemas
Custom XSL transfo. & web services
Distributed systems with same db schema
UML & XML schemas
XML schemas Reuseable XML schema stack
Master Data Managt
OWL ontologies Semantic integration
EPA STORET
EPA WQX
GWML
WOML
WFD schemas
eWater (EU)
SANDRE
SANDRE XML
Surface water & groundwater “standards”
Integration support
Stan
dard
use
rsSt
anda
rd d
evel
oper
s
ODM
WaterML (CUAHSI)
CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Semantic Web technologies for standards development
• RDF (Resource Description Format) for the “web of data”: annotations and links
• Value: flattened, web-compatible method to manage and link data into set of triples
• OWL (Web Ontology Language) for the web of (data) models
• Several variants based on description logic with different expressivity / scalability ratios
• Value: reasoning support to build class hierarchy and verify logical consistency
CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Building expectation that OWL can be useful
• Past and present efforts to create and use OWL versions of standards
• Drexel University (HydroSeek) • Uni of Muenster (ACE-GIS, SWING, EDINA)
• Discussions at the Water Resources Information Model Workshop (Canberra, Sep 2007)
• Recognition of the ontological value of some standards e.g. OGC Observations and Measurements
• Finney: Australian Marine Ontology, WALIS Forum 2008• Brodaric & Probst: DOLCE Rocks AAAI Spring Symp. 2008
CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Reasons to share our experience in building ontologies
• High demand for OWL versions of standards• Transition and ramping up period from a manual
process to a semi-automated one• Recently developed methods (ODM) and tools
(TopBraid) to create ontologies from UML models or from XML schemas
• Re-evaluation of current standard development practice
• Push for harmonisation of spatial standards (INSPIRE)
• Development of OGC Model driven approach• ISO 19150 Ontology group, led by Jean Brodeur
• Can SW help ISO TC 211? Can ISO TC 211 help SW?
CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
What we present today
• Work on OWL versions for multiple standards• ISO and OGC standards• Standards based on ISO and OGC standards defined for the
water domain
• Findings on the transformation method• Comparison of ontology generation tools from UML models
and from XML schemas
• Findings on the resulting ontologies• Tactics to build better ontologies
CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Standards to transform into OWL
•Focus on water standards describing Features & Observations because of our interest in:• Reference datasets (continental scale)
• Identification of water features and of their topological and hydrological relationships
• Data exchange language for individual and aggregated observations
CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Two key building blocks to build features and observations standards
• ISO 19109: Geographic information -- Rules for application schema
• Defines a method to specify features know as the General Feature Model (GFM)
• OGC Observations and Measurements (O&M)• Refines the GFM method to manage observations
• Supported by common schema generation technologies (UML to XML schemas)
• To implement UML patterns out of “stereotypes”• To create definitions on top of existing schemas• Example of tools: ShapeChange, FullMoon (CSIRO)
CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Common principles used for standards based on GFM and O&M
• A core model defines the main classes forming the standard
• Through their relation to other specified classes or to generic spatial definitions
• Extra design flexibility is given in three areas • Attachment of properties to features, • Introduction of externally managed code lists • Provision for alternative usage (union)
• Specific restrictions on the applicability of the definitions can be added with a constraint language, such as Schematron
CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Added value of the Observations & Measurements standard
• Two user-managed class hierarchies in GFM-based specs:
• Feature and FeaturesCollection: a Feature-type is characterized by a specific set of properties
• Up to five user-managed class hierarchies in O&M-based specs
• Observation, SamplingFeature, PropertyType, Procedure and Result
• An Observation is an Event whose result is an estimate of the value of some Property of the Feature-of-interest, obtained using a specified Procedure
• Stronger ontological value for O&M• More branches and separation of concern: • Example: Difference between Feature and SamplingFeature
• Feature for the real world objects e.g. an aquifer• SamplingFeature to characterise how a measure is done e.g.
along a borehole
CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Standards transformed into OWL
• Application standards based on • ISO TC 211 General Feature Model • OGC Observations & Measurements
• Corresponding ISO/OGC standards from two origins:
• UML model grouping all the ISO TC 211 standards
• from the Harmonized Model Maintenance Group• XML schemas from OGC (schemas.opengis,net)
• Including GML, SensorML, …
CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Selected ontology generation methods
• XSL-based approaches• XO (CSIRO-developed) from UML 2.0 or XML
schemas to OWL• Rhizomik.net xsd2owl.xsl (open source but
restricted to non commercial usage)
• TopBraid Composer (commercial tool)• Transformation from UML 2.0 and XML schemas
to OWL• Enterprise Architect files can be pre-processed with an
EA-specific openArchitectureWare plugin
CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Generated ontologies
OpenGIS schemasGML, SensorMLSOS, WFS, WCSOther schemas
GeoSciML, WML, CSML
Rhizomik xsd2owl or
XO
TopBraid or XO
EA modelsHMMG
HollowWorldGeoSciML
GWML, WOMLCSML
DHS-GDM
one ontology file per source
except for TopBraid
Pre-processing to regroup all the schemas
Pre-processing to UML2.0
Generated ontologies
CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Example 1: om:Observation from XML schemas (TopBraid)
CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Example 2: om:Observation from UML model (TopBraid)
• Long URIs based on package names• hasFeatureOfInterest:
<http://ogc.uml/Model/Model/Externally-governed-packages/HollowWorld/CommonUsagePackages/ISO-19110/ISO-19115-Metadata/Metadata-entry-set-information/MD_Metadata>[0..1]
CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Example 3: om:Observation from UML model (XO)
CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Example 4: om:Observation from XSD (Rhizomik)
CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Important findings
• XML schemas easier to transform than UML models• As long as the transformation tool is capable to process tricky
xsd:include and xsd:import cases
• Modularity schemes in place for UML or XML schemas are not necessarily directly applicable in OWL
• Suggested alternative is to simply use the XML namespace scheme to group together schemas sharing the same namespace into one or a limited number of modules
• The method to define URIs works better with XML schemas than with UML models
• XSL-based approaches better handle low quality (or incomplete) UML input
• Known problems with UML/XMI files
CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
XML schemas easier to transform than UML models
• UML models• High variability in the usage of stereotypes• Risk of problem if the UML model is not fully validated or
messy
• XML schemas• Availability of validation tools even for multi-part schemas• Less work to interpret the modelling intent • Always available directly or after generation from UML• Tighter management of successive versions
• Being able to generate the same output from both types of input for the same standard is critical to strengthen the transformation process
CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Modules (files), namespaces (prefix) and URIs (IDs) in OWL
Sub-Module 1filename gml.owl
Namespace: gmlhttp://opengis.net/gml
URI for each classes (and properties)
http://opengis.net/gml#AbstractObservation
Top Module
owl:import modules
Sub-Module 2filename om.owl
Namespace: omhttp://opengis.net/om
URI for each classes (and properties)
http://opengis.net/om#Observation
Difference with XML: can not have same namespaces in different modules
CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Method to define ontology modules
• OMG ODM recommendation to replicate UML package is inapplicable in our view
• Too many modules and the wrong ones • TopBraid’s UML import operation creates 184 OWL files for
the O&M model (which includes the ISO TC 211 standards)
• Recommendation• For XML schemas, group together schemas sharing the
same namespace into one or a limited number of modules• Define a method producing the same results for UML
models• Record the source module or schema as an annotation
property for traceability and/or round trip purposes
CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Method to define URIs
• Using UML package names to create URIs is not recommended
• See example 2
• Keeping the original XML schema namespace works well in practice
• Maybe two generation options are needed • To create separate definitions for different versions of the
same source• To merge definitions from different versions of the same
source
CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Three central issues with the current OMG ODM specification
• Modules derivation from packages • Impossible to apply in practice (too many
modules)
• Naming conventions to disambiguate property names
• Can lead to an explosion of the number of properties often not required
• Does not discuss the union & substitution group patterns which are widely used in ISO/OGC standards
CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Building better ontologies
•Assumption• Ontologies can help standards amateurs to understand them without reading the documentation or learning how they have been created
•This discussion• Tactics to capture the semantic essence of ISO/OGC & derived standards
CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Tactic 1: Stick to the original definitions
•Rendition of ISO standards which mirrors the original UML model • Drexel University team
• ISO and OGC ontologies in OWL-Protégé 2.1 (2004-05)
• ISO 19103, 19107-12, 19115, OGC Spatial referencing by Coordinates and GML
•Success factor: traceability to the origin of definitions (often overlooked)
CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Tactic 1: benefits of traceability
• Handle multiple definitions of Observations• OM1_Observation: published OGC O&M spec. (part 1)
version 1.0 • OM: Draft version of O&M• GML: gml:Observation
CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Tactic 2: Modularise, Winnow, Align with Upper Ontology
• University of Muenster (and EU projects partners)
• ACE-GIS: OWL-Protégé 1.2 (2004), SERES: OWL-Protégé 2.2 (2005), SWING: WSML (2008)
• Spatial representation (19107), Location (19111-19112), O&M (alignment with DOLCE and SWEET)
• Generally based on a costly manual process • Match what the end user wants• Weaker traceability to the sources of definitions
CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Tactic 3: Try to do both
• Replace the manual process by a smarter transformation designed to normalise the ontology skeleton
• Define the right branches at the top• Isolate unambiguous primitives (e.g. units)• Use modules/namespace/URIs to position source-
specific definitions against common ones
• Specific effort needed to• Reduce the number of root classes• Create deeper class & property hierarchies• Handle ambiguous property definitions
CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Example of normalised ontology skeleton
Define the right branches at the top
Isolate unambiguous primitives (e.g. units)
Use modules/namespace/URIs to position source-specific definitions against common ones
CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Conclusions
• Better ontology generation tactics can help to satisfy the demand for OWL versions of (groups of) standards
• Three priority areas have been identified• Systematically develop parallel transformation
chains from UML and XML schemas to enable cross-checking of outputs
• Develop more convenient and more robust modularity, namespace and URIs schemes
• Give feedback to ISO/OGC Policy group on the compatibility of their approach with OWL
CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Inputs for ISO 19150 Ontology group
• Can SW help ISO TC 211 (and OGC)? • Modelling and reasoning power of OWL
• Sub-properties in v. 1.0 and role composition in v. 2.0• Top level class hierarchy skeleton: normalised form of
ontologies, alignment to upper ontologies
• Can ISO TC 211 (and OGC) help SW?• Method to define a standard as a derived product
of another one• Transposable experience on how to extend or restrict a
specification• Use cases to inform SW work on ontologies and rules
CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Acknowledgements
Thanks to:• Ross Ackland, WRON Theme Leader, CSIRO• Simon Cox, Research scientist, CSIRO and OGC• Amit Parashar, CSIRO and Australian W3C office
And also to:• TopQuadrant for TopBraid Composer • Rhizomik.net for xsd2owl.xsl• Rick Jelliffe et al: XSL pre-processing of XML
schemas
CSIRO ICT CentreLaurent LefortSenior Research Engineer (Ontologies)
Phone: +61 2 6216 7046Email: [email protected]: wron.net.au
Contact UsPhone: 1300 363 400 or +61 3 9545 2176Email: [email protected] Web: www.csiro.au
Thank you
Backup slides
CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Standards
• GeoSciML (Geoscience Markup Language) • GFM-based, first standard to partially leverage O&M
• GWML (Groundwater Markup Language), WOML (Water Observation Markup Language)
• Two preliminary efforts based on O&M to create groundwater and surface water standards:
• CSML (Climate sciences Modelling Language) • Adapting & completing O&M for Met/Ocean data
• DHS-GDM (Department of Homeland Security Geospatial Data Model)
• Huge compilation of standards for homeland security applications
CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Summary of the 4 methods
Transfo.method
URI managt
Modules based on
OWL variant
Comments
TopBraid XSD
Namespace Namespaces using specific conventions
DL Best result in gal
TopBraid UML
Package too complex URIs
Package too many modules
DL Import not always successful
Rhizomik XSD
Unique Unique EL+ Adapted to handle multi-part schemas
XO UML Unique Unique EL+ Stereotypes (esp. Unions)