Semantic Technologies and Application to Climate Data M. Benno Blumenthal IRI/Columbia University...

Preview:

Citation preview

Semantic Technologies and Application to Climate Data

M. Benno Blumenthal

IRI/Columbia University

CDW 2011-03-30/04-01

Triplets of • Subject• Property (or Predicate)• ObjectURI’s identify things, i.e. most of the above

Namespaces are used as a convenient shorthand for the URIs

Inferred triples (RDFS, OWL, rules)

{WOA} dc:title “World Ocean Atlas”

{WOA} iridl:hasPart {Monthly}

{dc:title} rdfs:isDefinedBy {dc:}

RDF: single framework for writing multiple systems

blind monks examining an elephant

John Godfrey Saxe (1816-1887)

Multiple partial representations of objects described by data

Standard Metadata

Users

Datasets

Tools

Standard Metadata Schema/Data Services

Standard metadata schema

Tools

Users

Datasets

Standard Metadata Schema

RDF

RDFRDF

Tools

Users

Datasets

Standard Metadata Schema

RDF

RDFRDF

Tools

Users

Datasets

Standard Metadata Schem

RDF

RDFRDF

RDF Data Model Exchange

RDF

Tools

Users

Datasets

Standard Metadata Schema

RDF

RDFRDF

Tools

Users

Datasets

Standard Metadata Schema

RDF

RDFRDF

Models, Crosswalks, and Objects

Structure of the RDF information that we are using to represent data objects in multiple frameworks (see full figure)

Data Server leads to URI

IRI data library is a pure REST interface, so that there is a URL for everything: dataset, variable, series of analysis filters on variables, image, datafile.

RDF can thus be used to annotate everything.

IRI Data Library Overview

IRI Data Collection

Generalized Data Tools

Specialized Data Tools

Dataset • Dataset •Dataset •Variable•ivar•ivar

multidimensional

Data Viewer Data Language

Maproom

URL/URI for data, calculations, figs, etc

IRI DataCollectionDataset

• Dataset •Dataset

•Variable•ivar•ivar

Calculations“virtual

variables”

imagesgraphics

descriptive and navigational

pages

OpenGISWMS/WCS

KML

Data Filesnetcdfbinaryimages

Clients

OpenDAPTHREDDS

Tables

ServersOpenDAP

THREDDS

GRIBnetCDFimagesbinary

DatabaseTablesqueries

spreadsheets shapefiles

images w/proj

IRI Data Collection

Dataset Objects

Crosswalk to Faceted Search

Crosswalk to DIF-CD Records

Sample DIF-CD<DIF>

<Entry_ID>IRIDL_ENSO_Climate_Impacts_ENSO_PRCP_Prob_Australia</Entry_ID>

<Entry_Title>ENSO Climate Impacts ENSO PRCP Prob Australia</Entry_Title>

<Data_Set_Citation>

<Dataset_Title>ENSO Climate Impacts ENSO PRCP Prob Australia</Dataset_Title>

<Online_Resource>

http://iridl.ldeo.columbia.edu/maproom/.ENSO/.Climate_Impacts/.ENSO_PRCP_Prob/index.html?map.lon.plotfirst=100&map.lon.plotlast=180&map.lat.plotfirst=-55&map.lat.plotlast=0&map.lat.units=degree_north&map.lon.units=degree_east

</Online_Resource>

</Data_Set_Citation>

<Parameters>

<Category>EARTH SCIENCE</Category>

<Topic>ATMOSPHERE</Topic>

<Term>PRECIPITATION</Term>

<Variable_Level_1>PRECIPITATION RATE</Variable_Level_1>

</Parameters>

XML Schema to Owl Translation

Based on existing software, but extended• Bi-directional, enough information is

preserved to generate conforming XML documents (a Java class extracts XML elements from a triple store)

• Structure is in the schema information, not the instance

• Fixed xslt converts instance files to RDF

XML Schema to Owl Translation

XML Schema – instance translation is essentially an alternate RDF/XML representation where only the properties are nested

– A standard XML file has all blank-node entities

– XML schema with rdf:about/rdf:resource can have uri entities

makes sense that the instance file does not explicitly type

all the elements.

Data ServersOntologies

MMI

JPL

StandardsOrganizations

Start Point

RDF/XML-Schema CrawlerXSLT/GRDDL ingest

XML Schema to OWL translationOwl SemanticsSWRL Rules

SeRQL CONSTRUCT

Search Queries

LocationCanonicalizer

TimeCanonicalizer

Sesame

Search Interface

bibliography

IRI RDF Architecture

SSWAP

Simple Semantic Web Architecture and Protocol

A way of providing a service that semantically describes its domain and range to advertise it. To invoke it, both domain and range are restricted.

Traditionally we specify of chain of processing steps, and provenance documents that effort. SSWAP specifies an object by constraining it – you could specify its provenance to get it “traditionally”, or some other quality.

Multiplicity of Data Representations

RDF provides a unifying framework to simultaneous hold and deliver dataset metadata according to multiple standards

Models, Crosswalks, and Objects organizes that framework clarifying the semantic distance spanned

bidirectional XML Schema to OWL translation enables delivery of inferred metadata to existing XML-based systems

Persistence with inference/transform is the underlying technology

Semantic Service Framework could extend this framework to semantically-informed workflow generation