39
20081118 Fox OOS meeting 1 Ontologies and Semantic Applications in Earth Sciences Peter Fox (TWC/RPI; formerly HAO/NCAR) Thanks to many. Projects funded by NSF/OCI and NASA/ACCESS/ESTO

Ontologies and Semantic Applications in Earth Sciences

  • Upload
    virote

  • View
    42

  • Download
    0

Embed Size (px)

DESCRIPTION

Ontologies and Semantic Applications in Earth Sciences. Peter Fox (TWC/RPI; formerly HAO/NCAR) Thanks to many. Projects funded by NSF/OCI and NASA/ACCESS/ESTO. Background. Scientists should be able to access a global, distributed knowledge base of scientific data that: - PowerPoint PPT Presentation

Citation preview

Page 1: Ontologies and Semantic Applications in Earth Sciences

20081118 Fox OOS meeting1

Ontologies and Semantic Applications in Earth Sciences

Peter Fox (TWC/RPI; formerly HAO/NCAR)

Thanks to many.

Projects funded by NSF/OCI and NASA/ACCESS/ESTO

Page 2: Ontologies and Semantic Applications in Earth Sciences

2

BackgroundScientists should be able to access a global, distributed

knowledge base of scientific data that:• appears to be integrated• appears to be locally available

But… data is obtained by multiple means (models and instruments), using various protocols, in differing vocabularies, using (sometimes unstated) assumptions, with inconsistent (or non-existent) meta-data. It may be inconsistent, incomplete, evolving, and distributed

And… there exist(ed) significant levels of semantic heterogeneity, large-scale data, complex data types, legacy systems, inflexible and unsustainable implementation technology

Page 3: Ontologies and Semantic Applications in Earth Sciences

3

Data-types as service

… … … …

VO App1

VO App2VO App3

DB2 DB3DBn

DB1

VOTable

Simple Image

Access Protocol

Simple Spectrum

Access Protocol

Simple Time Access

Protocol

VO layer

Limited interoperability

Lightweight semantics

Limited meaning, hard coded

Limited extensibility

Under review

Open Geospatial Consortium:

Web {Feature, Coverage, Mapping} Service

Sensor Web Enablement:

Sensor {Observation, Planning, Analysis} Service

use the same approach

Page 4: Ontologies and Semantic Applications in Earth Sciences

20080602 Fox VSTO et al.4

… … … …

VO Portal

Web Serv.

VO API

DB2 DB3DBn

DB1

Semantic mediation layer - VSTO - low level

Semantic mediation layer - mid-upper-level

Education, clearinghouses, other services, disciplines, etc.

Metadata, schema, data

Query, access and use of data

Semantic query, hypothesis and inference

Semantic interoperability

Added value

Added value

Added value

Added value

Mediation Layer• Ontology - capturing concepts of Parameters,

Instruments, Date/Time, Data Product (and associated classes, properties) and Service Classes

• Maps queries to underlying data• Generates access requests for metadata, data• Allows queries, reasoning, analysis, new

hypothesis generation, testing, explanation, etc.

Standard, or not, vocabularies and schema

“Knowledge” as service!

Page 5: Ontologies and Semantic Applications in Earth Sciences

20080602 Fox VSTO et al.5

Semantic Web Methodology and Technology Development Process

• Establish and improve a well-defined methodology vision for Semantic Technology based application development

• Leverage any existing vocabularies

Use Case

Small Team, mixed skills

Analysis

Adopt Technology Approach

Leverage Technology

Infrastructure

Rapid Prototype

Open World: Evolve, Iterate,

Redesign, Redeploy

Use Tools

Science/Expert Review & Iteration

Develop model/

ontology

Page 6: Ontologies and Semantic Applications in Earth Sciences

20080602 Fox VSTO et al.6

E.g. Science and technical use casesFind data which represents the state of the neutral

atmosphere anywhere above 100km and toward the arctic circle (above 45N) at any time of high geomagnetic activity.

– Extract information from the use-case - encode knowledge– Translate this into a complete query for data - inference and

integration of data from instruments, indices and models

Provide semantically-enabled, smart data query services via a SOAP web for the Virtual Ionosphere-Thermosphere-Mesosphere Observatory that retrieve data, filtered by constraints on Instrument, Date-Time, and Parameter in any order and with constraints included in any combination.

Page 7: Ontologies and Semantic Applications in Earth Sciences

20080602 Fox VSTO et al.7

VSTO - semantics and ontologies in an operational environment: vsto.hao.ucar.edu, www.vsto.org

Web Service

Existing OPeNDAP Service

Page 8: Ontologies and Semantic Applications in Earth Sciences

20080602 Fox VSTO et al.8

Semantic Web Services

Page 9: Ontologies and Semantic Applications in Earth Sciences

20080602 Fox VSTO et al.9

Semantic Web Services

OWL document returned using VSTO ontology - can be used both syntactically or semantically

Page 10: Ontologies and Semantic Applications in Earth Sciences

10

Semantic Web Benefits• Unified/ abstracted query workflow: Parameters, Instruments, Date-Time

across widely different disciplines• Decreased input requirements for query: in one case reducing the

number of selections from eight to three• Semantic query support: by using background ontologies and a

reasoner, our application has the opportunity to only expose coherent queries (portal and services)

• Semantic integration: in the past users had to remember (and maintain codes) to account for numerous different ways to combine and plot the data whereas now semantic mediation provides the level of sensible data integration required, and exposed as smart web services– understanding of coordinate systems, relationships, data synthesis,

transformations, etc.– returns independent variables and related parameters

• A broader range of potential users (PhD scientists, students, professional research associates and those from outside the fields)

• VSTO: http://vsto.hao.ucar.edu, http://www.vsto.org

Page 11: Ontologies and Semantic Applications in Earth Sciences

Fox RPI: Semantic Data Frameworks May 14, 2008

11

http://dataportal.ucar.edu/schemas/vsto_all.owl (1.0, 2.0 coming)

Page 12: Ontologies and Semantic Applications in Earth Sciences

12

Ingest/pipelines: problem definition

• Data is coming in faster, in greater volumes and outstripping our ability to perform adequate quality control

• Data is being used in new ways and we frequently do not have sufficient information on what happened to the data along the processing stages to determine if it is suitable for a use we did not envision

• We often fail to capture, represent and propagate manually generated information that need to go with the data flows

• Each time we develop a new instrument, we develop a new data ingest procedure and collect different metadata and organize it differently. It is then hard to use with previous projects

• The task of event determination and feature classification is onerous and we don't do it until after we get the data

Page 13: Ontologies and Semantic Applications in Earth Sciences

20080602 Fox VSTO et al.13

Page 14: Ontologies and Semantic Applications in Earth Sciences

14

• Who (person or program) added the comments to the science data file for the best vignetted, rectangular polarization brightness image from January, 26, 2005 1849:09UT taken by the ACOS Mark IV polarimeter?

• What was the cloud cover and atmospheric seeing conditions during the local morning of January 26, 2005 at MLSO?

• Find all good images on March 21, 2008.• Why are the quick look images from March 21,

2008, 1900UT missing?• Why does this image look bad?

Use cases

Page 15: Ontologies and Semantic Applications in Earth Sciences

20080602 Fox VSTO et al.15

Page 16: Ontologies and Semantic Applications in Earth Sciences

20080602 Fox VSTO et al.16

Page 17: Ontologies and Semantic Applications in Earth Sciences

17

Provenance

• Origin or source from which something comes, intention for use, who/what generated for, manner of manufacture, history of subsequent owners, sense of place and time of manufacture, production or discovery, documented in detail sufficient to allow reproducibility

• Knowledge provenance; enrich with ontologies and ontology-aware tools

Page 18: Ontologies and Semantic Applications in Earth Sciences

18

Page 19: Ontologies and Semantic Applications in Earth Sciences

20080602 Fox VSTO et al.19

Page 20: Ontologies and Semantic Applications in Earth Sciences

20080602 Fox VSTO et al.20

Quick look browse

Page 21: Ontologies and Semantic Applications in Earth Sciences

21

Page 22: Ontologies and Semantic Applications in Earth Sciences

22

Visual browse

Page 23: Ontologies and Semantic Applications in Earth Sciences

23

Page 24: Ontologies and Semantic Applications in Earth Sciences

24

Page 25: Ontologies and Semantic Applications in Earth Sciences

Search and structured query

25

Search StructuredQuery

Page 26: Ontologies and Semantic Applications in Earth Sciences

20080602 Fox VSTO et al.26

Search

Page 27: Ontologies and Semantic Applications in Earth Sciences

27

Data Integration Use Case• Determine the statistical signatures of both

volcanic and solar forcings on the height of the tropopause

Page 28: Ontologies and Semantic Applications in Earth Sciences

28

Detection and attribution relations…

Page 29: Ontologies and Semantic Applications in Earth Sciences

20080602 Fox VSTO et al.29

Page 30: Ontologies and Semantic Applications in Earth Sciences

SWEET 2.0

Page 31: Ontologies and Semantic Applications in Earth Sciences

31

Semantic framework indicating how volcano and atmospheric parameters and databases can immediately be plugged in to the semantic data framework to enable data integration.

Page 32: Ontologies and Semantic Applications in Earth Sciences

Faceted Search

20080602 Fox VSTO et al.32

Page 33: Ontologies and Semantic Applications in Earth Sciences

Summary• Level of ontology encoding relates to use, e.g.

– VSTO: – SPCDIS:– SESDI: Data integration needs higher level of

curation of ontologies and mapping to data

• Languages and tools– Rapid prototyping (PHP, Semantic MediaWiki)– Clean and simple (RDFS, Perl and SPARQL)– Complex and rich (Java, Protégé, Jena, Pellet,

ELMO, Maven, Eclipse)

33

Page 34: Ontologies and Semantic Applications in Earth Sciences

20080602 Fox VSTO et al.34

Modified GEON Solution Framework

Level 1:

Data Registration at the Discovery Level,

e.g. Volcanolocation and activity

Level 2:

Data Registration at the Inventory Level, e.g. list of datasets by,types, times, products

Level 3:

Data Registration at the Item Detail

Level, e.g. access toindividual quantities

Ontology basedData Integration

Earth Sciences Virtual DatabaseA Data Warehouse where

Schema heterogeneity problem is Solved; schema based integration

Data Discovery Data Integration

A.K.Sinha, Virginia Tech, 2006

Page 35: Ontologies and Semantic Applications in Earth Sciences

Spare material

20080602 Fox VSTO et al.35

Page 36: Ontologies and Semantic Applications in Earth Sciences

20080602 Fox VSTO et al.36

Example 1: Registration of Volcanic Data

SO2 Emission from Kilauea east rift zone -

vehicle-based (Source: HVO)Abreviations: t/d=metric tonne (1000 kg)/day, SD=standard deviation, WS=wind speed, WD=wind direction east of true north, N=number of traverses

Location Codes:• U - Above the 180° turn at Holei Pali (upper Chain of Craters Road)

• L - Below Holei Pali (lower Chain of Craters Road)

• UL - Individual traverses were made both above and below the 180° turn at Holei Pali

• H - Highway 11

Page 37: Ontologies and Semantic Applications in Earth Sciences

20080602 Fox VSTO et al.37

Registering Volcanic Data (2)

• No explicit lat/long data

• Volcano identified by name

• Volcano ontology framework will link name to location

Page 38: Ontologies and Semantic Applications in Earth Sciences

20080602 Fox VSTO et al.38

Registering Atmospheric Data (2)

Page 39: Ontologies and Semantic Applications in Earth Sciences

39

Building blocks

• Data formats and metadata: IAU standard FITS, with SoHO keyword convention, JPeG, GIF

• Ontologies: OWL-DL and RDF• The proof markup language (PML) provides an interlingua for

capturing the information agents need to understand results and to justify why they should believe the results.

• The Inference Web toolkit provides a suite of tools for manipulating, presenting, summarizing, analyzing, and searching PML in efforts to provide a set of tools that will let end users understand information and its derivation, thereby facilitating trust in and reuse of information.

• Capturing semantics of data quality, event, and feature detection within a suitable community ontology packages (SWEET, VSTO)