84
1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

Embed Size (px)

Citation preview

Page 1: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

1

Foundations VI: Foundations VI: Discovery, Access and

Semantic Integration

Deborah McGuinness and Peter Fox

CSCI-6962-01

Week 11, November 10, 2008

Page 2: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

Contents• Review of reading, questions, comments

• Semantic Integration using SESDI – Semantically-Enabled Scientific Data Integration as an example

• Semantically-Enabled Search – ex. Noesis

• Integration using Top Level Science ontologies

• Summary

• Next week2

Page 3: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

References• Fox, P.; McGuinness, D.L.; Raskin, R.; Sinha, K. A Volcano Erupts: Semantically

Mediated Integration of Heterogeneous Volcanic and Atmospheric Data. Proceedings of the First Workshop on Cyberinfrastructure: Information Management in eScience, co-located with the ACM Conference on Information

and Knowledge Management, Lisbon, Portugal, November 9, 2007. ftp://ftp.ksl.stanford.edu/pub/KSL_Reports/KSL-07-09.pdf

• Sunil Movva, Rahul Ramachandran, Xiang Li, Phani Cherukuri, Sara Graves. Noesis: A Semantic Search Engine and Resource Aggregator for Atmospheric Science. NSTC2007. http://esto.nasa.gov/conferences/nstc2007/papers/Ramachandran_Rahul_A3P4_NSTC-07-0084.pdf

• Boyan Brodaric and Florian Probst. Enabling Cross-Disciplinary e-Science by Integrating Geoscience Ontologies with DOLCE. Under Review. 2008.

• Yolanda Gil, Ewa Deelman, Mark Ellisman, Thomas Fahringer, Geoffrey Fox, Dennis Gannon, Carole Goble, Miron Livny, Luc Moreau, Jim Myers, "Examining the Challenges of Scientific Workflows," Computer , vol. 40, no. 12, pp. 24-32, December, 2007. http://www.isi.edu/~gil/papers/computer-NSFworkflows07.pdf

3

Page 4: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

4

Semantic Web Methodology and Technology Development Process

• Establish and improve a well-defined methodology vision for Semantic Technology based application development

• Leverage controlled vocabularies, et c.

Use Case

Small Team, mixed skills

Analysis

Adopt Technology Approach

Leverage Technology Infrastructur

e

Rapid PrototypeOpen World:

Evolve, Iterate, Redesign, Redeploy

Use Tools

Science/Expert Review & Iteration

Develop model/

ontology

EvaluationEvaluation

Page 5: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

Motivation for Semantic Integration

• In order to solve problems that are inherently multi-disciplinary, researchers often need data from many varied sources.

• Consider problems such as global warming or some problems that you suggested 2 classes ago – e.g.impact of earthquakes on transportation, etc.

Page 6: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

Semantically Enabled Scientific Data Integration

• SESDI slides from joint work with Fox, McGuinness, Raskin, Sinha and materials from – McGuinness et al, Geoinformatics 2007– Fox et al, ESTO 2008

Page 7: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

Mt. Spurr, AK. 8/18/1992 eruption, USGS

http://www.avo.alaska.edu/image.php?id=319

Page 8: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

Eruption cloud movement from Mt.Spurr, AK,1992

USGS

Page 9: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

Tropopause

http://aerosols.larc.nasa.gov/volcano2.swf

Page 10: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

Atmosphere Use Case• Determine the statistical signatures of both

volcanic and solar forcings on the height of the tropopause From paleoclimate researcher – Caspar Ammann – Climate and Global

Dynamics Division of NCAR - CGD/NCAR

Layperson perspective:

- look for indicators of acid rain in the part of the atmosphere we experience…

(look at measurements of sulfur dioxide in relation to sulfuric acid after volcanic eruptions at the boundary of the troposphere and the stratosphere)

Page 11: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

SESDI Impact: A Better Way to Access DataThe Problem

Scientists often only use data from a single instrument because it is difficult to access, process, and understand data from multiple instruments. A typical data query might be:

“Give me the temperature, pressure, and water vapor from the AIRS instrument from Jan 2005 to Jan 2008”

“Search for MLS/Aura Level 2, SO2 Slant Column Density from 2/1/2007”

A Solution

Using a simple process, SESDI allows data from various sources to be registered in an ontology so that it can be easily accessed and understood. Scientists can use only the ontology components that relate to their data. An SESDI query might look like:

“Show all areas in California where sulfur dioxide (SO2) levels were above normal between Jan 2000 and Jan 2007”

This query will pull data from all available sources registered in the ontology and allow seamless data fusion.

Page 12: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

Components to implement• An analysis application

• Cross-domain terms, concepts and relations

• Connections to underlying data (registration)

• Framework to put these together

• Integration connector

Page 13: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

Detection and attribution relations…

Page 14: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008
Page 15: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008
Page 16: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008
Page 17: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

Data Registration Framework

Level 1:

Data Registration at the Discovery Level,

e.g. Volcanolocation and activity

Level 2:

Data Registration at the Inventory Level, e.g. list of datasets by,types, times, products

Level 3:

Data Registration at the Item Detail

Level, e.g. access toindividual quantities

Ontology basedData Integration

Earth Sciences Virtual DatabaseA Data Warehouse where

Schema heterogeneity problem is Solved; schema based integration

Data Discovery Data Integration

A.K.Sinha, Virginia Tech, 2006

Page 18: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

How to find the data?• Think about it the way the data providers do

Page 19: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

SEDRE: Semantically Enabled Data Registration Engine

A. K. Sinha, A. Rezgui, Virginia Tech

•SEDRE: a system that enables scientists to semantically register data sets for optimal querying and semantic integration

•SEDRE enables mapping of heterogeneous data to concepts in domain ontologies

Page 20: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

Semantic Registration in SEDRE: An Overview

Registry Server: Warehouse of registration

records

Volcanic Data

SEDRESEDRE

UserData

Server

Semantic Registration

Registration/Discovery Ontologies

Ontology Server

Registry Database

• SEDRE is a desktop application

• Users download and install SEDRE

• SEDRE accesses domain ontologies

• Users map data attributes (e.g., SO2) to concepts in

ontologies without ‘knowing it’

Page 21: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

Example 1: Registration of Volcanic Data

SO2 Emission from Kilauea east rift zone -

vehicle-based (Source: HVO)Abreviations: t/d=metric tonne (1000 kg)/day, SD=standard deviation, WS=wind speed, WD=wind direction east of true north, N=number of traverses

Location Codes:• U - Above the 180° turn at Holei Pali (upper Chain of Craters Road)

• L - Below Holei Pali (lower Chain of Craters Road)

• UL - Individual traverses were made both above and below the 180° turn at Holei Pali

• H - Highway 11

Page 22: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

Loading Volcanic Data into SEDRE

Page 23: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

Registering Volcanic Data (1)

Page 24: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

Registering Volcanic Data (2)

• No explicit lat/long data

• Volcano identified by name

• Volcano ontology framework will link name to location

Page 25: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

Example 2: Registration of Atmospheric Data

Satellite data for SO2 emissions

Abbreviation: SCD: Slant Column Density (in Dobson Unit (DU))

Page 26: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

Loading Atmospheric Data into SEDRE

Page 27: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

Registering Atmospheric Data (1)

Page 28: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

Registering Atmospheric Data (2)

Page 29: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

SEDRE+DIA: Overview

Query: Show all areas in California where sulfur dioxide (SO2) levels have been above normal between Jan. 1990

and Dec. 1990

User

Registry Server: Warehouse of registration

records

Map Server

Volcanic Data

SEDRESEDRE

UserData

Server

Semantic Registration

DIADIA

Gazetteer

Geochronology

Geodynamics

Ore DepositsGeochemistry

Petrology Mineralogy Structure Geophysics

Stratigraphy

Paleontology

Hydrology TectonicsAreal Geology

DIA (Discovery, Integration, Analysis)

SEDRE (Semantically Enabled Data Registration)

Registration/Discovery Ontologies

Ontology Server

Registry Database

Data Access

Semantic Discovery

DIA: Web-based System for Data Discovery, Integration and Analysis

(Developed at Virginia Tech through NSF funding)

Page 30: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

SESDI Data Registration Summary Summary

• Semantic data frameworks technologies are changing the landscape of providing data to scientists

• Tools for data registration are soon to be available

• Applications to perform data integration mediated by semantics are available

• Initial results - applied to two volcanoes - led to correlation of SO2 concentration from volcano and in the atmosphere and relation to H2SO4.

Page 31: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

Volcano Workshop - before

Page 32: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

Volcano concept map after the workshop - some linked concepts are circled

Page 33: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

Plate tectonics - before workshop

Page 34: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

Plate tectonics - after

Page 35: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

Atmosphere (portions from SWEET)

Page 36: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

Atmosphere II

Page 37: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

Volcano Workshop - before

Page 38: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

Volcano concept map after the workshop - some linked concepts are circled

Page 39: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008
Page 40: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

Packages for an Ontology for Volcanoes

Data Types

Volcanic System

Climate

Phenomenon Material

Instruments

ImportNASA: Semantic Web for Earth Science

Numerics Ontology

ImportNASA: Semantic Web for Earth Science

Units Ontology

ImportNASA: Semantic Web for Earth Science

Physical Property Ontology

ImportNASA: Semantic Web for Earth Science

Physical Phenomena Ontology

Planetary Material

Planetary Structure

Physical Properties

PlanetaryLocation

Geologic Time

GeoImage

PlanetaryPhenomenon

IMPORT EXISTINGONTOLOGIES

SWEET GEON

Page 41: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

DOLCE ROCKS: Integrating Foundational and Geoscience Ontologies

Preliminary results for the integration of concepts from DOLCE, GeoSciML, and SWEET

Boyan BrodaricNatural Resources CanadaFlorian ProbstUniversity of Muenster

From SSKI Spring Symposium Series on Semantic Scientific Knowledge Integration

Page 42: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

Outline

Foundational ontologies

DOLCE

DOLCE + GeoSciML

DOLCE + SWEET

DOLCE + GeoSciML + SWEET

Brodaric / Probst – SSKI 2008

Page 43: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

Foundational Ontologies

CONTENTS

General concepts and relations that apply in all domainsphysical object, process, event,…, inheres, participates,…

Rigorously definedformal logic, philosophical principles, highly structured

ExamplesDOLCE, BFO, GFO, SUMO, CYC, (Sowa)

Brodaric / Probst – SSKI 2008

Page 44: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

Foundational Ontologies

PURPOSE: help integrate domain ontologies

Geophysics ontology

Marine ontology

Water ontology

Planetary ontology

Geology ontology

Struc ontology

Rock ontology

“…and then there was one…”

Foundational ontology

Brodaric / Probst – SSKI 2008

Page 45: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

Foundational Ontologies

PURPOSE: help organize domain ontologies

“…a place for everything, and everything in its place…”

Foundational ontology

shale rock formation

lithification

Brodaric / Probst – SSKI 2008

Page 46: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

Problem scenario

Little work done on linking foundational ontologies with geoscience ontologies

Such linkage might benefit various scenarios requiring cross-disciplinary knowledge, e.g.:

water budgets: groundwater (geology) and surface water (hydro)

hazards risk: hazard potential (geology, geophysics) and items at threat (infrastructure, people, environment, economic)

health: toxic substances (geochemistry) and people, wildlife

many others…

Brodaric / Probst – SSKI 2008

Page 47: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

focus of this talk

Project

SWEET GeoSciML

DOLCE

Objectivesevaluate fit between DOLCE, GeoSciML, SWEET

evaluate operational benefits: e.g. data discovery, integration,…

Approachextend DOLCE

do not alter GeoSciML, SWEET

use Protégé / OWL and SeReS

Expected Resultsunified ontology, internally consistent

increased ability to discover and integrate data

Brodaric / Probst – SSKI 2008

Page 48: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

DOLCE 2.1 Lite-Plus, OWL 397

PerdurantEndurant

Quality

Abstract

inheres

inheres

Physical Quality Temporal Qualitycolor age

located-in

located-in

GSA Time-scale

Physical Region Temporal Regionbrown

Munsell-space

Ordovician

rock body

Lithification

event

participates

Physical EndurantPhysical Object

ProcessEventState

Brodaric / Probst – SSKI 2008

Page 49: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

GeoSciML 2.0 beta GML-UML schema of basic geologic entities

focus: Geologic Unit, Earth Material

some classes, many relations align relations

Brodaric / Probst – SSKI 2008

Page 50: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

CompoundMaterial part  : EarthMaterial plays: CompositionPart generic-constituent: Particle participant-in: GeologicProcess [0..*] host-of: Fabric[0..*]

has_quality: Lithology [1..*] has_quality: CompositionCategory [0..*] has_quality: PhysicalQuality [0..*] has_quality: MetamorphicQuality [0..*] has_quality: ConsolidationDegree

Rock UnconsolidatedMaterial

DOLCE + GeoSciML (1)

Physical-Body Amount-Of-Matter

generic-constituent

Physical Quality Physical Region

has-quality q_location

Benefitsfull coverage of GeoSciML fragment

Issuesclassification vs quality vs subtype

complex qualities (non-unary)

reference spaces (e.g. units of meas.)

has-q: ParticleType

has-q: FabricType

has-q: ProcessType

has-q: MineralClass

has-q: ChemicalClass

Particletype

Grain

Fabrictype

Foliated

Processtype

Sedimentary

Mineralclass

QAPFregion

Chemicalclass

TASregion

Rock

Lithology

Rocktype

Shale

GeologicUnit

EarthMaterial

CompoundMaterial

GeologicUnitType Formation X

Brodaric / Probst – SSKI 2008

Page 51: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

DOLCE + GeoSciML (2)

Physical-Body Amount-Of-Matter

generic-constituent

Physical Quality Physical Region

has-quality q_location

Social-Object Concept

classifies

part:Particletype

Grain

req: ProcessType

Sedimentary

req: GrainSizeParam

Density

GrainSize

Rock

DensityRegion

GrainSizeRegion

aphanitic

Lithology

Shale

GeologicUnit

EarthMaterial

CompoundMaterial

Formation X

Benefitsfull coverage of GeoSciML fragment

classifications are not qualities

Issueswhat is / isn’t a concept?

subtypes vs concepts

duplicate qualities & params

Brodaric / Probst – SSKI 2008

Page 52: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

SWEET 1.1 beta

OWL ontology of Earth related entities

focus: Substance, Earthrealm, Sunrealm, Phenomena, Process, Biosphere, Property

many classes, few relations align classes

Brodaric / Probst – SSKI 2008

Page 53: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

DOLCE + SWEETDOLCE = SWEET < SWEET

Physical-body BodyofGround, BodyofWater,…

Material-Artifact Infrastructure, Dam, Product,…

Physical-Object LivingThing, MarineAnimal

Amount-of-Matter Substance

Activity HumanActivity

Physical-Phenomenon Phenomena

Process Process

State StateOfMatter

Quality Quantity, Moisture,…

Physical-Region Basalt,…

Temporal-Region Ordovician,…

Benefitsfull coverage

rich relations

home for orphans

single superclasses

Issuesindividuals (e.g. Planet Earth)

roles (contaminant)

features (SeaFloor)

Brodaric / Probst – SSKI 2008

Page 54: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

DOLCE + SWEET + GeoSciMLDOLCE = SWEET < SWEET =GeoSciML <GeoSciML

Physical-body BodyofGround <RockBody GeologicUnit

Material-Artifact Infrastructure, Dam, Product,…

Physical-Object LivingThing, MarineAnimal

Amount-of-Matter SubstanceMixedSubstanceRock

EarthMaterial CompoundMaterialRock

Activity HumanActivity

Physical-Phenomenon PhenomenaSolidEarthPhen. GeologicEvent

Process ProcessGeologicalProcess

State StateOfMatter

Quality QuantityAge eventAge

Physical-Region Basalt,…

Temporal-Region Ordovician

Brodaric / Probst – SSKI 2008

Page 55: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

Status

DOLCE + GeoSciML + SWEET: mappings complete*

Protégé / OWL: encoding in progress

Operational evaluation: future work

Brodaric / Probst – SSKI 2008

Page 56: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

Conclusions

Surprisingly good fit amongst ontologiesso far: no show-stopper conflicts, a few difficult conflicts

DOLCE richness benefits geoscience ontologiesgood conceptual foundation helps clear some existing problems

Unresolved issues in modeling science entitiesmodeling classifications, interpretations, theories, models,…

Brodaric / Probst – SSKI 2008

Page 57: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

Semantic Search• One thing we can think about is building

domain literate tools that “understand” one or more science areas and use that knowledge (empowered by ontologies) to find or otherwise intelligently manipulate data.

• IWSearch is a simple example that uses its knowledge of PML to refine SWOOGLE and return appropriate search terms. This uses an underlying PML ontology

• NOESIS uses background science ontologies to find data

Page 58: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

Motivating Example

Related Results

Unrelated Results

Ramachandran and Movva

Page 59: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

The Problem• Search query is broad and not scoped.

• Only one kind of resource is searched (E.g., Web-documents).

• Semantics associated with the search string are not captured.

Noesis is an attempt to solve these issues– Search only what you want (Search Scoping)– Search everything that you want (Resource

aggregationRamachandran and Movva

Page 60: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

Noesis Approach• Noesis provides topic-based searches,

retrieving resources conceptually related to the user’s need.

• Semantic information is captured in the domain ontologies and is used for providing context for the search process.

Ramachandran and Movva

Page 61: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

Noesis Approach (cont’d) Using domain ontologies, Noesis provides a

guided refinement of search query producing successful searches and reducing the user’s burden to experiment with different search strings.

• Semantics are captured by providing the option to expand the query terms to include suggested related concepts.

Ramachandran and Movva

Page 62: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

Noesis AlgorithmNoesis search engine uses a two step algorithm.

– Query Analysis: Search query is broken down to identify concepts that are defined in the domain.

– Query Expansion: Related concepts are added to the search string for Scoping the search.

Ramachandran and Movva

Page 63: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

Ontology• From Machine Learning/Artificial

Intelligence/Intelligent Systems perspective “an ontology is a formal, explicit specification of a shared conceptualization” (Gruber, 1993).

• An ontology captures and encodes domain knowledge of concepts, constraints and the relationships among them, for use in a machine-readable fashion.

Ramachandran and Movva

Page 64: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

Noesis OntologiesNoesis uses two classes of ontologies.• Domain ontologies: These are the ontologies

used to describe concepts in a domain and their relationships. Noesis uses a set of core ontologies for describing concepts in Atmospheric Science.

• Application ontologies: Application ontologies describe a domain with respect to a function (E.g., A data archive). They enable flexible querying by bridging the gap between semantic concepts and the application specific vocabulary.

Ramachandran and Movva

Page 65: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

Providing Semantics - Specializations

• Adding related concepts to the search term as a way to provide semantic information.

• Ontologies are organized in tree like taxonomies, where the child nodes represent the specializations of a parent node.

• The first type of the related terms are specializations (SP). They are used to expand the search terms thus providing more detailed search. For example: – Search Term: “height– Additional term(SP): “free-board”

Ramachandran and Movva

Page 66: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

Providing Semantics - Synonyms• Synonyms are different terms that have the same

meaning. In ontological terms they are the equivalent concepts.

• owl:equivalent-class allows linking two syntactically different terms to one semantic concept (synonyms).

• The second form of related concepts are synonyms (SN). These include acronyms. For Example:

• Search Term: ‘Albedo’• Additional Term(SN): ‘Reflectance’

Ramachandran and Movva

Page 67: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

Providing Context• In reality, every concept has a set of other

concepts, that are neither in the same inheritance hierarchy nor equivalent, that tend to co-exist.

• These are called the associated concepts and they are captured in the ontology through the property relationships.

• Associated concepts are the third type of related terms (RT) and they provide the context for the search query. For example

• Search Term: ‘Mesocyclone’• Associated Term (RT): ‘Vorticity’

Ramachandran and Movva

Page 68: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

Semantics for Search Scoping

A user can start at a general topic and navigate to specific topics by selecting the related concepts of interest.

Related Terms

Specializations

Specializations and related termsfor the search term (“Cyclone”)

presented to the user

Page 69: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

Synonyms for the search term (“Reflectance”) presented to the user

Synonyms

Semantics for Search Scoping

Page 70: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

The Search• The identified concepts from the query string

are used to search the Ontology inference service (OIS) to get the related concepts (SP,SN and RT).

• The obtained terms are used to search other resources like Google, Yahoo etcSearch Query:(SP1.. SPn SN1..SPn RT1..RTn).

Page 71: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

The Search (Cont’d)

Noesis Engine(Query Expansion, Search and Refine results)

Noesis GUI

Inference Engine/Reasoner (Pellet)

Ontologies(CF, Core)

Ontology Inference Service (OIS)

Semantic Support

Syntax Based Search Engines

Yahoo Google …

Open Web Search Engines

DataBase Search Engine …

Hidden Web Search

Page 72: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

DataSet Interoperability• Over Years, data formats have evolved from

different communities, thus making them limited to the community and context.

• Semantically homogeneous data is archived as syntactically heterogeneous data sets. For example:

• DataSet1: Albedo• DataSet2: Reflectance

• Semantic Interoperability between data formats is required to make them usable across communities.

Page 73: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

Application Ontologies

• Data archives are often catalogued with content meta data to enable searching. But, users still need to search for syntactically matching keywords.

• Application ontologies describe a domain with respect to function (i.e. Data archive).

• Application ontologies enable flexible querying by bridging the gap between semantic concepts and the keywords.

Page 74: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

Application Ontologies (Cont’d)

• An application ontology is added for every new catalog (E.g.: CF-Ontology).

• This ontology annotates the terms used in the catalog with the conceptual meanings.

• The concepts in this ontology are linked with Core ontology (Modified SWEET) through the owl:equivalentClass.

Page 75: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

Noesis - Data Search

Noesis Engine(Query Expansion, Search and Refine results)

Noesis GUI

Semantic Support

Resource Catalog Search Engine

Data Search Engines

1. Search Request8. Results for display

6. Search with mapped terms

2. Related concept Request

Inference Engine/Reasoner (Pellet)

Ontologies(CF, Core)

Ontology Inference Service (OIS)

4. Related CF Terms Req

3. Related Concepts

5. CF Terms

7. Data Search Results

Page 76: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

Resource AggregationNoesis is a meta search engine

– Meta Search Engines simultaneously search multiple Open Web and Hidden Web resources to provide increase search coverage, efficiency and effectiveness.

– Searches for web-pages, data, education material and Publications.

Page 77: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

Resource Aggregation (Contd.)

Noesis uses the scoped search string to fetch resources, through search web services provided by third parties:

• Web: Yahoo, Google, Ask.com• Data: NCDC, NCAR, NOAA, NASA GCMD, LEAD, IPCC Model

Ouput, LDEO• Publications: AMS, Elsevier, Springer, RMS, Blackwell, AGU• Education: DLESE

Page 78: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

Resource Aggregation (Contd.)

Collates resources such as web pages, data sets, publications etc.

Page 79: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

Example – Step 1 (User Input)

User types a search term (‘Cyclone’)

Page 80: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

Example – Step 2 (Query Refinement)

Related Terms

Specializations

User selects the related terms to add to his search query

Page 81: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

Example – Web Results (Step 3)Refined query

Querying Web Search Engines

Relevant Results

Page 82: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

Example – Publications (Step 3)

Refined query used for publications search

Page 83: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

Example – Data Search (Step 3)

Mapped query used for data search

Page 84: 1 Foundations VI: Foundations VI: Discovery, Access and Semantic Integration Deborah McGuinness and Peter Fox CSCI-6962-01 Week 11, November 10, 2008

Next week• This weeks assignment:

– No new reading:– Prepare your class presentations– No physical office hours this week . Office hours

are virtual (through email) this week. Peter and I are both traveling but will be connected this week.

• Next class (week 12 – November 17): – Class presentations

• Questions?84