44
1 Ontology Enabled Data Discovery and Integration Kai Lin San Diego Supercomputer Center University of California, San Diego A. K. Sinha, Z. Malik, A. Rezgui, A. Dalton Virginia Tech

1 Ontology Enabled Data Discovery and Integration Kai Lin San Diego Supercomputer Center University of California, San Diego A. K. Sinha, Z. Malik, A

Embed Size (px)

Citation preview

Page 1: 1 Ontology Enabled Data Discovery and Integration Kai Lin San Diego Supercomputer Center University of California, San Diego A. K. Sinha, Z. Malik, A

1

Ontology Enabled Data Discovery and Integration

Kai LinSan Diego Supercomputer Center

University of California, San Diego

A. K. Sinha, Z. Malik, A. Rezgui, A. DaltonVirginia Tech

Page 2: 1 Ontology Enabled Data Discovery and Integration Kai Lin San Diego Supercomputer Center University of California, San Diego A. K. Sinha, Z. Malik, A

2

Motivations

• A better way to discover and understand datasets

Use the knowledge in ontologies to find datasets

• A better way to query datasets

Query through ontologies without knowing the schemas

• A better way to integrate multiple datasets

Integrate multiple datasets on-the-fly if they are mapped to ontologies

Page 3: 1 Ontology Enabled Data Discovery and Integration Kai Lin San Diego Supercomputer Center University of California, San Diego A. K. Sinha, Z. Malik, A

3

What Is Ontology

A formal, explicit specification of a shared conceptualization

unambiguous definitionof all concepts, attributes

and relationships

machine-readability commonly accepted

understanding

conceptual modelof a domain

Page 4: 1 Ontology Enabled Data Discovery and Integration Kai Lin San Diego Supercomputer Center University of California, San Diego A. K. Sinha, Z. Malik, A

4

Why Represent Domain Knowledge as Ontology

• Separate domain knowledge module from the operational module

• Configurable knowledge module

• Share and reuse domain knowledge

• Analyze domain knowledge

Page 5: 1 Ontology Enabled Data Discovery and Integration Kai Lin San Diego Supercomputer Center University of California, San Diego A. K. Sinha, Z. Malik, A

5

What’s Inside An Ontology?

• Concepts: Classes + Class-hierarchy– instances

• Properties: often also called “Roles” or “Slots”– labeled instance-value-pairs

• Axioms/Relations:– relations between classes (disjoint, covers)– inheritance (multiple? defaults?) – restrictions on slots (type, cardinality)– Characteristics of slots (symm., trans., …)

• reasoning tasks: – Classification: Which classes does an instance belong to? – Subsumption: Does a class subsume another one?– Consistency checking: Is there a contradiction in my axioms/instances?

Page 6: 1 Ontology Enabled Data Discovery and Integration Kai Lin San Diego Supercomputer Center University of California, San Diego A. K. Sinha, Z. Malik, A

6

Resource Description Framework (RDF)

XML Schema is not enough for semantics• only describe Grammar, i.e. syntax of single documents• can not express inheritance for concepts• no means to express complex integrity constraints• in an unambiguous way

Resource Description Framework (RDF) an infrastructure for the encoding, exchange and reuse of structured metadata

<document href=”page.html”> <author>Peter Morris</author></document>

<author> <fistName>Peter</fistName> <lastName>Morris</lastName> <documents> <uri>page.html</uri> </documents></author>

The author of ‘page.html‘ is Peter Morris

What is the “correct” way of expressing it?

Page 7: 1 Ontology Enabled Data Discovery and Integration Kai Lin San Diego Supercomputer Center University of California, San Diego A. K. Sinha, Z. Malik, A

7

RDF IdeaRDF is intended to provide a simple way for making statements about resources

Resources objects that are uniquely identified by an URI (Uniform Resource Identifier)

• Anything can have a URI.• an entire Web page, • a whole collection of pages e.g. an entire Website, • object that is not directly accessible via the Web such as a printed book.

Property a specific aspect, characteristic, attribute, or relation used to describe a resource has a specific meaning, defines its permitted values

• Lives-In, CarColor, WorkFor, HasA, IncludedIn, hasAuthor…

Statement a specific resource together with a named property plus the value of that property for that resource. Each RDF statement can be written down as a triple (Subject, Property, Object) or a graph

Resource propertyValue

Resource

Page 8: 1 Ontology Enabled Data Discovery and Integration Kai Lin San Diego Supercomputer Center University of California, San Diego A. K. Sinha, Z. Malik, A

8

A RDF Example

<?xml version="1.0"?> <rdf:RDF xmlns:rdf = “http://www.w3.org/1999/02/22-rdf-syntax-ns#” xmlns:dc = “http://purl.org/dc/elements/1.1/”> <rdf:Description rdf:about = “http://www.polleres.net/page.html”> <dc:creator> <rdf:Description rdf:about = “http://www.polleres.net/peter”>

<hasName>Peter Morris</hasName> </rdf:Description> </dc:creator> </rdf:Description></rdf:RDF>

http://www.polleres.net/page.html

http://www.polleres.net/peter

Peter Morris

http://purl.org/dc/elements/1.1/creator

hasName

April 1,2004

creationDate

English

http://purl.org/dc/elements/1.1/language

Page 9: 1 Ontology Enabled Data Discovery and Integration Kai Lin San Diego Supercomputer Center University of California, San Diego A. K. Sinha, Z. Malik, A

9

A General RDF Format

value of property-A

value of property-B

<?xml version="1.0"?><Resource-A> <property-A> <Resource-B> <property-B> <Resource-C> <property-C> Value-C </property-C> </Resource-C> </property-B> </Resource-B> </property-A></Resource-A>

Convention:• A capital letter to start a type (class) name• A lowercase letter to start a property name

Page 10: 1 Ontology Enabled Data Discovery and Integration Kai Lin San Diego Supercomputer Center University of California, San Diego A. K. Sinha, Z. Malik, A

10

RDF Schema (RDFS)

Core Class • rdfs:Resource• rdfs:Literal• rdf:XMLLiteral• rdfs:Class• rdfs:Property• rdfs:DataType• rdfs:Container

Core Property• rdf:type• rdfs:subClassOf• rdfs:subPropertyOf• rdfs:domain• rdfs:range• rdfs:label• rdfs:comment

RDFS is a simple ontology language

• RDF: triples for making assertions about resources• RDFS extends RDF with “schema vocabulary”, e.g.:

– Class, Property– type, subClassOf, subPropertyOf– range, domain

representing simple assertions, taxonomy + typing

Page 11: 1 Ontology Enabled Data Discovery and Integration Kai Lin San Diego Supercomputer Center University of California, San Diego A. K. Sinha, Z. Malik, A

11

RDFS Example

Resource Class Property

HoverVehicle

Company

Number

Vehicle

SeaVehicleLandVehicle

subClassOf

subClassOfsubClassOf

subClassOfsubClassOf

type

producedBy

type

numberOfEngine

Page 12: 1 Ontology Enabled Data Discovery and Integration Kai Lin San Diego Supercomputer Center University of California, San Diego A. K. Sinha, Z. Malik, A

12

• RDFS too weak to describe resources in sufficient detail:– No localised range and domain constraints

• Can’t say that the range of hasChild is person when applied to persons and elephant when applied to elephants

– No existence/cardinality constraints• Can’t say that all instances of person have a mother that is also a

person, or that persons have exactly 2 parents

– No transitive, inverse or symmetrical properties• Can’t say that isPartOf is a transitive property, that hasPart is the

inverse of isPartOf or that touches is symmetrical

– No in/equality• Can’t say that a class/instance is the same as some other

class/instance, can’t say that some classes/instances are definitely disjoint/different.

– No boolean algebra• Can’t say that that one class is the union, intersection, complement of

other classes, etc.

Limitations of RDFS

Page 13: 1 Ontology Enabled Data Discovery and Integration Kai Lin San Diego Supercomputer Center University of California, San Diego A. K. Sinha, Z. Malik, A

13

OWL Language - Overview

• Three species of OWL– OWL DL stays in Description Logic fragment– OWL Lite is “easier to implement” subset of OWL DL – OWL Full is union of OWL syntax and RDF

• OWL DL based on Description Logic– In fact it is equivalent to SHOIN(Dn) DL

• OWL DL Benefits from many years of DL research– Well defined semantics– Formal properties well understood (complexity, decidability)– Known reasoning algorithms– Implemented systems (highly optimised)

• OWL full has all that and all the possibilities of RDF/RDFS which destroy decidability

Full

DL

Lite

Page 14: 1 Ontology Enabled Data Discovery and Integration Kai Lin San Diego Supercomputer Center University of California, San Diego A. K. Sinha, Z. Malik, A

14

Full

DL

Lite

• OWL Full • Allow meta-classes etc

•OWL DL•Negation (disjointWith, complementOf)•unionOf •Full Cardinality•Enumerated types (oneOf)

• OWL Light •(sub)classes, individuals•(sub)properties, domain, range•intersection•(in)equality•cardinality 0/1•datatypes•inverse, transitive, symmetric•hasValue•someValuesFrom•allValuesFrom

RDF Schema

OWL Layers (Lite, DL, Full)

Page 15: 1 Ontology Enabled Data Discovery and Integration Kai Lin San Diego Supercomputer Center University of California, San Diego A. K. Sinha, Z. Malik, A

15

Ontology Inconsistency

• You may define Classes were no individual can fulfill its definition. Via reasoning engines such a definition can be found also in big ontologies.

– Cow ≡ Animal ⊓ Vegetarian

– Sheep Animal ⊑– Vegetarian ≡ eats Animal

– MadCow ≡ Cow ⊓ eats.Sheep

Page 16: 1 Ontology Enabled Data Discovery and Integration Kai Lin San Diego Supercomputer Center University of California, San Diego A. K. Sinha, Z. Malik, A

16

Open/Close World Assumption

Close World Assumption– The fact in the ontology describe completely what I know, all that is not in the

ontology is assumed to be false..

Open World Assumption (used in OWL)– There are something not described by the ontology

An ontology says: There is a train at 14:00

There is a train at 15:00Is there a train at 17:00?

no by Close World Assumptionunknown by Open World Assumption

Page 17: 1 Ontology Enabled Data Discovery and Integration Kai Lin San Diego Supercomputer Center University of California, San Diego A. K. Sinha, Z. Malik, A

17

Resource Discovery in GEON

• A Resource Registration System for Data Providers– Register ontologies (domain knowledge)

– Register datasets with metadata including data access information

– Optionally register datasets to ontologies (which is crucial for data integration and smart search)

• A Search Engine for Data Users– Metadata based search

– Spatial coverage based search

– Temporal coverage based search

– Concept based search

• Both are available through a public portal on the web

Page 18: 1 Ontology Enabled Data Discovery and Integration Kai Lin San Diego Supercomputer Center University of California, San Diego A. K. Sinha, Z. Malik, A

18

Metadata(ADN)

GEON Data Registration System

Resource Registration System

SRBMetadata

(ADN)Metadata(ADN)Metadata

(ADN)Excel

GeoTIFF

Shapefile

Catalog

General Information Ontology Annotations

Access Control

SubjectsFormatKeywordsSpatial coverage'sTemporal coverage's…………

Integrated Resources

Log

Resource Metadata

GEON Search

Resource Schemas

Page 19: 1 Ontology Enabled Data Discovery and Integration Kai Lin San Diego Supercomputer Center University of California, San Diego A. K. Sinha, Z. Malik, A

19

Database Registration

Table

Table

Table

Table

View

View

Original Database

Table Def

Table Def View Def

Published Database select tables and

views to register

GEON Mediator

GEON JDBC Driver

Application

Page 20: 1 Ontology Enabled Data Discovery and Integration Kai Lin San Diego Supercomputer Center University of California, San Diego A. K. Sinha, Z. Malik, A

20

Write Protection

Mediator

Database

UPDATE B

• Only accepts SELECT statements• Rejects any requests other than SELECT

A

B

C

B

Page 21: 1 Ontology Enabled Data Discovery and Integration Kai Lin San Diego Supercomputer Center University of California, San Diego A. K. Sinha, Z. Malik, A

21

Read Protection on Unregistered Tables and Views

MediatorDatabase

SELECT *FROM A

An unregistered table or view is invisible to an end user• The data in the table can’t be viewed by SELECT statement • The schema can’t be fetched

A

B

C

B

Page 22: 1 Ontology Enabled Data Discovery and Integration Kai Lin San Diego Supercomputer Center University of California, San Diego A. K. Sinha, Z. Malik, A

22

Item Level Ontological Data Registration for Discovering

The search engine uses ontologies to find more results, for example, the fact that Polygon is a subclass of GeometricalObject is used in the searching.

Rectangle

CirclePolygon Surface

GeometricalObject_2D

Ontology: Dataset Properties

mentions uses has instances

Search for GeometricalObject_2D Return datasets associated with Polygon

Page 23: 1 Ontology Enabled Data Discovery and Integration Kai Lin San Diego Supercomputer Center University of California, San Diego A. K. Sinha, Z. Malik, A

23

Data Integration Challenges: Heterogeneities

• Syntactical Heterogeneity

heterogeneous data format

e.g. 02-04-2004 vs. 02/04/04

• Structural Heterogeneity

heterogeneous data models and schemas

e.g. 02-04-2004 is saved as three columns or one columns

• Semantics Heterogeneity

fuzzy metadata, terminology, “hidden” semantics, implicit assumptions

GEON Preferred Solution:• Datasets are semantically registered first• Heterogeneities is resolved by registration

Page 24: 1 Ontology Enabled Data Discovery and Integration Kai Lin San Diego Supercomputer Center University of California, San Diego A. K. Sinha, Z. Malik, A

24

Database Integration

Integration at three levels

Level 1: Federation Based Integration• Users should be knowledgeable to each databases

Level 2: View Based Integration• The intended users are somebody who want to do integration for

others or make integration results reusable

Level 3: Ontology Based Integration• The easiest way for end users

Page 25: 1 Ontology Enabled Data Discovery and Integration Kai Lin San Diego Supercomputer Center University of California, San Diego A. K. Sinha, Z. Malik, A

25

Level 1: Federation Based Integration

C

A B

G

D

F

E

C

A B

D

GF

E

Mediatorbackend

backendSELECT * FROM A, E WHERE ……

• Use SQL to query the federated database• Structural and semantic heterogeneity should be solved by users themselves

Page 26: 1 Ontology Enabled Data Discovery and Integration Kai Lin San Diego Supercomputer Center University of California, San Diego A. K. Sinha, Z. Malik, A

26

Level 2: View Based Integration

C

A B

G

D

F

E

CA B

D

GFE

Mediatorbackend

backendSELECT * FROM V, W WHERE ……

• Allow defining views on top of the federated databases• Allow hiding the original backend schemas• Integration results can be shared and reused

V W

Page 27: 1 Ontology Enabled Data Discovery and Integration Kai Lin San Diego Supercomputer Center University of California, San Diego A. K. Sinha, Z. Malik, A

27

Level 3: Ontology Based Integration

• Require ontology annotations for backend databases • Use simple ontology query language to query the integrated database• Users don’t need know the backend schemas and local semantics

C

A B

G

D

F

E

CA B

D

GFE

Mediatorbackend

backend Ontology Based Query

Page 28: 1 Ontology Enabled Data Discovery and Integration Kai Lin San Diego Supercomputer Center University of California, San Diego A. K. Sinha, Z. Malik, A

28

Ontology Enabled Data Integration

• Ontology Enabled Semantic Integration

Challenges for Computer Scientists and Domain Scientists

– Computer Scientists: build an integration system based on the ontological registration of datasets

– Domain Scientists: create domain ontologies– Data Providers: register datasets to ontologies

Ontology1 Ontology2 ontology3

dataset1 dataset2 dataset3 dataset4

Page 29: 1 Ontology Enabled Data Discovery and Integration Kai Lin San Diego Supercomputer Center University of California, San Diego A. K. Sinha, Z. Malik, A

29

Ontological Data Registration for Data integration

• Registering a dataset to an ontology for data integration is a procedure to generate a partial model of the ontology from the dataset itself

From registrationdataset

individuals ontology

p

Not all the constraints in the ontology are satisfied

by the generated individuals

Page 30: 1 Ontology Enabled Data Discovery and Integration Kai Lin San Diego Supercomputer Center University of California, San Diego A. K. Sinha, Z. Malik, A

30

• Associate one or more columns under an optional SQL condition to a selected class in the ontology

• Provide a mapping method if no explicit names of individuals should be generated

Registering Relational Tables to Ontology Classes

…… Latitude …… Longitude ……

23.5 47.9

…… …… …… …… ……

Location(23.5, 47.9) is the name of an individual of the class Location

Same name indicates the same location

RockSample GeologicAge ……

Jurassic/Triassic

Precambrian

…………

GeologicalAge

Precambrian Cenozoic Paleozoic

Page 31: 1 Ontology Enabled Data Discovery and Integration Kai Lin San Diego Supercomputer Center University of California, San Diego A. K. Sinha, Z. Malik, A

31

Registering Tables to Ontology Object Properties

• Associate two entities which are already registered to the domain class and the range class of a selected object property in the ontology

…… RockSampleID …… PERIOD ……

…… …… …… …… ……

Rock GeologicAgehasAge

Page 32: 1 Ontology Enabled Data Discovery and Integration Kai Lin San Diego Supercomputer Center University of California, San Diego A. K. Sinha, Z. Malik, A

32

ODAL (Ontological Database Annotation Language)

<odal:NamedIndividuals odal:id="RockSample" odal:database="VTDatabase"> <odal:Class odal:resource="http://geon.vt.edu#RockSample" /> <odal:Table>Samples</odal:Table> <odal:Table>RockTexture</odal:Table> <odal:Table>RockGeoChemistry</odal:Table> <odal:Table>ModalData</odal:Table> <odal:Table>MineralChemistry</odal:Table> <odal:Table>Images</odal:Table> <odal:Column>ssID</odal:Column> </odal:NamedIndividuals>

GUI

generateto ODALprocessor

The values in the column ssID of the table Samples, RockTexture, RockGeoChemistry, ModalData,MineralChemistry and Images represent instances of RockSample

• Create a partial model of ontologies from database• Independent on any GUI• Independent on any concrete implementations• reusable

Page 33: 1 Ontology Enabled Data Discovery and Integration Kai Lin San Diego Supercomputer Center University of California, San Diego A. K. Sinha, Z. Malik, A

33

ODAL: Import Ontologies

The Ontologies used for annotating a database can be imported as follows:

<?xml version="1.0"?> <odal:ODAL xmlns:rdf = “http://www.w3.org/1999/02/22-rdf-syntax-ns#” xmlns:owl="http://www.w3.org/2002/07/owl#" xmlns:odal = “http://www.sdsc.edu/odal#” ><odal:Ontology> <odal:Imports rdf:resource="http://www.library.org/Book.owl"/> <odal:Imports rdf:resource="http://www.writer.org/Writer.owl"/></odal:Ontology>

……

</odal:ODAL>

Page 34: 1 Ontology Enabled Data Discovery and Integration Kai Lin San Diego Supercomputer Center University of California, San Diego A. K. Sinha, Z. Malik, A

34

ODAL: Database Connection Declaration

The target databases for making annotation is declared as follows:

<?xml version="1.0"?> <odal:ODAL xmlns:rdf = “http://www.w3.org/1999/02/22-rdf-syntax-ns#” xmlns:owl="http://www.w3.org/2002/07/owl#" xmlns:odal = “http://www.sdsc.edu/odal#” >……<odal:Database odal:id="PublicationDatabase"> <odal:DatabaseProductName>Oracle<odal:DatabaseProductName> <odal:DatabaseProductVersion>9.1.21<odal:DatabaseProductVersion> <odal:Host>oracle.sdsc.edu</odal:Host> <odal:Port>3456</odal:Port> <odal:DatabaseName>Publications</odal:DatabaseName></odal:Database>……

</odal:ODAL>

Page 35: 1 Ontology Enabled Data Discovery and Integration Kai Lin San Diego Supercomputer Center University of California, San Diego A. K. Sinha, Z. Malik, A

35

ODAL: Simple Named Individuals

<odal:NamedIndividuals odal:id="BookInTableBookPrice" odal:database="PublicationDatabase" > <odal:Class odal:resource="http://www.amazon.com/Book.owl#Book"/> <odal:Schema>Collections</odal:Schema> <odal:Table>book-price</odal:Table> <odal:Column>ISBN</odal:Column></odal:NamedIndividuals>

Suppose the book ontology contains a class Book and the schema Collection contains a table book-price with a column ISBN.

odal:id gives a name to the declaration, and represents the set of the individuals generated by the statement.

The statement says that each value in the column ISBN represents a book individual.

Page 36: 1 Ontology Enabled Data Discovery and Integration Kai Lin San Diego Supercomputer Center University of California, San Diego A. K. Sinha, Z. Malik, A

36

ODAL: Named Individuals from Multiple Columns

<odal:NamedIndividuals odal:id="LocationInTableRockSample" > <odal:Class odal:resource="http://www.usgs.org/Space.owl#Location"/> <odal:Schema>California</odal:Schema> <odal:Table>Rock-Sample</odal:Table> <odal:Column>Latitude</odal:Column> <odal:Column>Longitude</odal:Column></odal:NamedIndividuals>

Suppose an ontology contains a class Location and a database table Rock-Sample with two columns Latitude and Longitude.

The statement says that a pair of latitude and longitude gives a location

Page 37: 1 Ontology Enabled Data Discovery and Integration Kai Lin San Diego Supercomputer Center University of California, San Diego A. K. Sinha, Z. Malik, A

37

ODAL: Named Individuals with Conditions

<odal:NamedIndividuals odal:id="MaleEmployeeInTableEmployee" > <odal:Class odal:resource="http://www.abc.com/Employee.owl#MaleEmployee"/> <odal:Table>employee</odal:Table> <odal:Column>EmployeeId</odal:Column> <odal:Condition><![CDATA[ Gender=’M’ >]]</odal:Condition></odal:NamedIndividuals>

<odal:NamedIndividuals odal:id="FemaleEmployeeInTableEmployee" > <odal:Class odal:resource="http://www.abc.com/Employee#FemaleEmployee"/> <odal:Table>employee</odal:Table> <odal:Column>EmployeeId</odal:Column> <odal:Condition><![CDATA[ Gender=’F’ >]]</odal:Condition></odal:NamedIndividuals>

A condition in an odal:Condition element should be a boolean expression which isvalid to be used in any WHERE clauses of SQL queries

Page 38: 1 Ontology Enabled Data Discovery and Integration Kai Lin San Diego Supercomputer Center University of California, San Diego A. K. Sinha, Z. Malik, A

38

ODAL: Data Type Property Declaration

<odal:NamedIndividuals odal:id="PersonInTablePerson" > <odal:Class odal:resource="http://www.foo.org/Person.owl#Person"/> <odal:Table>Person</odal:Table> <odal:Column>ssn</odal:Column></odal:NamedIndividuals>

<odal:OntologyProperty> <odal:DatatypeProperty odal:resource="http://www.foo.org/Person.owl#hasAge"/> <odal:Table>person</odal:Table> <odal:Domain odal:resource="PersonInTablePerson" /> <odal:Range odal:resource="age" /></odal:OntologyProperty>

…8…1234-56-7890…

…age…SSN… Person

double

hasAge

Page 39: 1 Ontology Enabled Data Discovery and Integration Kai Lin San Diego Supercomputer Center University of California, San Diego A. K. Sinha, Z. Malik, A

39

• Usually we don’t make join on individuals cross different resources

• A set of datatype properties can be declared as a key for a class in the ontology. We do join cross multiple resources based on keys.

e.g. { hasLatitude, hasLongitude} can be declared as a key of Location

Two locations from different resources are same if they have the same

latitude and longitude

Conditions for Joining from Different Resources

Rock

RockSampleID

10001

…...

RockID

10001

……

We don’t know whether 10001 represents the same rock in the two resources. By default, we assume they are not.

Page 40: 1 Ontology Enabled Data Discovery and Integration Kai Lin San Diego Supercomputer Center University of California, San Diego A. K. Sinha, Z. Malik, A

40

SOQL (Simple Ontology Query Language)

Query single or integrated resources • via ontologies (i.e., high level logical views)• independent on any physical presentation (i.e. schemas)

RockSample Location

ValueWithUnit float

location

hasSiO2

value

lat long

unit

string

SELECT X.location.*; FROM RockSample X WHERE X.location.lat > 60 AND X.location.long > 100 AND X.hasSiO2.value < 30 AND X.hasSiO2.unit =‘weightPercetage’

GUIgenerate

to SOQLprocessor

Page 41: 1 Ontology Enabled Data Discovery and Integration Kai Lin San Diego Supercomputer Center University of California, San Diego A. K. Sinha, Z. Malik, A

41

The Architecture of GEON Semantic Mediator

Portal or Application

Mediator JDBC Driver

GUI

SOQLSemantic Query Rewriter

SOQL Parser Ontology

Reasoner

SOQL Processor

Spatial SQL against federal schemas

SQL Parser

OWL ODAL

Query Execution

Query Optimization

QueryPlanning Internal Database

Oracle DB2 MySQLSQL

ServerPostgreSQL PostGIS

ODAL Processor

Page 42: 1 Ontology Enabled Data Discovery and Integration Kai Lin San Diego Supercomputer Center University of California, San Diego A. K. Sinha, Z. Malik, A

42

SELECT X.code, X.location.* FROM SeismicStation X, Railroad Y WHERE distance(X.location, Y.geometry) < 1

SELECT X2.stationcode, X2.lat, X2.lon FROM railroads_of_the_united_states X1, stationdatatable X2 WHERE distance(X1.the_geom, MakePoint(X2.lat, X2.lon)) < 1

GEONSOQLGUI

SOQL Processor

Railroadshapefile

Seismic Stations

Schema Mediator

distance(X1.the_geom, MakePoint(X2.lat, X2.lon)) < 1

SELECT X1.the_geom FROM railroads X1

Question: Finding all seismic stations within 1 mile from railroads

SELECT X2.stationcode, X2.lat, X2.lon FROM stationdatatable X2

WHERE bounding box condition

Page 43: 1 Ontology Enabled Data Discovery and Integration Kai Lin San Diego Supercomputer Center University of California, San Diego A. K. Sinha, Z. Malik, A

43

Questions?

Page 44: 1 Ontology Enabled Data Discovery and Integration Kai Lin San Diego Supercomputer Center University of California, San Diego A. K. Sinha, Z. Malik, A

44

How to Connect to GEON Databases

• Download GEON JDBC Driver• Use the following code to create a connection

// load driverClass.forName ("org.geongrid.jdbc.driver.Driver");

// set the mediator URLString url = "jdbc:geon://geon01.sdsc.edu:2532/GEON-63cb404c-6038-11d9-a69f”;

// open the connectionConnection conn = DriverManager.getConnection(url, "geonuser", "geongrid");

GEON JDBC protocolThe host name and port number of GEON Mediator

GEON ID

Note: the original account information is invisible to end users