27
<Insert Picture Here> Semantic Technology in Oracle Database

Semantic Technology in Oracle Database. Data Interoperability Challenges Data locked into schemas, formats, software systems Semantic technology seen

Embed Size (px)

Citation preview

Page 1: Semantic Technology in Oracle Database. Data Interoperability Challenges Data locked into schemas, formats, software systems Semantic technology seen

<Insert Picture Here>

Semantic Technology in Oracle Database

Page 2: Semantic Technology in Oracle Database. Data Interoperability Challenges Data locked into schemas, formats, software systems Semantic technology seen

Data Interoperability Challenges

• Data locked into schemas, formats, software systems• Semantic technology seen as a possible solution• Specialty RDF data management engines are isolated from the data to be integrated

• In addition there are high training costs, systems admin costs, management costs.

• Tightly coupling semantics (RDF/OWL) functionality to the data storage infrastructure will facilitate data integration using semantics

RDF/OWL Triples

BusinessData

Semantic Apps

BusinessApps

Enterprise DataServer

RDF DataServer

Page 3: Semantic Technology in Oracle Database. Data Interoperability Challenges Data locked into schemas, formats, software systems Semantic technology seen

Adding advanced RDF services to Oracle Database

• Database features and queries can be enhanced using semantics

• Hybrid queries between enterprise data and semantic data possible

• Databases are part of infrastructure in several categories of applications that use semantics for data integration

• Biosurveillance, Social Networks, Telcos, Utilities, Text, Life Sciences, GeoSpatial

• All database benefits become available for semantic applications

• Scalability: Manage datasets 10X larger than specialized RDF/OWL stores (billions of triples), no scalability boundaries

• Billions of nodes, large graphs, parallel loading, query, indexing• Security, transaction control, availability, backup and recovery, lifecycle

management, etc.• Can combine multiple datatypes (geospatial, sensor, etc. with semantic data)

Page 4: Semantic Technology in Oracle Database. Data Interoperability Challenges Data locked into schemas, formats, software systems Semantic technology seen

Oracle 10g RDF Approach

• Provide an open and persisted RDF data model and analysis platform for semantic applications

• RDF Data Model with inferencing (RDFS and user-defined rules)

• Inferencing based on forward-chaining

• Perform SQL-based access to triples and inferred data• Combine SQL query of business with RDF graphs and

ontologies • Support large graphs (billion+ triples)• Easily extensible by 3rd party tools/apps

Page 5: Semantic Technology in Oracle Database. Data Interoperability Challenges Data locked into schemas, formats, software systems Semantic technology seen

Use Case: Knowledge Mining Solutions

Information Extraction

Categorization, Feature/term Extraction

Web Resources

News, Email, RSS

Content Mgmt. Systems

Processed Document Collection

RDF/OWL

Knowledge Mining & Analysis

• Text Indexing using Oracle Text

• Non-Obvious Relationship Discovery

• Pattern Discovery

• Text Mining

• Faceted Search

AnalystBrowsing, Presentation, Reporting, Visualization, Query

SQL/SPARQL Query

Explore

Domain Specific

Knowledge Base

OWL

Ontologies

Ontology Engineering Modeling Process

Page 6: Semantic Technology in Oracle Database. Data Interoperability Challenges Data locked into schemas, formats, software systems Semantic technology seen

Geospatial Semantic Search

Schemas:• Persisted RDF/OWL data• Persisted spatial data• Persisted business data• Persisted text data

GeoSemantic Processes• Text Extraction • Semantic Modeling • Rules/Policy Mgmt. • Geospatial Analysis• Map Visualization• Semantic Search

RDF Models Spatial Data

Oracle 10g RDBMS

Business Data Text Data

Page 7: Semantic Technology in Oracle Database. Data Interoperability Challenges Data locked into schemas, formats, software systems Semantic technology seen

Semantic Solutions on the WebDeploying on a SOA Infrastructure

• Simple Features• GeoRaster• Topology• Networks• Spatial Data Mining• Geocoding• Routing• Versioning• DBMS Rules

• J2EE Container• SOAP Web sevices• Orchestration &

Workflow• Security• Policy based resource

mgmt• Workload scaling• Portal• Wireless & Sensor

Core SoftwareInfrastructure

Semantic-Enabled toolsApplications& Services

• Business Logic• Entity Extraction• Visualization• Ontology Modeling• Faceted Search• Link/Graph Analysis• Advanced Inference• Metadata Repository• Entity Categorization• Relationship analysis

National Security

Financial RiskAnalysis

RegulatoryCompliance

Life SciencesDrug Discovery

Health ScienceBioSurveillance

Manufacturing Configuration Management

Page 8: Semantic Technology in Oracle Database. Data Interoperability Challenges Data locked into schemas, formats, software systems Semantic technology seen

Semantic Technology Stack

Standards

based

Page 9: Semantic Technology in Oracle Database. Data Interoperability Challenges Data locked into schemas, formats, software systems Semantic technology seen

Based on Standards

• Our implementation entirely based on W3C standards (RDF, RDFS, OWL)• SPARQL support is planned

• We are members of:• W3C DAWG (WG responsible for SPARQL)• W3C SWEO Interest group• W3C HCLS Interest group• W3C Multimedia Semantics Incubator group• Soon to be formed W3C OWL 1.1 Working group

Page 10: Semantic Technology in Oracle Database. Data Interoperability Challenges Data locked into schemas, formats, software systems Semantic technology seen

Technical Features

• Database storage model for data represented in RDF• SQL-based query of RDF data• Combining RDF queries with relational queries• Native inferencing engine to infer new relationships

from RDF data

Page 11: Semantic Technology in Oracle Database. Data Interoperability Challenges Data locked into schemas, formats, software systems Semantic technology seen

Technical Overview

RDF/OWL data and

ontologies

Enterprise (Relational)

data

Query RDF/OWL data and

ontologies

Combining relational queries with RDF/OWL

queries

INFERS

TO

RE

QUERY

RD

F/S

Use

r d

ef.

rule

s

Ba

tch

-L

oad

Incr

. L

oad

and

D

ML

Page 12: Semantic Technology in Oracle Database. Data Interoperability Challenges Data locked into schemas, formats, software systems Semantic technology seen

Storage: Highlights

• Stores <subject, predicate, object> triples• Set of triples form an RDF/OWL graph (model)

• Optimized storage structure: repeated values stored only once (uses normalization)

• Scales to very large datasets• No limits to amount of data that can be stored

• Current users: 600Million+ triples (UTH)

• Can handle multiple lexical forms of the same value• Ex: “0010”^^xsd:decimal and “010”^^xsd:decimal

• Maintains fidelity (user-specified lexical form)• Supports long literal values

John Oracle

:employeeOf

Page 13: Semantic Technology in Oracle Database. Data Interoperability Challenges Data locked into schemas, formats, software systems Semantic technology seen

Semantic Data Storage

ID (number) TRIPLE (sdo_rdf_triple_s) … … …

Model

Model

Triple (SDO_RDF_TRIPLE_S)

…..

Internal Semantic Store

Application table 1

Application table 2

• Application table links to model in internal semantic store

Optional columns for related enterprise data

Page 14: Semantic Technology in Oracle Database. Data Interoperability Challenges Data locked into schemas, formats, software systems Semantic technology seen

Query RDF Data

• SPARQL-like graph pattern embedded in SQL query• Matches RDF/OWL graph patterns with patterns in stored data• Returns a table of results• Can use SQL operators/functions to process results• Avoids staging when combined with queries on relational data• Scales: millisecond query times for large data sets (10M+ triples)

SELECT …

FROM …, TABLE (

SDO_RDF_MATCH invocation ) t, …

WHERE …

SDO_RDF_MATCH( '(?x rdf:type :Person)', -- pattern: all persons

SDO_RDF_Models('family'), -- RDF data models

SDO_RDF_Rulebases(‘RDFS'), -- rulebases

SDO_RDF_Aliases(…) -- aliases

null -- no filter condition

)

Page 15: Semantic Technology in Oracle Database. Data Interoperability Challenges Data locked into schemas, formats, software systems Semantic technology seen

Query Example: Family Data

select x, y, name from

TABLE(SDO_RDF_MATCH(

‘(:Tom :hasParent ?x)

(?x :hasFather ?y)

(?y :name ?name)',

SDO_RDF_Models('family'),

.., .., ..));

Returns the name of Tom’s grandfather

:Jack :Tom

:Janice:John

:Suzie :Matt

“John D”

X Y NAME

Matt John “John D”

Page 16: Semantic Technology in Oracle Database. Data Interoperability Challenges Data locked into schemas, formats, software systems Semantic technology seen

Combining RDF Queries with Relational Queries

• Find salary and hiredate of Tom’s grandfather(s)

• SELECT emp.name, emp.salary, emp.hiredateFROM emp, TABLE(SDO_RDF_MATCH( ‘(:Tom :hasParent ?y) (?y :hasFather ?x) (?x :name ?name)’, SDO_RDF_Models(‘family'), …)) tWHERE emp.name=t.name;

Page 17: Semantic Technology in Oracle Database. Data Interoperability Challenges Data locked into schemas, formats, software systems Semantic technology seen

Inference: Overview

• Native inferencing in the database for• RDF, RDFS • User-defined rules

• Rules are stored in rulebases in the database• RDF graph is entailed (new triples are inferred) by

applying rules in rulebase/s to model/s• Inferencing is based on forward chaining: new triples

are inferred and stored ahead of query time• Minimizes on-the-fly computation and results in fast query

times

Page 18: Semantic Technology in Oracle Database. Data Interoperability Challenges Data locked into schemas, formats, software systems Semantic technology seen

Inferencing

• RDFS Example:

A rdf:type B, B rdfs:subClassOf C

=> A rdf:type C

Ex: Matt rdf:type Father, Father rdfs:subClassOf Parent

=> Matt rdf:type Parent

• User-defined Rules Example:

A :hasParent B, B :hasParent C

=> A :hasGrandParent C

Ex: Tom :hasParent Matt, Matt :hasParent John

=> Tom :hasGrandParent John

Page 19: Semantic Technology in Oracle Database. Data Interoperability Challenges Data locked into schemas, formats, software systems Semantic technology seen

Query Example: Family Data

select y, name from TABLE(SDO_RDF_MATCH(

‘(:Tom :hasGrandParent ?y)

(?y :name ?name)’

(?y rdf:type :Male),

SEM_Models('family'),

SEM_Rulebases(‘family_rb),

.., ..));

Returns the name of Tom’s grandfather

Y NAME

John ‘John D’ :Jack :Tom

:Janice:John

:Suzie :Matt

“JohnD”“JohnD”Male

Page 20: Semantic Technology in Oracle Database. Data Interoperability Challenges Data locked into schemas, formats, software systems Semantic technology seen

Data Integration in the Life Sciences

“Find all pieces of information associated with a specific target”

• Data integration of multiple datasets• Across multiple representation formats, granularity of representation, and access

mechanisms• Across In-house and public sets (Gene Ontology, UniProt, NCI thesaurus, etc.).

• Standardized and machine-understandable data format with an open data access model is necessary to enable integration

• Data-warehousing approach represents all data to be integrated in RDF/OWL• Semantic metadata layer approach links metadata from various sources and

maps data access tool to relevant source• Ability to combine RDF/OWL queries with relational queries is a big benefit• Lilly and Pfizer are using semantic technology to solve data integration

problems

Page 21: Semantic Technology in Oracle Database. Data Interoperability Challenges Data locked into schemas, formats, software systems Semantic technology seen

Use Case: SenseLab Overview

Courtesy, SenseLab, Yale University

Page 22: Semantic Technology in Oracle Database. Data Interoperability Challenges Data locked into schemas, formats, software systems Semantic technology seen

Relational to Ontological Mapping

Drug

Neuron

PathologicalAgent

Receptor

Channel

inhibitsinhibits

Agent

NeuronalProperty

PathologicalChange

involvesinvolves inhibits

Compartment

has

is_located_in

is_located_in

Courtesy, SenseLab, Yale University

Page 23: Semantic Technology in Oracle Database. Data Interoperability Challenges Data locked into schemas, formats, software systems Semantic technology seen

<Insert Picture Here>

Semantic Technology Plans for the Next Release

Page 24: Semantic Technology in Oracle Database. Data Interoperability Challenges Data locked into schemas, formats, software systems Semantic technology seen

Safe Harbor Statement & Confidentiality

The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions.The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.

Page 25: Semantic Technology in Oracle Database. Data Interoperability Challenges Data locked into schemas, formats, software systems Semantic technology seen

Plans for the Next Release

• Fast bulk-load RDF/OWL data into the database• Several times faster than 10.2.0.2 batch load

• Infer new triples with native OWL inferencing• Faster query of RDF/OWL data and ontologies• Ontology-Assisted Query of relational data

Page 26: Semantic Technology in Oracle Database. Data Interoperability Challenges Data locked into schemas, formats, software systems Semantic technology seen

Overview

RDF/OWL data and

ontologies

Enterprise (Relational)

data

Query RDF/OWL data and

ontologies

INFERS

TO

RE

Ontology-Assisted Query of

Enterprise Data

QUERY

RD

F/S

Use

r-de

f.

Ba

tch

-Loa

dO

WLsu

bse

ts

Bu

lk-

Loa

d

Incr

. D

ML

Page 27: Semantic Technology in Oracle Database. Data Interoperability Challenges Data locked into schemas, formats, software systems Semantic technology seen

Technical Overview Summary

• Semantic Technology support in the database• Store RDF/OWL data and ontologies• Infer new RDF/OWL triples via native inferencing• Query RDF/OWL data and ontologies• Ontology-Assisted Query of relational data