65
© Copyright 2011 TopQuadrant Inc 1 NIEM Ontologies and Vocabularies Transforming NIEM to RDF/OWL and Querying NIEM-compliant Instance Data using SPARQL and SPIN Ralph Hodgson, CTO, TopQuadrant Gokhan Soydan, Semantic Solution Developer, TopQuadrant SemTech 2011 East, Thursday, December 1, 2011, 3:00 PM - 3:50 PM Level: Technical Intermediate Location: Auditorium

SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

Embed Size (px)

Citation preview

Page 1: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

Click to edit Master title style

© Copyright 2011 TopQuadrant Inc 1

NIEM Ontologies and Vocabularies

Transforming NIEM to RDF/OWL and Querying NIEM-compliant Instance Data

using SPARQL and SPIN

Ralph Hodgson, CTO, TopQuadrant Gokhan Soydan, Semantic Solution Developer, TopQuadrant

SemTech 2011 East, Thursday, December 1, 2011, 3:00 PM - 3:50 PM Level: Technical – Intermediate

Location: Auditorium

Page 2: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

Click to edit Master title style

© Copyright 2011 TopQuadrant Inc 2

What is there to talk about, is there enough time?

Business and Technical Motivations

Approaches to model-based Information Exchange using controlled vocabularies

Expressing NIEM as OWL Models and Vocabularies

The Power of RDF/OWL and SPARQL

Next Possibilities

Reusable Message Building blocks Composable Message Schemas Controllable Vocabularies Linked Data Information Insight

UML XML Schema UN/CEFACT CCTS OWL XML Schemas OWL and Turtle/JSON-LD

XSD to OWL Transformation U.S. DOJ Logical Entity Exchange

Specification 3.1 (LEXS) XML Instance Messages to RDF Conversion

SPARQL inferencing over LEXS Messages Demonstration

NIEM as LOD

Take Away

Page 3: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

© Copyright 2011 TopQuadrant Inc 3

First let’s remind ourselves on: why information is exchanged

Sculpture by M. Chava Evans (Baltimore, MD) Sculpture, Studio 33, Torpedo Factory, Alexandria, VA

Page 4: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

© Copyright 2011 TopQuadrant Inc 4

technical motivations …

Sculptures in the National Gallery, East Building, Washington DC, Nov 25, 2011

XML OWL

UML?

OWL as a specification language for information

models and controlled vocabularies

Page 5: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

© Copyright 2011 TopQuadrant Inc 5

But life in the XML Ecology isn’t easy from hierarchies

to Graphs

from Graphs

to hierarchies more at http://topquadrantblog.blogspot.com/2011/09/living-in-xml-and-owl-world.html

Page 6: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

Click to edit Master title style

© Copyright 2011 TopQuadrant Inc 6

Some breakthroughs: Co-existence of OWL and XSD/XML

+ TopBraid

Transformers

Convert XSD to RDF/OWL

XSD

RDF/OWL

TopBraid

Transformers

Convert XSD to RDF/OWL

XSD

RDF/OWL

TopBraid Transformers

Convert XSD to RDF/OWL

XSD

RDF/OWL

Semantic XML

Convert XML to RDF/OWL

XML

RDF/OWL + +

Make OWL Schemas from NIEM and LEXS XSD Schemas 1

2 Use the OWL Schemas to make RDF from LEXS XML Messages

Page 7: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

Click to edit Master title style

© Copyright 2011 TopQuadrant Inc 7

SPARQL Rules (SPIN)

Convert RDF/OWL to XML

XML

RDF/OWL

SPARQL Web Pages

(SWP)

Convert HTML to PDF

PDF

HTML

ReportingHub Semantic Processing

SPARQL Rules (SPIN)

Convert XML to RDF/OWL

XML

RDF/OWL

SPARQL Web Pages (SWP)

Convert RDF/OWL to HTML

HTML

RDF/OWL

Page 8: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

Click to edit Master title style

© Copyright 2011 TopQuadrant Inc 8

Generating XML Schemas and Controlled Vocabularies from OWL Models

GRDDL XSLT Generator

XSLT Processor

Going from XML to OWL

ref: XML SchemaPlus – http://www.xspl.us

Page 9: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

Click to edit Master title style

© Copyright 2011 TopQuadrant Inc 9

Different Reasons to “Connect the Dots”

1) 360 Degrees View

2) Transitive Connections

3) Information Discovery

C

More about the same thing

A

B What is linked to a thing of

interest

A Find things that share common

attributes or relationships

Page 10: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

Click to edit Master title style

© Copyright 2011 TopQuadrant Inc 10

Personal Motivations: August 1, 2009 – “Data Independence Day”

www.oegov.org

Page 11: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

© Copyright 2011 TopQuadrant Inc 11

Current practices for “Living in the XML Ecology” raise many challenges:

X

X

1. Vocabulary Alignment

2. Governance of “core” models

3. Extensibility and tailoring of models to local needs

4. Resilience to change

Page 12: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

Click to edit Master title style

© Copyright 2011 TopQuadrant Inc 12

Some ways XML Message Schemas have been, or are being, made using UML (1 of 5)

1 The Weather Data Model

ref: WXXM 1.1 Primer, 1.1 10 February 2010, https://wiki.ucar.edu/display/NNEWD/WXXM

Take Away

No URIs No inherent aggregation properties Special programs Complex queries

Page 13: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

Click to edit Master title style

© Copyright 2011 TopQuadrant Inc 13

Some ways XML Message Schemas have been, or are being, made using UML (2 of 5)

2 CIM Models in the SmartGrid

ref: EPRI CIM and 61850 Harmonization 2009 Project Report, Nov 17, 2009, http://cimug.ucaiug.org/Meetings/Charlotte2009/Presentations/CIM%20and%2061850%20Harmonization%20102909.pdf

Take Away

Page 14: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

Click to edit Master title style

© Copyright 2011 TopQuadrant Inc 14

Some ways XML Message Schemas have been, or are being, made using UML (3 of 5)

3 Harmonizing Spatial Data – NEN 3610:2011 and GML

ref: http://www.nen.nl/web/Normshop/Norm/NEN-36102011-nl.htm

Configuration (XML)

GML Application Schema

(XML Schema)

ShapeChange (Java, Servlet)

UML Application Schema (XMI)

Configuration (XML)

GML Application Schema

(XML Schema)

ShapeChange (Java program)

UML model

Encoding

Rules

Guidelines

/

Take Away

No URIs No inherent aggregation properties Special programs Complex queries

Page 15: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

Click to edit Master title style

© Copyright 2011 TopQuadrant Inc 15

Some of the ways XML Message Schemas have been, or are being, made using UML (4 of 5)

4 UN/CEFACT Standards for Message Exchange

/

Source: 16th UN/CEFACT PLENARY http://www.unece.org/fileadmin/DAM/cefact/cf_plenary/plenary10/UNCEFACT%2016TH%20PLENARY_full_rev5.ppt

Take Away

No URIs No inherent aggregation properties Special programs Complex queries

Page 16: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

Click to edit Master title style

© Copyright 2011 TopQuadrant Inc 16

Some ways XML Message Schemas have been, or are being, made using UML (5 of 5)

5

/

NIEM Information Exchange Package Documentation

source: “Where have all the Standards Gone?”, Bruce Kelling (Moderator), http://www.ncja.org/Content/NavigationMenu/EducationEvents/2009NationalForum/AllSpeakers.Standards.ppt

Take Away

No URIs Complexity Recommended Practices Required Practices

Page 17: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

© Copyright 2011 TopQuadrant Inc 17

TopQuadrant has faced the “OWL co-existence with UML and XML” challenges on a number of projects

SmartGrid Semantic Harmonization and Interoperability

NASA Telemetry and Command, Simulation and Data Architecture Models and Vocabularies

The Netherlands MoJ Ontology-Driven Metadata Workbench Message Builder

EPIM Reporting Hub for the Norwegian Oil and Gas Fields

Page 18: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

Click to edit Master title style

© Copyright 2011 TopQuadrant Inc 18

The Netherlands MoJ Ontology-Driven Metadata Workbench Message Builder

Business Needs Accurate and rapid Information Sharing

between Organizations

Agility in response to Legislation Changes

Data Quality is guaranteed

Reduced Costs of Message Schema Development

Technical Benefits Direct and flexible Reuse of Data

Components

Full Automation of XML Schema creation

Semantic Consistency is preserved and confirmed

Linked Data / traceability

Version Management

ref: http://www.enterprisedatajournal.com/article/netherlands-ministry-justice-metadata-

workbench-composing-xml-message-schemas-owl-models.htm

Take Away

Page 19: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

Click to edit Master title style

© Copyright 2011 TopQuadrant Inc 19

The Netherlands MoJ Ontology-Driven Approach to Message Design using

UN/CEFACT Solution: Ontology-Based Metadata Workbench: Transform Domain Models into UN/CEFACT CCTS compliant representation and allow Business Analysts to assemble business documents for electronic messages from Component Parts.

Take Away

Page 20: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

Click to edit Master title style

© Copyright 2011 TopQuadrant Inc 20

Rich Ontologies

CCTS Ontologies

Core Component Overlay

Creation of XML Message Schemas

Contexts

Domains

Business Document Ontologies

CCTS MetaModel

CCTS Document

SPIN Transformation rules

CCTS XML SchemaPlus

CCTS XML Schema

XSP MetaModel

XSLT Script

Business Component

Overlay

“Rich” Ontologies are expressive models of domains. These include LKIF and detailed situations of law and legal document and procedures.

CCTS-Compliant XML Schemas are generated from the XSP Document

CCTS Document Editor XSP Generation XSD Generation

Users create CCTS documents from BIEs and Core Components

Projects

Acronyms

BIE Business Information Entity CCTS UN/CEFACT Core Component Technical

Specifications LKIF Legal Knowledge Interchange Format SPIN SPARQL Inferencing Notation XSLT XSL Transformations (XSLT) Version 2.0 XSP XML SchemaPlus

Take Away

Page 21: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

Click to edit Master title style

© Copyright 2011 TopQuadrant Inc 21

NASA Constellation Program

CxP 70160 ANX10

Infrastructure

Specification

CxP 70160 ANX11

Application Programming

Interface Specification

CxP 70160 ANX14

Policy and Security

Model

Constellation Program Data Architecture and Interoperability through the use of OWL Ontologies with strategies for co-existence with XML and other data formats.

Take Away

Page 22: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

Click to edit Master title style

© Copyright 2011 TopQuadrant Inc 22

Generating XML Schemas and Controlled Vocabularies from OWL Models

GRDDL XSLT Generator

XSLT Processor

Going from XML to OWL

ref: XML SchemaPlus – http://www.xspl.us

Take Away

Page 23: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

Click to edit Master title style

© Copyright 2011 TopQuadrant Inc 23

ReportingHub Vision

Need: “Reporting to authorities and

partners on the NCS in a cost

efficient and secure manner”

Outcome: “Improved Information

Integration and Exchange”

“Faster and better decisions”

Enablers:

“A Field Specific Asset Model based on the Common Asset Model –

ISO 15926, PCA RDL and NPD Facts”

“SPARQL as a way to query the data in a triple store and reason

about data using appropriate inference engine(s)”

“Web Services for hiding the complexity of SPARQL Queries”

“Machine driven creation of new data relationships without

restructuring the data model”

SPARQL Rules (SPIN)

Convert XML to RDF/OWL

XML

RDF/OWL

1500 named users, and

100 concurrent users

SPARQL Web Pages (SWP)

Convert HTML to PDF

PDF

HTML

SPARQL Web Pages (SWP)

Convert RDF/OWL to PDF

PDF

RDF/OWL

Take Away

Page 24: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

Click to edit Master title style

© Copyright 2011 TopQuadrant Inc 24

SPARQL Rules (SPIN)

Convert RDF/OWL to XML

XML

RDF/OWL

SPARQL Web Pages

(SWP)

Convert HTML to PDF

PDF

HTML

ReportingHub Semantic Processing

SPARQL Rules (SPIN)

Convert XML to RDF/OWL

XML

RDF/OWL

SPARQL Web Pages (SWP)

Convert RDF/OWL to HTML

HTML

RDF/OWL

Take Away

Page 25: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

© Copyright 2011 TopQuadrant Inc 25

The NIEM/LEXS Experiment

Page 26: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

Click to edit Master title style

© Copyright 2011 TopQuadrant Inc 26

The NIEM/LEXS Experiment

From NIEM/LEXS XSD Schemas and Instance Data

To OWL Models and RDF Triples

NIEM/LEXS RDF/OWL Stack

VAEM, VOAG, VOID, DC

LEXS Rules

LEXS Instances

DTYPE

NIEM Vocabs and

Datatypes

NIEM Ontologies

LEXS Ontology

Page 27: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

Click to edit Master title style

© Copyright 2011 TopQuadrant Inc 27

What is NIEM?

“National Information Exchange Model, NIEM, is an interagency initiative to provide the foundation and building blocks for national-level interoperable information sharing and data exchange.

The NIEM project was formally announced at the Global Justice XML Data Model (Global JXDM) Executive Briefing on February 28, 2005.

It was initiated as, and continues to be, a joint venture between the U.S. Department of Homeland Security (DHS) and DOJ with outreach to other departments and agencies.

The base technology for NIEM is derived from the Global JXDM. ”

source: http://it.ojp.gov/default.aspx?area=implementationAssistance&page=1017&standard=486

Page 28: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

Click to edit Master title style

© Copyright 2011 TopQuadrant Inc 28

What are IEP and IEPD? Options for implementing information exchanges

source: US DoJ Implementation Guidance for NIEM-Conformant Exchanges , http://www.hsdl.org/?view&did=487388

An Information Exchange Package (IEP) is an XML representation of the information shared for a specific business purpose.

An Information Exchange Package Documentation (IEPD) is a collection of artifacts (describing the purpose, structure and content of IEPs) that governs an information exchange.

Page 29: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

© Copyright 2011 TopQuadrant Inc 29

What we will show you today

• Generation of OWL Models from XML Schemas

• Auto-conversion of LEXS-based XML messages to RDF

• An experiment with fake (generated) Incidents data to show how multiple messages can be aggregated

• Some SPARQL Queries and SPIN rules at work

Page 30: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

Click to edit Master title style

© Copyright 2011 TopQuadrant Inc 30

XSD/XML to OWL Rules (1 of 2)

# XSD/XML Constructs OWL Constructs

1 xsd:simpleType owl:Datatype

2 xsd:simpleType with xsd:enumeration Becomes an owl:Class as a subclass of ‘EnumeratedValue’. Instances are created for every enumerated value. An instance of ‘Enumeration’, referring to all the instances, is created as well as the owl:oneOf union over the instances.

3 xsd:complexType over xsd:complexContent

owl:Class

4 xsd:complexType over xsd:simpleContent

owl:Class

5 xsd:element (global) with complex type owl:Class and subclass of the class generated from the referenced complex type

6 xsd:element (global) with simple type

owl:Datatype

7 xsd:element (local to a type) owl:DatatypeProperty or owl:ObjectProperty depending on the element type. OWL Restrictions are built for the occurrence.

Take Away

Page 31: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

Click to edit Master title style

© Copyright 2011 TopQuadrant Inc 31

XSD/XML to OWL Rules (2 of 2)

# XSD/XML Constructs OWL Constructs

8 xsd:group owl:Class and sub-class of ‘A_AbstractElementGroup’

9 xsd:attributeGroup owl:Class and sub-class of ‘A_AbstractAttributeGroup’

10 Anonymous Complex Type As for Complex Type except a URI is constructed from the parent element and the nested element reference. Also, the class is defined as a subclass of ‘A_Anon’.

11 Anonymous Simple Type As for Simple Type except a URI is constructed from the parent element and the nested element reference.

12 xsd:default on an attribute Uses ‘dtype:defaultValue’ to attach a value to the OWL restriction representing the associated property.

13 Substitution Groups Subclass statements are generated for the members. Instance files resolve their types by consulting the OWL model at import-time.

14 Annotation attributes on elements OWL Annotation properties are created and placed directly on the relevant class.

15 Annotations using xsd:annotation Become, based on user selection, dc:description, rdfs:comment and/or skos:definition OWL annotations.

16 xsi:type on an XML element Overrides the schema type with the specified type.

Take Away

Page 32: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

© Copyright 2011 TopQuadrant Inc 32

DEMO of XSD to OWL and XML to OWL Transformations

Page 33: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

© Copyright 2011 TopQuadrant Inc. Slide 33

Metrics on the NIEM OWL Model

SELECT ?class ?restrictionCount WHERE { ?class a owl:Class . BIND(smf:countResults( "SELECT DISTINCT ?property WHERE { ?class rdfs:subClassOf ?restriction . ?restriction a owl:Restriction . ?restriction owl:onProperty ?property }" ) AS ?restrictionCount ) }

Take Away

Page 34: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

© Copyright 2011 TopQuadrant Inc. Slide 34

NIEM Person (Proto) OWL Model

Note: to address the reusability required in the MoJ work, NIEM ‘Person’ was re-factored into individual ‘Details’ classes.

Take Away

Page 35: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

© Copyright 2011 TopQuadrant Inc. Slide 35

Refactoring of NIEM Person into an OWL Model with reusable Concepts (person:Details)

Depending on the context of use, concepts describing different details about a person can be selected for the UBL Business Documents and Messages.

Take Away

Page 36: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

© Copyright 2011 TopQuadrant Inc. Slide 36

Refactoring of the NIEM Person into an OWL Model with reusable Concepts (person:AppearanceDetails)

A person’s ‘Appearance Details’ will be needed for criminal investigations.

Take Away

Page 37: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

© Copyright 2011 TopQuadrant Inc. Slide 37

NIEM JXDM Complex Type Example Take Away

Page 38: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

© Copyright 2011 TopQuadrant Inc 38

DOJ Logical Entity Exchange Specification (LEXS)

Page 39: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

Click to edit Master title style

© Copyright 2011 TopQuadrant Inc 39

What is the DOJ LEXS?

“LEXS provides a flexible, NIEM-based framework used for the creation of NIEM-conformant IEPDs for information sharing, both for publishing information and for system-to-system federated searches.”

source: http://it.ojp.gov/default.aspx?area=implementationAssistance&page=1017&standard=486

LEXS is a family of NIEM-conformant

IEPDs that define flexible structures to

support a variety of applications.

Any application that participates in

OneDOJ, is a part of LEISP, or supports

law enforcement information sharing

must participate in LEXS exchanges.

If additional structures beyond the

base LEXS are required, LEXS should

be extended by using NIEM (Option 2).

Page 40: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

Click to edit Master title style

© Copyright 2011 TopQuadrant Inc 40

Conversion of LEXS from XML Schema to OWL using the TopBraid XSD to OWL Importer

XML Schemas OWL Models

Page 41: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

Click to edit Master title style

© Copyright 2011 TopQuadrant Inc 41

Using SPARQL to count the properties on the LEXS/NIEM OWL Models

SELECT ?class (COUNT (DISTINCT ?p) AS ?properties) WHERE { ?class a owl:Class . OPTIONAL { ?class rdfs:subClassOf ?r . ?r a owl:Restriction . ?r owl:onProperty ?p . } } GROUP BY ?class ORDER BY DESC( ?properties )

Page 42: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

Click to edit Master title style

© Copyright 2011 TopQuadrant Inc 42

‘digest:EntityAssociationType’ really stands out with 194 Properties

Is this a refactoring opportunity?

Page 43: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

Click to edit Master title style

© Copyright 2011 TopQuadrant Inc 43

Some NIEM Controlled Vocabularies FBI

Page 44: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

Click to edit Master title style

© Copyright 2011 TopQuadrant Inc 44

FBI Code Lists Example: Hair Color

OWL Model OWL Instances

fbi:HAICST_GRY a fbi:HAICodeSimpleType ; rdfs:label "GRY"^^xsd:string ; dtype:order "5"^^xsd:nonNegativeInteger ; dtype:value "GRY"^^xsd:token ; skos:definition "Gray or Partially Gray"^^xsd:string ; skos:prefLabel "GRY"^^xsd:string .

Grey Hair in Turtle Syntax

Page 45: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

Click to edit Master title style

© Copyright 2011 TopQuadrant Inc 45

The digest:EntityActivity Class

OWL Class with properties

Inheritance

Association

A ‘digest:EntityActivity’ is both a ‘digest:Entity’ and a ‘digest:EntityActivityType’

Multiple Inheritance is common

Note that the ‘proto-OWL’ ontology respects the XML Schema’s use of wrapped data types. An optimization can unfold these to direct data types

Association

Page 46: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

Click to edit Master title style

© Copyright 2011 TopQuadrant Inc 46

A digest:EntityActivity Instance <lexsdigest:EntityActivity> <lexsdigest:Metadata s:id="MIncident1"> <nc:ReportedDate><nc:Date>1997-03-12</nc:Date></nc:ReportedDate> </lexsdigest:Metadata> <nc:Activity s:id="Incident1" s:metadata="MIncident1"> <nc:ActivityIdentification><nc:IdentificationID>000000000003</nc:IdentificationID> </nc:ActivityIdentification> <nc:ActivityCategoryText>Incident</nc:ActivityCategoryText> <nc:ActivityDate><nc:DateTime>1997-03-12T00:01:00.0Z</nc:DateTime></nc:ActivityDate> <nc:ActivityDescriptionText>On 3/12/1997 at 12:01 a.m., Mr. Donald R. Duck (Witness 1) saw a white male break the glass of his neighbor's (Jacob Joe) front door. Mr. Duck placed a 911 call on his cell phone to report the incident. Within minutes, police arrive at the residence (1 NW Brockway Avenue) to find the subject ransacking the house. Detective Bond was the responding and arresting officer. The subject was taken to the Santa Fe Police Department and placed under arrest. An arrest report was filed on 3/12/1997.</nc:ActivityDescriptionText> </nc:Activity> </lexsdigest:EntityActivity>

Class

Instance

Take Away

Page 47: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

Click to edit Master title style

© Copyright 2011 TopQuadrant Inc 47

burglary-incident-w-arrest-basic-lexs.xml

Transforming LEXS Instance Data to RDF

Semantic XML

Convert XML to RDF/OWL

XML

RDF/OWL + +

burglary-incident-w-arrest-basic-lexs (RDF) Automatic

Conversion from

LEXS XML to RDF

TopBraid’s Semantic XML Engine

uses sxml:tag annotations on the

auto-generated NIEM/LEXS OWL

Ontologies to control the

transformations.

Page 48: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

Click to edit Master title style

© Copyright 2011 TopQuadrant Inc 48

Useful QA Check on the “Semantic XML”1 Triples

SELECT *

WHERE {

?subject composite:child ?object .

NOT EXISTS { ?object a sxml:Comment }

NOT EXISTS { ?object a ?type .

?type sxml:element "xi:include" }

}

“0” is good!

QA Check

1 Semantic XML is a composite pattern model:

?anElement composite:child ?anotherElement

?anElement composite:child ?anAttribute

Page 49: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

Click to edit Master title style

© Copyright 2011 TopQuadrant Inc 49

Example SPARQL Query for the sample Burglary Incident

SELECT ?s ?fn1 ?ssn1 ?fbiID1v ?fpID1v ?fpj1v WHERE { ?s rdf:type digest:EntityPerson . ?s digest:personRef ?p1 . ?p1 core:personNameRef ?pnR1 . ?pnR1 core:personFullNameRef ?pfnR1 . ?pfnR1 dtype:value ?fn1 . ?p1 core:personSSNIdentificationRef ?pSSNR1 . ?pSSNR1 core:identificationIDRef ?pSSN1 . ?pSSN1 dtype:value ?ssn1 . ?p1 digest:personAugmentationRef ?p1a . ?p1a jxdm:personFBIIdentificationRef ?fbiID1 . ?fbiID1 core:identificationIDRef ?fbicID1 . ?fbicID1 dtype:value ?fbiID1v . ?p1a jxdm:personStateFingerprintIdentificationRef ?fp1 . ?fp1 core:identificationIDRef ?fpcID1 . ?fpcID1 dtype:value ?fpID1v . ?fp1 core:identificationJurisdictionRef ?fpj1 . ?fpj1 dtype:value ?fpj1v . }

Find all people involved in

an incident for which we

have full names, SSNs,

FBI IDs, finger prints and

the state of jurisdiction

Page 50: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

Click to edit Master title style

© Copyright 2011 TopQuadrant Inc 50

Using Magic Properties and Property Chains to simplify the SPARQL

SELECT ?s ?fn1 ?ssn1 ?fbiID1v ?fpID1v ?fpj1v WHERE { ?s rdf:type digest:EntityPerson . ?s digest:personRef ?p1 . ?p1 lexs:getFullName ?fn1 . ?p1 lexs:getSSN ?ssn1 . ?p1 digest:personAugmentationRef ?p1a . ?p1a lexs:getFBI-ID ?fbiID1v . ?p1a lexs:getFingerprintID ?fpID1v . ?p1a lexs:getFingerprintIDState ?fpj1v . }

SELECT ?ssn WHERE { ?arg1 ( core:personSSNIdentificationRef / core:identificationIDRef / dtype:value ) ?ssn. }

lexs:getSSN

SELECT ?name WHERE { ?arg1 ( core:personNameRef / core:personFullNameRef / dtype:value ) ?name. }

lexs:getFullName

SELECT ?id WHERE { ?arg1 ( jxdm:personFBIIdentificationRef / core:identificationIDRef / dtype:value ) ?id . }

lexs:getFBI-ID

SELECT ?id WHERE { ?arg1 (jxdm:personStateFingerprintIdentificationRef / core:identificationIDRef / dtype:value) ?id . }

lexs:getFingerprintID

SELECT ?id WHERE { ?arg1 ( jxdm:personStateFingerprintIdentificationRef / core:identificationJurisdictionRef / dtype:value) ?id . }

lexs:getFingerprintStateID

Magic Property

Property Chain

Page 51: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

Click to edit Master title style

© Copyright 2011 TopQuadrant Inc 51

To demonstrate interesting queries over the LEXS model we needed more data

Only one example file was available

Because we cannot use real data, we built a random cloner of the single instance file using fake data

Random values where chosen from enumerated values

Random witnesses, victims and suspects were taken from a database of fake people

Random dates were generated

The resultant dataset can have any number of incidents

Page 52: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

Click to edit Master title style

© Copyright 2011 TopQuadrant Inc 52

Generating Random Instance Data using the RDF Instance Graph “Seed”

burglary-incident-w-arrest-basic-lexs (RDF)

Automatic Cloner Using Deep Random Graph Copier

1000 Graph Clones

1 Seed Graph

Page 53: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

Click to edit Master title style

© Copyright 2011 TopQuadrant Inc 53

Where to find fake people?

http://www.fakenamegenerator.com/order.php

Using TopBraid, CSV file of up to 10,000 names was converted to RDF/OWL triples

This was done using SPINMap

RDF/OWL Instances

Page 54: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

Click to edit Master title style

© Copyright 2011 TopQuadrant Inc 54

SPINMap was used to transform the Fake People to NIEM/LEXS People

RDF/OWL Instances

Page 55: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

Click to edit Master title style

© Copyright 2011 TopQuadrant Inc 55

SPARQLMotion Script for the generation of Random Incidents using Fake People

Initialize script variables to set the count of random incidents and other graph base uris

For each random incident graph, this controls the generation of fake instances

Clones the ‘seed’ graph to make each new incident graph

For each type of person (witness, victim, etc.), a random fake name is picked from the ‘Fake Names’ Graph

On completion, the new graph is exported.

Page 56: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

Click to edit Master title style

© Copyright 2011 TopQuadrant Inc 56

Generation of Random Incidents using Fake People and Randomized Values

Incident 1 Incident n

4 Witnesses 2 Witnesses

2 Arrestees 1 Arrestee

Victim

Victim Victim

Victim

Dispatcher

Dispatcher

Operator

Operator

Officer Officer

Incident 1

Incident 10

Page 57: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

Click to edit Master title style

© Copyright 2011 TopQuadrant Inc 57

Using SPIN to classify Person Instances

Page 58: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

Click to edit Master title style

© Copyright 2011 TopQuadrant Inc 58

Using SPIN to transform Person data type properties to direct attributes

Witness Class

Victim Class

Operator Class

Officer Class

Arrestee Class

Male Class

Female Class

Dispatcher Class

Person Class

SPIN Rule on Person Class

Sub-class Relationships

Page 59: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

Click to edit Master title style

© Copyright 2011 TopQuadrant Inc 59

A Query over the Incidents Data (1 of 2)

Find all people who have

been both a witness and an

arrestee across all incidents

Masato M. Sai was arrested in incident 4, but he was also a witness, which seems suspicious. Especially considering he was an officer in incident 10 and a dispatcher in incident 1.

What’s interesting about Bartholomeus is that he was the dispatcher and got arrested for incident 10! So what’s going on here? Did he conspire with Masato?

Page 60: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

Click to edit Master title style

© Copyright 2011 TopQuadrant Inc 60

A Query over the Incidents Data (2 of 2)

Not surprising to confuse the police as

suspects if you see this going on

Page 61: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

Click to edit Master title style

© Copyright 2011 TopQuadrant Inc 61

So why Integrate Data using RDF/OWL?

“Ontology-Driven Data Refineries”

“Frictionless” Data

Page 62: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

Click to edit Master title style

© Copyright 2011 TopQuadrant Inc 62

Possible Next Steps Enhance XML Schema to OWL transformations to produce more

canonical OWL Models (direct properties)

Form a community of interest? Publish NIEM and LEXS OWL Models and SKOS Vocabularies?

Demonstrate data integration for LEXS-extended or none LEXS-based IEPDs using OWL Neutral Models and SKOS vocabularies DoJ IEPD Clearinghouse lists over 200 custom IEPDs – and this is growing

Provide tooling for generating custom IEPDs using RDF/OWL ontologies with composable message components

Page 63: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

© Copyright 2011 TopQuadrant Inc 63

Concluding Remarks

On balance, in the limited time we had, the presentation attempted to show:

1. automatic generation of OWL Models and Vocabularies from XML Schemas

2. automatic generation of RDF/OWL Graphs from XML-compliant messages

3. OWL as an expressive specification language for information models and vocabularies

4. SPARQL as a powerful way of exploring both data and models and doing transformations

Page 64: SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)

Click to edit Master title style

© Copyright 2011 TopQuadrant Inc 64

NIEM References

Main Site http://www.ise.gov/national-information-exchange-model-niem

Clearing House http://it.ojp.gov/framesets/iepd-clearinghouse-noClose.htm

DoJ http://www.it.ojp.gov/default.aspx?area=implementationAssistance&pa

ge=1017&standard=520

HSDL http://www.hsdl.org/?view&did=487388

Other http://www.ibm.com/developerworks/library/x-NIEM4/