26
Next Generation Semantic Data Environments (or Linked Data, Semantics, and Standards in Scientific Applications) Deborah L. McGuinness Tetherless World Senior Constellation Chair Professor of Computer and Cognitive Science Web Science Research Center Director Rensselaer Polytechnic Institute, Troy, NY With thanks to the extended RPI Tetherless World Team OMG Semantics : From Research to Reality: Implementing the Semantic Web March 20, 2013 Reston, VA

Next Generation Semantic Data Environments€¦ · – More open data – More authoritative data ... – Include schema / ontology information – including mapping information used

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Next Generation Semantic Data Environments€¦ · – More open data – More authoritative data ... – Include schema / ontology information – including mapping information used

Next Generation Semantic Data Environments

(or Linked Data, Semantics, and Standards in Scientific Applications)

Deborah L. McGuinness Tetherless World Senior Constellation Chair

Professor of Computer and Cognitive Science Web Science Research Center Director

Rensselaer Polytechnic Institute, Troy, NY

With thanks to the extended RPI Tetherless World Team

OMG Semantics : From Research to Reality: Implementing the Semantic Web March 20, 2013 Reston, VA

Page 2: Next Generation Semantic Data Environments€¦ · – More open data – More authoritative data ... – Include schema / ontology information – including mapping information used

Trends: More Data & More Diversity

• More data

– More open data – More authoritative data – More interest in and generation of metadata – More enthusiast generated / maintained data – More vocabularies, taxonomies, ontologies

• More diversity – Broader human participation

• Trained scientists, citizens, enthusiast, indigenous, …

– More locations – mobile as well as global – More sensors – human, robots, implants, … – Real time feeds – Social sources – Twitter, Facebook, …

2

Page 3: Next Generation Semantic Data Environments€¦ · – More open data – More authoritative data ... – Include schema / ontology information – including mapping information used

Increasing Requirements

• Data and data environments should: – Support usability – not just by original authors – Include (usable) documentation - meta data concerning collection

methods, sources, recency, assumptions, … – Provide accessibility with transparent access policies – Include schema / ontology information – including mapping information

used in integration along with rationales…. – Support queries (with usable and understandable interfaces) – Document verification and curation methods, including access to tools – Support AND encourage interactions; users should be able to comment,

question, contribute, discuss, ….

Path moves from Portal -> Virtual Observatory -> Online Community

Next: examples, foundations, and discussion

3

Page 4: Next Generation Semantic Data Environments€¦ · – More open data – More authoritative data ... – Include schema / ontology information – including mapping information used

Semantic Environmental and Ecological Monitoring

• Enable/Empower citizens & scientists to explore pollution sites, facilities, regulations, and health impacts along with provenance

• Demonstrates semantic monitoring possibilities

• Extend to endangered species and resource mgr issues

• Explanations and Provenance available

1

2 3

http://was.tw.rpi.edu/swqp/map.html and http://aquarius.tw.rpi.edu/projects/semantaqua

4 5

1. Map view of analyzed results 2. Explanation of pollution 3. Possible health effect of contaminant (from EPA) 4. Filtering by facet to select type of data 5. Link for reporting problems 6. Extended with input from USGS, with population counts for birds & fish

Page 6: Next Generation Semantic Data Environments€¦ · – More open data – More authoritative data ... – Include schema / ontology information – including mapping information used

Reusable Ontologies

• Pollution ontology describes the relationship between a regulation violation (a measurement), a polluted thing, and a polluted site

• Combined with other ontologies (e.g. W3C Geo) users can ask “Tell me all of the polluted things within 1 mile of my location”

6

Page 7: Next Generation Semantic Data Environments€¦ · – More open data – More authoritative data ... – Include schema / ontology information – including mapping information used

Ontologies

• Water quality ontology extends pollution to describe water-related pollution

• Further extended by regulation ontologies to provide “regulation violation” inference

• Allows the reasoner to match specific regulations to measurements that violate them

7

Page 8: Next Generation Semantic Data Environments€¦ · – More open data – More authoritative data ... – Include schema / ontology information – including mapping information used

Interface

8

Page 9: Next Generation Semantic Data Environments€¦ · – More open data – More authoritative data ... – Include schema / ontology information – including mapping information used

Semantic Methodology and Semantic Application Evolution

9

Originally developed for Virtual Observatories (in solar terrestrial) , now in water quality, Sea ice, volcanology, mycology, …. … McGuinness, Fox, West, Garcia, Cinquini, Benedict, Middleton The Virtual Solar-Terrestrial Observatory: A Deployed Semantic Web Application Case Study for Scientific Research. Proc. 19 Conf. on Innovative Applications of Artificial Intelligence (IAAI-07), http://www.vsto.org

SemantAqua -> SemantEco -> DataOne modularizing, broadening, provenance, interaction

VSTO -> SESDI -> SPCDIS - modularizing, provenance, broadening, interaction

Page 10: Next Generation Semantic Data Environments€¦ · – More open data – More authoritative data ... – Include schema / ontology information – including mapping information used

Population Sciences Grid: Interventions, Behaviors, and Policy

10

Extensible Mashups via Linked Data Diverse datasets from NIH Exploring Interventions along with correlations with behavior changes - in this case tobacco interventions and smoking prevalance Accountable Mashups via Provenance Award winning paper on multi-dimensional analysis

Page 11: Next Generation Semantic Data Environments€¦ · – More open data – More authoritative data ... – Include schema / ontology information – including mapping information used

An Example: Hawaii Changes in cigarette use viewed against policy changes

11

We link states from year to year to that state across time, adding data for each year.

Page 12: Next Generation Semantic Data Environments€¦ · – More open data – More authoritative data ... – Include schema / ontology information – including mapping information used

Ontology as API: Adding Dimensions

This RDF: Creates this visual:

12

y axis

x axis

dataset graph

Page 13: Next Generation Semantic Data Environments€¦ · – More open data – More authoritative data ... – Include schema / ontology information – including mapping information used

Social Observatory – First Responder effort (NIST funded)

Social Media use is on the rise. Every day, we write:

294 billion emails 2 million blog posts Over 40 Million Tweets*

First Responders, including Emergency Medical Personnel, Firefighters, and Police Officers, have active online communities on Social Media websites.

How can we leverage Social Media sites … to gather requirements for active First Responders? … to identify stakeholders within those First Responder communities?

13

Finding Topics

Finding Users

Page 14: Next Generation Semantic Data Environments€¦ · – More open data – More authoritative data ... – Include schema / ontology information – including mapping information used

Web Data “Challenge Response” Enablers

- HHS Award winning platform

- Target questions: “good hospital for my context” - Prizm, DataCube Explorer, …

14

Page 15: Next Generation Semantic Data Environments€¦ · – More open data – More authoritative data ... – Include schema / ontology information – including mapping information used

Open Government Data TWC –Intl Open Government Data Sets

Page 16: Next Generation Semantic Data Environments€¦ · – More open data – More authoritative data ... – Include schema / ontology information – including mapping information used

Mobile, Distributed, and Context-Aware Computing

Page 17: Next Generation Semantic Data Environments€¦ · – More open data – More authoritative data ... – Include schema / ontology information – including mapping information used

Open Data Workflow

First Responder Network

THEMES Observatories: Science, Open Government, Health and Life Science, Social

Web Science Research Foundations • Making Data Transparent and Actionable • Provenance • Semantic Methodology • Social Network Analysis • Semantically-Enabled Visualization • Web Data "Challenge Response" Enablers

Social Media: Reasoning on Graph Database

Health and Human Services Data Challenge

International Open Government Data Sets

Rensselaer Tetherless World Constellation Web Observatory Foundations & Directions

Multi-Dimensional Data Portals

Semantic eScience Data Portals

Page 18: Next Generation Semantic Data Environments€¦ · – More open data – More authoritative data ... – Include schema / ontology information – including mapping information used

SPARQL to Xquery translator RDFS materialization (Billion triple winner)

Govt metadata search Linked Open Govt Data

SPARQL WG, earlier QL – OWL-QL, Classic’ QL, …

OWL 1 & 2 WG Edited main OWL Docs, quick reference, OWL profiles (OWL RL),

Earlier languages: DAML, DAML+OIL, Classic

RIF WG AIR accountability tool

DL, KIF, CL, N3Logic

Inference Web, Proof Markup Language, W3C Provenance Working group formal model, W3C incubator group, …

Inference Web IW Trust, Air + Trust

Visualization APIs S2S

Govt Data

Ontology repositories (ontolinguag), Ontology Evolution env: Chimaera, Semantic eScience Ontologies, MANY other ontologie

Transparent Accountable Datamining Initiative (TAM

Foundations: Web Layer Cake

Page 19: Next Generation Semantic Data Environments€¦ · – More open data – More authoritative data ... – Include schema / ontology information – including mapping information used

Inference Web: Making Data Transparent and Actionable Using Semantic Technologies

• How and when does it make sense to use smart system results & how do we interact with them?

19

Knowledge Provenance in Virtual

Observatories

Hypothesis Investigation /

Policy Advisors

(Mobile) Intelligent

Agents

Intelligence Analyst Tools -> Watson

NSF Interops: SONET SSIII – Sea Ice

Cognitive Asst ->

CPOF & SIRI

Page 20: Next Generation Semantic Data Environments€¦ · – More open data – More authoritative data ... – Include schema / ontology information – including mapping information used

Moving to the Next Generation

20

Some focus areas to move to the next generation: • Provenance – e.g., not just the sources, and dates but

enough to know when to depend on something. • Policy – balance between sharing data, getting credit ,

making data accessible to all (or all willing to follow the rules

• Social aspects – incentives, rewards, evolution, customization

• Distributed, Mobile, and Context-aware • Education – scientific method - promote creating testable

hypotheses, how to verify/ replication, etc. • Broadly usable semantic methodology • Moving to truly integrated communities

Page 21: Next Generation Semantic Data Environments€¦ · – More open data – More authoritative data ... – Include schema / ontology information – including mapping information used

Discussion • Semantic foundations are being used in a wide range of areas. • They are not just for semantic practioners any more • Open as well as commercial software available • Come join us!

• And if you are already there…

– What do you want from evolving observatory / collaboratory infrastructure ?

– What do you need from provenance and explanation infrastructures? – Do you have tools, tool templates, and/or tool requirements? – Do you have use cases? – Are you using our (or another) semantic methodology? More info – Deborah McGuinness [email protected]

Page 22: Next Generation Semantic Data Environments€¦ · – More open data – More authoritative data ... – Include schema / ontology information – including mapping information used

Extra

22

Page 23: Next Generation Semantic Data Environments€¦ · – More open data – More authoritative data ... – Include schema / ontology information – including mapping information used

Semantic Web (RPI) 2013

RDFa Innovation

Research

Page 24: Next Generation Semantic Data Environments€¦ · – More open data – More authoritative data ... – Include schema / ontology information – including mapping information used

What is an Ontology?

Catalog/ ID

General Logical

constraints

Terms/ glossary

Thesauri “narrower

term” relation

Formal is-a

Frames (properties)

Informal is-a

Formal instance

Value Restrs.

Disjointness, Inverse, part-of…

Ontologies Come of Age McGuinness, 2001, and From AAAI Panel 99 – McGuinness, Welty, Uschold, Gruninger, Lehmann Plus basis of Ontologies Come of Age – McGuinness, 2003

Page 25: Next Generation Semantic Data Environments€¦ · – More open data – More authoritative data ... – Include schema / ontology information – including mapping information used

Interface

25

Page 26: Next Generation Semantic Data Environments€¦ · – More open data – More authoritative data ... – Include schema / ontology information – including mapping information used

Core and Framework Semantics - Multi-tiered interoperability

used by