24
Introduction to The Semantic Web Rick Bradshaw M.S. Sr. Data Architect Office of the Associate VP Health Sciences IT

Introduction to T he Semantic Web

  • Upload
    yakov

  • View
    38

  • Download
    0

Embed Size (px)

DESCRIPTION

Introduction to T he Semantic Web. Rick Bradshaw M.S. Sr. Data Architect Office of the Associate VP Health Sciences IT. Overview. Introduce the Semantic Web Interactive study of ClinicalTrials.gov semantic web style Take a closer look at RDF Run example SPARQL queries - PowerPoint PPT Presentation

Citation preview

Page 1: Introduction to  T he Semantic Web

Introduction to The Semantic Web

Rick Bradshaw M.S.Sr. Data Architect

Office of the Associate VP Health Sciences IT

Page 2: Introduction to  T he Semantic Web

Overview

• Introduce the Semantic Web• Interactive study of ClinicalTrials.gov semantic

web style– Take a closer look at RDF– Run example SPARQL queries

• Introduce federation– Run example SPARQL queries against federated

data

Page 3: Introduction to  T he Semantic Web

Semantic Web Definition

• The Semantic Web facilitates applying machine-readable semantic data/metadata to resources that are distributed across the web/internet– Often associated with specific technologies

• RDF – Resource Description Framework• RDFS – RDF Schema• OWL – Web Ontology Language

• Web 3.0 (?)

http://en.wikipedia.org/wiki/Semantic_Web

Page 4: Introduction to  T he Semantic Web

Machine-readable

• A computer can read and “understand” data– Ask specific questions and get specific answers – Aggregate specific data, perform calculations,

organize/order returned data• Can Google read and “understand” web data?

Page 5: Introduction to  T he Semantic Web

Example

• Specific Question– How many Spinal Muscular Atrophy trials have been

conducted at the University of Utah and when were they conducted?

• Specific Answer = ?• Google’s Answer

– “spinal muscular atrophy trial university of utah”– 14,500 pages– Top hit is very relevant in content

• Is it “computable”?

Page 6: Introduction to  T he Semantic Web

HTML<h2>Enrolling/Ongoing:&nbsp;</h2><p>Clinical and Genetic Studies in Spinal Muscular Atrophy</p><p>Metabolic Dysfunction in SMA: impact of nutritional management</p><p>Prospective Study of Bone Abnormalities in SMA</p><p>STOP SMA: Phenylbutyrate trial in pre-symptomatic infants with SMA</p><p><span><span>Pilot newborn screening project for identification and prospective followup of infants with spinal muscular atrophy</span></span></p><p><span><span>Atalauren extension study in patients with Duchenne Muscular Dystrophy</span></span>…

Page 7: Introduction to  T he Semantic Web

ClinicalTrials.gov RDF/XML

• Semantic Web Data for Clinical Trials– (1) http://static.linkedct.org/– (2) http://static.linkedct.org/page/trials/NCT00661453

Page 8: Introduction to  T he Semantic Web

Triples Triples Triples

• Triple Statement – <s><p><o>– Subject (s) – the resource – Predicate (p) – the relationship

• Often called the “property” in OWL– Object (o) – object of the relationship

• Example – (s) trial:NCT00661453 – (p) linkedct:brief_title– (o) “CARNIVAL Type I: Valproic Acid and Carnitine in Infants

With SMA Type I ”

Page 9: Introduction to  T he Semantic Web

Abbreviations

• For ease of readability• trial:NCT00661453

– “trial:” - abbreviation for namespace“http://static.linkedct.org/resource/trials/”

– “linkedct:” - abbreviation for namespace“http://static.linkedct.org/resource/linkedct/”

Page 10: Introduction to  T he Semantic Web

Triple Notations

• There are many– Turtle– RDF– OWL– OBO

Page 11: Introduction to  T he Semantic Web

Triples Text

Subjecttrial:NCT00661453trial:NCT00661453trial:NCT00661453trial:NCT00661453cond:1237cond:1237

Predicaterdf:typect:brief_titlect:start_datect:conditionrdf:typect:condition_name

Objectct:trials“CARNIVAL…”“April 2008”cond:12347ct:condition“Spinal Muscular…”

Page 12: Introduction to  T he Semantic Web

Triple Graphtrial:NCT00661453

“CARNIVAL Type I: Valproic Acid and Carnitine in Infants With Spinal Muscular Atrophy (SMA) Type I ”

ct:brief_titlect:condition

ct:start_datecond:12347

“April 2008”

rdf:type

ct:trial

“Spinal Muscular Atrophy Type I ”

ct:condition_name

rdf:typect:condition

Page 13: Introduction to  T he Semantic Web

RDF XML

• (see file under #2)

<rdf:RDF…> <rdf:Description rdf:about="http://static.linkedct.org/resource/trials/NCT00481013"> <linkedct:brief_title>Valproic Acid in Ambulant Adults With Spinal Muscular Atrophy</linkedct:brief_title> …</rdf:RDF>

Page 14: Introduction to  T he Semantic Web

Observations

• RDF is a standard supporting consistent data representation

• Rules about standards apply– Use an existing standards whenever possible

Page 15: Introduction to  T he Semantic Web

Popular RDF Standards

• Friend of a friend– alias=foaf– describe people and links

• Dublin Core– alias=dc– “metadata” standard

• Simple Knowledge Organization System– alias=skos– terminology, thesauri, …

Page 16: Introduction to  T he Semantic Web

Data Federation

• Combine data from more than one data source• Heterogeneous data

– All data sources do not use the same standards• ds1.firstName • ds2.first_name • ds3.person_name

• Homogeneous data– All data sources use the same standards

• ds1.firstName • ds2.firstName • ds3.firstName

Page 17: Introduction to  T he Semantic Web

Property Alignment Assertions

• ds1:firstNameowl:equivalentProperty

foaf:firstName• ds2:first_name

owl:equivalentPropertyfoaf:firstName

Page 18: Introduction to  T he Semantic Web

Class Alignment Assertions

• ds1:Personowl:equivalentClass

foaf:Person• ds2:HumanBeing

owl:equivalentClassfoaf:Person

Page 19: Introduction to  T he Semantic Web

Rule-based Assertions

• Use rules to evaluate complicated “if-then” scenarios and assert results– SWRL – Semantic Web Rule Language– JRL - Jena Rule Language

Page 20: Introduction to  T he Semantic Web

Reasoning

• Compute assertions• Adds new triple statements to the triple graph• Implications

– Data of interest must be read from all data sources to compute assertions

– When data sources are large this can take a long time and adequate computational resources are required

Page 21: Introduction to  T he Semantic Web

Use Case

• Combine clinical trial data with patient data• SMA trial data from clinicaltrials.gov

(linkedct.org) with patient demographics for 5 different trials

Page 22: Introduction to  T he Semantic Web

Resources• W3 Schools

– http://www.w3schools.com/semweb/default.asp• W3C Web Sites

– http://www.w3.org/standards/semanticweb/– http://www.w3.org/RDF/– http://www.w3.org/standards/techs/owl#w3c_all

• Safari Books– http://proquest.safaribooksonline.com– Semantic Web Programming– Semantic Web for the Working Ontologist

Page 23: Introduction to  T he Semantic Web

Resources

• Jena Java API• Protégé• D2R

Page 24: Introduction to  T he Semantic Web

Entity Relationship Diagram

TRIALTRIAL_IDBRIEF_TITLECONDITION_IDSTART_DATE

CONDITIONCONDITION_IDCONDITION_NAME