Using the Micropublications ontology and the Open Annotation Data Model to represent evidence within a drug-drug interaction knowledge base--LISC2014--2014-10-19

Using the Micropublications Ontology and the Open Annotation Data Model to represent evidence within

a drug-drug interaction knowledge base

Jodi Schneider, Paolo Ciccarese, Tim Clark and Richard D. Boyce

Linked Science at ISWC 2014Riva del Garda, Trentino, Italy19 October 2014

Goal of this project

Construct & maintain a knowledge base linking to evidence

i.e. data, methods, materials

where:• Each ASSERTION in the knowledge basehas a SUPPORT GRAPH of claims and evidence • Each SUPPORT GRAPH element (claims, data, methods, materials)

is dynamically linked to specific QUOTED ELEMENTS in source documents on the Web

Why? It's time-consuming to find the state of the art in a field!

• What do we know about field F? assertion X?• What evidence supports assertion X?• What assumptions are used in research

supporting assertion X?

Application domain: medication safety

• Potential drug-drug interactions– 2+ drugs, where interaction is known to be possible

• Adverse drug event– Harm caused by medication– Huge public health issue

> 1.5 million preventable adverse drug events/year (USA)

• Post-market safety issues

Drug information sources

• Evidence is selected & assessed by editorial boards– MICROMEDEX, First DataBank, Q-DIPS

• E.g. MICROMEDEX: – "In-house team of 90+ clinically-trained editorial staff"

(physicians, clinical pharmacists, nurses, medical librarians)– "Content is reviewed for clinical accuracy and relevance."– "Critical content areas may undergo an additional review by

members of our Editorial Board."• Potential problems

– a time-consuming (i.e. expensive), collaborative, process– maintaining internal and external inconsistency is non-trivial

Part of a larger effort

• “Addressing gaps in clinically useful evidence on drug-drug interactions”

• 4-year project, U.S. National Library of Medicine R01 grant (PI, Richard Boyce)

• Evidence panel of domain experts(Carol Collins, Lisa Hines, John R Horn, Phil Empey) & informaticists(Tim Clark, Paolo Ciccarese, Jodi Schneider)

• Programmer: Yifan Ning

Build on 3 things

• Drug Interaction Knowledge Base [Boyce2007, Boyce2009]

• Open Annotation Data Model [W3C2013]• Micropublications Ontology [Clark2014]

Drug Interaction Knowledge Base (DIKB)

– Hand-constructed knowledge base– Safety issues when 2 drugs are taken together– Focus is on EVIDENCE

[Boyce2007, Boyce2009]

Drug Interaction Knowledge Base (DIKB) - Boyce 2007-2009

– Hand-constructed knowledge base– Safety issues when 2 drugs are taken together– Focus is on EVIDENCE


DIKB supports queries about assertions & evidence:

• Get all assertions that are supported by a U.S. FDA regulatory guidance statement

• Are the evidence use assumptions are concordant, unique, and non-ambiguous?

• Which assertions are supported/refuted by just one type of evidence?


Evidence Entry Interface (2008)




Limitations of DIKB v1.2

• Cannot link quotes dynamically to source text– Document-level citation– Quote & section citation preferable

• Level of detail– Want more detail on data, methods, materials

• Minimal argumentation model– swanco:citesAsSupportingEvidence– swanco:citesAsRefutingEvidence


Open Annotation Data Model

http://www.openannotation.org/spec/core/

Micropublications Ontology (MP)

Clark, Ciccarese, Goble (2014) Micropublications: a semantic model for claims, evidence, arguments and annotations in biomedical communications

http://purl.org/mp

Goal of this project



where:• Each ASSERTION in the knowledge basehas a SUPPORT GRAPH of claims and evidence • Each SUPPORT GRAPH element (claims, data, methods, materials)


Modeling strategy



where:• Each ASSERTION in the knowledge basehas a SUPPORT GRAPH of claims and evidence: MP• Each SUPPORT GRAPH element (claims, data, methods, materials)


Modeling strategy



where:• Each ASSERTION in the knowledge basehas a SUPPORT GRAPH of claims and evidence: MP• Each SUPPORT GRAPH element (claims, data, methods, materials)

is dynamically linked to specific QUOTED ELEMENTS in source documents on the Web: OA

Quotes integrated (MP using OA)

http://purl.org/mp

Clark, Ciccarese, Goble (2014) Micropublications: a semantic model for claims, evidence, arguments and annotations in biomedical communications

Enhancing the DIKB with MP and OA

1. Represent the overall argument of the paper– Support & challenge relationships– Data, methods, materials

2. Semantic tagging, so drugs & proteins can be queried using knowledge from other sources

3. Make quotes actionable (highlight in orig doc)4. Handle new competency questions

Quote stored in OA, with link to source

Predicate Object

rdf:type mp:Method

rdf:value (exact text)

Predicate Object

rdf:type oa:SpecificResource

oa:hasSource <http://dailymed…>

oa:hasSelector ex:selector-1

ex:body-1 ex:target-1

ex:annotation-1

about

Quote stored in OA, with link to source

Predicate Object

rdf:type mp:Method

rdf:value (exact text)

Predicate Object

rdf:type oa:SpecificResource

oa:hasSource <http://dailymed…>

oa:hasSelector ex:selector-1

ex:body-1 ex:target-1

ex:annotation-1

about

Predicate Object

oa:prefix (preceding text)

oa:exact (exact text)

oa:postfix (following text)

ex:selector-1

New competency questions to answer

1. Finding assertions and evidence• List all assertions that are not supported by evidence

– By data, by methods, by materials

• What is the in vitro evidence for assertion X? the in vivo evidence?

– With provenance: Give me back the original data tables

2. Enabling updates• List all evidence that has been flagged as rejected from

entry into the knowledge base– By data, by methods, by materials

New competency questions to answer

3. Assessing the evidence• Which research group conducted the study used for

evidence item X?• What are the assumptions required for use of this

evidence item to support/refute assertion X? – Without directly entering them

4. Statistics for analytics/KB maintenance• Number of evidence items for and against each assertion

type– By data, by methods, by materials

Modeling challenges

• To date, MP has not been used to represent both unstructured text claims ("escitalopram does not inhibit CYP2D6") and logical representation of text as normalized subject-predicate-object (nanopublication of statement)

• Efficient querying will be needed, even when the evidence base scales. We are using an iterative design-and-test approach.

Future work

• NLP support: Create a pipeline for extracting potential drug-drug interaction (PDDI) mentions from scientific & clinical literature

• Usability tests: Tools usable by domain experts• NLP + "crowdsourcing" (distributed annotation)• Resolving links to paywalled PDFs

Acknowledgements

• Funding– ERCIM Alain Bensoussan fellowship Program

under FP7/2007-2013, grant agreement 246016– National Library of Medicine (1R01LM011838-01)

• Thanks to the Evidence Panel of Addressing PDDI Evidence Gaps: Carol Collins, Lisa Hines, and John R Horn, Phil Empey

• Thanks to programmer Yifan Ning

Technology

Using the Micropublications ontology and the Open Annotation Data Model to represent evidence within a drug-drug interaction knowledge base--LISC2014--2014-10-19