46
Knowledge Linkages Augmenting Online Clinical Care Discussions with Published Literature Sam Stewart, Syed Sibte Raza Abidi NICHE Research Group Faculty of Computer Science Dalhousie University, Halifax, Canada November 30, 2010 Sam Stewart (Dal) Knowledge Linkages November 30, 2010 1 / 35

Sam Stewart - Knowledge Linages

Embed Size (px)

Citation preview

Knowledge LinkagesAugmenting Online Clinical Care Discussions

with Published Literature

Sam Stewart, Syed Sibte Raza Abidi

NICHE Research GroupFaculty of Computer Science

Dalhousie University, Halifax, Canada

November 30, 2010

Sam Stewart (Dal) Knowledge Linkages November 30, 2010 1 / 35

Outline

Introduction

Problem Description

Knowledge Linkage Framework

Preliminary Results

Conclusion

Sam Stewart (Dal) Knowledge Linkages November 30, 2010 2 / 35

Introduction

Introduction

Pediatric pain management is a complex subject

I children lack the cognitive ability to properly express theirpain, which can lead to incorrect interventions.

Lack of specialized knowledge or training in pediatric painmanagement.

I Because of the temporal and physical restrictions thatclinicians face, traditional educational systems are not aplausible solution

Web 2.0 technologies provide alternate knowledge disseminationmediums for clinicians to converge and share their knowledge.

Sam Stewart (Dal) Knowledge Linkages November 30, 2010 3 / 35

Introduction

Rationale for Knowledge Linkages

Pediatric Pain Mailing List (PPML)

I Brings together over 700 pediatric pain practitioners fromaround the world to share their clinical experiences and seekadvice

The knowledge shared on the PPML is practice-based ratherthan evidence based

It is important to augment the practice-based (tacit) knowledgeon the PPML with explicit knowledge

The goal of this project is to establish knowledge linkagesbetween discussions on the PPML and publishedliterature on Pubmed.

Sam Stewart (Dal) Knowledge Linkages November 30, 2010 4 / 35

Introduction

Project Objectives

The objective of knowledge linkage is to reaffirm thepractice-related recommendations on the PPML withevidence-based literature from Pubmed

The outcome of the project will allow users to

I search through PPML archives to find topics of interest

I Retrieve research articles related to the topics fromPubmed, through a “single-click” evidence retrievalstrategy.

Sam Stewart (Dal) Knowledge Linkages November 30, 2010 5 / 35

Introduction

Knowledge Linkage Framework

PPMLArchives

OnlineDisucssion

Forum

MessageParsing

FilteredMessages

ThreadingAlgorithm

Threads

MappingTo MeSH

ThreadMeSHThread MeSHThread

Papers

Information Retrieval

Sam Stewart (Dal) Knowledge Linkages November 30, 2010 6 / 35

Project Framework

Step 1: Processing the Archives

The archives are stored in simple ASCII text files, organized bymonth, starting June 1993 and ending December 2008

The messages are processed to extract the sender, date, subjectline and content of the messages

The messages are filtered to remove non-substantive content

PPMLArchives

OnlineDisucssion

Forum

MessageParsing

FilteredMessages

ThreadingAlgorithm

Threads

MappingTo MeSH

ThreadMeSHThread MeSHThread

Papers

Information Retrieval

Sam Stewart (Dal) Knowledge Linkages November 30, 2010 7 / 35

Project Framework

Step 2: Threading

A thread is a series of messages centred around a commonsubject.

They are the embodiment of experiential knowledge on thePPML.

The messages are assigned to threads using their subject lines.

PPMLArchives

OnlineDisucssion

Forum

MessageParsing

FilteredMessages

ThreadingAlgorithm

Threads

MappingTo MeSH

ThreadMeSHThread MeSHThread

Papers

Information Retrieval

Sam Stewart (Dal) Knowledge Linkages November 30, 2010 8 / 35

Project Framework

Step 3: Mapping to MeSH

PPMLArchives

OnlineDisucssion

Forum

MessageParsing

FilteredMessages

ThreadingAlgorithm

Threads

MappingTo MeSH

ThreadMeSHThread MeSHThread

Papers

Information Retrieval

The threads are parsed and connected toformal MeSH terms, using the Metamapprogram

Metamap creates a mapping score, which is ameasure of the strength of the connection

Each mesh-based thread is used to queryPubMed

Sam Stewart (Dal) Knowledge Linkages November 30, 2010 9 / 35

Project Framework

Metamap: Mapping free text to MeSH

Metamap is a program developed by Dr. Alan Aronson at theNLM that maps biomedical text to the MeSH lexicon

Each mapping is assigned a score that is a measure of thestrength of the mapping.

1000×(Centrality +Variation+2×Coverage+2×Cohesiveness)/6

The scores provide a baseline measure of how well the mappedMeSH term represents the original term in the thread

Sam Stewart (Dal) Knowledge Linkages November 30, 2010 10 / 35

Project Framework

Example

Sample Statement

‘‘The report stated that when music therapy is used, the

babies required less pain medication. Does anyone know of any

published reports of empirical research demonstrating the

effect?’’

Source MeSH Term Scoremusic therapy Music Therapy 1000the babies Infant 966less pain medication Pain 660less pain medication Pharmaceutical Preparations 827published reports Publishing 694empirical research Empirical Research 1000

Sam Stewart (Dal) Knowledge Linkages November 30, 2010 11 / 35

Project Framework

Step 4: Literature Search StrategyPPMLArchives

OnlineDisucssion

Forum

MessageParsing

FilteredMessages

ThreadingAlgorithm

Threads

MappingTo MeSH

ThreadMeSHThread MeSHThread

Papers

Information Retrieval

Passively links the threads to published medical literature

Naive approach: Retrieve all papers that contain every MeSHterm in the thread. If no papers exist the algorithm would dropthe lowest scoring terms and reiterate

Sam Stewart (Dal) Knowledge Linkages November 30, 2010 12 / 35

Project Framework

Naive Approach

The naive approach has several problems

I It doesn’t provide any kind of ordering on the resultingpapers

I It doesn’t fully utilize the MeSH scores

I It doesn’t take into account the possibility of incorrectmappings

One of the challenges of mapping free text with Metamap is itsinaccuracy.

The presence of a false MeSH term with a high MeSH score willprevent the retrieval of useful papers

Sam Stewart (Dal) Knowledge Linkages November 30, 2010 13 / 35

Project Framework

Improved Search Strategy

Our improved search strategy makes full use of the Metamapscores

It also addresses the problem of incorrect mappings

It is based on the Extended Boolean Information Retrieval(eBIR) algorithm

I Customizes the algorithm to deal with pediatric pain byadding a specialized filter

Let (Mi ,mi) be MeSH term i and the associated Metamap score.

Q = [Infant OR Child OR Adolescent] AND

[(M1,m1) ORP(M2,m2) ORP . . . (Mn,mn)] (1)

Sam Stewart (Dal) Knowledge Linkages November 30, 2010 14 / 35

Project Framework

Step 5: Discussion Forum

PPMLArchives

OnlineDisucssion

Forum

MessageParsing

FilteredMessages

ThreadingAlgorithm

Threads

MappingTo MeSH

ThreadMeSHThread MeSHThread

Papers

Information Retrieval

An online forum is being developed that allows practitioners tointeract with the PPML discussions and review the researcharticles for a specific discussion thread.

The forum will be navigated by a standard search function, or bya search function based on MeSH terms

As well the threads will be organized into a hierarchy based ontheir MeSH terms

Sam Stewart (Dal) Knowledge Linkages November 30, 2010 15 / 35

Project Framework

Discussion Forum

Sam Stewart (Dal) Knowledge Linkages November 30, 2010 16 / 35

Project Framework

Example Thread

Sam Stewart (Dal) Knowledge Linkages November 30, 2010 17 / 35

Project Framework

Linked Papers

Sam Stewart (Dal) Knowledge Linkages November 30, 2010 18 / 35

Results

Example

This first example is the first thread ever transmitted on thePPML

It is on the subject of Music Therapy

The following slides show the discussion, then the list of MeSHterms mapped, then a sampling of the papers returned

Sam Stewart (Dal) Knowledge Linkages November 30, 2010 19 / 35

Results

Music Therapy I

Sender: 1Subject: Music TherapyDate: Mon Jun 28 21:19:36 ADT 1993Thread: 3, falseThe last several days, the local NBC station aired a ”medical report”about the use of music therapy. The report was from Miami andincluded a short report on the use of music therapy in a NICU. Thereport stated that when music therapy was used, the babies requiredless pain medication. Does anyone know of any published reports ofempirical research demonstrating this effect?

Sam Stewart (Dal) Knowledge Linkages November 30, 2010 20 / 35

Results

Music Therapy IIMessage

Sender:2Subject: Music TherapyDate: Tue Jun 29 08:25:12 ADT 1993Thread: 3, trueI would suggest that you might contact **** ******** in Pediatricsat Washington University Medical School. Her research is onneonatal pain and she might know where the local station picked upthe report. I haven’t seen any data on the topic.

Sam Stewart (Dal) Knowledge Linkages November 30, 2010 21 / 35

Results

Music Therapy IIIMessage

Sender: 3Subject: Music TherapyDate: Tue Jun 29 10:20:41 ADT 1993Thread: 3, trueI’m not aware of specific studies conducted using music therapy toreduce the need for pain medication (i.e., music therapy to managepain). However, several cognitive interventions have been used quiteeffectively to manage pain. Donald Meichenbaum developed atechnique in the early 1970s called stress inoculation training whichcombines aspects of self-instruction training and relaxation training.

Sam Stewart (Dal) Knowledge Linkages November 30, 2010 22 / 35

Results

Music Therapy MeSH Terms

MeSH Score PapersMusic Therapy -4802 1621

Pain -4215 237890Research -2000 254813

Pharmaceutical Preparations -1688 438290Infant, Newborn -1660 422011

Education -1654 492705Teaching -1320 52938

Vaccination -1320 44092Intensive Care Units, Neonatal -1000 6847

Pediatrics -1000 34612Empirical Research -1000 8897Relaxation Therapy -1000 5677

Vision, Ocular -966 18516Behavior -966 879624Infant -966 795209Air -966 16342

Awareness -861 9711Schools, Medical -861 17436

Biomedical Research -827 28157Publishing -694 27222Cognition -589 76048

Sam Stewart (Dal) Knowledge Linkages November 30, 2010 23 / 35

Results

Music Therapy Papers Returned

Bo LK, Callaghan P. Soothing pain-elicited distress in Chineseneonates. Pediatrics:2000,105(4). 10742370.

Cignacco E, Hamers JP, Stoffel L, van Lingen RA, Gessler P,McDougall J, Nelle M. The efficacy of non-pharmacologicalinterventions in the management of procedural pain in pretermand term neonates. A systematic literature review. Europeanjournal of pain (London, England):2006,11(2). 16580851.

Kemper KJ, Danhauer SC. Music as therapy. Southern medicaljournal:2005,98(3). 15813154.

Tagore T. Why music matters in childbirth. Midwifery todaywith international midwife:2009,(89). 19397157.

Sam Stewart (Dal) Knowledge Linkages November 30, 2010 24 / 35

Results

Pilot Study

A pilot study was conducted on all messages from 2007 and 2008

100 threads were reviewed to determine

1 the accuracy of the message parsing2 the accuracy of the thread assignment3 The accuracy of the papers returned

The message parsing was successful on 74% of the messages

The threading was successful on 92% of the messages

Sam Stewart (Dal) Knowledge Linkages November 30, 2010 25 / 35

Results

Recall, Precision, Utility

Precision and relative recall were compared between the modifiedsearch strategy, the eBIR model, and a traditional VSM.

I Relative recall is for comparing search strategies ofunannotated databases

Precision =Number of relevant papers returned by the search

Total number of papers returned

Recall =Number of relevant papers returned by the search

Number of relevant papers returned by all searches

Sam Stewart (Dal) Knowledge Linkages November 30, 2010 26 / 35

Results

Precision

2 4 6 8 10 12 14

0.00

0.05

0.10

0.15

0.20

Top k papers

Pre

cisi

on

●●

●● ● ●

● ● ●●

CustomVSMeBIR

Sam Stewart (Dal) Knowledge Linkages November 30, 2010 27 / 35

Results

Recall

2 4 6 8 10 12 14

0.0

0.2

0.4

0.6

0.8

Top k papers

Rel

ativ

e R

ecal

l

● ●

●●

CustomVSMeBIR

Sam Stewart (Dal) Knowledge Linkages November 30, 2010 28 / 35

Results

Precision-Recall

0.2 0.4 0.6 0.8

0.08

0.10

0.12

0.14

0.16

0.18

Relative Recall

Pre

cisi

on

●●

● ● ●●

● ●

●●●

CustomVSMeBIR

Sam Stewart (Dal) Knowledge Linkages November 30, 2010 29 / 35

Results

Search Strategy Results

The precision of the modified algorithm is significantly higherthan the other two algorithms at k = 15 (p-values of 0.013 and0.003 respectively)

The recall, however, is only significantly different between themodified and ebir models (p < 0.0001) and not with the VSMalgorithm (p = 0.351)

Ultimately, a search is “good” if it returns at least one pertinentresult

Utility at level k is an indicator of whether the search returns arelevant paper in the first k results.

Sam Stewart (Dal) Knowledge Linkages November 30, 2010 30 / 35

Results

Utility vs. k

2 4 6 8 10 12 14

0.0

0.1

0.2

0.3

0.4

0.5

0.6

Top k papers

Util

ity

●● ●

●●

●● ●

CustomVSMeBIR

Sam Stewart (Dal) Knowledge Linkages November 30, 2010 31 / 35

Conclusion

Conclusion

The mapping of experiential to explicit clinical knowledge iscritical, given the rapid changes in medical knowledge and itsapplication in specialized domains

Clinical experiences should be supported by clinical evidence, andthis has been achieved through our Knowledge LinkageFramework

Presented a method of leveraging web 2.0 techniques byincorporation medical information retrieval strategies to improvethe overall medical knowledge base

Automatic query generation, using clinical terms and contexts, isa unique aspect of the research

Sam Stewart (Dal) Knowledge Linkages November 30, 2010 32 / 35

Conclusion

Future Work

The next step is to provide open access to a wide number ofusers and get their feedback

More time should be spent looking into the variables within theeBIR algorithm and the modified algorithm

Q = [Infant ORp1 Child ORp1 Adolescent] ANDp2

[M1 ORp3 M2 ORp3 . . .ORp3 Mn]

Tweaking the Metamap scores, either within the Metamapsystem or through post-processing, should also be explored

Sam Stewart (Dal) Knowledge Linkages November 30, 2010 33 / 35

Conclusion

Acknowledgement

This work is carried out with the aid of a grant from the InternationalDevelopment Research Centre (IDRC), Ottawa, Canada

Sam Stewart (Dal) Knowledge Linkages November 30, 2010 34 / 35

Conclusion

Thank you

Sam Stewart (Dal) Knowledge Linkages November 30, 2010 35 / 35

Appendix

Appendix

Sam Stewart (Dal) Knowledge Linkages November 30, 2010 1 / 11

Appendix Metamap

Metamap Algorithms

There are three general types of matches:

Simple match a direct connection between the recognized nounand the UMLS term

Complex match when a noun phrase can be mapped directly toa combination of UMLS semantic types

Partial match when part of the noun/noun-phrase does not mapto UMLS

The general mapping strategy is, for each term SPECIALISTrecognizes: generate all variants of the noun-phrase, form thecandidate set of all the UMLS strings that contain 1 of thevariants, sort the candidate set by the strength of mapping,combine candidates for disjoint parts of the noun-phrase, thenselect the mapping with the best score.

Sam Stewart (Dal) Knowledge Linkages November 30, 2010 2 / 11

Appendix Metamap

Variants

The variants are all composite parts of a noun-phrase, alongwith all acronyms, abbreviations and synonyms of those terms,all variants of those variants, etc . . .

For the term ocular the following figure depicts the generation ofthe variants

Sam Stewart (Dal) Knowledge Linkages November 30, 2010 3 / 11

Appendix Metamap

Metamap Scores

The scores range from [-1000, 0], with lower scores being better

The score is based on 4 metrics: centrality, variation, coverageand cohesiveness. The final score is calculated as:

−1000× (Centrality + Variation + 2×Coverage + 2×Cohesiveness)/6

Sam Stewart (Dal) Knowledge Linkages November 30, 2010 4 / 11

Appendix Metamap

Metamap Scores I

Centrality a 1/0 indicating whether the match is to the head of thephrase

Variation A measure of the distance the matched term is from theroot word. The distance, D, is a sum of the followingvariations. The score is calculated as 4

D+4.:

spelling: 0

inflectional: 1

synonym/acronym/abbreviation: 2

derivational: 3.

Sam Stewart (Dal) Knowledge Linkages November 30, 2010 5 / 11

Appendix Metamap

Metamap Scores II

Coverage How much of both the UMLS string and the phrase areinvolved in the match. The number of words in eachphrase are computed, as well as the spans of each term,i.e., the length of the matching terms, ignoringnon-matching terms. The score is calculated as

2

3

Span

UMLS Length+

1

3

Span

term length

Cohesiveness Like coverage, but focusing on connected terms. Itcalculates the length of connected components (themaximal sequence of connected words in both terms),and takes a weighted mean again, this time of the sumof squares.

2

3

SS UMLS con comps

UMLS length2 +1

3

SS phrase con comps

phrase length2

Sam Stewart (Dal) Knowledge Linkages November 30, 2010 6 / 11

Appendix Metamap

Sample Metamap ScoresFrom PPARCH.199603

Noun UMLS Cent. Var. Cov. Coh.aired Air 1 D=1;4/5 1 1

of music therapy Music Therapy 1 0;1 1 1a NICU ICU, Neonatal 1 0;1 1 1

the babies Infant 1 D=1; 4/5 1 1less pain medication Pain 0 0;1 2

311+ 1

313

2311+ 1

319

Pharm Prep 1 0;1 2322+ 1

313

2322+ 1

319

of any published reports Publishing 0 0;1 2311+ 1

312

2311+ 1

314

empirical research Empirical Research 1 0;1 1 1

Noun UMLS TOTALaired Air −1000× (1 + 4/5 + 2(1) + 2(1))/6 = −966

of music therapy Music Therapy −1000a NICU ICU, Neonatal −1000

the babies Infant −1000× (1 + 4/5 + 2(1) + 2(1))/6 = −966less pain medication Pain −1000× (0 + 1 + 2( 7

9) + 2( 19

27))/6 = −660

Pharm Prep −1000× (1 + 1 + 2( 79) + 2( 19

27)/6 = −− 827

−660−8272

= −743.5

of any published reports Publishing −1000(0 + 1 + 2( 1012) + 2( 36

48))/6 = −694

empirical research Empirical Research −1000

Sam Stewart (Dal) Knowledge Linkages November 30, 2010 7 / 11

Appendix Metamap

Extended Boolean Information Retrieval (eBIR)

The eBIR system incorporates query weights into the traditionalBIR model

Let the set of query terms be A = {(A1, s1), . . . , (An, sn)}, whereAi is the i th query term, and si is the associated score

Let the OR and AND queries be

QOR(p) = {(A1, s1) ORp . . . ORp (An, sn)}

QAND(p) = {(A1, s1) ANDp . . . ANDp (An, sn)}

The selection of p effects the influence of high-scoring terms onthe returned query scores.

Sam Stewart (Dal) Knowledge Linkages November 30, 2010 8 / 11

Appendix Metamap

Modified IR Algorithm

The problem with applying the eBIR algorithm to this project isthat it doesn’t address the issue of specialized domains

MeSH keywords such as Pediatrics or Pain could be implicitlyrepresentative of all conversations on the list

Our algorithm modified the eBIR algorithm by adding aspecialized filter

I adding an AND operator to the query

Sam Stewart (Dal) Knowledge Linkages November 30, 2010 9 / 11

Appendix Metamap

Modified IR Algorithm

The new query would modify the search query by adding Infant,Child or Adolescent to the set of MeSH terms, as demonstratedin equation (2)

I Let (Mi ,mi) be MeSH term i and the associated Metamapscore.

Q = [Infant ORpChild ORPAdolescent] ANDP

[(M1,m1) ORP(M2,m2) ORP . . . (Mn,mn)] (2)

Using the eBIR algorithm the next step would be to apply queryweights to the terms in the specialized filter and then find asuitable value for p.

Sam Stewart (Dal) Knowledge Linkages November 30, 2010 10 / 11

Appendix Metamap

Modified IR Algorithm

In order to accommodate the filter the eBIR algorithm wasmodified, making the AND operator a strict Boolean operator,and leaving the query weights on the OR operator

The decision was also made to set p = 1.

Q = [Infant OR Child OR Adolescent] AND

[(M1,m1) ORP(M2,m2) ORP . . . (Mn,mn)] (3)

The result is a search strategy customized to the pediatric paindomain, that makes full use of the Metamap scores to return apertinent set of papers

Sam Stewart (Dal) Knowledge Linkages November 30, 2010 11 / 11