Upload
paradigma-consulting
View
285
Download
0
Embed Size (px)
Citation preview
Motivation & Approach
To make Cochrane contents – evidence - more accessible
Cross referencing information
finding relevant passages in documents of 200+ pages
Supporting discovery & search in Cochrane review documents
Build a foundation for apps to be used in „point of care“ situations
Extract an ontology based on information contained in the Cochrane library
„discover“ Cochrane content relevant to a given patient
Using semantic models (IBM‘s System T) to extract entities &
relations
diseases, diagnoses, treatments, interventions, medication, drugs,
symptoms, complications
„… prolonged treatment with vitamin K antagonists reduces the risk of
recurrent venous thromboembolism …. ”
Page 2
L. Chiticariu, R. Krishnamurthy, Y. Li, F. Reiss, and S. Vaithyanathan, “Domain adaptation of rule-based annotators for named-entity recognition tasks,” in
Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, 2010, pp. 1002–1012.
A. Nagesh, G. Ramakrishnan, L. Chiticariu, R. Krishnamurthy, A. Dharkar, and P. Bhattacharyya, “Towards efficient named-entity rule induction for customizability,”
in Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, 2012, pp.
128–138.
Page 3
Semantic models using System T / AQL
•Extractcandidatefeatures
•Apply filters
•Post processing
Dictionary
Learning
•Extract basicfeatures
•Combine features
•Annotate andcanonical form
Named Entity
Recognition
•Combine modifiers withentities
•Combine Relation Hintswith extendedspans
•Normalize
Relation Identification
Extract Candidate Features – Context Hints
:treatment <is> <more> effective than :treatment
/(administer|apply)ing/ :treatment
/tak(e|ing)/ :drug
:drug <consumption>
/doses used in/ :disease
/(health )?consequences? of/ :disease
/(medication|therapy|treatment) for/ :disease
Page 5
Dictionary
Learning
Dictionary Learning - Postprocessing
Shorten by intrinsic hierarchy
„long-term compression therapy“ „compression therapy“
“probable ischaemic stroke“ „ischaemic stroke“
(use match statistics as heuristic measures)
Filter using statistics
remove „exploded terms“(few original matches, many dictionary matches)
remove „weak terms“(few matches, but many matches for parent)
Type Deduction
Type candidates from source extractor modules
„same list“ (entities mentioned within a list)
comparation patterns: „compare with“, „in combination with“
Page 6
Dictionary
Learning
Building Blocks for Relation Extraction
Named Entities
(„calcium channel blockers“, „parkinson‘s
disease“)
Relation Hints
(A was caused by B; A has positive effects on B …)
tag with relation type: CAUSE, PREVENT,
INCREASE, …
Modifier Hints
(use of A; risk of A; developing A …)
tag with modifier type: RISK, USE, INFECTION,
GROWTH, …Page 8
Relation Identification
Illustrated cases
List and Bracket Processing
„Other drugs include carbamazepine and newer antiepileptics
(lamotrigine, topiramate and zonisamide) and the atypical
antipsychotics (clozapine, aripiprazole and ziprasidone)“
Special Cases (for “is a”)
„atypical antipsychotics (clozapine, aripiprazole and ziprasidone)“
„infections, such as malaria and hookworm”
„selenium, vitamin C and other antioxidants“
Simple direct relations
@PREVENT :entity /(protect|help)s against/ :entity
@PREVENT :entity <can> <adverb> ? prevent :entity
@CAUSE :entity <is> <adverb>? followed by :entity
@CAUSE :entity <can> <adverb>? result in :entity
Page 9
Relation Identification
Relation Postprocessing
Combine consecutive modifiers and Named Entities:„a reduction in the risk of developing A“
REDUCE.RISK.GROWTH
Combine Relation Hints and Extended Entity Spans:„use of calcium channel blockers was associated with a reduction in the risk of developing
parkinson‘s disease“
USE A CAUSE REDUCE.RISK.GROWTH B
Simplify („translate“) to create the final Semantic Relation
A REDUCE RISK B
calcium channel blockers REDUCE RISK parkinson‘s
disease
Page 10
Relation Identification
Observations and insights gained …
Rule based system adequate for medical reports
Statistical approaches require larger corpora
Grammatical parsers alone not sufficiently specific
Domain specific language aids semantic modelling
Problems encountered, responses (POS)
Adjective contamination
Some antiepileptic drugs are marketed specifically for migraine prophylaxis.
Delimiting entities and relations
Drug therapy for migraine falls into two categories.
Patients were likely to reduce the number of their migraine headaches by 50%.
Efforts commensurate with the text corpus
Continuous improvement process inherent in our approach
Building on top of the existing dictionaries and patterns
Page 19
What‘s next?
Improve AQL extraction results
Improve entity normalization and types
(eg. make better use of entity components: „endothelin receptor
antagonist“)
Identify most relevant relations
Extraction of Structured Context
Use ontology for point of care situations
Introduce deep learning technology
User interface (mobile systems of engagement) for „point of care“
situations
Combine with patient data to guide the discovery process in
Cochrane reviews
Page 20