Upload
becky
View
43
Download
1
Tags:
Embed Size (px)
DESCRIPTION
Identifying Negation/Uncertainty Attributes for SHARPn NLP. Presentation to SHARPn Summit “ Secondary Use ” June 11-12, 2012 . Cheryl Clark, PhD MITRE Corporation . The Challenge: Text Mentions versus Clinical Facts. Negation : event has not occurred or entity does not exist - PowerPoint PPT Presentation
Citation preview
© 2012 The MITRE Corporation. All rights reserved.Approved for Public Release: 12-2751
Identifying Negation/Uncertainty Attributes for SHARPn NLP
Presentation to SHARPn Summit “Secondary Use” June 11-12, 2012
Cheryl Clark, PhD MITRE Corporation
© 2012 The MITRE Corporation. All rights reserved.Approved for Public Release: 12-2751
■ Negation: event has not occurred or entity does not exist She had fever yesterday.
■ Uncertainty: a measure of doubt The symptoms are renal failure.
■ Conditional: could exist or occur under certain circumstances The patient should come back to the ED any rash occurs.
■ Subject: person the observation is on; experiencer had lung cancer.■ Generic: no clear subject/experiencer E. coli is sensitive to Cipro but enterococcus is not
The Challenge: Text Mentions versus Clinical Facts
Page 2
not inconsistent with
no
if
HOSPITAL-PEDIATRIC DISCHARGE SUMMARYNAME – #####DATE OF ADMISSION – ####LOCATION – #####BIRTH DATE - #####
(REASON FOR ADMISSION)SWOLLEN, PAINFUL HANDS. VOMITING. SYMPTOMS OF 18 HOURS DURATION.
(ABSTRACT)PATIENT, 1 YEAR OLD. IS KNOWN TO HAVE SICKLE CELL DISEASES AND 2 EPISODES OF MENINGITIS. DEVELOPED SWOLLEN, PAINFUL AND WARM HANDS. HAD SEVERAL EPISODES OF VOMIINT PRIOR TO ADMISSION. LABORATORY STUDIES DID NOT REVEAL ANEMIA OR SYSTEMIC INFECTION. HYDRATION THERAPY AND BED REST WERE PROVIDED, WITH IMPORVEMENT IN 48 HOURS. WAS DISCHARGED IMPROVED. TO BE FOLLOWED IN HEMATOLOGY CLINIC.
HOSPITAL-PEDIATRIC DISCHARGE SUMMARYNAME – #####DATE OF ADMISSION – ####LOCATION – #####BIRTH DATE - #####
(REASON FOR ADMISSION)SWOLLEN, PAINFUL HANDS. VOMITING. SYMPTOMS OF 18 HOURS DURATION.
(ABSTRACT)PATIENT, 1 YEAR OLD. IS KNOWN TO HAVE SICKLE CELL DISEASES AND 2 EPISODES OF MENINGITIS. DEVELOPED SWOLLEN, PAINFUL AND WARM HANDS. HAD SEVERAL EPISODES OF VOMIINT PRIOR TO ADMISSION. LABORATORY STUDIES DID NOT REVEAL ANEMIA OR SYSTEMIC INFECTION. HYDRATION THERAPY AND BED REST WERE PROVIDED, WITH IMPORVEMENT IN 48 HOURS. WAS DISCHARGED IMPROVED. TO BE FOLLOWED IN HEMATOLOGY CLINIC.
HOSPITAL-PEDIATRIC DISCHARGE SUMMARYNAME – #####DATE OF ADMISSION – ####LOCATION – #####BIRTH DATE - #####
(REASON FOR ADMISSION)SWOLLEN, PAINFUL HANDS. VOMITING. SYMPTOMS OF 18 HOURS DURATION.
(ABSTRACT)PATIENT, 1 YEAR OLD. IS KNOWN TO HAVE SICKLE CELL DISEASES AND 2 EPISODES OF MENINGITIS. DEVELOPED SWOLLEN, PAINFUL AND WARM HANDS. HAD SEVERAL EPISODES OF VOMIINT PRIOR TO ADMISSION. LABORATORY STUDIES DID NOT REVEAL ANEMIA OR SYSTEMIC INFECTION. HYDRATION THERAPY AND BED REST WERE PROVIDED, WITH IMPORVEMENT IN 48 HOURS. WAS DISCHARGED IMPROVED. TO BE FOLLOWED IN HEMATOLOGY CLINIC.
HOSPITAL-PEDIATRIC DISCHARGE SUMMARYNAME – #####DATE OF ADMISSION – ####LOCATION – #####BIRTH DATE - #####
(REASON FOR ADMISSION)SWOLLEN, PAINFUL HANDS. VOMITING. SYMPTOMS OF 18 HOURS DURATION.
(ABSTRACT)PATIENT, 1 YEAR OLD. IS KNOWN TO HAVE SICKLE CELL DISEASES AND 2 EPISODES OF MENINGITIS. DEVELOPED SWOLLEN, PAINFUL AND WARM HANDS. HAD SEVERAL EPISODES OF VOMIINT PRIOR TO ADMISSION. LABORATORY STUDIES DID NOT REVEAL ANEMIA OR SYSTEMIC INFECTION. HYDRATION THERAPY AND BED REST WERE PROVIDED, WITH IMPORVEMENT IN 48 HOURS. WAS DISCHARGED IMPROVED. TO BE FOLLOWED IN HEMATOLOGY CLINIC.
fever renal infarction rashlung cancerCipro…
nouncertainconditionalfamily member
Mother
generic
© 2012 The MITRE Corporation. All rights reserved.Approved for Public Release: 12-2751
Assertion Classifier(Maximum Entropy)
Extract words,
concepts, locationsIdentify
word classes and
ordering
Compute scope
enclosures by rule
Negation & Uncertainty Cue/Scope
Tagger
Background:Assertion Analysis Tool, Version 1
3
Independent Evaluation:i2b2/VA 2010 Clinical
NLP Challenge Assertion Status Task
F Score = 0.93
Input docs
i2b2 concepts
i2b2 assertions
Identify sections
© 2012 The MITRE Corporation. All rights reserved.Approved for Public Release: 12-2751
Assertion Status Integration within SHARPn Clinical Document Pipeline
Input docs …
4
… …
All annotations are UIMA Common Analysis Structure (CAS)
Assertion Classifier
(Maximum Entropy)
Extract words,
concepts, locations
Identify word classes and
ordering
Compute scope
enclosures by rule
Negation & Uncertainty
Cue/Scope Tagger
Identify sections
Updated attribute
annotationsAnnotations
cTAKES analysis engines
© 2012 The MITRE Corporation. All rights reserved.Approved for Public Release: 12-2751
i2b2 Assertion Categories
Page 5
Corresponds to SHARPn conditional
■ Assertion classification system designed to meet requirements of 2010 i2b2/VA Challenge Assertion subtask Present: default categoryPatient had a stroke
Absent: problem does not exist in the patientHistory inconsistent with stroke
Possible: uncertainty expressedWe are unable to determine whether she has leukemia
Conditional: patient experiences the problem only under certain conditionsPatient reports shortness of breath upon climbing stairs
Hypothetical: medical problems the patient may developIf you experience wheezing or shortness of breath
Not Patient: problem associated with someone who is not the patient
Family history of prostate cancer
© 2012 The MITRE Corporation. All rights reserved.Approved for Public Release: 12-2751
■ i2b2 assertion output values– defined for medical problems– closed set of values
– mutually exclusive (fixed priority when multiple values apply)
■ SHARPn assertion attributes
Re-architecting Assertions
Page 6
presentabsentpossiblehypotheticalnot patientconditional
negation yes/nouncertainty yes/noconditional yes/nosubject multi-valued (patient, family, donor, other…)…
– apply to various entities, events, relations
– independent– attributes can have multiple values– additional attributes may be added
single, multi-way classifier
multiple classifiers, some binary
(no SHARPn equivalent)
© 2012 The MITRE Corporation. All rights reserved.Approved for Public Release: 12-2751
■ Simple mapping from i2b2 assertion classes to SHARPn attributes– Uses existing i2b2-trained single classifier model– Identifies i2b2/SHARPn equivalences– Maps to SHARPn attribute values
Assertion Module Refactoring: Phase 1
Page 7
Please call physician you develop .if[ ]
i2b2 assertion status = “hypothetical”
SHARPn conditional attribute = “true”
shortness of breath
© 2012 The MITRE Corporation. All rights reserved.Approved for Public Release: 12-2751
■ Direct assignment of SHARPn attribute values ■ Will use multiple classifiers trained on SHARPn data
– Will identify attribute values directly ■ Benefits
– Aligns with SHARPn concept attributes requirements– Aligns with SHARPn clinical data annotation– Enables more accurate meaning representation
Assertion Module Refactoring: Phase 2
Page 8
He does not smoke , has no hypertension , and has history of coronary artery disease.
i2b2 2010 ParadigmChoose one:
presentabsent
possiblehypotheticalconditionalnot patientnegator
familySHARPn Attribute Paradigmnegation = presentsubject = family_member
no
absent
not patient
family
© 2012 The MITRE Corporation. All rights reserved.Approved for Public Release: 12-2751
Page 9
System Errors=> Need for Better Linguistic Analysis for Assertions
■ Need for phrasal structure; scope extent not always enough
She had [no chest pain or chest pressure ] with this and this was deemed a negative test.
negated
not negated
© 2012 The MITRE Corporation. All rights reserved.Approved for Public Release: 12-2751
■ Insert a signifier node into constituency parse above entity■ Use tree kernel methods to compare similarity with negated
sentences in training data (can be used on other modifiers as well with varying degrees of success)
Syntactic Approaches*
* Slide courtesy of Tim Miller, Children’s Hospital Boston
© 2012 The MITRE Corporation. All rights reserved.Approved for Public Release: 12-2751
■ Use TK model to extract tree fragment features (Pighin & Moschitti 07)
■ Allows interaction with other feature types■ Faster to find fragments than do whole-tree comparisons
Tree kernel fragment mining*
* Slide courtesy of Tim Miller, Children’s Hospital Boston
© 2012 The MITRE Corporation. All rights reserved.Approved for Public Release: 12-2751
■ Some assertion attributes apply to relations, too.– negation– uncertainty– conditional
Next Steps: Assertions for Relations
Page 12
The are a although do the extent of .
bleedingbleeding
fundal AVMsexplain
site ofpotentialnot
causal relation
location relationuncertain
negated
© 2012 The MITRE Corporation. All rights reserved.Approved for Public Release: 12-2751
■ Model Retraining– Models for individual attributes – Linguistic features based on parser output – Training on SHARPn data– Enhancements to parsers
■ Evaluation– Accuracy on i2b2 gold annotations vs. accuracy on SHARPn gold
annotations■ i2b2 absent vs. SHARPn negated■ i2b2 possible vs. SHARPn uncertainty ■ i2b2 hypothetical vs. SHARPn conditional
– Evaluation based on system-generated entity annotations– Evaluation on CEM concept rather than on individual mentions
Next Steps: Classifier Retraining and Component Evaluation
Page 13
© 2012 The MITRE Corporation. All rights reserved.Approved for Public Release: 12-2751
Thank you!
Page 14
SHARPn Negation/Uncertainty TeamJohn AberdeenDavid CarrellCheryl ClarkMatt Coarr
Scott HalgrimLynette Hirschman
Donna IhrkeTim Miller
Guergana SavovaBen Wellner
© 2012 The MITRE Corporation. All rights reserved.Approved for Public Release: 12-2751
Backup Slides
© 2012 The MITRE Corporation. All rights reserved.Approved for Public Release: 12-2751
Negation and temporal
Circumstantial negation (i2b2 calls this conditional)
Allergens
Clarifying Definitions
Page 16
No longer annotated as negated. Course: degree_of (tumor, CHANGED (span for “removed”))
The text span “removed” indicates the tumor was there but does not exist anymore. Originally annotated as negated.
While smoking, he does not use his nicotine patch
Allergen status distinguished from negationAllergy_indicator_class
Medications mentioned as allergens originally negated
The patient had the tumor removed.
Annotated as negated ALLERGIES
PCNSulphaZocorAsendinRocephin
© 2012 The MITRE Corporation. All rights reserved.Approved for Public Release: 12-2751
Page 17
System Errors=> Need for Better Linguistic Analysis for Assertions
She had no signs of infection on her leg wounds and she did have some mild erythema around her right great toe
Issue is structure and not simply span extent:
present = should not be negatedabsent = negated
She had [no chest pain or chest pressure ] with this and this was deemed a negative test.
negated
not negated
[ ]
© 2012 The MITRE Corporation. All rights reserved.Approved for Public Release: 12-2751
■ [Add screenshot]
MASTIF-Generated SHARPn attributes in cTAKES Output
Page 18
default values
calculated value
© 2012 The MITRE Corporation. All rights reserved.Approved for Public Release: 12-2751
Assertions for Different Concept Types
Page 19
polarity = -1 negated
© 2012 The MITRE Corporation. All rights reserved.Approved for Public Release: 12-2751
UMLS CUI-driven annotation (SHARPn)UMLE contains some concept-internal negation; concept-internal subject
Cigarette smoker Concept: [C0337667] (finding)
Never smoked Concept: [C0425293] Never smoked tobacco (finding)Non-smoker Concept: [C0337672] Non-smoker (finding)
Mother smokes Concept: [C0424969] (finding)Father smokes Concept: [C0424968] (finding)
Mother does not smoke Concept: [C2586137] (finding)Father does not smoke Concept: [C2733448] (finding)
i2b2 concept excludes contextual cues; SHARPn concept includes it.
The patient has never smoked.
Issues: Differences in training data annotation
Page 20
i2b2 concept: smoked (negated)
SHARPn concept: never smoked (not negated)
© 2012 The MITRE Corporation. All rights reserved.Approved for Public Release: 12-2751
No known allergies Concept: [C0262580] No known allergies
i2b2: concept = known allergies; type = problem; assertion = absent
SHARPn: concept = no known allergies; type = disease/disorder; (finding in UMLS) assertion = presentNKAi2b2: concept = nka ; type= problem; assertion = absent
Issue: Differences in training data annotation
Page 21
© 2012 The MITRE Corporation. All rights reserved.Approved for Public Release: 12-2751
We describe a methodology for identifying negation and uncertainty in clinical documents and a system that uses that information to assign assertion values to medical problems mentioned in clinical text. This system was among the top performing systems in the assertion subtask of the 2010 i2b2/VA community evaluation Challenges in natural language processing for clinical data, and has subsequently been packaged as a UIMA module called the MITRE Assertion Status Tool for Interpreting Facts (MASTIF), which can be integrated with cTAKES. We describe the process of extending MASTIF, which uses a single multi-way classifier to select among a closed set of mutually exclusive assertion categories, to a system that uses individual, independent classifiers to assign values to independent negation and uncertainty attributes associated with a variety of clinical concepts (e.g., medications, procedures, and relations) as specified by SHARPn requirements. We discuss the benefits that result from this new representation and the challenges associated with generating it automatically. We compare the accuracy of MASTIF on i2b2 data with accuracy on a subset of SHARPn clinical documents, and discuss the contribution of linguistic features to accuracy and generalizability of the system. Finally, we discuss our plans for future development.
Abstract
Page 22