22
© 2012 The MITRE Corporation. All rights reserved. Approved for Public Release: 12-2751 Identifying Negation/Uncertainty Attributes for SHARPn NLP Presentation to SHARPn Summit “Secondary Use” June 11-12, 2012 Cheryl Clark, PhD MITRE Corporation

Identifying Negation/Uncertainty Attributes for SHARPn NLP

  • Upload
    becky

  • View
    43

  • Download
    1

Embed Size (px)

DESCRIPTION

Identifying Negation/Uncertainty Attributes for SHARPn NLP. Presentation to SHARPn Summit “ Secondary Use ” June 11-12, 2012 . Cheryl Clark, PhD MITRE Corporation . The Challenge: Text Mentions versus Clinical Facts. Negation : event has not occurred or entity does not exist - PowerPoint PPT Presentation

Citation preview

Page 1: Identifying Negation/Uncertainty Attributes for  SHARPn  NLP

© 2012 The MITRE Corporation. All rights reserved.Approved for Public Release: 12-2751

Identifying Negation/Uncertainty Attributes for SHARPn NLP

Presentation to SHARPn Summit “Secondary Use” June 11-12, 2012

Cheryl Clark, PhD MITRE Corporation

Page 2: Identifying Negation/Uncertainty Attributes for  SHARPn  NLP

© 2012 The MITRE Corporation. All rights reserved.Approved for Public Release: 12-2751

■ Negation: event has not occurred or entity does not exist She had fever yesterday.

■ Uncertainty: a measure of doubt The symptoms are renal failure.

■ Conditional: could exist or occur under certain circumstances The patient should come back to the ED any rash occurs.

■ Subject: person the observation is on; experiencer had lung cancer.■ Generic: no clear subject/experiencer E. coli is sensitive to Cipro but enterococcus is not

The Challenge: Text Mentions versus Clinical Facts

Page 2

not inconsistent with

no

if

HOSPITAL-PEDIATRIC DISCHARGE SUMMARYNAME – #####DATE OF ADMISSION – ####LOCATION – #####BIRTH DATE - #####

(REASON FOR ADMISSION)SWOLLEN, PAINFUL HANDS. VOMITING. SYMPTOMS OF 18 HOURS DURATION.

(ABSTRACT)PATIENT, 1 YEAR OLD. IS KNOWN TO HAVE SICKLE CELL DISEASES AND 2 EPISODES OF MENINGITIS. DEVELOPED SWOLLEN, PAINFUL AND WARM HANDS. HAD SEVERAL EPISODES OF VOMIINT PRIOR TO ADMISSION. LABORATORY STUDIES DID NOT REVEAL ANEMIA OR SYSTEMIC INFECTION. HYDRATION THERAPY AND BED REST WERE PROVIDED, WITH IMPORVEMENT IN 48 HOURS. WAS DISCHARGED IMPROVED. TO BE FOLLOWED IN HEMATOLOGY CLINIC.

HOSPITAL-PEDIATRIC DISCHARGE SUMMARYNAME – #####DATE OF ADMISSION – ####LOCATION – #####BIRTH DATE - #####

(REASON FOR ADMISSION)SWOLLEN, PAINFUL HANDS. VOMITING. SYMPTOMS OF 18 HOURS DURATION.

(ABSTRACT)PATIENT, 1 YEAR OLD. IS KNOWN TO HAVE SICKLE CELL DISEASES AND 2 EPISODES OF MENINGITIS. DEVELOPED SWOLLEN, PAINFUL AND WARM HANDS. HAD SEVERAL EPISODES OF VOMIINT PRIOR TO ADMISSION. LABORATORY STUDIES DID NOT REVEAL ANEMIA OR SYSTEMIC INFECTION. HYDRATION THERAPY AND BED REST WERE PROVIDED, WITH IMPORVEMENT IN 48 HOURS. WAS DISCHARGED IMPROVED. TO BE FOLLOWED IN HEMATOLOGY CLINIC.

HOSPITAL-PEDIATRIC DISCHARGE SUMMARYNAME – #####DATE OF ADMISSION – ####LOCATION – #####BIRTH DATE - #####

(REASON FOR ADMISSION)SWOLLEN, PAINFUL HANDS. VOMITING. SYMPTOMS OF 18 HOURS DURATION.

(ABSTRACT)PATIENT, 1 YEAR OLD. IS KNOWN TO HAVE SICKLE CELL DISEASES AND 2 EPISODES OF MENINGITIS. DEVELOPED SWOLLEN, PAINFUL AND WARM HANDS. HAD SEVERAL EPISODES OF VOMIINT PRIOR TO ADMISSION. LABORATORY STUDIES DID NOT REVEAL ANEMIA OR SYSTEMIC INFECTION. HYDRATION THERAPY AND BED REST WERE PROVIDED, WITH IMPORVEMENT IN 48 HOURS. WAS DISCHARGED IMPROVED. TO BE FOLLOWED IN HEMATOLOGY CLINIC.

HOSPITAL-PEDIATRIC DISCHARGE SUMMARYNAME – #####DATE OF ADMISSION – ####LOCATION – #####BIRTH DATE - #####

(REASON FOR ADMISSION)SWOLLEN, PAINFUL HANDS. VOMITING. SYMPTOMS OF 18 HOURS DURATION.

(ABSTRACT)PATIENT, 1 YEAR OLD. IS KNOWN TO HAVE SICKLE CELL DISEASES AND 2 EPISODES OF MENINGITIS. DEVELOPED SWOLLEN, PAINFUL AND WARM HANDS. HAD SEVERAL EPISODES OF VOMIINT PRIOR TO ADMISSION. LABORATORY STUDIES DID NOT REVEAL ANEMIA OR SYSTEMIC INFECTION. HYDRATION THERAPY AND BED REST WERE PROVIDED, WITH IMPORVEMENT IN 48 HOURS. WAS DISCHARGED IMPROVED. TO BE FOLLOWED IN HEMATOLOGY CLINIC.

fever renal infarction rashlung cancerCipro…

nouncertainconditionalfamily member

Mother

generic

Page 3: Identifying Negation/Uncertainty Attributes for  SHARPn  NLP

© 2012 The MITRE Corporation. All rights reserved.Approved for Public Release: 12-2751

Assertion Classifier(Maximum Entropy)

Extract words,

concepts, locationsIdentify

word classes and

ordering

Compute scope

enclosures by rule

Negation & Uncertainty Cue/Scope

Tagger

Background:Assertion Analysis Tool, Version 1

3

Independent Evaluation:i2b2/VA 2010 Clinical

NLP Challenge Assertion Status Task

F Score = 0.93

Input docs

i2b2 concepts

i2b2 assertions

Identify sections

Page 4: Identifying Negation/Uncertainty Attributes for  SHARPn  NLP

© 2012 The MITRE Corporation. All rights reserved.Approved for Public Release: 12-2751

Assertion Status Integration within SHARPn Clinical Document Pipeline

Input docs …

4

… …

All annotations are UIMA Common Analysis Structure (CAS)

Assertion Classifier

(Maximum Entropy)

Extract words,

concepts, locations

Identify word classes and

ordering

Compute scope

enclosures by rule

Negation & Uncertainty

Cue/Scope Tagger

Identify sections

Updated attribute

annotationsAnnotations

cTAKES analysis engines

Page 5: Identifying Negation/Uncertainty Attributes for  SHARPn  NLP

© 2012 The MITRE Corporation. All rights reserved.Approved for Public Release: 12-2751

i2b2 Assertion Categories

Page 5

Corresponds to SHARPn conditional

■ Assertion classification system designed to meet requirements of 2010 i2b2/VA Challenge Assertion subtask Present: default categoryPatient had a stroke

Absent: problem does not exist in the patientHistory inconsistent with stroke

Possible: uncertainty expressedWe are unable to determine whether she has leukemia

Conditional: patient experiences the problem only under certain conditionsPatient reports shortness of breath upon climbing stairs

Hypothetical: medical problems the patient may developIf you experience wheezing or shortness of breath

Not Patient: problem associated with someone who is not the patient

Family history of prostate cancer

Page 6: Identifying Negation/Uncertainty Attributes for  SHARPn  NLP

© 2012 The MITRE Corporation. All rights reserved.Approved for Public Release: 12-2751

■ i2b2 assertion output values– defined for medical problems– closed set of values

– mutually exclusive (fixed priority when multiple values apply)

■ SHARPn assertion attributes

Re-architecting Assertions

Page 6

presentabsentpossiblehypotheticalnot patientconditional

negation yes/nouncertainty yes/noconditional yes/nosubject multi-valued (patient, family, donor, other…)…

– apply to various entities, events, relations

– independent– attributes can have multiple values– additional attributes may be added

single, multi-way classifier

multiple classifiers, some binary

(no SHARPn equivalent)

Page 7: Identifying Negation/Uncertainty Attributes for  SHARPn  NLP

© 2012 The MITRE Corporation. All rights reserved.Approved for Public Release: 12-2751

■ Simple mapping from i2b2 assertion classes to SHARPn attributes– Uses existing i2b2-trained single classifier model– Identifies i2b2/SHARPn equivalences– Maps to SHARPn attribute values

Assertion Module Refactoring: Phase 1

Page 7

Please call physician you develop .if[ ]

i2b2 assertion status = “hypothetical”

SHARPn conditional attribute = “true”

shortness of breath

Page 8: Identifying Negation/Uncertainty Attributes for  SHARPn  NLP

© 2012 The MITRE Corporation. All rights reserved.Approved for Public Release: 12-2751

■ Direct assignment of SHARPn attribute values ■ Will use multiple classifiers trained on SHARPn data

– Will identify attribute values directly ■ Benefits

– Aligns with SHARPn concept attributes requirements– Aligns with SHARPn clinical data annotation– Enables more accurate meaning representation

Assertion Module Refactoring: Phase 2

Page 8

He does not smoke , has no hypertension , and has history of coronary artery disease.

i2b2 2010 ParadigmChoose one:

presentabsent

possiblehypotheticalconditionalnot patientnegator

familySHARPn Attribute Paradigmnegation = presentsubject = family_member

no

absent

not patient

family

Page 9: Identifying Negation/Uncertainty Attributes for  SHARPn  NLP

© 2012 The MITRE Corporation. All rights reserved.Approved for Public Release: 12-2751

Page 9

System Errors=> Need for Better Linguistic Analysis for Assertions

■ Need for phrasal structure; scope extent not always enough

She had [no chest pain or chest pressure ] with this and this was deemed a negative test.

negated

not negated

Page 10: Identifying Negation/Uncertainty Attributes for  SHARPn  NLP

© 2012 The MITRE Corporation. All rights reserved.Approved for Public Release: 12-2751

■ Insert a signifier node into constituency parse above entity■ Use tree kernel methods to compare similarity with negated

sentences in training data (can be used on other modifiers as well with varying degrees of success)

Syntactic Approaches*

* Slide courtesy of Tim Miller, Children’s Hospital Boston

Page 11: Identifying Negation/Uncertainty Attributes for  SHARPn  NLP

© 2012 The MITRE Corporation. All rights reserved.Approved for Public Release: 12-2751

■ Use TK model to extract tree fragment features (Pighin & Moschitti 07)

■ Allows interaction with other feature types■ Faster to find fragments than do whole-tree comparisons

Tree kernel fragment mining*

* Slide courtesy of Tim Miller, Children’s Hospital Boston

Page 12: Identifying Negation/Uncertainty Attributes for  SHARPn  NLP

© 2012 The MITRE Corporation. All rights reserved.Approved for Public Release: 12-2751

■ Some assertion attributes apply to relations, too.– negation– uncertainty– conditional

Next Steps: Assertions for Relations

Page 12

The are a although do the extent of .

bleedingbleeding

fundal AVMsexplain

site ofpotentialnot

causal relation

location relationuncertain

negated

Page 13: Identifying Negation/Uncertainty Attributes for  SHARPn  NLP

© 2012 The MITRE Corporation. All rights reserved.Approved for Public Release: 12-2751

■ Model Retraining– Models for individual attributes – Linguistic features based on parser output – Training on SHARPn data– Enhancements to parsers

■ Evaluation– Accuracy on i2b2 gold annotations vs. accuracy on SHARPn gold

annotations■ i2b2 absent vs. SHARPn negated■ i2b2 possible vs. SHARPn uncertainty ■ i2b2 hypothetical vs. SHARPn conditional

– Evaluation based on system-generated entity annotations– Evaluation on CEM concept rather than on individual mentions

Next Steps: Classifier Retraining and Component Evaluation

Page 13

Page 14: Identifying Negation/Uncertainty Attributes for  SHARPn  NLP

© 2012 The MITRE Corporation. All rights reserved.Approved for Public Release: 12-2751

Thank you!

Page 14

SHARPn Negation/Uncertainty TeamJohn AberdeenDavid CarrellCheryl ClarkMatt Coarr

Scott HalgrimLynette Hirschman

Donna IhrkeTim Miller

Guergana SavovaBen Wellner

Page 15: Identifying Negation/Uncertainty Attributes for  SHARPn  NLP

© 2012 The MITRE Corporation. All rights reserved.Approved for Public Release: 12-2751

Backup Slides

Page 16: Identifying Negation/Uncertainty Attributes for  SHARPn  NLP

© 2012 The MITRE Corporation. All rights reserved.Approved for Public Release: 12-2751

Negation and temporal

Circumstantial negation (i2b2 calls this conditional)

Allergens

Clarifying Definitions

Page 16

No longer annotated as negated. Course: degree_of (tumor, CHANGED (span for “removed”))

The text span “removed” indicates the tumor was there but does not exist anymore. Originally annotated as negated.

While smoking, he does not use his nicotine patch

Allergen status distinguished from negationAllergy_indicator_class

Medications mentioned as allergens originally negated

The patient had the tumor removed.

Annotated as negated ALLERGIES

PCNSulphaZocorAsendinRocephin

Page 17: Identifying Negation/Uncertainty Attributes for  SHARPn  NLP

© 2012 The MITRE Corporation. All rights reserved.Approved for Public Release: 12-2751

Page 17

System Errors=> Need for Better Linguistic Analysis for Assertions

She had no signs of infection on her leg wounds and she did have some mild erythema around her right great toe

Issue is structure and not simply span extent:

present = should not be negatedabsent = negated

She had [no chest pain or chest pressure ] with this and this was deemed a negative test.

negated

not negated

[ ]

Page 18: Identifying Negation/Uncertainty Attributes for  SHARPn  NLP

© 2012 The MITRE Corporation. All rights reserved.Approved for Public Release: 12-2751

■ [Add screenshot]

MASTIF-Generated SHARPn attributes in cTAKES Output

Page 18

default values

calculated value

Page 19: Identifying Negation/Uncertainty Attributes for  SHARPn  NLP

© 2012 The MITRE Corporation. All rights reserved.Approved for Public Release: 12-2751

Assertions for Different Concept Types

Page 19

polarity = -1 negated

Page 20: Identifying Negation/Uncertainty Attributes for  SHARPn  NLP

© 2012 The MITRE Corporation. All rights reserved.Approved for Public Release: 12-2751

UMLS CUI-driven annotation (SHARPn)UMLE contains some concept-internal negation; concept-internal subject

Cigarette smoker Concept: [C0337667]   (finding)

Never smoked Concept: [C0425293]  Never smoked tobacco (finding)Non-smoker Concept: [C0337672]  Non-smoker (finding)

Mother smokes Concept: [C0424969]  (finding)Father smokes Concept: [C0424968]  (finding)

Mother does not smoke Concept: [C2586137]   (finding)Father does not smoke Concept: [C2733448]  (finding)

i2b2 concept excludes contextual cues; SHARPn concept includes it.

The patient has never smoked.

Issues: Differences in training data annotation

Page 20

i2b2 concept: smoked (negated)

SHARPn concept: never smoked (not negated)

Page 21: Identifying Negation/Uncertainty Attributes for  SHARPn  NLP

© 2012 The MITRE Corporation. All rights reserved.Approved for Public Release: 12-2751

No known allergies Concept: [C0262580]  No known allergies

i2b2: concept = known allergies; type = problem; assertion = absent

SHARPn: concept = no known allergies; type = disease/disorder; (finding in UMLS) assertion = presentNKAi2b2: concept = nka ; type= problem; assertion = absent

Issue: Differences in training data annotation

Page 21

Page 22: Identifying Negation/Uncertainty Attributes for  SHARPn  NLP

© 2012 The MITRE Corporation. All rights reserved.Approved for Public Release: 12-2751

We describe a methodology for identifying negation and uncertainty in clinical documents and a system that uses that information to assign assertion values to medical problems mentioned in clinical text.  This system was among the top performing systems in the assertion subtask of the 2010 i2b2/VA community evaluation Challenges in natural language processing for clinical data, and has subsequently been packaged as a UIMA module called the MITRE Assertion Status Tool for Interpreting Facts (MASTIF), which can be integrated with cTAKES. We describe the process of extending MASTIF, which uses a single multi-way classifier to select among a closed set of mutually exclusive assertion categories, to a system that uses individual, independent classifiers to assign values to independent negation and uncertainty attributes associated with a variety of clinical concepts (e.g., medications, procedures, and relations) as specified by SHARPn requirements.  We discuss the benefits that result from this new representation and the challenges associated with generating it automatically.  We compare the accuracy of MASTIF on i2b2 data with accuracy on a subset of SHARPn clinical documents, and discuss the contribution of linguistic features to accuracy and generalizability of the system.  Finally, we discuss our plans for future development.

Abstract

Page 22