Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Patient Matching A-Z
Wednesday, March 2nd 2016
Adam W. Culbertson, Innovator-in-Residence HHS, HIMSS
•Overview of Innovator-in-Residence Program
•Background on Patient Matching •Challenges to Matching •Evaluation of Matching Algorithms and Metrics
Overview
• Brings entrepreneurial individuals into HHS through collaboration with private and not-for-profit organizations
• HIMSS funded fellow working in collaboration with HHS CTOs office, IDEA LAB and the Office of the National Coordinator for Health IT
• Patient Matching Final Report, identified patient matching as a critical barrier to interoperability
• Two Year Fellowship Started August 2014-August 2016
Innovator-in-Residence (IIR) Program
Generation
Storage
Transport
Merge
Structure
Governance Policy Considerations
All Pieces needed for
Interoperability
This is where standards are
important
Significant Dates in (Patient) Matching
A Framework for Cross-Organizational
Patient Identity Management
2015
Kho, Abel N., et al Design and
Implementation of a Privacy Preserving
Electronic Health Record Linkage Tool
HIMSS Patient Identity
Integrity
Grannis, et al Privacy and Security
Solutions for Interoperable Health Information
Exchange
2009
Joffe et al A Benchmark Comparison
of Deterministic and Probabilistic Methods for Defining Manual Review
Datasets in Duplicate Records Reconciliation
Dusetzina, Stacie B., et al Linking Data for Health
Services Research: A Framework and Instructional Guide
HIMSS hires Innovator In Residence (IIR) focused
on Patient Matching
Audacious Inquiry and ONC
Patient Identification and Matching Final Report
2014
HIMSS Patient Identify Integrity Toolkit,
Patient Key Performance
Indicators
Winkler Matching and
Record Linkage
2011
Newcombe, Kennedy, & Axford
Automatic Linkage of Vital Records
1959
Dunn Record Linkage
1946
Soundex US Patent 1261167
1918
Fellegi & Sunter A Theory of
Record Linkage
1969
Grannis, et al Analysis of Identifier Performance Using a Deterministic Linkage
Algorithm
2002
Campbell, K et al A Comparison of Link Plus, The Link King, and a “Basic”
Deterministic Algorithm
RAND Health Report
Identity Crisis: An Examination of the Costs and Benefits of a Unique Patient Identifier for the US Health Care System
2008
Patient Matching Definition
Patient matching: Comparing data from multiple sources to identify records that represent the same patient. • In Healthcare involves matching varied
demographic fields from different health databases to create a unified view of a patient.
Identity Matching / Identity Resolution
Identity analysis:
link analysis, data mining
Identity resolution:
Merge/dedupe records
Identity matching Measure record similarity.
Search/retrieval
Attribute matching Compare name, DOB, COB, address, etc.
Identity data repository
Structured and unstructured data sources
% Availability of Attributes Over Region
0.00%
10.00%
20.00%
30.00%
40.00%
50.00%
60.00%
70.00%
80.00%
90.00%
100.00%Fi
rst N
ame
Mid
dle
Nam
eLa
st N
ame
Date
of B
irth
Birt
h Ye
arG
ende
r
Soci
al S
ecur
ity N
umbe
r
Addr
ess (
full)
Stre
et A
ddre
ss L
ine
1
City
Stat
ePo
stal
Cod
e
Coun
try
Abbr
evia
tion
Coun
try
Full
Nam
e
Phon
e N
umbe
r (an
y)
Hom
e Ph
one
Num
ber
Cell
Phon
e N
umbe
r
Wor
k Ph
one
Num
ber
Emai
l Add
ress
Nic
knam
e
Insu
ranc
e N
umbe
r (fr
ee te
xt)
Driv
ers L
icen
se N
umbe
r
Race
(OM
B)Ra
ce (f
ree
text
)Et
hnic
itiy
Lang
uage
Occ
upat
ion
Inco
me
Mar
ital S
tatu
sHe
ight
(cm
)He
ight
(m)
Heig
ht (i
n)He
ight
(ft)
Wei
ght (
lbs)
Wei
ght (
kg)
Bloo
d Ty
pe
Site B
Site A
Site C
• Data Quality is a Key – Garbage in and Garbage out
• Data entry errors are compound data matching complexity – Various algorithmic solutions to address these, not perfect
• Types of errors: – Missing or Incomplete Values – Inaccurate data – Fat finger errors – Information is out of date – Transposed names – Misspelled names
Data Quality
• Transposition errors • Mary Sue vs Sue Marie • Smith, John vs John, Smith
• Names change over time • Marriage, Divorce
• More than one way to spell name • Jon, John
• Data entry – Fat-finger = typo, transposition, etc.
• Phonetic variation
Data Quality
• Ideal outcome of any matching exercise is correctly answering this one question hundreds or thousands of times, Are these two things the same thing?
– Correctly identifying all the true positives and true negatives while minimizing the number of errors, false positives and false negatives
Patient Matching Goal
• True Positive- The two records represent the same patient
• True Negative- The two records don't represent the same patient
Patient Matching Terminology
• False Negative: The algorithm misses a record that should be matched
• False Positive: The algorithm creates a link to two records that don’t actually match
Patient Matching Terminology
EHR A EHR B Truth (Gold Standard)
Algorithm Match Type
Jonathan Jonathan Match Match True Positive
Jonathan Sally Non-Match Non-Match True Negative
Jonathan Sally Non-Match Match False Positive
Jonathan Jon Match Non-Match False Negative
Evaluation
Good
Bad
EHR A EHR B Truth (Gold Standard)
Algorithm Match Type
Jonathan Jonathan Match Match True Positive
Jonathan Sally Non-Match Non-Match True Negative
Jonathan Sally Non-Match Match False Positive
Jonathan Jon Match Non-Match False Negative
Evaluation
Bad
EHR A EHR B Truth (Gold Standard)
Algorithm Match Type
Jonathan Jonathan Match Match True Positive
Jonathan Sally Non-Match Non-Match True Negative
Jonathan Sally Non-Match Match False Positive
Jonathan Jon Match Non-Match False Negative
Evaluation
Bad
Truth
Algorithm
Positive Negative
Positive True Positive False Positive
Negative False Negative True Negative
Evaluation
Recall
Precision
Precision = True Positives / (True Positives + False Positives)
Recall = True Positives / (True Positives + False Negatives)
• Calculation – Precision = True Positives / (True Positives +
False Positives)
– Recall = True Positives / (True Positives + False Negatives)
• Tradeoffs between Precision and Recall – F Measure
Evaluation
Summary • Patient matching is old problem • Need to understand data attributes available • Understand their quality • Follow a systematic approach to evaluation
– Methodology to create ground truth data – Metrics
• Precision • Recall
Development of Test Data Set
Patient Database
Select Potential Matches (aka Adjudication Pool)
Compare Algorithm and Test Data Set
Human-Reviewed Match Decisions (Answer Key == Ground Truth Data Set)
Manual Reviewer 1
Manual Reviewer 2
Manual Reviewer 3
Development of Ground Truth Sets • Identify data set that reflects real word use case
• Develop potential duplicates
• Human adjudication review and classification – Match or Non-Match
• Estimate truth
– Pooled methods using multiple matching methods
/li/
‘Li’ ‘Lee’
‘Leah’
‘Leigh’ /le.ɑ/
/li.ɑ/
/lei̯/
‘Lay’
‘Laye’
/lai̯/ ‘Lie’
‘Ligh’
Quoi?
Patient Names (Answers)
Jean Rimbaud (OK, or John….)
Leigh Cramer
Alice Slawson
I don’t know what your neighbors’ names are… … but did you get them right? … did you get the *whole* name right?
Identity Matching Adjudication Collector (IMAC) User Interface
One screen of the Adjudication Collector continually provides questions to the adjudicator which need to be answered. These screens first ask the question with no dates provided and then again asks the question with dates shown.
Issues In Establishing Ground Truth
• Different truth for different applications – Credit check – Security applications – Customer support – De-duplication of mailing lists
• What is the cost of missing a match? – New record entered into database – Irritated customer – Lives are lost
• Criteria for truth must be carefully established and well-understood by annotators
– Question posed to annotators must be carefully phrased
Issues In Establishing Ground Truth
• How much time / expertise is available to judge (/discount) false positives?
• Needs to reflect real word test use case • Evaluation results are only as good as the truth on
which they are based – And only as appropriate as the evaluation is to the task that will
be performed with the operational system
• Absolute recall impossible to measure without
completely known test set (i.e. “You don’t know what you’re missing.”)
– Estimate with pooled results
Issues In Establishing Ground Truth • First step in evaluation is to determine why the
evaluation is being conducted • Different truth for different applications
– Security Applications vs Patient Health Record
• What is the cost of missing a match? – Security: Lives are lost – Health: Patient safety event, missed medications, allergies,
etc… death But…this is situation today.
• What is the cost of wrongly identifying a match? – Security : Passenger is inconvenienced / delayed – Health: Patient safety event, wrong medication, treatment,
liability, death
• Criteria for truth must be carefully established and well-understood
– E.g. Question posed to annotators must be carefully phrased
Summary for Healthcare Use Case
The Trade-off Between False Positive and False Negative Matches
• As the match score threshold is increased, the number of false positives decreases, but false negatives increase. (increasing precision)
• As the match score threshold is lowered, the number of false negatives decreases, but false positives increase (increasing recall)
Source: Grannis, S. Introduction to Record Linkage. September 27, 2012
Basic IR Metrics: Precision and Recall
“Subject”: MAHMOUD ABDUL HAMEED
12/10/1945
False positives
False negatives
“Target List”:
‘True’ Answers
System returns
Precision (P) = X/Y
Recall (R) = X/Z
X
Y
Z
MOREY APPLEBAUM MOHAMMED ABDUL HAMID MAHMOUD ABD EL HAMEED MAKMUD ABDUL HAMID MAHMOUD ABD ALHAMID
(2/4)
(2/3)
True Positives
Precision and Recall Inversely Related (1)
Database
‘True’ Hits
System returns
Recall Increased, but Precision Fell
The ‘Low Hanging Fruit’ phenomenon – more false hits will come in for every true one
Precision and Recall Inversely Related (2)
Database
‘True’ Hits
System returns
Precision Increased, but Recall Fell
More selective matching
What Makes a Good Evaluation? • Objective – gives unbiased results • Replicable – gives same results for same inputs • Diagnostic – can give information about system
improvement • Cost-efficient – does not require extensive
resources to repeat • Understandable – results are meaningful in some
way to appropriate people • Well-documented – also contextualizes results in
terms of purpose of the evaluation and task
IMAC – Admin Interface
An administrative screen allows the ability to manage IMAC users as well as manage the questions asked of users. This includes the ability to set the priority of questions and the number of judges to be used for each question.
Evaluation: Like IR Tasks • Metrics
– F-measure - harmonic mean of precision and recall • F = (β2 + 1) P R / ( (β2 P) + R) where P = precision = correct system responses / all system responses R = recall = correct system responses / all correct reference responses β = beta factor– provides a mean to control the importance of recall over precisio
– Additional Measures • False positives – items that are identified as correct responses that are
not correct responses (= 1 – Precision)
• False negatives – correct responses not identified (= 1 – Recall) • Fallout = non-relevant responses / all non-relevant reference responses
(related to, but not directly calculable from precision / recall) Issue: • Annotation Standard for Development of Ground Truth
• Large Affects on performance due to algorithm tuning
• Tuning is need specific • Setting Cut-offs
– Upper Thresholds – Feature Selection – Feature Weighing
• Blocking
Algorithm Tuning
Framework for Evaluation: EAGLES 7-Step Recipe/ISLE FEMTI* 1. Define purpose of evaluation – why doing the
evaluation 2. Elaborate a task model – what tasks are to be
performed with the data 3. Define top-level quality characteristics 4. Produce detailed system requirements 5. Define metrics to measure requirements 6. Define technique to measure metrics 7. Carry out and interpret evaluation
Originally developed as an evaluation framework for Machine Translation, but authors note that it should be able to be used as a generic evaluation framework.
*Acronyms: EAGLES – European Advisory Group on Language Engineering Standards ISLE – International Standards for Language Engineering FEMTI – Framework for the Evaluation of Machine Translation in ISLE (http://www.issco.unige.ch/femti)