Upload
filippo-galgani
View
318
Download
0
Embed Size (px)
Citation preview
Understanding patient
experiences from mining
primary care data
Centre for Health Informatics
Filippo Galgani
Adam Dunn
Margaret Williamson
Malcolm Gillies
Guy Tsafnat
General Practice EMRs
• Aim: measure quality of care for a range of conditions in a diverse
population using GP EMR data.
• Dataset: longitudinal data (2.5 million Australian patients) including
prescriptions, diagnoses, pathologies, referrals
• Patients’ journey: grouping patients by experience to detect relevant
patterns in data over time..
Big Data Problems
• Data collected to keep patient history:
– Dealing with missing information
– Inconsistency
– Combination of short text fields (not coded) and numerical
values
• Doctors’ time constraints make data entry inaccurate
• Progress notes not available (privacy issue)
• Patients may visit other practices (thus missing information)
• Events happen irregularly
Continuity of care
Reasons for Prescription
123571
162357
Some Reason Given
Reason Missing
1974 different for PPI prescriptions
GORD (Gastro-oesophageal Reflux Disease) 50842
Reflux - gastro-oesophageal 13596
Reflux oesophagitis 6285
GOR (Gastro-oesophageal Reflux) 6047
Gastritis 5755
Gastro-oesophageal Reflux 4356
… …
Textual inconsistency:
Natural Language Processing
gord
GORD
gord;
gord • Normalization of case
and punctuation
• Stopword Filtering
• Spelling Correction
Gastro-oesophageal
Reflux Disease Gastro-oesophageal
Reflux
oesophygitis oesophagitis
Textual inconsistency:
Natural Language Processing
• Lemmatization Oesophagitis ulcerative
Oesophagitis ulcerating
Oesophagitis
ulcer
• Acronym Expansion
• Synonyms
GORD
GORD (Gastro-oesophageal Reflux Disease)
Gastro-oesophageal Reflux Disease =
Reflux oesophagitis Gastro-oesophageal Reflux =
Reasons for Prescription
GORD (Gastro-oesophageal Reflux Disease) 50842
Reflux - gastro-oesophageal 13596
Reflux oesophagitis 6285
GOR (Gastro-oesophageal Reflux) 6047
Gastritis 5755
Gastro-oesophageal Reflux 4356
… …
GORD (Gastro-oesophageal Reflux Disease) 87217
NLP pipeline
1974 different for PPI prescriptions
123571
162357
Some Reason Given
Reason Missing
123571
162357
Some Reason Given
Reason Missing
Reasons for Prescription
?
Missing Information: Machine Learning Approach
Random set of PPI patients
annotated by experts wrt GORD
Grouping Patients by Journey
Conclusion
• Data mining on GP EMRs is challenging due to the
noisy, messy and sparse nature of the data
• Analyzing journeys is possible, it required:
– Temporal reasoning (infer missing events)
– Natural Language Processing (solve textual
inconsistencies)
– Machine Learning (predict missing information)
– Domain knowledge (for modeling)
Acknowledgment
• This research was funded by the Australian Department of Health
and Ageing through the NPS MedicineWise as part of the
MedicineInsight Program.
• I wish to express my gratitude to:
Malcolm Gillies and Margaret Williamson from NPS
Adam Dunn and Guy Tsafnat from UNSW
• Thank you for the attention