Download pdf - Reuse and Sharing of Electronic Health Record Data · Reuse and Sharing of Electronic Health Record Data with a focus on Primary Care and Disease Coding ACADEMISCH PROEFSCHRIFT ter

Reuse and Sharing of Electronic Health Record Data

with a focus on Primary Care and Disease Coding

Annet Sollie

Reuse and Sharing of Electronic Health Record DataPhD thesis, with a summary in Dutch.

No part of this thesis may be reproduced without prior permission of the author.

ISBN: 978-94-6182-757-9

Author: J.W. (Annet) Sollie, copyright © 2016

Cover: Janneke Laarakkers, PlanPuur

Lay-out & Print: Off Page, Amsterdam

VRIJE UNIVERSITEIT

Reuse and Sharing of Electronic Health Record Data

with a focus on Primary Care and Disease Coding

ACADEMISCH PROEFSCHRIFT

ter verkrijging van de graad Doctor aan

de Vrije Universiteit Amsterdam,

op gezag van de rector magnificus

prof.dr. V. Subramaniam,

in het openbaar te verdedigen

ten overstaan van de promotiecommissie

van de Faculteit der Geneeskunde

op vrijdag 27 januari 2017 om 11.45 uur

in de aula van de universiteit,

De Boelelaan 1105

door

Johanna Wilhelmina Sollie

geboren te Zwolle

promotoren:

prof.dr. M.E. Numans

prof.dr. R.H. Sijmons

copromotor:

dr. C.W.Helsper

thesis committee: prof. dr. H.E. van der Horst VU Medisch Centrum, Amsterdam

prof. dr. R.A.M.J. Damoiseaux Universitair Medisch Centrum, Utrecht

prof. dr. M.C. Cornel VU Medisch Centrum, Amsterdam

prof. dr. S. Brinkkemper Universiteit Utrecht

dr.ir. R. Cornet Academisch Medisch Centrum Amsterdam

The printing of this thesis was financially supported by SBOH, employer of GP-trainees

in the Netherlands

5

thesis committee prof. dr. H.E. van der Horst VU Medisch Centrum, Amsterdam prof. dr. R.A.M.J. Damoiseaux Universitair Medisch Centrum, Utrecht

prof. dr. M.C. Cornel VU Medisch Centrum, Amsterdam prof. dr. S. Brinkkemper Universiteit Utrecht dr.ir. R. Cornet Academisch Medisch Centrum Amsterdam

The printing of this thesis was financially supported by SBOH, employer of GP-trainees in the Netherlands

Voor mijn kinderen:

Kristel

Nathalie

Mike

Marilyn

9

ContEntS

Chapter 1 General Introduction, aims and outline of the thesis 11

Quality of Data – Literature ReviewChapter 2 Quality of Data in the Primary Care Electronic Health Record

(EHR) System 25(Published: Huisarts & Wetenschap August 2013)

Quality and Reusability of Primary Care EHR Data – Hands-on identification of bottlenecksChapter 3 Reusability of coded data in the primary care Electronic Medical

Record: a dynamic cohort study concerning cancer diagnoses 35(Published: Int. J. Med. Inf. August 2016)

Chapter 4 Do GPs know their cancer patients? Assessing the quality

of cancer registration in Dutch Primary Care: a cross sectional

validation study 57(Published: BMJ Open September 2016)

Chapter 5 Primary Care management of women with breast cancer related

concerns - a dynamic cohort study using a Network Database 79(Published: Eur. J. Cancer Care June 2016)

Strategies & Solutions for improving data quality and enabling data re-use and sharingChapter 6 A new coding system for metabolic disorders demonstrates

gaps in the international disease classifications ICD-10 and

SNOMED-CT which can be barriers to genotype-phenotype

data sharing 103(Published: Human Mutation July 2013)

Chapter 7 SORTA: a System for Ontology-based Recoding and Technical

Annotation of biomedical phenotype data 119(Published: Database, the journal of biological databases and curation

September 2015)

Chapter 8 Proposed roadmap to stepwise integration of genetics in family

medicine and clinical research 143(Published: Clinical & Translational Medicine, February 2013)

Summary

Chapter 9 Summarizing Discussion 157

Chapter 10 Nederlandse Samenvatting, Dankwoord, Curriculum Vitae 175

CHaPtERGeneral Introduction, aims

and outline of the thesis1

13

1

Gen

era

l Intr

od

uc

tIon

, aIm

s an

d o

utlIn

e of th

e thesIs

GEnERaL IntRoDuCtIon

To introduce the field of study of this thesis, the following case scenario is presented:

Case ScenarioJanuary 2016: a 48-year-old male consults his General Practitioner (GP) for a persistent

mucus-producing cough. He is worried because he has been coughing for more than

8 weeks now and used to be a solid smoker until he stopped 5 years ago. The cough is

interfering with his sporting activities and his sleep. The patient has no fever, no hoarseness

and does not cough up blood but did lose some weight over the last few months. The

GP performs a physical examination that turns out to be normal. According to the Dutch

College of General Practitioners’ guideline “Acute Cough” the GP decides to send the

patient to the hospital for a chest X-ray.

In her* Electronic Health Record (EHR) system the GP registers the consultation according to

the Subjective Objective Analysis Plan (SOAP)**[1] primary care registration structure as follows:

8

Chapter 1. General Introduction, aims and outline of the thesis

To introduce the field of study of this thesis, the following case scenario is presented:

Case Scenario January 2016: a 48-year-old male consults his General Practitioner (GP) for a

persistent mucus-producing cough. He is worried because he has been coughing for more

than 8 weeks now and used to be a solid smoker until he stopped 5 years ago. The cough is

interfering with his sporting activities and his sleep. The patient has no fever, no hoarseness

and does not cough up blood but did lose some weight over the last few months. The GP

performs a physical examination that turns out to be normal. According to the Dutch College

of General Practitioners’ guideline “Acute Cough” the GP decides to send the patient to the

hospital for a chest X-ray.

In her* Electronic Health Record (EHR) system the GP registers the consultation

according to the Subjective Objective Analysis Plan (SOAP)**[1] primary care registration

structure as follows:

The GP also creates a new Episode in the registry. [2]. Episodes are used to cluster

consultations that belong together and, as such, provide an overview of diagnoses. Based on

the text in the Analysis-line, the system suggests, among other codes, an ICPC-1

(International Classification of Primary Care) [3,4] code R84# (malignancy bronchus/lung)

for this consultation. She accepts this suggestion since no code or data-field is available to

register suspected cancer. Thinking the diagnosis of lung cancer is highly likely in this

patient, the GP uses the code for that disorder to make sure her opinion is recorded

prominently in the patient’s EHR.

Journal Date Description ICPC 15-01-2016 S Cough > 8 weeks, productive, no reported fever or haemoptysis.

Weight loss of 3 kg, insomnia, and exercise induced dyspnoea. Previous smoker (25 pack-years, stopped > 5 years)

O Not appear ill nor in respiratory distress. Respiratory rate 16/min. Chest examination: normal vesicular breath sounds, no added sounds. Hoarseness -. No palpable lymph nodes, throat: normal.

A Cough DD lung cancer? R84# P CXR and appt. 1 week

The GP also creates a new Episode in the registry. [2]. Episodes are used to cluster

consultations that belong together and, as such, provide an overview of diagnoses. Based

on the text in the analysis-line, the system suggests, among other codes, an ICPC-1

(International Classification of Primary Care) [3,4] code R84# (malignancy bronchus/lung)

for this consultation. She accepts this suggestion since no code or data-field is available

to register suspected cancer. Thinking the diagnosis of lung cancer is highly likely in this

patient, the GP uses the code for that disorder to make sure her opinion is recorded

prominently in the patient’s EHR.

9

The chest X-ray turns out to be normal and the GP shares this news with her patient in

a consultation by telephone. That day work is hectic and she forgets to remove the lung

cancer code from the EHR episode and to replace it with the appropriate code for this

consultation: the symptom code for “Cough (ICPC-1 code R05)”. The patient, reassured and

relieved, recovers completely in the following weeks.

However, on a sunny Saturday morning early March, the patient visits an out-of-hours

clinic because he sprained his ankle during the first football match of the season. He is

frightened out of his senses when the GP on duty asks him attentively if he has already started

his treatment for lung cancer.

A month later, his anonymized EHR record is sent to a large research database at the

nearby University where investigators are working on various projects such as early detection

of cancer. In this database he is now registered as a lung cancer patient and researchers assess

his consultations, lab results and radiology investigations in the 2 years prior to the diagnosis.

Little does the patient know that coming December he will also be counted as a “cancer case”

in the assessment of the GPs workload by an insurance company with the aim to determine

financial reimbursement and to calculate a quality indicator to asses quality of cancer care

provided by the GP.

Obviously, the GP did not make a mistake in her diagnostic workup with this patient,

however, she did make a mistake during her ICPC coding routine in the first consultation and

again, during the second consultation. The first mistake is that she accepted a “diagnosis”

code for a hypothesis, the second mistake is that she did not correct the code of the hypothesis

in the Episode into a symptom code [2]. Unfortunately, both easy-to-make mistakes are likely

to be common, because of time pressure in daily Primary Care practice (10 minutes per

consultation including expanding registration constraints) and EHR system design and

interface which have not yet fully evolved. These error-prone coding routines do not disturb

the medical process in every-day practice. However, problems do arise when medical files

containing these errors are used by someone else in the setting of clinical care or when they

are reused for other purposes such as quality management or research.

Episodes Start date Description ICPC 15-01-2016 Cough DD Lung cancer? R84

14

1

Gen

era

l Intr

od

uc

tIon

, aIm

s an

d o

utlIn

e of th

e thesIs

The chest X-ray turns out to be normal and the GP shares this news with her patient in

a consultation by telephone. That day work is hectic and she forgets to remove the lung

cancer code from the EHR episode and to replace it with the appropriate code for this

consultation: the symptom code for “Cough (ICPC-1 code R05)”. The patient, reassured

and relieved, recovers completely in the following weeks.

However, on a sunny Saturday morning early March, the patient visits an out-of-hours

clinic because he sprained his ankle during the first football match of the season. He is

frightened out of his senses when the GP on duty asks him attentively if he has already started

his treatment for lung cancer.

A month later, his anonymized EHR record is sent to a large research database at

the nearby University where investigators are working on various projects such as early

detection of cancer. In this database he is now registered as a lung cancer patient and

researchers assess his consultations, lab results and radiology investigations in the 2 years

prior to the diagnosis. Little does the patient know that coming December he will also

be counted as a “cancer case” in the assessment of the GPs workload by an insurance

company with the aim to determine financial reimbursement and to calculate a quality

indicator to asses quality of cancer care provided by the GP.

Obviously, the GP did not make a mistake in her diagnostic workup with this patient,

however, she did make a mistake during her ICPC coding routine in the first consultation

and again, during the second consultation. The first mistake is that she accepted a

“diagnosis” code for a hypothesis, the second mistake is that she did not correct the code

of the hypothesis in the Episode into a symptom code [2]. Unfortunately, both easy-to-

make mistakes are likely to be common, because of time pressure in daily Primary Care

practice (10 minutes per consultation including expanding registration constraints) and

EHR system design and interface which have not yet fully evolved. These error-prone

coding routines do not disturb the medical process in every-day practice. However,

problems do arise when medical files containing these errors are used by someone else

in the setting of clinical care or when they are reused for other purposes such as quality

management or research.

Reuse of EHR dataComputer scientists and medical informaticists used to shudder at the thought: reuse of

EHR data for other purposes [5]. In the early nineties, van der Lei warned against the reuse

of routine care EHR data for other purposes such as research by launching his fist law of

informatics [6]: “data shall only be used for the purpose for which they were collected”.

In his opinion, medical data is recorded for a specific purpose and that purpose has an

influence on what data is recorded and how, thereby limiting its quality and usability for

other purposes[7]. This is illustrated in the case scenario described above: a registration

that makes perfect sense to GPs in everyday practice but can be interpreted completely

wrong by re-users of data.

15

1

Gen

era

l Intr

od

uc

tIon

, aIm

s an

d o

utlIn

e of th

e thesIs

Today (2016), however, reuse of EHR data for other purposes than its traditional

role of recording and supporting the healthcare process, is becoming commonplace.

Anonymized Primary Care EHR records are already being re-used as a data source

for purposes of quality assessment and for producing indicators of care for insurance

companies[8,9], for pro-active indicated prevention[10] in specified risk groups[10], and

for research[11,12].

EHR data is potentially valuable to improve care: for instance to identify and

subsequently monitor within practice populations patients with an increased risk of

disease such as cardiovascular events, cancer or cognitive derailment and psychiatric

disorders [13,14]. These patients can be included in pro-active indicated prevention

projects, which are thought to be promising tools in managing the ever-increasing

workload of family physicians. Although studies regarding the effectiveness of indicated

prevention are not as promising as some would have expected[15,16], there are examples

of these projects running successfully such as those for the frail elderly with multi-

morbidity or complex care-needs [17,18].

Policy makers have discovered the possibilities of EHR data collected during routine

care to calculate quality indicators, which are increasingly often being used as a

supposedly accurate basis to assess quality of care [8,9]. For Dutch primary care alone,

over one hundred quality indicators have already been established today and more are

being developed. Because manual assessment of these indicators is a time-consuming

burden for healthcare professionals, policy makers aim for automatic calculation based

on extracted routine care data[9].

There are also substantial benefits of reusing EHR data for research purposes. In

the Netherlands, as in the UK, the US and many other western countries, GPs have

been registering data on their patients electronically for more than 20 years and are

sharing their anonymized patient data with practice-based research network (PBRN)

databases [12,19,20]. Retrospective research possibilities using this voluminous “data

goldmine” that comprises many years of follow-up seem endless. As no patient or

data recruitment is necessary: scientific discoveries can be accelerated while cutting

costs at the same time. Furthermore, studies that would otherwise be difficult

to perform, for instance in the field of rare diseases, or concerning health events

that would otherwise be difficult to capture, could be accomplished with routinely

collected EHR data [21].

Besides reuse of data for purposes of pro-active indicated prevention, for assessment

of quality indicators and for research, EHR data is also increasingly being shared for

the purpose of every-day care. In the Netherlands, data sharing through the National

Switch Point (LSP) project between GPs and out-of-hours clinics or other medical

professionals is being expanded further, despite delays and patient participation

numbers lagging behind.

16

1

Gen

era

l Intr

od

uc

tIon

, aIm

s an

d o

utlIn

e of th

e thesIs

The problemAlthough the potential for re-use of EHR data is huge and examples of re-use in health care,

policy making and research already exist, the concerns that prompted Van der Lei to state

his first law of Informatics many years ago are still valid. Indeed, studies into reusability of

hospital clinical data for research demonstrate that data is often incomplete, incorrect and

inaccurate, including episode list errors and inaccurate diagnostic codes. [22,23]. In contrast,

little is known about EHR data quality and its suitability for reuse and sharing within the

domain of primary care. The scarce literature in this field that reports on only a few aspects

of data quality; mainly ‘completeness’ and ‘correctness’ (model table 1), is giving rise to

concerns. There are signals that data quality is suboptimal [24–26] but information on the

extent of the problem, as well as information on relevant determinants of data quality,

causes for and consequences of suboptimal quality is currently missing. This prompts the

question: are we in fact reusing and sharing a data-goldmine or quite the opposite?

table 1. Dimensions of data quality^

Dimension Description

Completeness A characteristic of Information Quality measuring the degree to which all required data is known

Correctness Conforming to an approved or conventional standard, conforming to or agreeing with fact, logic or known truth

Concordance Or: consistency. The condition of adhering together, the ability to be asserted together without contradiction

Plausibility Or: accuracy to surrogate source. A measure of the degree to which data agrees with an original, acknowledged authorative source of data (in this context: general medical knowledge)

Currency The quality or state of information of being up-to-date and not outdated

^ Definitions from the IAIDQ (International Association for Information and Data Quality (http://iaidq.org/main/glossary.shtml)

aImS anD outLInE of tHIS tHESIS

In summary, EHR data is increasingly being shared and reused and we expect this trend

to persist and expand in the coming years. Currently, we do not know enough about the

quality and resulting reusability of this data for other purposes, and not enough about the

suitability for data sharing as a means to improve every-day care. Therefore, there is a need

to quantify and explore this problem within Primary Care and this is the first aim of this thesis.

We decided to explore a number of dimensions of data quality in Primary Care using various

approaches, including literature study. We focused on diagnosis registry as a central item in

17

1

Gen

era

l Intr

od

uc

tIon

, aIm

s an

d o

utlIn

e of th

e thesIs

the EHR with special attention for disease coding since re-users rely heavily on coded data.

We also wanted to identify bottle-necks in reuse hands-on, thus by actually doing research

with EHR data, not only coded but also free-text. The choice for cancer as a disease under

study in the first part of this thesis is not accidental: there is a reference standard available

through the Netherlands Cancer Registry[27]. Furthermore patient care for cancer-survivors

is partly subject to transition to Primary Care, which means reliable data should be available.

In summary, in the first part of this thesis (figure 1) we aim to:

1. assess data quality in the Primary Care EHR by

y Chapter 2: Studying literature to provide an overview of current knowledge on

data quality in Primary Care;

y Chapter 3: Studying the quality of coded cancer diagnosis registration

(numerical comparison on population level with external reference for three

common cancer types)

y Chapter 4: Assessing diagnosis registry (completeness and correctness of

cancer registration through record linkage to an external reference for four

common cancers)

y Chapter 5: Doing research using coded as well as free-text data concerning

GP management of women with breast cancer related concerns;

In the second part of this thesis we try to contribute to solutions in order to improve data quality

and enable reuse and sharing of EHR data. We decided to broaden our horizons by working

with rare diseases as opposed to common diseases (cancer) in the first part of this thesis, but

also by searching participation with medical specialists (hospital EHRs) and bio-informaticians.

Because we found coding errors to be a major cause for suboptimal data quality, we decided

to focus more on disease coding by actually developing a more complete coding system in the

field of rare diseases and by participating in the development of a coding tool. In the study

described in chapter 5 we discovered a lack of application of available genetics-knowledge in

Primary Care, partly caused by current design and limitations in the EHR. Hence we decided

to develop a practical roadmap on this subject, including items to improve EHR data quality

in Primary Care. In summary, in the second part of this thesis (figure 1) we aim to:

2. Find strategies and solutions to improve quality of EHR data and to contribute to

the enabling of reuse and sharing of EHR data.

y Chapter 6: Explore coding pathways by actually developing a disease coding

system for (rare) metabolic diseases in cooperation with paediatricians;

y Chapter 7: Participate in the development a tool for retrospective coding of text

and mapping of existing coding systems in cooperation with bio-informaticians;

y Chapter 8: Propose practical improvements to the EHR and to coding

systems by developing a “roadmap” to stepwise integration of genetics in family

medicine and clinical research.

18

1

Gen

era

l Intr

od

uc

tIon

, aIm

s an

d o

utlIn

e of th

e thesIs

15

Thes

is:

Reus

e an

d sh

arin

g of

Ele

ctro

nic H

ealth

Rec

ord

Data

– w

ith a

focu

s on

Prim

ary

Care

and

dise

ase

codi

ng

Qual

ity &

Reu

sabi

lity

of D

ata

Hand

s on

Iden

tifica

tion

of b

ottle

neck

s and

are

as fo

r im

prov

emen

t

Stra

tegi

es &

Solu

tions

to e

nabl

e re

use

& sh

arin

g of

EHR

dat

a

Chap

ter 2

Lite

ratu

re re

view

Da

ta q

ualit

y in

the

Elec

tron

ic He

alth

Rec

ord

of th

e GP

Chap

ter 4

Dia

gnos

tic D

ata

Qua

lity

asse

ssm

ent u

sing

rec

ord

linka

ge:

Do GP

s kno

w the

ir ca

ncer

patie

nts?

a rec

ord l

inkag

e stud

y asse

ssing

the

quali

ty of

canc

er re

gistry

in

Prim

ary C

are

Chap

ter 5

Doi

ng re

sear

ch u

sing

co

ded

& fr

ee-t

ext E

HR

data

Pr

imar

y Car

e man

agem

ent o

f wo

men w

ith br

east

canc

er re

lated

co

ncer

ns; a

dyna

mic c

ohor

t stud

y us

in g a

netw

ork d

ataba

se

Chap

ter 3

Qua

lity

& R

eusa

bilit

y of

Cod

ed D

iagn

osis

Regi

stry

Re

usab

ility o

f cod

ed da

ta in

the

Prim

ary C

are E

HR; a

dyna

mic

coho

rt stu

dy co

ncer

ning c

ance

r dia

gnos

es

Chap

ter 8

Im

prov

ing

the

EHR

& C

odin

g Sy

stem

s Pr

opos

ed ro

adma

p to

stepw

ise in

tegra

tion o

f ge

netic

s in f

amily

med

icine

an

d clin

ical r

esea

rch

Chap

ter 6

Dev

elop

men

t of

a D

isea

se C

odin

g sy

stem

A n

ew co

ding s

ystem

for

metab

olic d

isord

ers

demo

nstra

tes ga

ps in

int

erna

t. dise

ase c

lassif

.

Qual

ity o

f Dat

a Lit

erat

ure

Revi

ew

Chap

ter 9

Sum

mar

izing

Di

scus

sion

Tr

ansla

tion

of o

vera

ll re

sults

in

to ch

eckl

ist fo

r reu

se a

nd

data

shar

ing.

Sug

gest

ions

for

impr

ovem

ent o

f dat

a qu

ality

Chap

ter 7

Mat

chin

g an

d en

rich

ing

dise

ase

codi

ng

syst

ems

Deve

lopme

nt an

d us

e of a

tool

for co

ding

syste

ms: p

ilot m

etabo

lic

disea

ses

Figu

re 1

: Str

uctu

re o

f the

The

sis

fig

ure

1. S

truc

ture

of

the

Thes

is

19

1

Gen

era

l Intr

od

uc

tIon

, aIm

s an

d o

utlIn

e of th

e thesIs

We conclude this thesis with a summarizing discussion including a translation of results

into a checklist for EHR data reuse and sharing in chapter 9.

Notes* Whenever in this thesis the GP is referred to as female (“her” or “she”), the reader can

replace this by “his”, or “he” if so preferred

**A “SOAP”-journal entry consists of four data fields. The first is “Subjective” and is used

to register in plain text what the patient describes, such as complaints and the reason

for the encounter. The second is “Objective” and includes the GPs findings; results from

clinical examination and measurements, mostly in text format. The third “Analysis”

is used to register the diagnosis or most important symptom and is coded using the

International Classification of Primary Care version 1 (ICPC-1) 2009 coding system. The

final is “Plan”, comprising the GPs medication prescriptions, referrals to medical specialists

and follow-up appointments. # Code for malignant neoplasm bronchus/lung, from the International Classification of

Primary Care version 1 (ICPC-1) 2009 coding system, published by the WHO (http://www.

rivm.nl/who-fic/cdromthesaurus/Pagerenglish.pdf )

20

1

Gen

era

l Intr

od

uc

tIon

, aIm

s an

d o

utlIn

e of th

e thesIs

REfEREnCES1. Van der Zanden G. Quality Assessment of Medical Health Records using Information

Extraction, Master’s thesis. 2010, Universiteit Twente, Enschede.2. The Dutch College of General Practitoners. Guideline adequate EHR registry. Revised

version 2013. Available at: https://www.nhg.org/themas/publicaties/richtlijn-adequate-dossiervorming-met-het-epd.

3. Bentsen BG. International classification of primary care. Scand J Prim Health Care 1986;4:43–50.

4. WHO. International Classification of Primary Care; ICPC-2. Available at: http://www.who.int.proxy.library.uu.nl/classifications/icd/adaptations/icpc2/en/(December 2013 last accessed)Title.

5. Burnum JF. The Misinformation Era: The Fall of the Medical Record. Ann Intern Med 1989;110:482. doi:10.7326/0003-4819-110-6-482

6. van der Lei J. Use and abuse of computer-stored medical records. Methods Inf Med 1991;30:79–80.http://www.ncbi.nlm.nih.gov/pubmed/1857252 (accessed 23 Mar2015).

7. van der Lei J. Information and communication technology in health care: do we need feedback? Int J Med Inform 2002;66:75–83. doi:10.1016/S1386-5056(02)00039-4

8. Blumenthal D. Meaningful use: an assessment. An interview with David Blumenthal, M.D., National Coordinator for Health Information Technology, Office of the National Coordinator. Interview by Mark Hagland. Healthc Inform 2011;28:40,44.

9. Dentler K, Numans ME, ten Teije A, et al. Formalization and computation of quality measures based on electronic medical records. J Am Med Inform Assoc 2014;21:285–91. doi:10.1136/amiajnl-2013-001921 [doi]

10. Bleijenberg N, Drubbel I, Ten Dam VH, et al. Proactive and integrated primary care for frail older people: design and methodological challenges of the Utrecht primary care PROactive frailty intervention trial (U-PROFIT). BMC Geriatr 2012;12:16. doi:10.1186/1471-2318-12-16 [doi]

11. Terry AL, Chevendra V, Thind A, et al. Using your electronic medical record for research: a primer for avoiding pitfalls. Fam Pract 2010;27:121–6. doi:10.1093/fampra/cmp068 [doi]

12. Dolor RJ, Fagnan LJ. PBRN conference summary and update. Ann Fam Med 2014;12:176–7. doi:10.1370/afm.1631 [doi]

13. Bodenheimer T. The future of primary care: transforming practice. N Engl J Med 2008;359:2086,2089. doi:10.1056/NEJMp0805631 [doi]

14. Neuwirth EE, Schmittdiel JA, Tallman K, et al. Understanding panel management: a comparative study of an emerging approach to population care. Perm J 2007;11:12–20.

15. Kruis AL, Boland MRS, Assendelft WJJ, et al. Effectiveness of integrated disease management for primary care chronic obstructive pulmonary disease patients: results of cluster randomised trial. BMJ 2014;349:g5392. doi:10.1136/bmj.g5392

16. Caley M, Chohan P, Hooper J, et al. The impact of NHS Health Checks on the prevalence of disease in general practices: a controlled study. Br J Gen Pract 2014;64:e516–21. doi:10.3399/bjgp14X681013

17. Chen EH, Bodenheimer T. Improving Population Health Through Team-Based Panel Management. Arch Intern Med (American Med Assoc 2011;171:(2 pages). doi:10.1001/archinternmed.2011.395

18. Loo TS, Davis RB, Lipsitz L a., et al. Electronic Medical Record Reminders and Panel Management to Improve Primary Care of Elderly Patients. Arch Intern Med 2011;171:1552–8. doi:10.1001/archinternmed.2011.394

19. Carey IM, Cook DG, De Wilde S, et al. Implications of the problem orientated medical record (POMR) for research using electronic GP databases: a comparison of the Doctors Independent Network Database (DIN) and the General Practice Research Database (GPRD). BMC Fam Pract 2003;4:14. doi:10.1186/1471-2296-4-14 [doi]

21

1

Gen

era

l Intr

od

uc

tIon

, aIm

s an

d o

utlIn

e of th

e thesIs

20. Peterson KA, Lipman PD, Lange CJ, et al. Supporting better science in primary care: a description of practice-based research networks (PBRNs) in 2011. J Am Board Fam Med 2012;25:565–71. doi:10.3122/jabfm.2012.05.120100 [doi]

21. Muller S. Electronic medical records: the way forward for primary care research? Fam Pract 2014;31:127–9. doi:10.1093/fampra/cmu009

22. Coorevits P, Sundgren M, Klein GO, et al. Electronic health records: new opportunities for clinical research. J Intern Med 2013;274:547–60. doi:10.1111/joim.12119

23. Danciu I, Cowan JD, Basford M, et al. Secondary use of clinical data: the Vanderbilt approach. J Biomed Inform 2014;52:28–35. doi:10.1016/j.jbi.2014.02.003

24. Boggon R, van Staa TP, Chapman M, et al. Cancer recording and mortality in the General Practice Research Database and linked cancer registries. Pharmacoepidemiol Drug Saf 2013;22:168–75. doi:10.1002/pds.3374

25. Nielen MMJ, Ursum J, Schellevis FG, et al. The validity of the diagnosis of inflammatory arthritis in a large population-based primary care database. BMC Fam Pract 2013;14:79. doi:10.1186/1471-2296-14-79

26. Hammad TA, Margulis A V, Ding Y, et al. Determining the predictive value of Read codes to identify congenital cardiac malformations in the UK Clinical Practice Research Datalink. Pharmacoepidemiol Drug Saf 2013;22:1233–8. doi:10.1002/pds.3511

27. Netherlands CCCT. The Netherlands Cancer Registry. http://cijfersoverkanker.nl (accessed 20 Mar2012).

Quality of Data – Literature Review

CHaPtERQuality of Data in the Primary Care Electronic Health Record

(EHR) System

Annet Sollie

2

Published (as a shortened version) in Huisarts & Wetenschap, august 2013: Annet Sollie. Hoe is de kwaliteit van data in het HIS? Huisarts & Wetenschap. 2013 Augustus. nr 8:403-403

(Published: Huisarts & Wetenschap August 2013)

27

2

Qu

alit

y o

f Da

ta in

the Pr

ima

ry C

ar

e eleCtr

on

iC h

ealth

reC

or

D (eh

r) Sy

Stem

IntRoDuCtIon

In the light of increasing possibilities for re-use and sharing of data we investigated various

quality aspects of routine healthcare data from Primary Care. To explore this, we carried

out a literature study, as described in this chapter.

What is the question from General Practice? Most General Practitioners (GPs) in the Netherlands have been registering the routine

care data of their enlisted patients in an Electronic Health Record (EHR) system for

many years now. However, the use of advanced digital techniques to improve the

reuse of this data, for indicated prevention projects - projects aimed at preventing

the onset of a disease in an individual with an increased risk for that disease –, for

harvesting indicators to monitor quality of care, or for research purposes, is still not

very thriving. Our research question is: what is known about the quality of data in the

Primary Care EHR?

What is the current policy?Routine care data are not being used extensively yet (2012) for purposes other than daily

care. This is explained, among other reasons like suboptimal adherence to registration

guidelines and supposed privacy violation, mainly by insufficient insight into the quality

and thus reusability of this data.

WHat IS tHE RELEvanCE of tHIS ISSuE?

It is important for GPs themselves as well as for other potential re-users to know what

the quality of the data in their EHR is, since we expect routine care data to be reused

more and more for purposes of:

y Indicated prevention: better care at lower cost for example by actively detecting

patients in the primary care EHR with an increased risk of certain diseases, preferably

followed by structured proactive care programs aiming at the identification and

restoration of care gaps.

y Quality management: Increasing use of quality indicators harvested from EHR systems,

thus fully depending on data entered during routine care by GPs for providing the

correct information;

y Research: reuse of existing, readily available routine care data would enable

researchers to avoid building separate databases, provided that advanced coding

and anonymization tools are installed;

y Data Sharing: automated, not anonymized, digital sharing of data for supporting

remote healthcare through for instance a National EHR:

28

2

Qu

alit

y o

f Da

ta in

the Pr

ima

ry C

ar

e eleCtr

on

iC h

ealth

reC

or

D (eh

r) Sy

Stem

WHat Do WE knoW fRom ExIStInG LItERatuRE?

Within the field of information technology, “data quality” is assessed based on a number

of dimensions, on which unfortunately there is no full agreement. The dimensions

mentioned in the literature are the state of completeness, correctness, currency, plausibility

and concordance that make data whether or not appropriate for a specific use. In this

context, completeness and correctness of data are the most commonly used dimensions[1].

We searched the existing (Dutch and English) literature from 2008 – 2012 for publications

addressing the quality of EHR data in the context of primary care (Figure 1) and found

eight articles, most of them focusing on the dimensions of completeness and to a lesser

degree to correctness. We studied the references of these eight studies but did not find

any additional publications.

The available literature shows that quality of data varies among GP practices

and among data categories (demographics, vital signs, laboratory, risk, prescribing

information, allergies/intolerances and diagnoses).

Kristianson et al[2] extracted data from 776 diabetes patients from a Swedish

EHR and found that demographic data (date of birth and gender) was recorded

22

Figure 1: Search-strings, results & criteria for inclusion and exclusion

The available literature shows that quality of data varies among GP practices and among data

categories (demographics, vital signs, laboratory, risk, prescribing information,

allergies/intolerances and diagnoses).

Kristianson et al[2] extracted data from 776 diabetes patients from a Swedish EHR and found

that demographic data (date of birth and gender) was recorded complete and correct for most

patients (94- 100%). He also found that data on the prescription of medication can be trusted

and is useful, because of the ATC Coding system that is being used, although information on

details like correct dosages is frequently missing. Information on vital signs (blood pressure,

pulse and respiratory rate), weight and Body Mass Index (BMI) was registered incomplete

and inconsistent, mostly because of free-text registration. With the exception of data on

diagnoses and drug therapy (medication prescription), few data could be used without

extensive data management.

Kwaliteit AND huisartsinformatiesyste

em OR HIS OR huisarts informatie

systeem OR huisarts

Dutch Literature (NTVG en H&W) Full text 2008 t/m 2012

routine zorg data OR routine zorg gegevens

English Literature (Pubmed) Title & Abstract 2008 t/m 2012

(data AND quality) AND ((electronic AND

record) OR ehr OR emr) AND (general

practice OR gp)

(routine AND care) AND (data AND

quality) AND (general practice OR gp)

Result Search / Selected

NTVG 22/0 H&W 7/0


NTVG 42/0 H&W 77/6


Pubmed 25/8


Pubmed 22/4

After removing duplicates & reading: 3

After removing duplicates & reading: 5

Inclusion Criteria Exclusion Criteria: Research article Opinion About quality of data About quality of care About routine care data Data collected for research / questionnaires Setting: primary care Setting: hospital / other

figure 1. Search-strings, results & criteria for inclusion and exclusion

29

2

Qu

alit

y o

f Da

ta in

the Pr

ima

ry C

ar

e eleCtr

on

iC h

ealth

reC

or

D (eh

r) Sy

Stem

complete and correct for most patients (94- 100%). He also found that data on the

prescription of medication can be trusted and is useful, because of the ATC Coding

system that is being used, although information on details like correct dosages is

frequently missing. Information on vital signs (blood pressure, pulse and respiratory

rate), weight and Body Mass Index (BMI) was registered incomplete and inconsistent,

mostly because of free-text registration. With the exception of data on diagnoses

and drug therapy (medication prescription), few data could be used without extensive

data management.

Fokkens et al[3] extracted the data from 196 diabetes patients from Dutch EHRs

and compared this with data registered in the software application “Diabcare”, a

structured registration program used in many practices for diabetes care in addition to

the EMR. They found that the registration of blood pressure scores quite high on the

dimension of “completeness”. This also applies to results of laboratory investigations.

However, data on risk factors (smoking and weight) and results of eye and foot

examinations were registered less frequently in the EHR system compared to the

Diabcare system.

Australian researchers[4] showed 33 patients their own medical file and asked them to

check for completeness and correctness. The results of this study show that demographic

data and data on allergies was recorded complete and correct for respectively 94% and

61% of the patients. However, 35% of patients found that relevant information was

missing in their EHR and 51% found erroneous information in their EHR.

Two Dutch[5,6] and one Swiss[7] study that focused on diagnosis coding in the EHR

concluded that the quality of diagnosis coding was “encouraging” but could improve

and varied among practices. Visscher et al[6] studied ICPC coding of diagnoses in 311

general practices during 2011/2012 and showed that for 86% of the consultations the

General Practitioner (GP) assigned a meaningful* ICPC code. Akkerman et al[5] found

that the annual incidences and prevalences per 1.000 person-years for several ICPC-coded

diseases in the EHR registries of 153 GPs from the regions of Utrecht and Almere (total

198.000 patients) were comparable with incidences and prevalences from the Second

Dutch National Survey of General Practice (2-DNSGP), which contains EHR data validated

with patient health interviews. Swiss[7] researchers organized a “hot-coding” week

among 24 GPS in 2010 and showed that the mean index of ICPC-codes in relation to the

number of consultations rose significantly (from 1.31 to 1.52 mean ICPC codes per patient

visit). This implies that the currently observed numbers of diagnoses per consultation is

underestimated, suggesting room for improvement.

Another Dutch study[8] by Jabaaij (112.000 patient EHRs from 32 practices) shows that

57-99% of Episodes of care in the EHR, which are used to group consultations, are ICPC-

coded. Consultations are linked to an Episode in 62-100% of cases and prescriptions are

30

2

Qu

alit

y o

f Da

ta in

the Pr

ima

ry C

ar

e eleCtr

on

iC h

ealth

reC

or

D (eh

r) Sy

Stem

linked to an Episode in 33-99% of cases. Again, this study shows remarkable variability

in the use of codes and completeness between GP practices.

ConCLuSIonS

The currently available literature on data quality in the primary care EHR is relatively scarce.

However, literature shows that registry of demographic data and results of laboratory

tests is nearly complete and correct and that registry of medication prescriptions is

correct but not always complete and up-to-date (current). Registration of diagnoses using

ICPC codes is fairly good but not complete, while the registration of vital parameters,

allergies/intolerances, weight, Body Mass Index (BMI) and risk factors is unsatisfactory.

To summarize, studies have focused mainly on assessing the completeness of data in the

EHR, and conclude that for demographic data and coded data (medication prescriptions,

laboratory results and diagnoses) this completeness is ‘fairly’ good. Little is known on

other dimensions of data quality in the primary care EHR as well as on extraction (im-)

possibilities. There is certainly room for improvement.

What is the most important research question? Is the quality of primary care routine care data of an acceptable standard in order to

make reuse of data for purposes of health and research possible?

* Meaningful ICPC codes: codes in the range 01-29 (complaints), 70-99 (diagnoses) + A44

(vaccination), R44 (influenza vaccine) and X37 (cervical smear). Codes A79 (no disease) and

A99 (generalized other / unspecified disease (s)) are not considered to be “meaningful”.

31

2

Qu

alit

y o

f Da

ta in

the Pr

ima

ry C

ar

e eleCtr

on

iC h

ealth

reC

or

D (eh

r) Sy

Stem

REfEREnCES1. Weiskopf NG, Weng C. Methods and dimensions of electronic health record data quality

assessment: enabling reuse for clinical research. J Am Med Inform Assoc 2013;20:144–51. doi:10.1136/amiajnl-2011-000681

2. Kristianson KJ, Ljunggren H, Gustafsson LL. Data extraction from a semi-structured electronic medical record system for outpatients: a model to facilitate the access and use of data for quality control and research. Health Informatics J 2009;15:305–19. doi:10.1177/1460458209345889 [doi]

3. Fokkens AS, Wiegersma PA, Reijneveld SA. A structured registration program can be validly used for quality assessment in general practice. BMC Health Serv Res 2009;9:241. doi:10.1186/1472-6963-9-241

4. Tse J, You W. How accurate is the electronic health record? - a pilot study evaluating information accuracy in a primary care setting. Stud Health Technol Inform 2011;168:158–64.

5. Akkerman A, Verheij T, Veen R, et al. Interactieve medische informatie van het Huisartsen Netwerk Utrecht en de Almere Zorggroep (‘Interacive medical information from the General Practitioners Network Utrecht and the Almere Group of Care’). Huisarts Wet 2008;51:90–5. doi:10.1007/BF03086658

6. Visscher, S. ten Veen, P. Verheij PR. Kwaliteit van ICPC-Codering (‘Quality of ICPC Coding’. Huisarts en Wet j 2012;10:459–459.

7. Chmiel C, Bhend H, Senn O, et al. The FIRE project: a milestone for research in primary care in Switzerland. Swiss Med Wkly 2011;140:w13142. doi:10.4414/smw.2011.13142 [doi]

8. Jabaaij L, Njoo K, Visscher S, Van den Hoogen H, Tiersma W, Levelink H et al. Verbeter uw verslaglegging, gebruik de EPD-scan-h. Huisarts Wet 2009;52:240–6.

Quality and Reusability of Primary Care EHR Data – Hands-on identification

of bottlenecks

CHaPtERReusability of coded data

in the primary care Electronic Medical Record: a dynamic cohort

study concerning cancer diagnoses

Annet Sollie, General Practitioner-in-Training/PhD Fellow, Rolf H. Sijmons, Clinical Geneticist, Professor of Medical Translational Genetics, Charles Helsper, MD, PhD, Epidemiologist, Mattijs E Numans, General Practitioner, Professor of Primary Care

3

Published august 2016 in the International Journal of medical Informatics as: Annet Sollie, Rolf H. Sijmons, Charles Helsper, Mattijs E. Numans. Reusability of coded data in the primary care Electronic Medical Record: a dynamic cohort study concerning cancer diagnoses.

37

3

Reu

sab

ility

of c

od

ed d

ata

in th

e pRim

aRy

ca

Re eh

R

abStRaCt

objectives To assess quality and reusability of coded cancer diagnoses in routine primary care data.

To identify factors that influence data quality and areas for improvement.

methodsA dynamic cohort study in a Dutch network database containing 250,000 anonymized

electronic medical records (EMRs) from 52 general practices was performed. Coded data

from 2000 to 2011 for the three most common cancer types (breast, colon and prostate

cancer) was compared to the Netherlands Cancer Registry.

measurementsData quality is expressed in Standard Incidence Ratios (SIRs): the ratio between the

number of coded cases observed in the primary care network database and the expected

number of cases based on the Netherlands Cancer Registry. Ratios were multiplied by

100% for readability.

ResultsThe overall SIR was 91.5% (95%CI 88.5 – 94.5) and showed improvement over the years.

SIRs differ between cancer types: from 71.5% for colon cancer in males to 103.9% for

breast cancer. There are differences in data quality (SIRs 76.2% – 99.7%) depending on

the EMR system used, with SIRs up to 232.9% for breast cancer. Frequently observed

errors in routine healthcare data can be classified as: lack of integrity checks, inaccurate

use and/or lack of codes, and lack of EMR system functionality.

ConclusionsRe-users of coded routine primary care Electronic Medical Record data should be aware

that 30% of cancer cases can be missed. Up to 130% of cancer cases found in the EMR

data can be false-positive. The type of EMR system and the type of cancer influence the

quality of coded diagnosis registry. While data quality can be improved (e.g. through

improving system design and by training EMR system users), re-use should only be taken

care of by appropriately trained experts;

38

3

Reu

sab

ility

of c

od

ed d

ata

in th

e pRim

aRy

ca

Re eh

R

IntRoDuCtIon

Reuse of electronic medical record (EMR) data is a hot topic, not only in hospitals[1,2] but

also in primary care[3]. An example is the international trend to calculate quality indicators

automatically based on data collected during routine care. For Dutch primary care alone,

over one hundred quality indicators are established and more are being developed[4].

Because manual assessment of these indicators is a time-consuming burden for healthcare

professionals, policy makers aim for automatic calculation based on extracted, mainly

coded, routine care data [5,6].

Furthermore, risk assessment for prevention projects, followed by structured panel

management procedures as well as chronic disease management to improve proactive

care[7–10], are becoming more and more popular. These are thought to be promising

tools in managing the increasing workload of family physicians, but again they rely

strongly on the analysis of routine care diagnostic data to identify patients who could

be included in preventive care and chronic disease management programmes, such as

the frail elderly[11] or cancer patients.

Also reuse of data for primary care research purposes such as early detection of

cancer is almost becoming commonplace, as is demonstrated by the rapidly evolving

practice-based research networks (PBRNs) in Europe, Canada and the USA. These

networks provide a basic facility for primary care research, and often use anonymized

data uploaded by participating practices to a central database[12–14].

It is important that primary care organization regard their (routine care) data as

a significant and valuable organizational asset. It is equally important that they realize that

in the wrong hands (personnel without appropriate expertise and training in handling

of routine care data), data re-use can actually cause harm. In order to truly value routine

primary healthcare data and to re-use this data reliably, the data should represent the

true situation as closely as possible. Despite the examples of actual reuse mentioned

above, there are serious concerns about the quality and subsequent reusability of EMR

data in primary care[1,2,15–17].

In medical informatics, data quality is assessed using various “dimensions”[15,18].

Although there is no uniform accepted model or method to assess data quality in primary

care, Gray Weiskopf [15] summarized the five common dimensions of data quality based

on extensive literature research: Completeness, Correctness, Concordance, Plausibility

and Currency (figure 1).

39

3

Reu

sab

ility

of c

od

ed d

ata

in th

e pRim

aRy

ca

Re eh

R

figure 1. Dimensions of data quality

Dimension Description

Completeness No missing data; a truth about a patient is present in the EHR

Correctness An element that is present in the EHR is true

Concordance There is agreement between elements in the EHR or between the EHR and other data sources

Plausibility An element in the EHR makes sense in light of other knowledge about what that element is measuring or representing

Currency An element in the EHR is a relevant representation of the patient state at a given point in time

Little has been published on the quality of data from primary care records. A few studies

(see Discussion) have assessed their completeness and, to a lesser extent, their correctness,

but information on other dimensions is lacking.

When assessing data quality, focusing on the coded registration of diagnoses has

most priority, because this is a central item being used in analyses. We focus on diagnoses

of cancer because it is a high impact diagnosis that we expect to be registered and coded

correctly in the EMR for purposes of care. Furthermore, the national Netherlands Cancer

Registry (NCR)[19] provides an accessible and supposedly reliable reference standard. To

assess quality and usability of coded cancer diagnoses for re-use using available reference

data we decided that we could assess and study three dimensions of data quality using

our data infrastructure: completeness, correctness and concordance with the reference

standard. To find focus points for improvement, we identified factors that influence

data quality.

mEtHoDS

Design We performed a dynamic cohort study in a Dutch network database containing 250,000

anonymized electronic medical records (EMRs) from 52 general practices. We used

a 4 step study approach, as described in Figure 2, to determine Standardized Incidence

Rate Ratios (SIRs) between January 1st 2000 and December 31st 2011.

First, we determined our reference standard: the expected incidence rates based on

the Netherlands Cancer Registry (NCR) [19] and Statistics Netherlands[20].

Second, observed incidence rates of cancer in coded routine primary care data were

determined using the Julius General Practitioners’ Network (JGPN), for patients with

a diagnosis of breast-, prostate-, and/or colon cancer.

40

3

Reu

sab

ility

of c

od

ed d

ata

in th

e pRim

aRy

ca

Re eh

R

1d Calculate expected number of cases for each

cancer in study population

Statistics Netherlands Dutch National Cancer Registry Study population inJGPN database

1a Download total population, male/female, 2000 – 2011

1b Download number of breast, prostate and colon cancer cases

in 2000 - 2011

1c Calculate number of person years in JGPN

database for 2000 - 2011

Observed cases

2a Select patients with cancer episode ICPC code

2c Select patients without cancer episode but with cancer consultation ICPC

code

2d Select patients without cancer ICPC code but with

cancer medication

2b Work-up data: manual correction of diagnosis

and/or date of diagnosis

Observed cases,

corrected

Observed cases, corr & consult

Observed cases, corr

& med

Expected cases

3 Calculate

SIR & 95%CI

SIRs and 95%CIs

Figure 2: Flowchart Methods

figure 2. Flowchart Methods

Third, we calculated the SIRs for three four-year time slots between January 1st 2000

– December 31st 2011, for each EMR system, for the three cancer types and for overall

cancer diagnoses. Finally, we identified, listed and classified the common errors we

found in the extracted EMR data.

Patients & EMR Data: the JGPN databaseThe JGPN database[11,21] comprises anonymized routine care data, updated

quarterly from 52 GP practices (120 GPs, 250,000 patients) that share their data

with the Julius Centre for Health Sciences and Primary Care, University Medical

Centre Utrecht, the Netherlands. This population is considered representative for

the Dutch population (table 1)[22]. GP’s were not aware of this study at the time

of registry, neither did they receive specific training on coding or registry in the

EMR nor was the data improved in any way. Hence the data can be considered as

regular routine care data.

41

3

Reu

sab

ility

of c

od

ed d

ata

in th

e pRim

aRy

ca

Re eh

R

table 1. Representativeness of patients in GPs in the JGPN in 2013

Patients

JGPn netherlands

Total N:231,556 Total N:16,779,575

male sex n (% male of total)

110,973(48%)

8,307,339(50%)

% age < 20 of total (% male of group)

21%(51%)

23%(51%)

% age 20-65 of total (% male of group)

65%(48%)

60%(50%)

% age > 65 of total (% male of group)

14%(43%)

17%(45%)

The JGPN data include consultations and clustered disease episodes with ICPC-coded[23]

diagnoses, ATC (Anatomical Therapeutic Chemical)-coded prescribed medication, coded

laboratory test results, and for some patients coded referrals and letters from medical

specialists.

In the Netherlands GP medical encounters are registered according to the “SOAP-

system”[24]. A SOAP-journal consists of four data fields. The first is “Subjective” and

comprises patient complaints and the reason for the encounter, registered in free

text fields. The second is “Objective” and includes results from clinical examination

and measurements, also mostly in free text format. The third “Analysis” is used to

register the diagnosis or most important symptom and is coded using the International

Classification of Primary Care version 1 (ICPC-1) 2009 coding system[23]. The final

datafield is “Plan”, comprising the GPs medication prescriptions, referrals to medical

specialists and follow-up appointments. The list “Episodes”, also coded using ICPC-1

codes, clusters consultations concerning the same diagnosis for an individual patient.

This list provides an overview of active and non-active diagnoses and complaints with

corresponding start and/or end dates.

ICPC diagnosis codes are available for the more common types of cancer, including

the three cancers under study. There are no codes available to specify recurrence of

cancer, for suspected cancer or for treatments of cancer. The GP manually enters the

ICPC code for the concerning ‘cancer’ diagnosis during consultation or after receiving

confirmation of the diagnosis from secondary care correspondence. The GP decides

when a new Episode is created for the cancer diagnosis and which consultations

are added to this Episode. According to the Dutch College of General Practitioners’

guideline for correct registration, every cancer diagnosis should be registered as an

episode in the EMR and consultations about relevant complaints or treatments should

be attached to it [25].

42

3

Reu

sab

ility

of c

od

ed d

ata

in th

e pRim

aRy

ca

Re eh

R

Reference dataAll oncology-related information from hospitals, as triggered by the pathology report

for newly found cancers, is sent to the NCR[19]. Specially trained staff members enter

relevant data about these diagnoses in the registry. Also, cancer diagnoses reported

in hospital patient discharge files, for which no pathological investigation is being

performed, are included in the NCR as clinical diagnoses for most hospitals. The NCR

claims to be almost complete (>95% of all cancers) for the population of the Netherlands

and without false-positive records since 1989. However, no solely in primary care

diagnosed cancer cases are entered into the NCR registry. The registration delay at the

NCR is reported to be 3 - 9 months after the pathologist confirmed the cancer and is

claimed to be decreasing. There is some evidence that the quality of the NCR data is

complete and accurate[26,27].

Reference data on the size of the Dutch population were obtained from Statistics

Netherlands[20], which is responsible for collecting and processing data for the official

national statistics to be used by policymakers and scientific researchers.

Data collection and analysis

Step 1: Calculation of expected number of cases from reference data

The expected incidences in the study population were calculated using steps 1a to 1d,

as demonstrated in Figure 2. First, to determine the size of the reference population,

the total number of males and females between 20 – 90 years on January 1st of each

year from 2000 to 2011 were downloaded from Statistics Netherlands[20] in 5-year age

categories. The mean of two consecutive years was used to determine the size of the

population on July 1st of each year. Next, to determine the incidence rates for cancer, the

absolute national number of the three types of cancer patients under study and aged

20 – 90 years was obtained from the NCR[19] for each of the years 2000 – 2011, also

in 5-year age categories. We included invasive and non-invasive (in situ) breast cancer

(females only), colon cancer (separately for males and females) and prostate cancer

(males). Next, to determine the size of the study population, the number of person-years

in the JGPN database was calculated from the number of registered patients per age

category for each EMR system on July 1st of each year. Finally, the expected incidences

in our study population were calculated per category per EMR system per year using

the following formula:

(absolute number of cancer cases / population) * number of person-years in database.

Step 2: Determining the observed number of cases: selecting patients

from study population, data extraction/work-up

To obtain the observed incidences, we used a three step search procedure to select all

cases of breast- colon-, and prostate cancer cases from 2000 to 2011 from the JGPN

43

3

Reu

sab

ility

of c

od

ed d

ata

in th

e pRim

aRy

ca

Re eh

R

database (Figure 2). These were counted per EMR system, per 5-year age category and

per year. Practices were using one of the following EMR systems: Medicom, Promedico

or MicroHIS. There are more than 10 EMR systems available for general practices in the

Netherlands, but these three are the most frequently used in the country as well as in

the JGPN-affiliated practices. Because of low numbers of users and patients we decided

not to use data of additional practices in the database using other EMR systems. The

data models of all Dutch Primary Care EMR systems are based on the frequently updated

Reference Model issued by the Dutch College of General Practitioners[28] and they differ

mainly in user interface.

First (step 1), we extracted relevant data for all patients with one or more ICPC

episode codes for breast- (X76 and X67.1), colon- (D75), and/or prostate cancer

(Y77 and Y77.1). This included year of birth, gender and cancer episodes with their

ICPC code and description. We continued by removing duplicate records and then

manually checking the episode description entered by the GP along with the registered

ICPC code and date. This manual check was performed by the first author, who is

researcher, ICT expert and GP and took about 30 hours. If the information in the

episode description was sufficiently clear, corrections were made, mainly in the date

of diagnosis (for instance: Episode date 2011, description “Breast cancer diagnosed

and treated in 2002”, date of diagnosis was changed into 2002). If the ICPC code

clearly did not match the description (for instance code Breast Cancer X76, description

“mother has had breastcancer”), the record was excluded. We counted the number

of errors (corrections and exclusions) per EMR system and per time period. Patients

reported as having recurrent cancer (by the GP, in the text description) were counted

only at the occurrence of the primary cancer, in line with NCR prevalence reports.

Patients with “cancer in medical history”, with or without further specification, were

excluded because the original date of diagnosis was considered to occur before our

observation period.

After this procedure, we concentrated on finding missing episode-diagnoses (step 2)

by selecting all the patients without a coded cancer-related episode but who were

registered with one or more registered consultations with an ICPC code for breast- (X76,

X76.1), colon- (D75), and/or prostate (Y77, Y77.1) cancer. Extracted data included year

of birth, sex, as well as all other encounter SOAP registries, including other ICPC codes.

Finally (step 3) we continued to check all patients without an episode or consultation

coded for a relevant cancer type but who were prescribed specific cancer drugs (3)

from 2000 – 2011. This procedure should theoretically result in finding all the patients

diagnosed with cancers in our observation period. Note that we performed the three

step search procedure for the total JGPN population but only present the data for adults

to compare with the reference population (age at diagnosis between 20-90) in the

results section.

44

3

Reu

sab

ility

of c

od

ed d

ata

in th

e pRim

aRy

ca

Re eh

R

figure 3. List of drug prescriptions used in EMR search of cancer cases

Cancer type name of drug atC code*

Breast cancer Gosereline/zoladex L02AE03

Tamoxifen/nolvadex/istubal/valodex L02BA01

Anastrozol/arimidex L02BG03

Letrozol/femara/letroman L02BG04

Exemestaan/aromasin L02BG06

Trastuzumab/Herceptin/herclon L01XC03

Bevacizumab/Avastin L01XC07

Colon cancer Cetuximab/erbitux L01XC06

Panitumumab/vectibix L01XC08

Bevacizumab/Avastin L01XC07

Prostate cancer Gosereline/zoladex L02AE030

*ATC = Anatomical Therapeutic Chemical coding system

Step 3: Calculation of standard incidence ratios

SIRs[29] were calculated as the ratio between the observed and expected number

of cancer cases, for our three cancer types combined and for the different types

separately, differentiated per four-year period and per EMR system. All analyses

were stratified by sex. Because the differences between the observed and expected

number of cases may be due to random fluctuations in disease occurrence, 95%

confidence intervals (CIs) were computed assuming Byar’s approximation[29]. If the

95%CI includes 100%, the difference between the observed and expected cases is

likely to have occurred by chance. All data analysis and calculations were carried out

using Microsoft Excel 2010.

RESuLtS

The combined SIR for breast, colon, and prostate cancer between 2000 and 2011 was

91.5%, (95%CI 88.5 – 94.5). This means there is a significant difference between the

observed number of cases in the EMR and the expected number according to the NCR

(table 2).

45

3

Reu

sab

ility

of c

od

ed d

ata

in th

e pRim

aRy

ca

Re eh

R

table 2. Quality of diagnosis registry in the EMR

Expected cases observed cases* SIR** 95%CI of SIR***

n n % %

overall Cancer 3926 3594 91.5% 88.5 - 94.5

Follow-up period

2000 - 2003 incl 1010 670 66.3% 61.3 - 71.3

2004 - 2007 incl 1294 1239 95.7% 90.3 - 100.9

2008 - 2011 incl 1623 1685 103.8% 98.8 - 108.6

EMR system

Promedico 2132 2126 99.7% 95.4 - 103.9

Medicom 574 537 93.7% 85.4 - 101.1

MicroHIS 1221 931 76.2% 71.3 - 81.0

Cancer type

Breast Cancer Female 1685 1750 103.9% 98.9 - 108.5

Colon Cancer Female 599 476 79.5% 72.2 - 86.3

Colon Cancer Male 662 473 71.5% 65.0 - 77.8

Prostate Cancer Male 981 895 91.2% 85.1 - 97.0

* Observed cases after work-up** SIR is the Standard Incidence Ratio and is the ratio between observed and expected cases, expressed as a percentage, numbers in bold print are statistically significant*** If the 95%CI includes 100.0, the difference between the observed and expected number of cases is not statistically significant

The SIRs varied over time: from 2000 to 2003 the combined SIR was 66.3%, (95%CI

61.3 – 71.3), from 2004 to 2007 it was 95.7% (95%CI 90.3 – 100.9), and from 2008

to 2011 it was 103.8% (95%CI 98.8 – 108.6). For colon cancer in males the SIR was

71.5% (65.0 – 77.8), while in women it was 79.5% (95%CI 72.2 – 86.3). The SIR for

prostate cancer was 91.2% (85.2 – 97.0) and for breast cancer 103.9% (98.9 – 108.5).

Furthermore, a statistically significant deviation from the expected number of cases was

found in the combined SIR for the MicroHIS EMR system: 76.2% (71.3 – 81.0). Almost

every SIR for the Promedico system was over 100% in recent years, which means that

a higher than expected number of cases (indicating false positive diagnoses) was found

in the EMR (S1 table 3).

46

3

Reu

sab

ility

of c

od

ed d

ata

in th

e pRim

aRy

ca

Re eh

R

S1 table 3. Quality of diagnosis registry stratified to cancer type and EMR system (Continued)

Expected cases

n

observed cases*

nSIR**

%

95%CI of SIR#

%

observedCases after

Cons***n (% extra)

SIR**%

95%CI of SIR#

%

breast Cancer female

Promedico

2000 – 2003 262.1 232 88.5% 76.8 – 99.2 449 (94%) 171.3% 154.1 – 185.1

2004 – 2007 292.7 350 119.6% 106.4 – 131.0 674 (93%) 230.2% 211.0 – 244.9

2008 – 2011 337.0 411 121.9% 109.5 – 132.7 785 (91%) 232.9% 214.9 – 246.8

Medicom

2000 – 2003 37.7 58 153.7% 108.3 – 180.2 59 (2%) 156.3% 110.4 – 182.9

2004 – 2007 80.8 97 120.0% 94.1 – 139.7 99 (2%) 122.5% 96.2 – 142.2

2008 – 2011 141.5 124 87.7% 71.6 – 101.6 291(135%) 111.9% 178.6 – 224.5

MicroHIS

2000 – 2003 135.8 108 79.5% 64.1 – 93.3 110 (2%) 81.0% 65.4 – 94.9

2004 – 2007 174.1 150 86.2% 71.9 – 98.9 156 (4%) 89.6% 75.0 – 102.5

2008 – 2011 223.5 220 98.4% 84.9 – 110.4 232 (5%) 103.8% 89.9 – 116.0

Colon Cancer female

Promedico

2000 – 2003 92.9 41 44.1% 31.3 – 57.5 55 (34%) 59.2% 43.7 – 73.9

2004 – 2007 109.4 98 89.6% 71.1 – 105.4 98 (0%) 89.6% 71.1 – 105.4

2008 – 2011 129.4 159 122.9% 102.3 – 139.3 182 (15%) 140.7% 118.3 – 157.8

Medicom

2000 – 2003 10.4 6 57.6% 19.4 – 91.8 6 (0%) 57.6% 19.4 – 91.8

2004 – 2007 25.1 17 67.7% 36.8 – 94.0 18 (6%) 71.7% 39.4 – 98.2

2008 – 2011 45.2 33 72.2% 48.0 – 94.4 33 (0%) 72.2% 48.0 – 94.4

MicroHIS

2000 – 2003 42.3 25 59.1% 36.8 – 79.9 25 (0%) 59.1% 36.8 – 79.9

2004 – 2007 61.3 34 55.4% 37.4 – 72.9 34 (0%) 55.4% 37.4 – 72.9

2008 – 2011 82.5 63 76.4% 57.1 – 93.3 66 (5%) 80.0% 60.2 – 97.2

Colon Cancer male

Promedico

2000 – 2003 97.4 39 40.0% 28.2 – 52.6 46 (18%) 47.2% 34.1 – 60.6

2004 – 2007 120.8 104 86.1% 68.9 – 101.0 114 (10%) 94.3% 76.2 – 109.8

2008 – 2011 146.9 138 93.9% 77.6 – 108.1 157 (14%) 106.9% 89.2 – 121.7

Medicom

2000 – 2003 11.1 9 80.9% 31.8 – 114.3 9 (0%) 80.9% 31.8 – 114.3

2004 – 2007 28.4 14 49.4% 25.9 – 72.8 15 (7%) 52.9% 28.3 – 76.7

2008 – 2011 54.2 42 77.5% 53.6 – 97.7 42 (0%) 77.5% 53.6 – 97.7

47

3

Reu

sab

ility

of c

od

ed d

ata

in th

e pRim

aRy

ca

Re eh

R

S1 table 3. Quality of diagnosis registry stratified to cancer type and EMR system (Continued)

Expected cases

n

observed cases*

nSIR**

%

95%CI of SIR#

%

observedCases after

Cons***n (% extra)

SIR**%

95%CI of SIR#

%

MicroHIS

2000 – 2003 42.6 21 49.3% 29.6 – 69.1 21 (0%) 49.3% 29.6 – 69.1

2004 – 2007 66.5 32 48.1% 32.3 – 64.2 32 (0%) 48.1% 32.3 – 64.2

2008 – 2011 93.7 74 79.0% 60.5 – 95.2 78 (5%) 83.3% 64.2 – 99.7

Prostate Cancer

Promedico

2000 – 2003 140.2 90 64.2% 50.9 – 76.8 114 (27%) 81.3% 65.9 – 95.0

2004 – 2007 186.3 239 128.3% 110.8 – 142.6 253 (6%) 135.8% 117.8 – 150.4

2008 – 2011 216.3 225 104% 89.8 – 116.4 243(8%) 112.3% 97.4 – 125.1

Medicom

2000 – 2003 15.2 14 92% 44.0 – 123.3 14 (0%) 92% 44.0 – 123.3

2004 – 2007 43.9 47 106.9% 74.2 – 130.8 47 (0%) 106.9% 74.2 – 130.8

2008 – 2011 79.9 76 95.1% 72.6 – 113.5 78 (3%) 97.6% 74.7 – 116.1

MicroHIS

2000 – 2003 121.9 27 22.2% 14.8 – 31.2 27 (0%) 22.2% 14.8 – 31.2

2004 – 2007 104.8 57 54.4% 40.5 – 67.9 58 (1%) 55.3% 41.3 – 69.0

2008 – 2011 72.0 120 166.6% 132.6 – 188.9 132 (10%) 183.2% 147.1 – 206.1

* Observed cases after work-up** SIR is the Standard Incidence Ratio and is the ratio between observed and expected cases, expressed as a percentage, numbers in bold print are significant*** Observed cases after work-up and adding patients without episode ICPC for cancer but with journal consultations coded as cancer # If the 95%CI includes 100.0, the difference between the observed and expected number of cases is not statistically significant

Note that all data presented is after work-up, which was performed by manual checking

(see methods). This led to records being excluded because of erroneous coding in 6% of

cancer cases found in Promedico, 10% in Medicom and 5% in MicroHIS.

Corrections in date of diagnosis were made in 11% of cancer cases in Promedico,

14% in Medicom and 6% in MicroHIS. There were no trends visible. In total, work-up

was necessary in 17% of the Promedico cancer cases found, 24% of the Medicom cases

and 11% of the MicroHIS cases.

Frequent errors encountered in the EMR during work-up could be classified as follows:

lack of integrity checks, inaccurate use or lack of ICPC codes, lack of EMR functionality,

and inaccurate registration of the date of diagnosis (S2 table 3).

48

3

Reu

sab

ility

of c

od

ed d

ata

in th

e pRim

aRy

ca

Re eh

R

S2 table 4. List of frequent errors in the EMR registries classified by type

no.type of inaccuracy

Dimension of quality Examples

1 Lack of integrity checks in EMR

Concordance •X76 (malignant neoplasm breast female) given to a male patient

•X76 given to a very young patient•Two or more episodes for the same disease

2 Inaccurate use of ICPC codes

Correctness •X76 for a patient with a tick bite or D75 for malignant neoplasm bladder (wrong coding)

•X76 for suspicion of breast cancer (e.g. lump)•See also examples under 3

3 Lack of ICPC codes

Correctness •D75 for anal cancer (no ICPC code available)•For genetic risk (cancer or known mutated gene in family,

gene carrier, or gene tested negative)•Two disease episodes for bilateral breast cancer

(no ICPC code available)

3 Lack of EMR functionality

Completeness •Use of disease registry (ICPC codes) for crucial symptoms & signs (rectal bleeding, increased PSA)

•Use of disease registry (ICPC codes) for further investigation (mammography, colposcopy) and periodic investigation (e.g. annual preventive screening)

•Use of disease registry (ICPC codes) for treatment•Use of disease registry (ICPC codes) for preventive surgery

(mastectomy) and breast reconstruction•No possibility to register relapsing cancer•No possibility to enter cancer staging (TNM)•No functionality to register suspicion of disease or

exclusion of disease

8 Date of diagnosis incorrect or missing

Currency •Registration of “breast cancer 1993”with data 1-1-2010•Registration of “colon cancer in medical history” on

1-3-2009

Adding patients without an episode ICPC code for cancer, but with one or more journal

consultations with a cancer code, increased the total number of cases found by 0 – 7%

for MicroHIS and Medicom (table 5). For Promedico this was up to 27%. For breast cancer,

however, over 90% more cases were found in Promedico in the follow-up periods. In

Medicom the increase in number of breast cancer cases was 135% between 2008 – 2011,

counting patients with journal consultation cancer codes.

The number of cases found by searching the EMR for cancer medication was low and

added only to breast- and prostate cancer numbers. Because the date of prescription

was not available for all the records, only the total number of cases found from 2000 to

2011 were counted (Table 5).

49

3

Reu

sab

ility

of c

od

ed d

ata

in th

e pRim

aRy

ca

Re eh

R

table 5. Number of cases found by searching EMR for cancer medication prescriptions

Expected cases

observed cases*

n

medication casesm (%)

observed cases including

medication cases

overall Cancer 3926.4 3594 164 (4.5%) 3758

Breast Cancer Female 1685.3 1750 105 (6.0%) 1855

Promedico 891.9 993 33 1026

Medicom 260.0 279 20 299

MicroHIS 533.4 478 52 530

Colon Cancer Female 59.1 476 1 (0.2%) 477

Promedico 331.7 298 0 298

Medicom 81.2 56 0 56

MicroHIS 186.1 122 1 123

Colon Cancer Male 661.5 473 0 (0%) 437

Promedico 365.1 281 0 281

Medicom 93.7 65 0 65

MicroHIS 202.7 127 0 127

Prostate Cancer Male 980.6 895 586 (6.4%) 953

Promedico 542.8 554 16 570

Medicom 139.1 137 26 163

MicroHis 298.7 204 16 220

DISCuSSIon

Principal findingsThe overall SIR was 91.5% (95%CI 88.5 – 94.5). Comparability of incidence rates improved

significantly over the years, from a SIR of 66.3% in 2000 – 2003 (95%CI 61.3 – 71.3) to

103.8% in 2008 – 2011 (95%CI 98.8 – 108.6). SIRs differ between cancer types: from 71.5%

(95%CI 65.0 – 77.8) for colon cancer in males to 103.9% (95%CI 98.9 – 108.5) for breast

cancer. There are differences in data quality (SIRs 76.2% – 99.7%) depending on the EMR

system used, with SIRs up to 232.9% for breast cancer in one EMR system in recent years.

The most frequent errors in the EMR can be classified as: lack of integrity checks,

inaccurate use and/or lack of codes, and lack of EMR functionality.

Strengths and limitationsTo the best of our knowledge, this is the first study to assess data quality in primary

care EMRs, taking into account the type of diagnosis, adaptation time and type of EMR

50

3

Reu

sab

ility

of c

od

ed d

ata

in th

e pRim

aRy

ca

Re eh

R

system. The major strengths of our study are the size of the cohort, the extensive EMR

data content we had available that was sheer routine care hence not improved in any way

and the use of SIRs to assess data quality. Furthermore, the JGPN database comprises

well-documented information on its enlisted patients. The characteristics of these patients

do not differ from the overall Dutch population (table 1)[22]. Furthermore we used only

coded data (ICPC codes and medication) to select cancer diagnosis from the EMR and

chose not to exploit the free-text part of the EMR. Although this may be viewed as a

limitation, current re-use of EMR data is often based on coded data only, which means

our study provides an insight that could be useful.

A limitation is that we chose to work with coded diagnoses extracted from the

EMR which were not validated using information from the EMR other than the Episode

description. This means false-positive diagnosis could have been included and false-

negative diagnosis could have been missed. Furthermore, we assumed our reference

standard to be correct (see methods). If this is not the case, this may have biased our

results. Last but not least working with routine care data has its limitations that should

be taken into account, such as variance in registry between GPs despite the availability of

a guideline for adequate registry issued by the Dutch College of General Practitioners[25]

for a number of years now.

Comparison with the literatureTse et al[30] asked 33 Australian patients to check their own medical files for

completeness and accuracy; 35% of them found relevant information missing from

their EMR and 51% found incorrect data. Two Dutch[31,32] studies and one Swiss[33]

study investigated the coding of diagnoses in primary care EMRs; they all concluded

that the quality of coding in general was fairly good but that it varied widely between

general practices. Visscher et al.[32] found a meaningful (i.e. no non-specific) ICPC

code for 86% of consultations at 311 general practices in the Netherlands. Akkerman

et al.[31] concluded that the incidence and prevalence data for 28 diseases (including

colon- and kidney cancers), registered by 135 GPs in the Netherlands, were not

significantly higher or lower than the data from Visscher and Ten Veen[21], which

were also from a GP registry.

A study by De Clercq[34] et al compared routine care data from over 10,000 Belgian

patients, extracted from GPs’ EMRs and their answers to an electronic questionnaire. They

studied ten healthcare conditions using clinical and biological parameters (cholesterol,

blood pressure, and body mass index), diagnoses (hypertension, diabetes, and past

cardiovascular events), and drug prescriptions (anti-diabetic drugs, aspirin, statins, and

anti-hypertensive drugs). They found a relatively fair agreement (Kappa≥0.40) between

the two data collection methods for seven conditions, but no agreement for the biological

parameters. The prevalence of diagnoses and drug prescriptions was relatively lower in

the EMR data than in that collected by the questionnaire.

51

3

Reu

sab

ility

of c

od

ed d

ata

in th

e pRim

aRy

ca

Re eh

R

Pascoe et al[16] recruited five GP centres in the UK for a retrospective analysis of

primary care medical records on registration of cancer diagnoses compared to their

regional cancer registry. One in five of all primary care patients with cancer was not

identified when a search for all patients with cancer was conducted using codes for

malignancy, while one in five patient records with a code for malignancy that was

confirmed in the cancer registry also lacked the necessary documentation to verify the

cancer type, date of diagnosis, or any other aspect of the cancer. Overall, codes for cancer

in these EMRs had a poor level of completeness (29.4%) and correctness (65.6%) when

the UK Cancer Registry was used as the reference. These results, including the possible

false-positive registry of cancer diagnoses, are in line with our study.

In Canada, the use of EMRs in primary care is relatively low. Terry et al[17] studied

the EMR as a potential source of data for research and concluded that care must be

taken when using the EMR data for research purposes. They concluded that more time

is needed for EMR training and to standardize how data are registered.

In summary, the limited literature on data quality in primary care EMRs shows that the

registration of diagnoses is reasonably, but often incomplete when compared to external

sources. Prevalence’s are generally lower in the EMRs and relevant data is sometimes

missing. This is in line with our own findings, although we found improvement over time.

For recent years prevalence’s of cancer diagnoses can be higher than expected based on

external reference data, indicating registry of false-positive diagnoses.

Meaning and impactThe quality of coded diagnosis registry in primary care has improved in recent years (2008

and up) but re-users of data have to understand the pitfalls. Quality and usability of

coded cancer diagnoses in routine care data remains substandard. Re-use should only be

performed by people appropriately trained and with expertise in data management and

analysis. It is important that primary care workers that enter data routinely are involved

in this process for adequate interpretation of data.

Work-up of data (manual checking small parts of the EMR and correction in 11 – 24%

of cases) improves the quality of routine care data. Be aware that 30% of cases can be

missed when searching for cancer diagnoses even in recent years and be aware that

there are differences in the quality of registration of diagnoses between cancer types

and types of EMR system used. For the diagnoses of breast- and prostate cancers, a large

number of false-positive cancer cases can be found. Depending on the purpose of reuse,

this means that it is probably necessary to validate the cancer diagnoses for instance

by linking records with an external data source such as the National Cancer Registry or

a hospital database to eliminate false-positive cases.

Searching for extra cases by checking for journal consultation codes or prescribed cancer

medication seems to be ineffective, since relatively few extra cases were found. When extra

cases are being found, they tend to lead to a higher than expected incidence (false-positive).

52

3

Reu

sab

ility

of c

od

ed d

ata

in th

e pRim

aRy

ca

Re eh

R

The differences between types of cancer and types of EMR are remarkable: for colon

cancer, especially in Medicom and MicroHIS, the SIRs are relatively low (72.2% – 79.0%,

both significant), which suggests that GPs fail to register colon cancer cases in these

systems. For prostate cancer, SIRs in recent years are good, but higher than expected

incidence rates (possible false-positives) are found in MicroHIS (166.6%). Likewise there

seems to be over-registration of all cancer types in Promedico in recent years. The most

probable explanation is that GPs register a clinical suspicion of cancer with an ICPC

code for cancer. A less likely explanation is that the NCR is less complete than previously

demonstrated[19] Also, the differences between EMRs suggests that EMR system design

directly influences data quality.

Furthermore, MicroHIS has the lowest SIR overall (76.2% 95%CI 71.3 – 81.0), since

it does not force its users to choose an ICPC code for every consultation and episode

(unlike Promedico and Medicom). It also needs the lowest number of corrections (6%).

This suggests that although obligatory coding for each encounter results in a more

complete registry, it also leads to more registration errors.

The list of frequent errors we encountered in the EMR systems (S2 table 4) reveals

ways to improve data quality. EMR software suppliers should extend their standard

procedures for integrity checks and provide EMR data entry fields for crucial symptoms,

disease recurrence, for common tests and treatments, and for familial or genetic risk

factors that might be the main reason for consulting the doctor but do not necessarily

mean that the definite diagnosis has been confirmed. Certain ICPC codes could be

developed to enable better registration of these items. In addition, GPs and their staff

should be trained to make adequate EMR registrations. Also, EMR software designers

should cooperate with interaction designers to discover ways to improve data quality at

entry by enhanced system design. Furthermore, new technology in the area of natural

language recognition could be incorporated into an EMR to help users by suggesting

structured data entry options[35].

Unanswered questions and future researchAlthough with this study information on data quality has been produced on a population

level, the next step would be to assess data quality on an individual level. Therefore

the possibilities for an individual patient-based linkage of records between the EMR

and National Cancer Registries, first in an anonymized research setting but eventually

as a dynamic link to everyday practice, are worth investigating. Future research should

furthermore focus on assessing the quality and reusability of data for other parts of the

EMR and on assessing more dimensions of data quality using the dimensions determined

by Gray Weiskopf[15]. Also, investigating which aspects of the user interfaces in the

different EMR lead to differences in data quality would be a worthwhile exercise. Last

but not least research is needed to determine why GPs seem to miss and over-register

cancer cases, since this can provide additional clues to improve data quality.

53

3

Reu

sab

ility

of c

od

ed d

ata

in th

e pRim

aRy

ca

Re eh

R

aCknoWLEDGEmEntS

We thank the GPs in the Utrecht area for sharing their anonymized EMR data with us

for this study, Julia Velikopolskaia for her assistance in extracting data from the Julius

General Practitioners’ Network Database and Jackie Senior for editing the manuscript.

54

3

Reu

sab

ility

of c

od

ed d

ata

in th

e pRim

aRy

ca

Re eh

R

REfEREnCES1. Danciu I, Cowan JD, Basford M, et al. Secondary use of clinical data: the Vanderbilt approach.

J Biomed Inform 2014;52:28–35. doi:10.1016/j.jbi.2014.02.0032. Coorevits P, Sundgren M, Klein GO, et al. Electronic health records: new opportunities for

clinical research. J Intern Med 2013;274:547–60. doi:10.1111/joim.121193. Muller S. Electronic medical records: the way forward for primary care research? Fam Pract

2014;31:127–9. doi:10.1093/fampra/cmu0094. Nivel. Feasibility study indicators primary care and Etalage+ data (Haalbaarheidsstudie

indicatoren huisartsenzorg en Etalage+-gegevens). 2011. 5. Dentler K, Numans ME, ten Teije A, et al. Formalization and computation of quality measures

based on electronic medical records. J Am Med Inform Assoc 2014;21:285–91. doi:10.1136/amiajnl-2013-001921 [doi]


7. Chen EH, Bodenheimer T. Improving Population Health Through Team-Based Panel Management. Arch Intern Med (American Med Assoc 2011;171:(2 pages). doi:10.1001/archinternmed.2011.395

8. Bodenheimer T. The future of primary care: Transforming Practice. N Engl J Med 2008;359(20):2086–9.

9. Neuwirth EE, Schmittdiel JA, Tallman K, et al. Understanding panel management: a comparative study of an emerging approach to population care. Perm J 2007;11:12–20.

10. Loo TS, Davis RB, Lipsitz L a., et al. Electronic Medical Record Reminders and Panel Management to Improve Primary Care of Elderly Patients. Arch Intern Med 2011;171:1552–8. doi:10.1001/archinternmed.2011.394

11. Bleijenberg N, Drubbel I, Ten Dam VH, et al. Proactive and integrated primary care for frail older people: design and methodological challenges of the Utrecht primary care PROactive frailty intervention trial (U-PROFIT). BMC Geriatr 2012;12:16. doi:10.1186/1471-2318-12-16 [doi]


13. Carey IM, Cook DG, De Wilde S, et al. Implications of the problem orientated medical record (POMR) for research using electronic GP databases: a comparison of the Doctors Independent Network Database (DIN) and the General Practice Research Database (GPRD). BMC Fam Pract 2003;4:14. doi:10.1186/1471-2296-4-14 [doi]


15. Weiskopf NG, Weng C. Methods and dimensions of electronic health record data quality assessment: enabling reuse for clinical research. J Am Med Inform Assoc 2013;20:144–51. doi:10.1136/amiajnl-2011-000681

16. Pascoe SW, Neal RD, Heywood PL, et al. Identifying patients with a cancer diagnosis using general practice medical records and Cancer Registry data. Fam Pract 2008;25:215–20. doi:10.1093/fampra/cmn023 [doi]

17. Terry AL, Chevendra V, Thind A, et al. Using your electronic medical record for research: a primer for avoiding pitfalls. Fam Pract 2010;27:121–6. doi:10.1093/fampra/cmp068 [doi]

18. McGilvray Danette. Executing Data Quality Projects. Elsevier 2008. 19. Netherlands CCCT. The Netherlands Cancer Registry. http://cijfersoverkanker.nl (accessed

20 Mar2012).

20. Statistics Netherlands. Available at: http://statline.cbs.nl (December 2013, date last accessed).

21. Hoogendoorn M, Szolovits P, Moons LMG, et al. Utilizing uncoded consultation notes from electronic medical records for predictive modeling of colorectal cancer. Artif Intell Med 2016;69:53–61. doi:10.1016/j.artmed.2016.03.003

22. Hak E, Rovers MM, Sachs APE, et al. Is asthma in 2-12 year-old children associated with physician-attended recurrent upper respiratory tract infections? Eur J Epidemiol 2003;18:899–902.http://www.ncbi.nlm.nih.gov/pubmed/14561050 (accessed 4 Aug2014).

23. Bentsen BG. International classification of primary care. Scand J Prim Health Care 1986;4: 43–50.

24. Van der Zanden G. Quality Assessment of Medical Health Records using Information Extraction, Master’s thesis. 2010.

25. Dutch College of General Practitoners. Guideline adequate EHR registry. Revised version 2013. Available at: https://www.nhg.org/themas/publicaties/richtlijn-adequate-dossiervorming-met-het-epd.

26. Schouten LJ, Jager JJ, van den Brandt PA. Quality of cancer registry data: a comparison of data provided by clinicians with those of registration personnel. Br J Cancer 1993;68:974–7.http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1968711&tool=pmcentrez&rendertype=abstract (accessed 20 Aug2015).

27. Van Leersum NJ, Snijders HS, Henneman D, et al. The Dutch surgical colorectal audit. Eur J Surg Oncol 2013;39:1063–70. doi:10.1016/j.ejso.2013.05.008

28. The Dutch College of General Practitoners. No Title. Ref. Model EHR Syst. Available https//www.nhg.org/themas/publicaties/his-referentiemodel.

29. Breslow NE DN. Statistical Methods in Cancer Research: Volume II—The Design and Analysis of Cohort Studies. Lyon: : International Agency for Research on Cancer 1987.

30. Tse J, You W. How accurate is the electronic health record? - a pilot study evaluating information accuracy in a primary care setting. Stud Health Technol Inform 2011;168: 158–64.


32. Visscher, S. ten Veen, P. Verheij PR. Kwaliteit van ICPC-Codering (‘Quality of ICPC Coding’. Huisarts en Wet j 2012;10:459–459.


34. De Clercq E, Moreels S, Bossuyt N, et al. Routinely-collected general practice data from the electronic patient record and general practitioner active electronic questioning method: a comparative study. Stud Health Technol Inform 2013;192:510–4.

35. Barrett N, Weber-Jahnke JH, Thai V. Engineering natural language processing solutions for structured information from clinical text: extracting sentinel events from palliative care consult letters. Stud Health Technol Inform 2013;192:594–8.http://www.ncbi.nlm.nih.gov/pubmed/23920625 (accessed 4 Aug2014).

CHaPtERDo GPs know their cancer patients?

Assessing the quality of cancer registration in Dutch Primary Care:

a cross sectional validation study

Annet Sollie, General practitioner in training / PhD Fellow, Jessika Roskam, Medical student, Rolf H. Sijmons, Professor of Medical Genetics, Mattijs E Numans, Professor of General Practice, Charles W. Helsper, MD, PhD, Clinical Epidemiologist.

4

Published September 2016 in bmJ open as: Sollie A, Helsper CW, Ader RJ, Ausems MG, van der Woudern JC, Numans ME. Do GPs know their patients with cancer? Assessing the quality of cancer registration in Dutch primary care: a cross-sectional validation study. BMJ Open. 2016 Sep 15;6(9):e012669. doi: 10.1136/bmjopen-2016-012669.

59

4

Assessin

g th

e qu

Alit

y o

f cA

nc

er r

egistr

Atio

n in

Du

tch

Prim

Ary

cA

re

abStRaCt

ObjectivesTo assess the quality of cancer registry in Primary Care.

Design and SettingA cross-sectional validation study using linked data from primary care Electronic Health

Records (EHRs) and the Netherlands Cancer Registry (NCR).

Population 290,000 patients, registered with 120 General Practitioners, from 50 practice centres in

the Utrecht area, the Netherlands in January 2013.

InterventionLinking the Electronic Health Records (EHRs) of all patients in the Julius General Practitioners’

Network (JGPN) database at an individual patient level to the full Netherlands Cancer

Registry (NCR) (approx. 1.7 mln tumours between 1989 and 2011), to determine the

proportion of matching cancer diagnoses. Full-text EHR extraction and manual analysis

for non-matching diagnoses.

Main Outcome MeasuresProportions of matching and non-matching breast, lung, colorectal and prostate cancer

diagnoses between 2007 and 2011, stratified by age category, cancer type and EHR

system. Differences in year of diagnosis between EHR and NCR. Reasons for non-matching

diagnoses.

ResultsIn the Primary Care EHR, 60.6% of cancer cases were registered and coded in accordance

with the NCR. Of the EHR diagnoses 48.9% was potentially false positive (not registered

in the NCR). Results differed between EHR systems but not between age-categories or

cancer types. The year of diagnosis corresponded in 80.6% of matching coded diagnoses.

Adding full-text EHR analysis improved results substantially. A national disease registry

(the NCR) proved incomplete.

ConclusionsEven though GPs do know their cancer patients, only 60.6% are coded in concordance

with the NCR. Re-users of coded EHR data should be aware that 40% of cases can be

missed and almost half can be false-positive. The type of EHR system influences registration

quality. If full-text manual EHRs analysis is used, only 10% of cases will be missed and

20% of cases found will be wrong. EHR data should only be re-used with care.

60

4

Assessin

g th

e qu

Alit

y o

f cA

nc

er r

egistr

Atio

n in

Du

tch

Prim

Ary

cA

re

IntRoDuCtIon

Ask General Practitioners (GPs) if they know their cancer patients and they will most

likely answer with an outspoken “yes”! Ask them if these patients are registered with

this cancer diagnosis in their Electronic Health Record (EHR) system and the answer will

be “yes, probably”. Most GPs will acknowledge the importance of adequate disease

registry in EHR records, certainly for a serious disease such as cancer, since these records

are used for information exchange between care providers.

Since re-use of EHR records for other purposes such as chronic disease management,

[1] research [2–4] and quality assessment [5,6] is becoming commonplace, not only in

hospitals [7,8] but also in Primary Care, [2–4] correct and complete registry of diagnoses

using coding systems is pivotal [9–11]. Disease registry using coding systems has been

common practice in Primary Care EHRs for almost two decades in western countries. In

several countries including the Netherlands, guidelines have been developed for correct

registration in the EHR that do address adequate disease coding.[12]

Despite these developments, there are indications that (coded) disease registry in

Primary Care is still suboptimal, [13,14] even for important diagnoses such as cancer.

[15] Literature describing the quality of data in Primary Care however is limited. In order

to assess quality and subsequent (re-) usability of EHR data, it is important to quantify

this quality and to determine which variables influence quality of disease registry. We

will assess various aspects of the quality of disease registry in primary care for re-use

purposes. We focus on cancer since supposedly reliable and elaborate information

concerning cancer diagnoses is available from the Netherlands Cancer Registry (NCR),

thereby providing a potential reference standard required for our study.

We aim to answer the following questions:

1. What are the proportions of matching, missing (potentially false-negative) and wrong

(potentially false-positive) cancer diagnoses in the Primary Care EHR using the NCR

as a reference standard?

2. How accurate is the year of diagnosis registered in the EHR for the matching cancer

cases, when compared to the NCR as a reference standard?

3. Do age of the patient, cancer type and EHR system influence the quality of cancer

diagnosis registry?

4. What are the causes of suboptimal cancer diagnosis registry in the EHR and subsequent

opportunities for improvement?

61

4

Assessin

g th

e qu

Alit

y o

f cA

nc

er r

egistr

Atio

n in

Du

tch

Prim

Ary

cA

re

mEtHoDS

Design Using a cross-sectional validation study, we assessed the proportion of matching, missing

(potentially false negative) and wrong (potentially false positive) breast, lung, colorectal

and prostate cancer diagnoses in primary care EHR data between 2007 and 2011, using

the NCR as a reference standard. We linked the EHR to the NCR at an individual patient

level using a Trusted Third Party (TTP), to obtain an anonymous dataset containing both

the EHR and the NCR data. We defined coded diagnoses representing the same cancer

type in both databases as matching cases. We defined missing (potentially false-negative)

diagnoses as occurring with one or more of the four cancers under study in the NCR, but

not in the EHR in one of the years 2007 to 2011. We defined wrong (potentially false-

positive) diagnoses as registered with one of the four cancer types under study in the EHR,

but not registered with the same diagnosis in the NCR, in one of the years 2007 to 2011.

DataWe used the routine EHR data extracted from practice centres in the Utrecht area, the

Netherlands, that are a member of the Julius General Practitioners’ Network (JGPN;

120 GPs, 50 practice centres, 290,000 patients). Coded and free-text primary care data

from individual patients enlisted with these centres is periodically extracted to the central

anonymized EHR JGPN database. Data was included if GPs used one of the three most

frequently used EHR systems in the study region; Promedico®, Medicom® and Microhis®.

These systems cover 85% of the population registered with participating GPs. The systems

vary in design and user-interface but are all based on the Reference Model provided by the

Dutch College of General Practitioners. The JGPN population is considered representative

for the Dutch population [16] and its GPs and GP centres represent the average Dutch

GPs and GP centres. GPs were not aware of this study at the time of registry; neither did

they receive specific training on coding. Hence the data in the JGPN can be regarded as

true “routine care data”.

In the Netherlands GP medical encounters are registered according to the “SOAP-

system”.[17] A SOAP-journal consists of four data fields. The first is “Subjective” (S) and

is used to register in plain text what the patient describes, such as complaints and the

reason for the encounter. The second data field is called “Objective” (O) and includes

the GP’s findings and results from clinical examination and measurements, in plain text.

The third field is “Analysis” (A), which is used to register the (working) diagnosis, most

important symptom or hypothesis as plain text and is coded using the International

Classification of Primary Care version 1 (ICPC-1) coding system.[18] The final field is “Plan”

(P), comprising the GPs medication prescriptions, diagnostic tests, referrals to medical

specialists and follow-up appointments as plain text. The list “Episodes”, also coded using

ICPC-1, clusters consultations concerning the same diagnosis for an individual patient

62

4

Assessin

g th

e qu

Alit

y o

f cA

nc

er r

egistr

Atio

n in

Du

tch

Prim

Ary

cA

re

with corresponding start and end dates. According to the Dutch College of General

Practitioners’ guideline [12] for correct registration, every cancer diagnosis should be

registered as an episode in the EHR and consultations concerning relevant complaints or

treatments should be added to this Episode. The guideline also states that it is mandatory

for GPs to update the EHR Episode with the final diagnosis.

ICPC diagnosis codes are available for the more common types of cancer, including the

cancers under study. There are no separate codes available for the recurrence of cancer, for

suspected cancer or for treatments of cancer. The GP manually enters the ICPC code for a

‘cancer’ diagnosis during consultation or after receiving secondary care correspondence.

A diagnosis code should only be used in the EHR after confirmation of the diagnosis and

not if a diagnosis is suspected. The GP decides if and when a new Episode is created for

the cancer diagnosis and which consultations are added to this Episode.

Reference StandardElaborate information on cancer diagnoses and treatment is available in the NCR.[19]

Specially trained staff members enter relevant data about all Dutch cancer diagnoses

in the NCR database, triggered by hospital pathology reports of newly found cancers.

In addition, cancer diagnoses reported in hospital patient discharge files, for which no

pathological investigation is being performed, are also included in the NCR as clinical

diagnoses for most hospitals. The NCR claims to be almost complete (>95% of all cancers)

for the population of the Netherlands and without false-positive records since 1989.

There is a registration delay reported at the NCR of 3 to 9 months after the pathologist

confirmed the cancer and the delay is claimed to be decreasing. There is some evidence

that the quality of the NCR data is complete and accurate.[20,21] Theoretically, cancer

patients missing in the NCR could include those who are diagnosed with cancer in primary

care based on clinical signs and symptoms, but are unable or refuse to go to a hospital.

Therefore, for these patients, no pathology report or hospital admission is registered,

which would be needed to enter the NCR.

Data collection and analysis

Step 1: Identification of Cancer Cases in Primary Care EHR

We identified all breast, lung, colorectal and prostate cancer cases diagnosed at ages 20

to 90 between 2007 and 2011 in the EHR, using a three-step search strategy. First we

searched for patients with one or more cancer Episodes in the database with ICPC codes

X76 (breast cancer), R84 (lung cancer), D75 (colorectal cancer), and/or Y77 (prostate

cancer). Next we selected all cases without a coded cancer related Episode but with one

or more encounters (home visits, correspondence, consultation) coded with X76, R84, D75

and/or Y77. Finally we selected patients without an Episode or medical encounter coded

for the types of cancer under study, but with whom any prescription for cancer specific

medication was registered during observation time. After identifying these patients, a

63

4

Assessin

g th

e qu

Alit

y o

f cA

nc

er r

egistr

Atio

n in

Du

tch

Prim

Ary

cA

re

subset of data including all the required information was extracted from the EHR: ICPC

code, year of diagnosis, age at diagnosis, year of birth, sex and type of EHR system used.

Step 2: Linkage of data

Parallel to step 1, the TTP linked the entire JGPN database with the entire NCR database.

The linking was performed after encryption of the data using a mixed algorithm with

deterministic and probabilistic parts based on the following variables: date of birth,

gender, zip code, last name, initials and first name. If date of birth and gender matched

(deterministic part), the probabilistic part of the algorithm started based on the Fellegi-

Sunter [22] model. This means the other variables were compared, yielding scores, which

were totalled and evaluated using weights.

The TTP provided a list with pseudonymised patient numbers of all patients that were

successfully linked and added the NCR data. JGPN datamanagement added EHR data to

every patient number on the list.

Step 3: Matching diagnoses

Dutch GP’s use the ICPC-1 coding system. The NCR uses the International Classification of

Diseases for Oncology (ICD-O). Diagnoses were counted as matching when their ICPC-1

code and ICD-O code represented the same cancer (Table 1).

Diagnoses were stratified based on cancer type, age category (<50, 50 to 75 and

>75) and EHR system. Differences in the year of diagnosis between EHR and NCR were

determined by subtracting the year of diagnosis at the GP from the year of diagnosis as

registered in the NCR.

Step 4: Non-matching (missing and wrong) diagnoses

Non-matching diagnoses were assessed using the NCR as a reference. We determined

the proportion of missing (potentially false-negative) diagnoses in the EHR and stratified

these per cancer type and per age category. We also determined the proportion of

wrong (potentially false-positive) diagnoses stratified by cancer type, age category and

EHR systems used. In case of repeated entries of a new diagnosis for the same person in

the EHR, we counted one as being correct and the other(s) as false-positive.

We extracted and studied the full EHR of a random sample of 120 of the 1,644 (7.3%)

potentially false-positive EHR diagnoses and 120 of the 1,720 (7.0%) potentially false-

negative EHR diagnoses to determine reasons for inaccurate and incomplete GP disease

registry. To ensure representativeness of our sample, we used the following sampling

method: the sample of false positive cases consisted of 4 x 30 cases per cancer type

equally distributed over the years of interest and the EHR system used. The sample of

false-negative case consisted of 5 cases per cancer type per year. The EHR system could

not be taken into account in the selection of false-negative cases since this could only

be determined after extracting the full EHR of the sample.

All data analysis and calculations were carried out using SPSS Statistics version 21.

64

4

Assessin

g th

e qu

Alit

y o

f cA

nc

er r

egistr

Atio

n in

Du

tch

Prim

Ary

cA

re

Patient InvolvementPatients were not involved in the design, development of outcome measures or conduct

of this study. Since this study uses anonymized EHR data from an existing network

database only, no patients were recruited and thanking patients or disseminating results

directly is not applicable.

RESuLtS

The linkage by the TTP of the full JGPN database to the full NCR database yielded 12,930

JGPN subjects with a registered cancer at the NCR (data from January 2013), of whom 12,526

could be included in our analysis. The remaining 404 (3.1%) records belonged to 202 patients

who were matched twice. These records were considered incorrectly identified. We had to

remove another 14 (0.1%) records for suspected wrong linkage before starting our analyses.

The extraction of breast, lung, colorectal and prostate cancer diagnoses yielded 3,364

cases from the EHR data and 2,839 from the NCR (Table 2).

table 1. Cross-linking of ICPC-1 with ICD-O codes used to define matching cancers in the Electronic Health Records and the Netherlands Cancer Registry

Cancer typeICPC -1 codes used in Electronic

Health RecordICD-9/10- o codes used in national

Cancer Registry

Colorectal cancer D75 incl subtypes: C18 #

D75.01 C19#

D75.02 C20#

D75.03 153*

154 *

Lung cancer R84 C34

162*

163*

Breast cancer X76 incl. subtypes: C50#

X76.1 174*

X76.01

X76.02

X76.03

Prostate cancer Y77 C61#

158*

ICPC-1: International Classification of Primary Care version 1ICD-O: International Classification of Diseases for Oncology (ICD-O)* Codes from ICD-9 (since 1978)# Codes from ICD-10 (since 1990)

65

4

Assessin

g th

e qu

Alit

y o

f cA

nc

er r

egistr

Atio

n in

Du

tch

Prim

Ary

cA

re

Overall 60.6% (1,720 of 2,839) of cases matched (“sensitivity” of the EHR), which

means that 39.4% (1,644 of 2,839) of cancer cases seem to be missing in the EHR

(potentially false-negative). Furthermore, 1,644 (48.9% of 3,364) of EHR cases were not

found in the NCR, thus should be qualified as potentially false-positive. Consequently,

the “positive predictive value” of a cancer diagnosis in the EHR is 51.1%. The two by

two table illustrates these findings, including a negative predictive value of 99.5% and

a specificity of 99.3 % (Table 3).

We found no substantial differences in proportion of matching, potentially false-

negative and false-positive cases between cancer types and age categories (Table 2).

However, there are differences between EHR systems used; Microhis® has the highest

proportion of matching cases (64.5%, 534 out of 828) and the lowest proportion of

potentially false-positive cases (35.5% 294 of 828). Promedico® has the lowest proportion

of matching cases (44.7%, 925 out of 2,068) and the highest proportion of potentially

false-positive cases (55.3%, 1,143 of 2,068).

The year of diagnosis in the EHR is registered in accordance with the NCR for 80.6%

(1,386 out of 1,720) of cancer cases. For 75.5% (252 of 334) of cases with a differing year

of diagnoses, the deviation from the NCR incidence year is less than 2 years (figure 1).

15

Figure 1: Deviation in year of registered cancer diagnosis in the Electronic Health Record from

reference standard (National Cancer Registry)

Manual analysis of the full EHR text in a random sample of 120 unregistered NCR

cases (potentially false-negative), shows that, even though these cases were not coded with

an Episode or Journal consultation, for 29% (n=35) information about the cancer diagnosis is

present in the EHR plain text, indicating the GPs awareness of the diagnosis. For another 23%

(n=27) the cancer diagnosis is also mentioned, available in plain text, but the coding is based

on cancer related symptoms and not on the final cancer diagnosis (for example breast cancer

coded as “ lump in the breast” X19 and not recoded after confirmation of the diagnosis). In

17% (n=20) of cases the GP registered a coded cancer diagnosis but added the date of registry

figure 1. Deviation in year of registered cancer diagnosis in the Electronic Health Record from reference standard (National Cancer Registry)

66

4

Assessin

g th

e qu

Alit

y o

f cA

nc

er r

egistr

Atio

n in

Du

tch

Prim

Ary

cA

re

table 2. Results of record linking between the Electronic Health Records and the Netherlands Cancer Registry

total cancers JGPn

total cancers nCR

number matching

Proportion matching

95%CI

number false-Postives

Proportion false-Positives 95%CI

number false-negatives

Proportion false-negatives

95%CI n m n n/m (%) m m/n (%) k k/m (%)

4 types combined 3364 2839 1720 60.6 (58.8; 62.4) 1644 48.9 (47.2; 50.6) 1119 39.4 (37.6; 41.2)

Age < 50 451 412 246 59.7 (55.0; 64.4) 205 45.5 (40.9; 50.1) 166 40.3 (35.5; 45.0)

Age 50-75 2109 1824 1139 62.4 (60.2; 64.7) 970 46.0 (43.9; 48.1) 685 37.6 (35.3; 39.8)

Age > 75 804 603 335 55.6 (51.6; 59.5) 469 58.3 (54.9; 61.7) 268 44.4 (40.5; 48.4)

EHR system

Promedico 2068 x 925 44.7 1143 55.3 (53.1; 57.4) x

Medicom 468 x 261 55.8 207 44.2 (39.7; 48.7) x

MicroHis 828 x 534 64.5 294 35.5 (32.2; 38.8) x

Cancer Type

Breast Cancer 1144 1008 622 61.7 (58.7; 64.7) 522 45.6 (42.7; 48.5) 386 38.3 (35.3; 41.3)

Age < 50 290 267 156 58.4 (52.5; 64.3) 134 46.2 (40.5; 51.9) 111 41.6 (35.7; 47.5)

Age 50-75 681 598 381 63.7 (59.7; 67.6) 300 44.1 (40.3; 47.8) 217 36.3 (32.4; 40.1)

Age > 75 173 143 85 59.4 (51.4; 67.5) 88 50.9 (43.4; 58.3) 58 40.6 (32.5; 48.6)

Prostate Cancer 662 547 336 61.4 (57.3; 65.5) 326 49.2 (45.4; 53.1) 211 38.6 (34.5; 42.7)

Age < 50 6 5 4 80.0 (44.9; 115.1) 2 33.3 (-4.4; 71.1) 1 20.0 (-15.1; 55.1)

Age 50-75 469 425 270 63.5 (59.0; 68.1) 199 42.4 (38.0; 46.9) 155 36.5 (31.9; 41.0)

Age > 75 187 117 62 53.0 (43.9; 62.0) 125 66.8 (60.1; 73.6) 55 47.0 (38.0; 56.1)

Lung Cancer 731 600 331 55.2 (51.2; 59.1) 400 54.7 (51.1; 58.3) 269 44.8 (40.9; 48.8)

Age < 50 46 36 18 50.0 (33.7; 66.3) 28 60.9 (46.8; 75.0) 18 50.0 (33.7; 66.3)

Age 50-75 492 438 248 56.6 (52.0; 61.3) 244 49.6 (45.2; 54.0) 190 43.4 (38.7; 48.0)

Age > 75 193 126 65 51.6 (42.9; 60.3) 128 66.3 (59.7; 73.0) 61 48.4 (39.7; 57.1)

Colon Cancer 827 684 431 63.0 (59.4; 66.6) 396 47.9 (44.5; 51.3) 253 37.0 (33.4; 40.6)

Age < 50 74 58 43 74.1 (62.9; 85.4) 31 41.9 (30.7; 53.1) 15 25.9 (14.6; 37.1)

Age 50-70 502 409 265 64.8 (60.2; 69.4) 237 47.2 (42.8; 51.6) 144 35.2 (30.6; 39.8)

Age > 75 251 217 123 56.7 (50.1; 63.3) 128 51.0 (44.8; 57.2) 94 43.3 (36.7; 49.9)

67

4

Assessin

g th

e qu

Alit

y o

f cA

nc

er r

egistr

Atio

n in

Du

tch

Prim

Ary

cA

re

table 2. Results of record linking between the Electronic Health Records and the Netherlands Cancer Registry

total cancers JGPn

total cancers nCR

number matching

Proportion matching

95%CI

number false-Postives

Proportion false-Positives 95%CI

number false-negatives

Proportion false-negatives

95%CI n m n n/m (%) m m/n (%) k k/m (%)

4 types combined 3364 2839 1720 60.6 (58.8; 62.4) 1644 48.9 (47.2; 50.6) 1119 39.4 (37.6; 41.2)

Age < 50 451 412 246 59.7 (55.0; 64.4) 205 45.5 (40.9; 50.1) 166 40.3 (35.5; 45.0)

Age 50-75 2109 1824 1139 62.4 (60.2; 64.7) 970 46.0 (43.9; 48.1) 685 37.6 (35.3; 39.8)

Age > 75 804 603 335 55.6 (51.6; 59.5) 469 58.3 (54.9; 61.7) 268 44.4 (40.5; 48.4)

EHR system

Promedico 2068 x 925 44.7 1143 55.3 (53.1; 57.4) x

Medicom 468 x 261 55.8 207 44.2 (39.7; 48.7) x

MicroHis 828 x 534 64.5 294 35.5 (32.2; 38.8) x

Cancer Type

Breast Cancer 1144 1008 622 61.7 (58.7; 64.7) 522 45.6 (42.7; 48.5) 386 38.3 (35.3; 41.3)

Age < 50 290 267 156 58.4 (52.5; 64.3) 134 46.2 (40.5; 51.9) 111 41.6 (35.7; 47.5)

Age 50-75 681 598 381 63.7 (59.7; 67.6) 300 44.1 (40.3; 47.8) 217 36.3 (32.4; 40.1)

Age > 75 173 143 85 59.4 (51.4; 67.5) 88 50.9 (43.4; 58.3) 58 40.6 (32.5; 48.6)

Prostate Cancer 662 547 336 61.4 (57.3; 65.5) 326 49.2 (45.4; 53.1) 211 38.6 (34.5; 42.7)

Age < 50 6 5 4 80.0 (44.9; 115.1) 2 33.3 (-4.4; 71.1) 1 20.0 (-15.1; 55.1)

Age 50-75 469 425 270 63.5 (59.0; 68.1) 199 42.4 (38.0; 46.9) 155 36.5 (31.9; 41.0)

Age > 75 187 117 62 53.0 (43.9; 62.0) 125 66.8 (60.1; 73.6) 55 47.0 (38.0; 56.1)

Lung Cancer 731 600 331 55.2 (51.2; 59.1) 400 54.7 (51.1; 58.3) 269 44.8 (40.9; 48.8)

Age < 50 46 36 18 50.0 (33.7; 66.3) 28 60.9 (46.8; 75.0) 18 50.0 (33.7; 66.3)

Age 50-75 492 438 248 56.6 (52.0; 61.3) 244 49.6 (45.2; 54.0) 190 43.4 (38.7; 48.0)

Age > 75 193 126 65 51.6 (42.9; 60.3) 128 66.3 (59.7; 73.0) 61 48.4 (39.7; 57.1)

Colon Cancer 827 684 431 63.0 (59.4; 66.6) 396 47.9 (44.5; 51.3) 253 37.0 (33.4; 40.6)

Age < 50 74 58 43 74.1 (62.9; 85.4) 31 41.9 (30.7; 53.1) 15 25.9 (14.6; 37.1)

Age 50-70 502 409 265 64.8 (60.2; 69.4) 237 47.2 (42.8; 51.6) 144 35.2 (30.6; 39.8)

Age > 75 251 217 123 56.7 (50.1; 63.3) 128 51.0 (44.8; 57.2) 94 43.3 (36.7; 49.9)

68

4

Assessin

g th

e qu

Alit

y o

f cA

nc

er r

egistr

Atio

n in

Du

tch

Prim

Ary

cA

re

Manual analysis of the full EHR text in a random sample of 120 unregistered NCR cases

(potentially false-negative), shows that, even though these cases were not coded with an

Episode or Journal consultation, for 29% (n=35) information about the cancer diagnosis is

present in the EHR plain text, indicating the GPs awareness of the diagnosis. For another

23% (n=27) the cancer diagnosis is also mentioned, available in plain text, but the coding

is based on cancer related symptoms and not on the final cancer diagnosis (for example

breast cancer coded as “ lump in the breast” X19 and not recoded after confirmation of

the diagnosis). In 17% (n=20) of cases the GP registered a coded cancer diagnosis but

added the date of registry (later than 2011 so not included in primary analysis) instead

of the date of diagnosis. For 10 cases (8%) the cancer was not found in our initial search

(step 1 methods) but appears to be registered correctly when extracting the full EHR. This

means that in 77% (29 + 23 + 17 + 8) of the cancer diagnoses qualified as potentially

false-negative in the EHR registries, the cancer diagnosis is actually known to the GP and

can be recognized in the EHR with adequate text-finding strategies.

For another 8% (n=10) no information could be traced indicating the presence of

cancer in the full text EHR. For 2% (n=2) another valid explanation for the missing coded

cancer diagnosis is present in the EHR text: one patient moved and unlisted with his GP

and for another patient the diagnosis was made after death and the GP did not add this

diagnosis to the EHR. For 13% (n=16) no written journal EHR data linked to the patient

could be retrieved in the JGPN database.

Analysis of the full EHR text of another 120 randomly selected potentially incorrectly

assigned cancer diagnoses shows that for 18% of these seemingly false-positive

diagnoses, clear and reliable indications of the presence of cancer in the EHR is found,

while no diagnosis is present in the NCR. For 49% (n=59) of false-positive diagnoses, no

reason can be traced in the EHR text (Table 4).

Reviewing these numbers, 90% of NCR confirmed cancer cases can be recognized

and found in primary care EHR systems, counting for two thirds of the EHR coded cancer

cases. An additional 10% of coded cancer cases in primary care EHR systems should be

considered correct, but stays unvalidated since the diagnosis did not reach the National

table 3. Two by two table for matching and non-matching cancer cases registered from 2007 to 2011 in the JGPN and NCR

Cancer status according to nCR

totalCancer present Cancer absent

Cancer status according to JGPN

Cancer present 1,720 1,644 3,364

Cancer absent 1,119 237,906 239,025

total 2,839 239,550 242,389*

* Population of the JGPN registered in the included EHR systems.

69

4

Assessin

g th

e qu

Alit

y o

f cA

nc

er r

egistr

Atio

n in

Du

tch

Prim

Ary

cA

re

table 4. Causes of False-Negative and False-Positive registration of cancer diagnoses in the Electronic Health Records

false Positive diagnoses

Number % Comments

59 49 no Explanation

59 49 No logical reason can be traced in the full EHR text about the registration of a cancer code with this patient

15 13 Coding Error by GP

15 13 Coding error by GP (f.i. “R84” lung cancer when lung cancer is suspected by the GP)

46 38 Diagnosis correct in EHR

16 13 Year of diagnosis in the EHR is 2011, leaving a small chance that the histological confirmation of the diagnosis (NCR) was performed (and registered) in 2012, while the clinical diagnosis was made and registered in the EHR in 2011.

11 9 Year of diagnosis > 10 years before 2007; cancer not available in the NCR*

8 7 Cancer registered twice at GP

7 6 Patient has moved or diagnosis was made abroad, thus not in the NCR*

4 3 No tissue biopsy was performed in agreement with patient and/or family; thus not in the NCR*

120 100

false negative diagnoses

Number % Comments

92 77 Information about the cancer is available

35 29 Information about the cancer is available in plain text in the EHR but the cancer is not coded

27 23 GP assigned a wrong code / did not update existing code after diagnosis (f.i. “X19” lump in breast” instead of “X76” breast cancer)

20 17 The cancer is coded in the EHR but GP assigned a year of diagnosis > 2011

10 8 Coded cancer found in full EHR but patient was not in initial search due to time lapse between initial search and linkage (> 1 year)

16 13 EHR record cannot be retrieved

16 13 Patient EHR cannot be retrieved (probably linkage error)

10 8 No Explanation

10 8 No EHR text or codes about any cancer

2 2 Various

2 2 Remaining causes: patient has moved, diagnosis after death

120 100

* NCR = National Cancer Registry

70

4

Assessin

g th

e qu

Alit

y o

f cA

nc

er r

egistr

Atio

n in

Du

tch

Prim

Ary

cA

re

Cancer Registry for various reasons. Approximately 20% of cancer cases found in EHR

systems should be considered as wrongly coded, false positive cases.

DISCuSSIon

Principal FindingsExtracting coded cancer diagnoses from a primary care EHR (JGPN) and linking these

to the Netherlands Cancer Registry demonstrates matching diagnoses in over 60% of

cases. Almost 40% of cancer cases registered in the NCR are missing in the EHR (Table 2).

However, for at least 77% of these false-negative coded diagnoses, un-coded information

indicating the GPs knowledge of the cancer can be found in the EHR (Table 4). Overall,

GPs seem to know the great majority of their cancer patients, since 90% of the NCR

validated cancers are also described in EHR systems.

Almost half of the coded cancer diagnoses in the EHR seem to be false-positive

(Table 2), of which only a minority can be explained by wrongly used diagnostic coding

such as coding symptoms as actual cancer.

There are differences up to 20% in proportions of correct (matching), missing (false-

negative) and wrong (false-positive) cancer diagnoses between EHR systems but not

between age-categories or cancer types. The year of diagnosis is in the EHR is confirmed

in over 80% of matching cases. For incorrectly registered diagnosis-years, 76% deviates

no more than 2 years from the NCR.

Strengths and WeaknessesTo the best of our knowledge, this is the first study to use record linkage to assess quality

of cancer registry in routine primary care data, combined with a search for actual causes

of inadequate registry and resulting opportunities for improvement. The major strengths

of this study are the size of the cohort, the extensive EHR data and the availability of a

reliable reference standard (Netherlands Cancer Registry). Furthermore, the JGPN database

comprises un-manipulated EHR data as available from routine care, hence not improved

or enriched in any way such as in the UK Clinical Practice Research Datalink, formerly

known as General Practice Research Database.[23]

Our study has some restrictions. Since our study was performed in the Netherlands,

the results are indicative for settings similar to ours, which means countries with the GP

in a gatekeeper role, which have adapted to the use of EHRs in primary care resulting

in relatively “mature” EHR systems. We were able to analyse only a sample of false-

positive and false-negative cases. In our study a number of missing (false-negative)

cases (21% (13 + 8) of our sample of 120) could not be traced in the JGPN, and no

explanation regarding wrongfully coded cancer diagnoses could be found in 49% of

false-positive cases.

71

4

Assessin

g th

e qu

Alit

y o

f cA

nc

er r

egistr

Atio

n in

Du

tch

Prim

Ary

cA

re

Since no unique identifiers could be used for linkage, we used the commonly used

alternative method of probabilistic linkage. Consequently, discrepancies in databases could

in part be a consequence of linkage errors, which could have biased our results in either

direction. The primary problem that may occur, is the rare occurrence of matching two

different patients with identical characteristics. This would result in the false assumption

that a cancer diagnosis registered in the NCR is “missing” in the matched patient in the

EHR. This is expected to occur in < 1% of cases. Another linkage error which may occur

is “no match” for a patient which is registered in both databases, but not by the same

characteristics used for linkage. Since we used: date of birth, gender, zip code, last name,

initials and first name, these characteristics are unlikely to be registered differently in the

registries. Only in case of typing errors, moving out of the zip code area or changing last

names within the time frame between dates on which the data were extracted from the

different databases, or in case of not registering such a previous change in one of these

databases, such linkage error will occur. We estimate the chance of such an occurrence

to be below 1%.

To calculate the concordance in year of diagnosis, we used the calendar year in which

the diagnosis was registered. Consequently, both in the EHR and the NCR, a diagnosis

registered in January of a calendar year, e.g. 2011, and a diagnosis registered in December

of the same year are considered as, for this example, “registered in 2011”. This means

that a difference in registration of “one year”, could be either two days (registered in

JGPN on 31rd of December 2010 and in NCR on first of January 2011) or nearly two

years (registered in JGPN on first of January 2010 and in NCR on 31rd of December

2011). Because we used this rough measure, we only showed the absolute numbers and

refrained from providing statistical testing for concordance.

Comparison with Existing LiteratureTwo Dutch [24,25] [25,24]studies and one Swiss [26] study investigating the coding of

diagnoses in primary care EHRs concluded that the quality of coding in general was

fairly good but varied widely between general practices. None of these studies used

record linkage and the largest study (1.1mln patients [24]) assessed only the presence of

‘meaningful’ ICPC diagnostic codes in the EHR. In another Dutch study [13] the diagnosis

inflammatory arthritis in Dutch EHRs could be validated in 71-78% of 219 patients by

comparison with correspondence from a medical specialist.

In a UK CPRD database study,[14] a high (93%) Positive Predictive Value of Read

Codes for congenital cardiac malformations registered between 1996 – 2010 was found.

However, 31% of cases had a different event date, including 10% that differed no more

than 30 days. These results cannot readily be generalized because practices contributing

data to the CPRD are accepted only when they meet standards of data completeness.

[27] Even though the five studies described so far assessed quality of non-cancer disease

registry, they all present outcomes that are in line with our findings.

72

4

Assessin

g th

e qu

Alit

y o

f cA

nc

er r

egistr

Atio

n in

Du

tch

Prim

Ary

cA

re

There are several studies that assess the quality of disease registry in primary care

for cancer. Boggon[15] et al reported a concordance level of 83.3% between CRPD

records from a Diabetes cohort and the UK National Cancer Data Repository. This is

higher than the proportion of matching cases (concordance level) at first sight of 60.6%

that we found, but might be comparable to the 80% we found when using additional

search techniques. Also the proportion of false-positives (17%, 967 of 5.797) and false-

negatives (6%, 341 of 5.676) are much lower than the proportions we found, but again,

only high-quality data is imported in the CRPD. Boggon also found relevant differences

between cancer types and age-categories (less concordance with increasing age), which

we did not.

Pascoe et al [10] recruited five GP centres for a retrospective analysis of EHR records

on registration of cancer diagnoses compared to a regional cancer registry in the UK.

One in five (20%) of all primary care cancer patients was not identified when a search for

all patients with cancer was conducted using codes for malignancy. Also 20% of patient

records with a code for malignancy that was confirmed in the cancer registry lacked the

necessary documentation to verify the cancer documented in the EHR. Overall, codes for

cancer in these EHRs had a poor level of completeness (29.4%) and correctness (65.6%),

if compared to the UK Cancer Registry as the reference standard.

The California Kaiser Permanent study [28] aimed to assess variability in date of

prostate cancer diagnosis between 2000 and 2010 by comparing Cancer Registry,

pathology reports and EHR data. Variability in date of diagnosis was found: from 9.6 years

earlier to 10 years later but the vast majority of deviations was small. These results are

comparable with the results in our study, although our deviations ranged from -10 to +

4 years. A recent study by Kearney et al,[29] validating the completeness and accuracy of

the Northern Ireland Cancer Registry (NICR), found a high level of completeness (99.9%)

within the NICR compared to the GP registries. The authors suggest that these excellent

results could be induced by the introduction of the National Health Service (NHS) unique

identifier in 2008, which enables matching and data-enrichment, but also by financially

rewarding GPs that maintain a high quality up-to-date record of patients with chronic

diseases, including cancer.[29,30]

Meaning and Implications for Research and PracticeDo GPs know their cancer patients? Yes, our data show they do know the vast majority.

Does this mean re-users of data can retrieve all these patients using coded diagnoses?

No, because the proportion of wrong and missing diagnoses is too high. If re-users

have access to full-text, they would be able to identify op to 90% of cases reliably using

labor-intensive manual exploration or text-mining techniques of EHRs. Also, 20% of cases

identified will have to be excluded after re-assessment. For some purposes this will be

acceptable, for other purposes it will not be.

73

4

Assessin

g th

e qu

Alit

y o

f cA

nc

er r

egistr

Atio

n in

Du

tch

Prim

Ary

cA

re

Although we have shown that GPs seem to know their own cancer patients, locums

and doctors working at out-of-hours clinics do rely heavily on EHR data, including

coded diagnoses. Missing (false-negative) and wrong (false-positive) cancer diagnoses

on this list could have adverse effects on clinical practice, including medical decisions

made elsewhere. Also, patients could perceive errors in diagnosis lists as unprofessional

and unreliable. From a research perspective, erroneously including non-cancer patients

(false-positives) and missing real cancer cases (false-negatives), may introduce bias. If

textmining techniques are used, these results improve substantially, as was shown in this

study as well as in others.[31] However, the possibility of residual confounding cannot

be completely excluded.

A number of causes for suboptimal registry have been demonstrated in our study

that might be used as a starting point for improving data quality at the source, hence at

data entry. Improvements could be made (1) through education for practicing and future

GPs, by improving usability of (2) EHR systems and (3) coding systems.

First of all, GPs awareness and coding skills could possibly be improved through

education in order to decrease coding errors and errors in registered year of diagnosis.

Although we have not found any studies proving education actually improves data quality,

we do know that financial incentives as well as feedback using data quality reports does

improve data quality.[30,32,33] This shows that improving registration quality is feasible

and can be learned. Furthermore, GPs could evaluate and update working processes at

the GP practice to integrate diagnosis registry after a letter from a hospital or diagnostic

laboratory is received.

Second, EHR systems could be improved by facilitating user-friendly and accurately

coded diagnosis registry. Some systems are subject to less false-positive diagnoses and

a higher number of accurate cancer diagnosis. Since we do not expect these differences

to be caused by confounding resulting from different types of GPs, choosing certain

types of EHRs, this implies that differences in system design lead to varying data quality.

Adding options which are now missing, e.g. to directly register date of diagnosis,

suspected, recurring and metastasizing disease, treatment, increased markers (f.i. for

Prostate Specific Antigen or PSA) and a positive family history could also improve quality.

Thirdly, improvements could also be realized by adding codes to the coding system

(f.i. for recurring and metastasizing disease), by providing adequate synonyms to improve

findability of codes and by adding relevant crosslinks to facilitate data-sharing between

sources.

Another strategy, which might improve clinical practice, was used in our study.

Linkage has its benefits, since [23] multiple data sources are often complementary and

taken together have added value, as was also demonstrated in our study. For current

GP practice, the effort that has to be put into requesting and performing the actual

linkage process (including patient informed consent) is not worth the benefit. However,

74

4

Assessin

g th

e qu

Alit

y o

f cA

nc

er r

egistr

Atio

n in

Du

tch

Prim

Ary

cA

re

if structural linkage of EHRs to accurate medical data sources (such as the NCR) should

become available, this information could be used to pro-actively alert and inform the GP

to check and adjust recordings in routine care data.

For research purposes and quality assessment we feel that validation of disease cases

and/or improvement of EHR data is necessary before working with the data, particularly

if only coded data are used. Research using EHR data provides access to a very rich data

source, but its interpretation and use should only be performed in cooperation with

experienced clinicians who can judge the meaning of the information in its context.

Linkage could be one of the tools to decrease the number of false-negative coded cancer

diagnoses. For a lot of diseases however no reference standard is available. Linkage to

hospital records could improve data also by decreasing false-negative cases, but the

quality of these data is also likely to be suboptimal. Studying the full EHR, which is time-

consuming, might also be supported by automatized textmining techniques which will

help identifying false-positive records. If in the future these techniques can be made

more advanced and more reliable, these might replace the need for manual searching.

Unanswered Questions and Future researchFuture research should, beside correctness and completeness, evaluate other dimensions

of data quality; concordance, plausibility and currency.[9] Also, quality of other key data

items besides the diagnosis should be studied, for instance risk factors, treatment and

allergies. In cooperation with the NCR, further exploration of the cases where the EHR

seems to provide reliable indications of the presence of cancer, whereas there is no record

in the NCR, is needed in the near future.

Evaluating the user interface of the various EHR systems and determining how these

explain the differences in data quality in this study, would be a worthwhile exercise.

Furthermore in this study we investigated a serious and relatively common disease;

investigating the quality of registry for more rare or less serious diseases may provide

different results, which would be of additional value.

Last but not least, the design, implementation and evaluation of actual interventions

in the GP practice to improve disease registry would provide a necessary next step in

improving EHR data quality. Improving text-mining software and strategies could be

part of this.

Concluding remarksYes, GPs do know the vast majority of their cancer patients. However, re-users of coded

Electronic Health Care data should be aware that they are at risk of missing 40% of cancer

cases and that almost half of the cancer cases found could be wrongfully registered.

Analysing the full-text EHRs improves these numbers: only 10% of cases will be missed and

20% of cases found will be wrong. Particularly in non-clinical circumstances like research,

when high accuracy is needed, Primary Care EHR data should only be re-used with care.

75

4

Assessin

g th

e qu

Alit

y o

f cA

nc

er r

egistr

Atio

n in

Du

tch

Prim

Ary

cA

re

aCknoWLEDGEmEntS

The authors thank the registration teams of the Comprehensive Cancer Centers for the

collection of data for the Netherlands Cancer Registry and the scientific staff of the

Netherlands Cancer Registry. We thank the GPs in the Utrecht area participating in the

Julius General Practitioners’ Network for sharing their anonymized EHR data with us for

this study and Julia Velikopolskaia for her assistance in extracting data.

76

4

Assessin

g th

e qu

Alit

y o

f cA

nc

er r

egistr

Atio

n in

Du

tch

Prim

Ary

cA

re

REfEREnCES1. Bleijenberg N, Drubbel I, Ten Dam VH, et al. Proactive and integrated primary care for frail older

people: design and methodological challenges of the Utrecht primary care PROactive frailty intervention trial (U-PROFIT). BMC Geriatr 2012;12:16. doi:10.1186/1471-2318-12-16 [doi]



4. Terry AL, Chevendra V, Thind A, et al. Using your electronic medical record for research: a primer for avoiding pitfalls. Fam Pract 2010;27:121–6. doi:10.1093/fampra/cmp068



7. Coorevits P, Sundgren M, Klein GO, et al. Electronic health records: new opportunities for clinical research. J Intern Med 2013;274:547–60. doi:10.1111/joim.12119

8. Danciu I, Cowan JD, Basford M, et al. Secondary use of clinical data: the Vanderbilt approach. J Biomed Inform 2014;52:28–35. doi:10.1016/j.jbi.2014.02.003

9. Weiskopf NG, Weng C. Methods and dimensions of electronic health record data quality assessment: enabling reuse for clinical research. J Am Med Inform Assoc 2013;20:144–51. doi:10.1136/amiajnl-2011-000681

10. Pascoe SW, Neal RD, Heywood PL, et al. Identifying patients with a cancer diagnosis using general practice medical records and Cancer Registry data. Fam Pract 2008;25:215–20. doi:10.1093/fampra/cmn023 [doi]

11. Muller S. Electronic medical records: the way forward for primary care research? Fam Pract 2014;31:127–9. doi:10.1093/fampra/cmu009

12. The Dutch College of General Practitoners. Guideline adequate EHR registry. Revised version 2013. Available at: https://www.nhg.org/themas/publicaties/richtlijn-adequate-dossiervorming-met-het-epd.

13. Nielen MMJ, Ursum J, Schellevis FG, et al. The validity of the diagnosis of inflammatory arthritis in a large population-based primary care database. BMC Fam Pract 2013;14:79. doi:10.1186/1471-2296-14-79

14. Hammad TA, Margulis A V, Ding Y, et al. Determining the predictive value of Read codes to identify congenital cardiac malformations in the UK Clinical Practice Research Datalink. Pharmacoepidemiol Drug Saf 2013;22:1233–8. doi:10.1002/pds.3511

15. Boggon R, van Staa TP, Chapman M, et al. Cancer recording and mortality in the General Practice Research Database and linked cancer registries. Pharmacoepidemiol Drug Saf 2013;22:168–75. doi:10.1002/pds.3374



18. Bentsen BG. International classification of primary care. Scand J Prim Health Care 1986;4:43–50.

19. Netherlands Cancer Registry, 2015 www. cijfersoverkanker. nl/?languag. = en. Netherlands Comprehesive Cancer Registration.

77

4

Assessin

g th

e qu

Alit

y o

f cA

nc

er r

egistr

Atio

n in

Du

tch

Prim

Ary

cA

re

20. Van Leersum NJ, Snijders HS, Henneman D, et al. The Dutch surgical colorectal audit. Eur J Surg Oncol 2013;39:1063–70. doi:10.1016/j.ejso.2013.05.008

21. Schouten LJ, Jager JJ, van den Brandt PA. Quality of cancer registry data: a comparison of data provided by clinicians with those of registration personnel. Br J Cancer 1993;68:974–7.http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1968711&tool=pmcentrez&rendertype=abstract (accessed 20 Aug2015).

22. Xinran Li, Guttmann A, Demongeot J, et al. An empiric weight computation for record linkage using linearly combined fields’ similarity scores. Conf Proc. Annu Int Conf IEEE Eng Med Biol Soc IEEE Eng Med Biol Soc Annu Conf 2014;2014:1346–9. doi:10.1109/EMBC.2014.6943848

23. Herrett E, Shah AD, Boggon R, et al. Completeness and diagnostic validity of recording acute myocardial infarction events in primary care, hospital care, disease registry, and national mortality records: cohort study. BMJ 2013;346:f2350.http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=3898411&tool=pmcentrez&rendertype=abstract (accessed 18 Feb2015).

24. Visscher, S. ten Veen, P. Verheij PR. Kwaliteit van ICPC-Codering (‘Quality of ICPC Coding’. Huisarts en Wet j;10:459–459.



27. Clinical Practice Research Datalink, 2015, www.cprd.com. 28. Porter KR, Chao C, Quinn VP, et al. Variability in date of prostate cancer diagnosis: a

comparison of cancer registry, pathology report, and electronic health data sources. Ann Epidemiol 2014;24:855–60. doi:10.1016/j.annepidem.2014.09.004

29. Kearney TM, Donnelly C, Kelly JM, et al. Validation of the completeness and accuracy of the Northern Ireland Cancer Registry. Cancer Epidemiol 2015;39:401–4. doi:10.1016/j.canep.2015.02.005

30. Investing in General Practice; The new general medical services contract. 2004.www.legislation.gov.uk/uksi/2004/291/pdfs/uksi_20040291_en.pdf


32. van der Bij S, Khan N, Ten Veen P, et al. Improving the quality of EHR recording in primary care: a data quality feedback tool. J Am Med Inform Assoc 2016;356:2527–34. doi:10.1093/jamia/ocw054

33. Taggart J, Liaw S-T, Yu H. Structured data quality reports to improve EHR data quality. Int J Med Inform 2015;84:1094–8. doi:10.1016/j.ijmedinf.2015.09.008

CHaPtERPrimary Care management of

women with breast cancer related concerns - a dynamic cohort study

using a Network Database

Annet Sollie MD, MSc, Charles W Helsper MD PhD, Rosanne J.M. Ader MD, Margreet G.E.M. Ausems MD, PhD, Johannes C van der Wouden, PhD,

Mattijs E Numans MD, PhD

5

Published in the European Journal of Cancer Care as, June 2016 as: Sollie A, Helsper CW, Ader RJ, Ausems MG, van der Wouden JC, Numans ME. Primary care management of women with breast cancer-related concerns-a dynamic cohort study using a network database. Eur J Cancer Care (Engl). 2016 Jun 15. doi: 10.1111/ecc.12526.

5

Prim

ary

Ca

re m

an

ag

emen

t of w

om

en w

ith b

rea

st Ca

nC

er r

elated

Co

nC

ern

s

81

abStRaCt

The aim of this study was to determine the incidence, management and diagnostic

outcomes of breast cancer related concerns presented in primary care.

A dynamic cohort study was performed in the anonymised routine electronic medical

records (EMR’s) extracted from 49 General Practices in The Netherlands (163,471 person-

years, women aged 18 - 75). Main Outcome Measures were: 1) incidence rates for breast

cancer related concerns in Primary Care, 2) proportions of these women with and without

symptoms of the breast referred for further investigation, 3) proportions of referrals

(not) according to the guideline and 4) proportions of women with breast cancer related

concerns diagnosed with breast cancer during follow-up.

Breast cancer related concerns are presented frequently in Primary Care (incidence

rate 25.9 per 1,000 women annually). About half these women are referred for further

investigation. There is room to improve GP management, mainly for women with an

increased lifetime risk of developing breast cancer. Information concerning family history

of cancer is often missing in the EMR. Since cancer is rarely diagnosed during follow-up,

particularly when symptoms are absent, reduction of unnecessary concerns is plausible

if identification of those without an increased risk is improved.

5

Prim

ary

Ca

re m

an

ag

emen

t of w

om

en w

ith b

rea

st Ca

nC

er r

elated

Co

nC

ern

s

82

baCkGRounD

Breast cancer is by far the most frequently diagnosed cancer among women worldwide

and is still the second cause of cancer death among women in more developed regions[1].

This is the most plausible explanation why many women have concerns about breast

cancer [2]. Breast cancer related concerns can be caused by physical complaints of the

breast, by fear of breast cancer in general or by experiences with breast cancer, e.g. in

the family[3,4]. In the Netherlands, as in other countries with a gatekeeper health care

system such as the UK, women with potential cancer concerns present first to their

General Practitioner (GP) if they decide to consult a health care worker. GP management

should be focussed on identifying women with an increased risk of having or developing

breast cancer, with as little delay as possible. Women without an increased risk of breast

cancer should be reassured to prevent further anxiety and fear.

Research has shown that delays in cancer diagnosis can be caused in primary

care (first presentation to referral) [5,6]. Furthermore, studies show that genetic risk

assessment with respect to cancer in primary care is still inadequate [7–11]. This means

there are indications that GP management for a number of women presenting with

breast cancer related concerns, is not optimal [12–15]. However, this problem has not

recently been quantified. Also, no recent data is available on the frequency with which

GPs are consulted by women with breast cancer related concerns and neither have

breast cancer outcomes among these women been related to the initial reason for

encounter at the GP recently.

Therefore we aim to determine the incidence, management, and diagnostic outcomes

of breast cancer related concerns presented by women in primary care, by answering

the following research questions:

1. What is the incidence of GP consultations primarily focussed on breast cancer related

concerns, for women with and without symptoms of the breast?

2. What proportion of these symptomatic and asymptomatic women are referred by

their GP for further investigation (breast clinic or radiology department) and/or for

cancer genetic counseling?

3. What proportion of women, from subgroups that present with fear of breast cancer

or a positive family history of breast cancer, are identified by their GP as having an

increased lifetime breast cancer risk?

4. What proportion of women identified by their GPs as having an increased lifetime

risk of developing breast cancer (from question 3) are referred for annual screening

or cancer genetic counseling in accordance to the guideline?

5. What proportion of women presenting in primary care with breast cancer related

concerns are diagnosed with breast cancer during follow-up in relation to the initial

reason for the encounter?

5

Prim

ary

Ca

re m

an

ag

emen

t of w

om

en w

ith b

rea

st Ca

nC

er r

elated

Co

nC

ern

s

83

mEtHoDS

A flowchart of the methods used in this study is depicted in Figure 1.

35

Methods A flowchart of the methods used in this study is depicted in Figure 1.

Study Population Women 18-75y, registered in 2008, 2009 and/or 2010 in Julius Network Database (JGPN)

Step 1 - Selection Selection of all women with one or more coded consultations concerning the mamma/breast,

grouping according to ICPC code assigned at first consultation

Physical complaints X18, X19, X20, X20.1,

X79 X88

(Known) Breast cancerX76, X76.1

Fear of Breast cancerX26

Positive Family History A29.2

Step 4 - Determine Breast cancer patients Number of women with coded diagnosis of breast cancer at final consultation

Step 3 - Determine GP management 3.1 - Referral rates to breast clinic, mammogram/ultrasound and genetic counseling

3.3 – Calculate number of adequate, inadequate and missed referrals for annual screening and

genetic counseling

Step 2 – Symptoms or not Determine if selected women (X26 and A29.2) are

symptomatic or asymptomatic

3.2 - Determine individual life time risk of breast cancer based on documented family history in

EMR text

Data were obtained from the Julius General Practitioners’ Network (JGPN) Database[16–

19]. This database comprises anonymized, coded data that are periodically extracted

from the GPs routine Electronic Medical Records (EMR). The data concern approximately

250,000 patients enlisted with 49 sentinel General Practice centres (120 GPs) in the region

of Utrecht in 2010. Since every Dutch citizen is enlisted with a GP and the practices sharing

their data are randomly spread around the city of Utrecht and surrounding villages, the

involved population is considered representative of the Dutch primary care population

[20]. The available data include ICPC-coded consultations and episodes with diagnoses,

5

Prim

ary

Ca

re m

an

ag

emen

t of w

om

en w

ith b

rea

st Ca

nC

er r

elated

Co

nC

ern

s

84

ATC-coded prescribed medication, laboratory test results, and (for a large proportion of

the patients) also coded referrals and coded letters from medical specialists.

GP consultations in The Netherlands are registered according to the “SoaP-system”

[21]. A SOAP journal consists of four data fields: Subjective: patient complaint, reason

for consultation; objective: clinical examination; analysis (possible) diagnosis; and Plan.

The A-line of a SOAP journal is coded using the International Classification of Primary

Care version 1 (ICPC-1) of the 2009 coding system, published by the WHO [22]. The S, O,

and P lines are usually registered using free text only. The Dutch College of General

Practitioners published a Guideline for Adequate Registry [23]. According to this guideline,

the General Practitioner decides how and when to code a diagnosis or symptom in the

A line. For breast cancer this will usually be done after a letter has been received from a

hospital where the cancer has been diagnosed.

GP’s were not aware of this study at the time of registering their consultations neither

did they receive specific training on coding.

Study population: selection of patients and extraction of data from DatabaseIncluded were all female patients enlisted in participating JGPN practice centres that use

the Dutch EMR system Promedico-ASP®. Relevant parts of the EMR records of all female

patients between ages 18 and 75 years with one or more ICPC consultation codes that

indicate breast cancer related concerns in 2008, 2009, and/or 2010 were extracted from

the database (Table 1).

table 1. Relevant ICPC-Codes used for identification of patients from database

Physical Complaints of the breast

X18 Breast pain female

X19 Breast lump / mass female

X20 Nipple symptom / complaint female

X20.1 Nipple discharge

X21 Breast symptom / complaint female other

X79 Benign neoplasm breast female

X88 Fibrocystic disease breast

breast cancer

X76 Malignant neoplasm breast female

X76.01 Adenocarcinoma breast female

fear of breast cancer

X26 Fear of breast cancer female

Positive family history of breast cancer

A29.2 Breast cancer in family history

5

Prim

ary

Ca

re m

an

ag

emen

t of w

om

en w

ith b

rea

st Ca

nC

er r

elated

Co

nC

ern

s

85

Age at the time of the first visit to the GP that was registered with the selected ICPC

code was determined. Data that were extracted include year of birth; all dates and EMR

records assigned to the selected ICPC-codes; ATC prescribed medication; and referrals

to departments of radiology, surgery, or clinical genetics.

The follow-up time for each patient was defined as the time period between the

date of the first registered consultation with one of the selected ICPC codes and that of

closure of the database for this study (31 December, 2010).

For two subgroups of consultations registered with having fear of breast cancer (X26)

and having a positive family history of breast cancer (A29.2), the complete, anonymized

Electronic Medical Record data were extracted.

Data analysisThe data were analysed in four consecutive steps (Figure 1):

Step 1 (research question 1)

All women with one or more registered ICPC coded consultations that indicate concerns

related to breast cancer in one of the study years were identified in the JGPN database

using the 11 ICPC codes for either complaints of the breast, breast cancer, fear of

breast cancer or a positive family history for breast cancer (table 1). The results were

grouped according to ICPC-code at first consultation and to age category (18–50 years

or 50–75 years). Women between 50–75 years of age are invited to participate in the

Dutch national breast cancer screening program[24]. These women are advised to consult

their GP in case of an abnormal mammogram; also the GP receives a notification. It was

not possible to extract these notifications from the EMR.

Multiple consultations with the same code were counted as one; for women who

had multiple consultations with different codes, the code at the first consultation was

chosen as the reason for encounter. Corresponding incidence rates were calculated per

1000 person years.


For women that were registered with an ICPC code X26 (fear of breast cancer) or A29.2

(family history of breast cancer), the EMR texts in 2008, 2009, and 2010 were studied

to determine whether or not they also reported symptoms of the breast.

Step 3 (research questions 2, 3 and 4)

GP management of these women was determined in three sub-steps:

Step 3.1. First, the number of women that were referred to a breast clinic, for a mammogram

or ultrasound, and to a genetics department were counted by checking coded referrals in

the EMR but also by manually checking all A and P lines in the SOAP journal of the selected

women. Multidisciplinary breast clinics and radiology departments use referral forms, but

the actual referral is written in free text. The text registered by the GP in the EMR or a

5

Prim

ary

Ca

re m

an

ag

emen

t of w

om

en w

ith b

rea

st Ca

nC

er r

elated

Co

nC

ern

s

86

referral letter was used in the analysis. Referral rates were calculated as percentages of the

total population of women that consulted their GPs with concerns about breast cancer.

Step 3.2. For the subgroups of women with codes X26 (fear of breast cancer) and A29.2

(positive family history), the presence of an increased risk of developing breast cancer

was determined, using the complete EMR, including free text. Increased risk was defined

according to the primary care guideline “Diagnosing Breast Cancer” from the Dutch

college of general practitioners, summarized in table 2 ([25].

table 2. Referral policies for screening and genetic counseling, according to Dutch guideline “Diagnosing breast cancer”

SCREEnInG - moderately increased lifetime risk (20 – 30%): mammography requested by GP from age 40 to 50 possibly supplemented by clinical breast examination

- 1 first and 1 second degree relative diagnosed with breast cancer < 50 years old.

- 2 first degree relatives with breast cancer regardless of age

- ≥ 3 first or second degree relatives with breast cancer, regardless of age

- 1 first degree relative with bilateral or multifocal breast cancer, with first tumor diagnosed at age < 50.

- 1 first or second degree relative with ovarian cancer regardless of age and 1 first or second degree relative with breast cancer.

GEnEtIC CounSELInG - Strongly increased lifetime risk of developing breast cancer (>30%), referral to a genetics department

- 1 first degree relative diagnosed with breast cancer < 35 years old.

- ≥ 2 first degree relatives diagnosed with breast cancer regardless of age

- ≥ 3 first or second degree relatives with breast cancer of which at least one tumor diagnosed before age of 50.

Step 3.3. Based on the same guideline it was determined whether women at risk should

be referred for either annual screening, or genetic counseling (table 2). These results

were compared with the registered referral practice of the GP based on the EMR text

and it was determined whether or not referral was done according to the guideline. If

the available information in the EMR was insufficient to determine whether or not referral

was indicated, this was also marked. Subsequently these numbers were converted to

proportions of referrals for annual screening or genetic counseling (not) according to

the guideline.


The proportions of women with a coded diagnosis of breast cancer (X76 or X76.1) during

follow-up were determined for each subgroup of initially coded consultations.

All statistical analyses were performed using SPSS (version 20); EMR texts were

extracted as .csv files and analysed using Microsoft Excel 2010.

5

Prim

ary

Ca

re m

an

ag

emen

t of w

om

en w

ith b

rea

st Ca

nC

er r

elated

Co

nC

ern

s

87

RESuLtS

Incidence rates The total number of women between the age of 18–75 registered in the database was

54,947 in 2008, 52,885 in 2009, and 55,639 in 2010, giving a total of 163,471 person

years. As demonstrated in Figure 2 and Table 3, we found 4,240 women with one or

more contacts with breast cancer related ICPC codes within these years.

The mean age of the women at first consultation was 42.1 ± 13.7 years, and 72.4%

were aged <50. Mean duration of follow-up in the dataset was 1.6 ± 0.9 years. The

overall incidence rate for women consulting their GP with breast cancer related concerns

was 25.9 per 1,000 women per year (4,240/163,471). This means that a Dutch GP with

an average list size of 2,350 patients[26] and an age distribution comparable to the

whole country, will be consulted by women concerned about breast cancer or having

complaints concerning the breast about 22 times a year.

Of the 4,240 unique women with a breast cancer related ICPC found, 3,619 (85.3%

or 22.1 per 1,000 per year) were coded as presenting with physical symptoms and

signs of the breast(s). Most of them reported pain (983 = 23.2% or 6.0 per 1,000 per

year) or a lump (1,080 = 25.5% or 6.6 per 1,000 per year). Fear of breast cancer was

registered with 340 women (8.0% or 2.0 per 1,000 per year), and breast cancer in the

family history was registered as the first reason for encounter with 281 women (6.6%

or 1.7 per 1,000 per year). Among the women registered with fear of breast cancer at

first consultation, 138 of the 340 (41%) also reported having one or more symptoms.

Among the 281 women initially coded with having a family history of breast cancer, 37

(13%) were symptomatic at the time of coding (Table 4).

In summary, the incidence rates are 22.1 per 1,000 per year for women presenting

with physical signs and symptoms of the breast at the first consultation, and 3.8 women

per 1,000 per year for women presenting with fear of breast cancer or a positive family

history. The overall incidence rate for women consulting their GP with breast cancer

related concerns is 25.9 per 1,000 per year.

ManagementAs indicated in Tables 4 and 5, the overall referral rate for further investigation (breast

clinic or radiology department) was 53.2% (N=2,257 out of 4,240 consulting their

GP). A quarter of these women were referred to a breast clinic, which is 11.7% of

total referrals (N = 495, 19.7% aged 50–75 years, 8.6% aged 18–50 years); 1,762

women (41.6%) were referred for mammography and/or ultrasound without further

diagnostic facilities, equally spread over the age categories. Only 47 women (1.1 %)

were referred to a genetics department, of whom the majority were in the age group

18–50 (Table 5).

5

Prim

ary

Ca

re m

an

ag

emen

t of w

om

en w

ith b

rea

st Ca

nC

er r

elated

Co

nC

ern

s

88 42

Figu

re 2

: Num

ber o

f wom

en w

ith re

leva

nt IC

PC c

ode

at fi

rst G

P co

nsul

tatio

n; n

umbe

r dev

elop

ing

brea

st c

ance

r

fig

ure

2. N

umb

er o

f w

omen

wit

h re

leva

nt IC

PC c

od

e at

firs

t G

P co

nsul

tati

on;

num

ber

dev

elo

ping

bre

ast

canc

er

5

Prim

ary

Ca

re m

an

ag

emen

t of w

om

en w

ith b

rea

st Ca

nC

er r

elated

Co

nC

ern

s

89

table 4. Presenting number and proportion of (a)symptomatic women referred for further investigation with A29 or X26 codes at first consultation

a29.2 at first consultationfamily history of bC

x26 at first consultationfear of bC

Consulting GP n (% of total)

Referred m (% of

subgroup)Consulting GP n (% of total

Referred m (% of

subgroup)

Symptomatic 50-75 yr 8 28

18-50 yr 29 110

Total 37 (13%) 27 (73%) 138 (41%) 72 (52%)

Asymptomatic 50-75 yr 40 19

18-50 yr 163 88

Total 203 (72%) 109 (54%) 107 (31%) 56 (52%)

EMR info insufficient 50-75 yr 8 28

18-50 yr 33 67

Total 41 (15%) 32 (78%) 95 (28%) 57 (60%)

Total 50-75 yr 56 75

18-50 yr 225 265

total 281 168 (60%) 340 185 (54%)

table 3. Subgroups of women with ICPC codes at first GP consultation per 1,00 women per year (163,471 person years)

ICPC-Code at first consultation

18-50 yr n (inc per 1,000

women per yr)

50-75yr n (inc per 1,000

women per yr)

18–75yr n (inc per 1,000

women per yr)

A29.2 Breast cancer in family history 225 (1.4) 56 (0.3) 281 (1.7)

X18 Breast pain female 777 (4.8) 206 (1.3) 983 (6.0)

X19 Breast lump /mass female 812 (5.0) 268 (1.6) 1,080 (6.6)

X20 Nipple symptom / complaint female 201 (1.2) 32 (0.2) 233 (1.4)

X20.1 Nipple discharge 67 (0.4) 11 (0.1) 78 (0.5)

X21 Breast symptom / complaint female other

330 (2.0) 169 (1.0) 499 (3.1)

X26 Fear of breast cancer female 264 (1.6) 76 (0.5) 340 (2.1)

X76 Malignant neoplasm breast female 56 (0.3) 243 (1.5) 299 (1.8)

X76.1 Adenocarcinoma breast female 7 (0.0) 35 (0.2) 42 (0.3)

X79 Benign neoplasm breast female 34 (0.2) 27 (0.2) 61(0.4)

X88 Fibrocystic disease breast 299 (1.8) 45 (0.3) 344 (2.1)

total 3,070 1,170 4,240

5

Prim

ary

Ca

re m

an

ag

emen

t of w

om

en w

ith b

rea

st Ca

nC

er r

elated

Co

nC

ern

s

90

table 5. Proportion of women referred to a multidisciplinary breast clinic, radiology department and genetics department

breast Clinicn (% of 4240

consulting GP)

Radiology (mammogram or ultrasound)n (% of 4240

consulting GP)

Genetics Departmentn (% of 4240

consulting GP)

total number of women per age group

Referrals 50-75 yr 231 (19.7%) 430 (36.8%) 6 (0.5%) 1,170

18-50 yr 264 (8.6%) 1,332 (43.4%) 41 (1.3%) 3,070

total 495 (11.7%) 1,762 (41.6%) 47 (1.1%) 4,240

Of the women registered with fear of breast cancer at first consultation but without

physical symptoms according to the EMR text, more than half (52%) were referred

for further investigation. For the group of women initially coded as having a family

history of breast cancer, 109 (54%) were referred for further investigation, despite being

asymptomatic (Table 4).

Identification and management of women with increased lifetime risk Increased

lifetime risk (subgroups X26 and A29.2 at first consultation) was established and

registered by GP’s in 46 and 128 women, in subgroups X26 and A29.2, respectively. For

324 of the 621 EMR files studied (52%), the information regarding family history was

insufficient to determine whether women have an increased lifetime risk and should be

referred for annual screening or genetic counseling (table 6). This includes 89 (32 + 57)

EMR files from patients that were actually referred.

Moderately increased lifetime risk (20–30%). A total of 34 (10%) women out of 340

with an ICPC code X26 (fear of breast cancer) at first consultation and a total of 105

(37%) women out of 281 with an ICPC A29.2 (positive family history of breast cancer)

at first consultation were referred for annual screening and, thus, had been identified by

the GP as having a moderately increased lifetime risk (table 6). Of these referrals, 18–25%

were done according to the guideline, 13–15% not according to the guideline, and for

62– 68% of referrals, the information in the EMR was insufficient to determine whether

or not the referrals were done according to the guideline (table 6). Three women who

should have been referred for annual screening were missed.

Strongly increased lifetime risk (>30%). For genetic counseling, 12 (4%) women who

were coded as presenting with fear of breast cancer and 23 (8%) coded with a family

history of breast cancer were referred and, thus, had been identified by the GP as having

a strongly increased lifetime risk. Referral was according to guideline for 13%–33%,

and not according to the guideline for 26%–42%. For 25%– 61% of referrals it was

impossible to determine whether or not they were done according to the guideline due to

insufficient data in the EMR text (table 6). Seven women who should have been referred

for genetic counseling were missed.

5

Prim

ary

Ca

re m

an

ag

emen

t of w

om

en w

ith b

rea

st Ca

nC

er r

elated

Co

nC

ern

s

91

Outcomes A total of 450 women (10.6% of N = 4,240 consulting their GP with breast cancer related

concerns) were registered as having breast cancer (ICPC code X76 and/or X76.1 at final

consultation in our population during follow-up time). Of these women, 109 (2.6% of

N = 4,240) developed breast cancer after first consulting their GP with complaints of the

breast or concerns about developing breast cancer (table 3, figure 2). Of the remaining

341 women (8.0% of N = 4,240) that the GPs registered breast cancer as the reason for

encounter, 278 (6.5 % of N = 4,240) of them were within the age category 50–75 years

and, thus, had been invited to participate in the national breast cancer screening program.

Leaving out the patients in this age category gives a percentage of 4% ((450 - 341

+ 278)/4,240) developing breast cancer during follow up time (1.6 ± 0.9 years).

Women consulting their GP with a lump in the breast, coded as benign neoplasm of

the breast (X79) or breast lump (X19), were registered with breast cancer during follow

up in 11.5% (n = 7 of 61) and 5.9% (n=64 of 1,080) of cases, respectively. Of the women

presenting with fear of breast cancer or with a family history of breast cancer, only 0.9%

(N = 3 of 340) and 2.1% (N = 6 of 281) were diagnosed with breast cancer during follow up.

table 6. Proportion of referrals for annual screening and genetic counseling in accordance with guideline

x26 fear of breast Cancer a29.2 family history of breast Cancer

moderately increased lifetime

risk (20-30%)Referrals for

annual Screening n (%)

Stongly increased lifetime risk

(>30%)Referrals

for Genetic Counseling n (%)

moderately increased lifetime

risk (20-30%)Referrals for

annual Screening n (%)

Stongly increased lifetime risk

(>30%)Referrals

for Genetic Counseling n (%)

Referral in accordance with guideline1 incl. bRCa positive family2

6 (18%)0

4 (33%)5

26 (25%)3

3 (13%)2

Referral not according to guideline3

5 (15%) 3 (42%) 14 (13%) 6 (26%)

Information not assessable4

23 (68%) 3 (25%) 65 (62%) 14 (61%)

total Referrals 34 12 105 23

1Based on the EMR data the patient was considered at risk and was referred as instructed according to the Guideline “Diagnosing Breast Cancer” from the Dutch college of general practitioners (NHG)2Family with proven mutation in BRCA1 or BRCA2 gene3Based on the EMR data the patient had an increased risk but was not referred according to the guideline or did not have an increased risk but had a referral 4Information in the EMR was ambiguous making it impossible to decide whether or not a patient had an increased risk

5

Prim

ary

Ca

re m

an

ag

emen

t of w

om

en w

ith b

rea

st Ca

nC

er r

elated

Co

nC

ern

s

92

DISCuSSIon

SummaryThis dynamic cohort study in the Julius GP’s Network database was performed to

determine incidence rates, management, and diagnostic outcomes for women in primary

care with concerns related to breast cancer. The overall incidence rate for breast cancer

related concerns is 25.9 per 1,000 women per year, the majority of them presenting

with physical signs and symptoms of the breast (85.3% or 22.1 per 1,000 per year).

The referral rate for further investigation is just over 50%, for symptomatic as well as

for asymptomatic women. About a quarter of referrals for annual screening or genetic

counseling were determined as performed according to the guideline, about a quarter

were not, and the information in the EMR was insufficient for the remainder. Because a

large proportion of cases could not be assessed, it is unclear how many women that should

have been referred for annual screening or genetic counseling because of an increased

lifetime risk were missed. Among the women presenting with complaints of the breast

or concerns about developing breast cancer, 2.6% were diagnosed with breast cancer

during follow-up. Women presenting with a lump in the breast were most frequently

diagnosed with breast cancer during follow up.

Strengths and limitationsMajor strengths of the present study are the size of the cohort and the quality of the data.

The JGPN database is comprised of well-documented information of enlisted patients.

Characteristics of these patients did not differ from the overall Dutch population, and the

main characteristics of the GPs were comparable with total Dutch GPs with respect to

age, gender, part-time and fulltime workers, and practice in urban and rural areas [20]. To

the best of the present authors’ knowledge, this is the first report on the actual referral

practice of GPs for this patient group and the first report where GP referral behaviour is

compared with the national guidelines.

The use of routine care data has limitations that should be kept in mind when

interpreting results. These limitations relate to the patient, the GP, coding, and the EMR.

The patient decides whether and when she consults her GP with a breast cancer related

concern. Others might have the same concerns without visiting the doctor. Secondly, GPs

decide what they consider to be the most important symptom presented by their patients

during consultation to be registered in the EMR. This might result in one GP registering

a symptom as pain in the breast, while another registers worry about breast cancer. A

third uncertainty in the present data is the ICPC-1 coding system. Misclassification, lack of

available codes, and differences in classification between years and GPs cannot be ruled

out. This may have resulted in either an overestimation or underestimation of the true

incidence rates. However, we know that in larger populations and among larger groups

5

Prim

ary

Ca

re m

an

ag

emen

t of w

om

en w

ith b

rea

st Ca

nC

er r

elated

Co

nC

ern

s

93

of GPs sharing their data, differences in presentation, prioritization, and lack of detail

become less significant [27]. Furthermore, we know that variation in registration rules in

different EMR systems may be a cause of heterogeneity in extracted data. To minimise

the effect of heterogeneity in the present study, we restricted data collection to only one

type of EMR. Also, patient preferences discussed during the consultation for or against

referral are not captured using ICPC codes. Finally, restrictions in registration possibilities

in existing EMR systems should be taken into account (e.g., the limited possibilities in

Dutch EMR systems to register family history information, leading to possible registration

of this information as plain text). In the present study, this information (in plain text)

appeared to be insufficient to determine increased lifetime risk for a large proportion

of cases.

However, the fact that information concerning family history is missing in a large part

of EMR files could also mean that the information is available but not registered by the

GP, that the GP lacks knowledge in this area or the guidelines are not readily available or

unclear, or that the GP has other reasons to deviate from the guidelines. Previous studies

have shown that GPs currently lack knowledge and confidence in this area [28] and that

there is an urgent need for a genetics curriculum for postgraduate and continuing general

practice education in this area [29].

Another limitation might be that our study population was comprised also of

women who participate in the national breast cancer screening program (age category

50–75 years). Within the screening programme, abnormal mammography results are

reported to GPs together with advice to refer to a multidisciplinary breast clinic. Inclusion

of these women has probably resulted in an overestimation of the “true” rate of women

that would consult their GP with complaints of the breast or concerns about developing

breast cancer. However, an average Dutch general practice is faced with only three

positive screening referrals per year, only one of whom will be diagnosed with breast

cancer [30]. Note that these women may also be less inclined to visit their GP with

concerns about breast cancer in between screening mammograms.

Finally, in the present study, the follow-up time was restricted to a maximum of

three years after the first consultation. This is likely to be too short to determine the

total number of women developing breast cancer, leading to an underestimation of

this outcome.

Comparison with existing literatureThe incidence rates for breast cancer related concerns found in the present study are

considerably higher than those in older data presented in two UK studies (data from

1995–1996) but slightly lower than in another Dutch study (data from 1985–2003).

However, we included patients that the GP registered as having breast cancer at first

consultation (X76 and X76.01), in contrast to the other studies described here.

5

Prim

ary

Ca

re m

an

ag

emen

t of w

om

en w

ith b

rea

st Ca

nC

er r

elated

Co

nC

ern

s

94

The Bridge Study group recorded presentation rates of breast symptoms in 34 general

practices in South Wales in 1995–1996. These presentation rates ranged from 1.9–14.8

patients per GP per year (median = 6.5): 46.4% with breast lump, 28.2% with breast

pain, 16.2% with lumpiness, and 5.5% with nipple discharge (The BRIDGE study group

1999). This translates to 2.2–17.0 per 1,000 women per year (median 7.4), calculating with

an average general practice size of 1,712 in 1997 in the UK, (Royal College of General

Practitioners 2004) and 51% of the population being female in 2001 [32]. Note that

these numbers only include women with physical complaints of the breast (in our study

22.1 per 1,000 women per year). In a study by Newton [33] 257 GPs from Sheffield (UK)

participated and recorded a mean number of 2.05 consultations over a 4-week period

in 1995. If annual figures are extrapolated from these data, they suggest that each GP

sees 15.8 women with new breast problems per year, or 18 per 1,000 women per year.

These numbers include all “breast problems,” including women that present with a

positive family history. This study does not explicitly present women with fear of breast

cancer as a separate group.

The incidence rates found in the present study are slightly lower than the findings of

Eberl, [34] who studied routine family practice data from Dutch GP practices between

1985 and 2003 on breast symptoms. Breast symptoms were reported in 29.7 consultations

per 1,000 active female patients per year, with breast pain (13 per 1,000 per year) and

breast mass (9 per 1,000 per year) being the most common breast-related complaints.

Note that this excludes 3.6 per 1,000 per year who consult their GP for fear of breast

cancer (compared with 2.0 per 1,000 in the present study). This means that Eberl found an

overall incidence rate of 33.3 per 1,000 women per year, compared to 25.9 per 1,000 per

year in the present study. Incidence rates found in her study were 2.5 per 1,000 women

per year for nipple complaints and 4.3 per 1,000 per year with other breast complaints.

A possible explanation for the differences in incidence rates found could be that these

are a reflection of the incidence trends of breast cancer. Toriola [35] summarizes these

for the US as having four distinct patterns; an increase over 45 years (1943–1979), a

more rapid increase over 20 years (1980–1999) – punctuated by a gradual (2000–2002),

then sharp decline (2002–2003) –, and a post 2003 period of stable incidence rates.

Another explanation may be that there is a growing awareness in recent years among

women concerning breast cancer, which prompts more GP visits. Furthermore there may

be slight differences in ICPC codes used in the different studies, resulting in differing

inclusion conditions.

Concerning referral patterns, limited literature was available. The Bridge Study group,

mentioned above, reported that, in 1995–1996, 55% of all patients were referred[36].

This is in line with the results found in the present study, with an overall referral rate

54.3%. Newton [33] found that, in 1995, at an initial consultation for breast symptoms,

GPs referred approximately one-third of women to secondary care.

5

Prim

ary

Ca

re m

an

ag

emen

t of w

om

en w

ith b

rea

st Ca

nC

er r

elated

Co

nC

ern

s

95

Eberl [34] who studied routine family practice data from Dutch GP practices between

1985 and 2003 on breast symptoms (mean registration time 5.6 years), found that of the

women complaining of breast symptoms, 81 (3.2%) had breast cancer diagnosed, which is

much lower than the 10% found in the present study. Note that we included patients who

were coded as having breast cancer at first consultation, while Eberl probably did not (no

information available in the published material). Leaving out these patients means that only

2.6% developed breast cancer ((450 - 341)/4,240); however, these numbers include women

with a positive screening advise from the national breast cancer screening program. Leaving

out the patients in this age category (278 women aged 50–75 years) gives a percentage of 4%

((450 - 341 + 278)/4,240) as developing breast cancer, which closely resembles the Eberl data.

A recent UK CPRD study by Walker[37] revealed that the PPV of breast cancer

(diagnosed between 2000 and 2009) with a breast lump presented at the GP was 4.8%

in women aged 40–49 years, rising to 48% in women aged >70 years. PPVs were lower

in women who also reported breast pain. Hippesley-Cox[38] found, also using UK Primary

Care data (2000 – 2012) that a breast lump was associated with a 51-fold increased risk

of breast cancer. These studies confirm our finding that GPs should be aware of women

presenting with a lump in the breast because they are most frequently diagnosed with

breast cancer (5.9%, figure 2) during follow up.

ConCLuSIonS

This study demonstrates breast cancer related concerns are presented in Primary Care

frequently. The referral rate for further investigation is over 50%. There is room to improve

GP Management, mainly for women with an increased lifetime risk of developing breast

cancer. Information regarding the family history is often missing in the EMR. Furthermore,

since only 2.6% of women with breast cancer related concerns were diagnosed with

breast cancer during follow-up time, substantial reduction of unnecessary concerns is

plausible by improving identification of those without an increased risk of breast cancer.

Unanswered Questions and Future ResearchFuture research aimed at the underlying mechanism of low adherence to guidelines for

referral by GPs in case of an increased lifetime risk, and poor registration of the family

history in the EMR, could assist in the development of effective interventions to improve

referral practice of patients at increased genetic risk.

Strategies to be evaluated include: (1)increasing awareness of the importance of

registering an increased lifetime risk (2) optimizing the availability of [online] up-to-date

and easy to use referral guidelines, for example by integrating them into the EMR; (3)

developing training for GPs in taking, assessing and registering a family history and (4)

enabling the registry of a family history within the context of the EMR.

5

Prim

ary

Ca

re m

an

ag

emen

t of w

om

en w

ith b

rea

st Ca

nC

er r

elated

Co

nC

ern

s

96

REfEREnCES1. Ferlay J, Soerjomataram I I, Dikshit R, et al. Cancer incidence and mortality worldwide:

sources, methods and major patterns in GLOBOCAN 2012. Int J Cancer 2014;136:E359–86. doi:10.1002/ijc.29210

2. Consedine NS, Magai C, Krivoshekova YS, et al. Fear, anxiety, worry, and breast cancer screening behavior: a critical review. Cancer Epidemiol Biomarkers Prev 2004;13:501–10.http://www.ncbi.nlm.nih.gov/pubmed/15066912 (accessed 13 Dec2015).

3. Bennett, P.Parsons, E.Brain, K.Hood K. Long-term cohort study of women at intermediate risk of familial breast cancer: experiences of living at risk. 2010.

4. Gibbons A, Groarke A. Can risk and illness perceptions predict breast cancer worry in healthy women? J Health Psychol Published Online First: 23 February 2015. doi:10.1177/1359105315570984

5. Neal RD. Do diagnostic delays in cancer matter? Br J Cancer 2009;101 Suppl :S9–12. doi:10.1038/sj.bjc.6605384

6. Maclean R, Jeffreys M, Ives A, et al. Primary care characteristics and stage of cancer at diagnosis using data from the national cancer registration service, quality outcomes framework and general practice information. BMC Cancer 2015;15:500. doi:10.1186/s12885-015-1497-1

7. Ardern-Jones A, Kenen R, Eeles R. Too much, too soon? Patients and health professionals’ views concerning the impact of genetic testing at the time of breast cancer diagnosis in women under the age of 40. Eur J Cancer Care (Engl) 2005;14:272–81. doi:10.1111/j.1365-2354.2005.00574.x

8. Carroll JC, Rideout AL, Wilson BJ, et al. Genetic education for primary care providers: improving attitudes, knowledge, and confidence. Can Fam Physician 2009;55:e92–9.http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2793208&tool=pmcentrez&rendertype=abstract (accessed 17 Jan2015).

9. McCann S, MacAuley D, Barnett Y. Genetic consultations in primary care: GPs’ responses to three scenarios. Scand J Prim Health Care 2005;23:109–14. doi:10.1080/02813430510015259

10. Nippert I, Harris HJ, Julian-Reynier C, et al. Confidence of primary care physicians in their ability to carry out basic medical genetic tasks-a European survey in five countries-Part 1. J Community Genet 2011;2:1–11. doi:10.1007/s12687-010-0030-0

11. Vasen HFA, Möslein G, Alonso A, et al. Recommendations to improve identification of hereditary and familial colorectal cancer in Europe. Fam Cancer 2010;9:109–15. doi:10.1007/s10689-009-9291-3

12. Burke W, Culver J, Pinsky L, et al. Genetic assessment of breast cancer risk in primary care practice. Am J Med Genet A 2009;149a:349–56. doi:10.1002/ajmg.a.32643

13. Campbell H, Holloway S, Cetnarskyj R, et al. Referrals of women with a family history of breast cancer from primary care to cancer genetics services in South East Scotland. Br J Cancer 2003;89:1650–6. doi:10.1038/sj.bjc.6601348

14. Febbraro T, Robison K, Wilbur JS, et al. Adherence patterns to National Comprehensive Cancer Network (NCCN) guidelines for referral to cancer genetic professionals. Gynecol Oncol 2015;138:109–14. doi:10.1016/j.ygyno.2015.04.029

15. Bell RA, McDermott H, Fancher TL, et al. Impact of a randomized controlled educational trial to improve physician practice behaviors around screening for inherited breast cancer. J Gen Intern Med 2015;30:334–41. doi:10.1007/s11606-014-3113-5


17. Hamoen EH, Reukers DF, Numans ME, et al. Discrepancies between guidelines and clinical practice regarding prostate-specific antigen testing. Fam Pract 2013;30:648–54. doi:10.1093/fampra/cmt045

5

Prim

ary

Ca

re m

an

ag

emen

t of w

om

en w

ith b

rea

st Ca

nC

er r

elated

Co

nC

ern

s

97

18. Kasteleyn MJ, Wezendonk A, Vos RC, et al. Repeat prescriptions of guideline-based secondary prevention medication in patients with type 2 diabetes and previous myocardial infarction in Dutch primary care. Fam Pract 2014;31:688–93. doi:10.1093/fampra/cmu042

19. Lacourt TE, Houtveen JH, Smeets HM, et al. Infection load as a predisposing factor for somatoform disorders: evidence from a dutch general practice registry. Psychosom Med 2013;75:759–64. doi:10.1097/PSY.0b013e3182a3d91f



22. Bentsen BG. International classification of primary care. Scand J Prim Health Care 1986;4:43–50.23. Dutch College of General Practitoners. Guideline adequate EHR registry. Revised version 2013.

Available at: https://www.nhg.org/themas/publicaties/richtlijn-adequate-dossiervorming-met-het-epd.

24. Dutch breast cancer screening programme by the National Institute for Public Health and the Environment. http://rivm.nl/en/Topics/B/Breast_cancer_screening_programme

25. Zonderland HM, Tuut MK, den Heeten GJ, et al. [Revised practice guideline ‘Screening and diagnosis of breast cancer’]. Ned Tijdschr Geneeskd 2008;152:2336–9.http://www.ncbi.nlm.nih.gov/pubmed/19024064 (accessed 17 Jan2015).

26. Schäfer, W.L.A., Berg, M.J. van den, Groenewegen PP. De werkbelasting van huisartsen in internationaal perspectief. Huisarts Wet;3:94–101.http://www.henw.org/archief/volledig/id12123-de-werkbelasting-van-huisartsen-in-internationaal-perspectief.html.

27. van Bommel MJ, Numans ME, de Wit NJ, et al. Consultations and referrals for dyspepsia in general practice--a one year database survey. Postgrad Med J 2001;77:514–8.http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1742094&tool=pmcentrez&rendertype=abstract (accessed 18 Jan2015).

28. Watson E, Clements A, Yudkin P, et al. Evaluation of the impact of two educational interventions on GP management of familial breast/ovarian cancer cases: a cluster randomised controlled trial. Br J Gen Pract 2001;51:817–21.http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1314127&tool=pmcentrez&rendertype=abstract (accessed 17 Jan2015).

29. Houwink EJF, Henneman L, Westerneng M, et al. Prioritization of future genetics education for general practitioners: a Delphi study. Genet Med 2012;14:323–9. doi:10.1038/gim.2011.15

30. Verbeek ALM, van Dijck JAAM, Kiemeney LALM, et al. [Responsible cancer screening]. Ned Tijdschr Geneeskd 2011;155:A3934.http://www.ncbi.nlm.nih.gov/pubmed/22085573 (accessed 17 Jan2015).

31. The presentation and management of breast symptoms in general practice in South Wales. The BRIDGE Study Group. Br J Gen Pract 1999;49:811–2.http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1313533&tool=pmcentrez&rendertype=abstract (accessed 17 Jan2015).

32. Palgrave Macmillan. Social Trends. Available at: http://www.palgrave-journals.com.proxy.library.uu.nl.st/journal/v39/n1/full/st20093a.html. 2009.

33. Newton P, Hannay DR, Laver R. The presentation and management of female breast symptoms in general practice in Sheffield. Fam Pract 1999;16:360–5.http://www.ncbi.nlm.nih.gov/pubmed/10493705 (accessed 17 Jan2015).

34. Eberl MM, Phillips RL, Lamberts H, et al. Characterizing breast symptoms in family practice. Ann Fam Med 2008;6:528–33. doi:10.1370/afm.905

35. Toriola AT, Colditz GA. Trends in breast cancer incidence and mortality in the United States: implications for prevention. Breast Cancer Res Treat 2013;138:665–73. doi:10.1007/s10549-013-2500-7

5

Prim

ary

Ca

re m

an

ag

emen

t of w

om

en w

ith b

rea

st Ca

nC

er r

elated

Co

nC

ern

s

98

36. The BRIDGE Study Group. The presentation and management of breast symptoms in general practice in South Wales. Br J Gen Pract 1999;:811–2.

37. Walker S, Hyde C, Hamilton W. Risk of breast cancer in symptomatic women in primary care: a case-control study using electronic records. Br J Gen Pract 2014;64:e788–93. doi:10.3399/bjgp14X682873

38. Hippisley-Cox J, Coupland C. Symptoms and risk factors to identify women with suspected cancer in primary care: derivation and validation of an algorithm. Br J Gen Pract 2013;63:e11–21. doi:10.3399/bjgp13X660733

Strategies & Solutions for improving data quality and enabling data

reuse and sharing

CHaPtERA new coding system for metabolic disorders demonstrates gaps in the international disease classifications ICD-10 and SNOMED-CT which can

be barriers to genotype-phenotype data sharing

Annet Sollie, Rolf H. Sijmons, Dick Lindhout, Ans T. van der Ploeg, M. Estela Rubio Gozalbo, G. Peter A. Smit, Frans Verheijen, Hans R. Waterham,

Sonja van Weely, Frits A. Wijburg, Rudolph Wijburg, and Gepke Visser

6

Published in Human mutation, July 2013 as: Sollie A, Sijmons RH, Lindhout D, van der Ploeg AT, Rubio Gozalbo ME, Smit GP, Verheijen F, Waterham HR, van Weely S, Wijburg FA, Wijburg R, Visser G. A new coding system for metabolic disorders demonstrates gaps in the international disease classifications ICD-10 and SNOMED-CT, which can be barriers to genotype-phenotype data sharing. Hum Mutat. 2013 Jul;34(7):967-73. doi: 10.1002/humu.22316.

105

6

A n

ew c

od

ing

system

for

metA

bo

lic d

isor

der

s dem

on

strA

tes gA

ps

abStRaCt

Data sharing is essential for a better understanding of genetic disorders. Good phenotype

coding plays a key role in this process. Unfortunately, the two most widely used coding

systems in medicine, ICD-10 and SNOMED-CT, lack information necessary for the detailed

classification and annotation of rare and genetic disorders. This prevents the optimal

registration of such patients in databases and thus data-sharing efforts. In order to improve

care and to facilitate research for patients with metabolic disorders we developed a new

coding system for metabolic diseases with a dedicated group of clinical specialists. Next,

we compared the resulting codes with those in ICD and SNOMED-CT. No matches were

found in 76% of cases in ICD-10 and in 54% in SNOMED-CT. We conclude that there

are sizable gaps in the SNOMED-CT and ICD coding systems for metabolic disorders.

There may be similar gaps for other classes of rare and genetic disorders. We have

demonstrated that expert groups can help in addressing such coding issues. Our coding

system has been made available to the ICD and SNOMED-CT organizations as well as to

the Orphanet and HPO organizations for further public application and updates will be

published online (www.ddrmd.nl and www.cineas.org).

106

6

A n

ew c

od

ing

system

for

metA

bo

lic d

isor

der

s dem

on

strA

tes gA

ps

IntRoDuCtIon

Data sharing is essential for a better understanding of rare genetic disorders and the

underlying genetic defects. Good phenotype coding plays a key role in this process and

also in general in processes where phenotype data needs to be entered into clinical

registries, genotype-phenotype databases, and biobanks, and shared between them.

Such initiatives to register, combine and exchange clinical and research data are pivotal

in supporting research and improving health care [Jones et al., 2011; Richesson and

Vehik, 2010].

Rare diseases are life threatening or chronically debilitating diseases with a

prevalence of up to 5 per 10,000 inhabitants in the European Union (EU). It is

estimated that there are at least 5,000 rare diseases, many of them genetic, affecting

6-8% of the total population in the EU, which implies a minimum 27 million people in

the EU are affected [European Medicines Agency (EMA), 2007]. In the United States,

the Rare Disease Act of 2002 also defined rare disease according to prevalence,

specifically “any disease or condition that affects less than 200,000 persons in

the United States”, or about 1 in 1,500 people. Although there are no disease-

modifying therapies for most rare diseases, the passing of the 1983 U.S. Orphan

Drug Act (ODA) [Food and Drug Administration] and European legislation in 2000

[European Parliament. Regulation (EC) No. 141/2000 of the European Parliament

and the Council of 16 December 1999 on orphan medicinal products] stimulated

new research lines by creating financial incentives and other supportive measures

for developers of new drugs to treat people with rare diseases [Talele et al., 2010].

It is expected that many more rare diseases will become amenable to treatment

within the next few decades.

The need to improve research and care in the field of rare disorders, which can be

strongly supported by the sharing and combining of data on these rare patients, has

also been recognized by the Council of the European Union. Through their European

Action in the Field of Rare Diseases [Official Journal of the European Union, Council

recommendation of 8 June 2009 on an action in the field of rare disease], signed in 2009,

the EU member states committed themselves to establishing and implementing a national

rare disease action plan and to cooperating at a European level on this health issue. The

European Action stated that member states should “aim to ensure that rare diseases

are adequately coded and traceable in all health information systems, encouraging

an adequate recognition of the disease in the national healthcare and reimbursement

systems based on the ICD.”

Unfortunately, the two most widely used coding systems in medicine– ICD (the WHO’s

International Classification of Diseases, www.who.int/classifications/icd) and SNOMED-CT

(Systematized Nomenclature of Medicine Clinical Terminology, www.ihtsdo.org) – lack

107

6

A n

ew c

od

ing

system

for

metA

bo

lic d

isor

der

s dem

on

strA

tes gA

ps

essential details for classifying and annotating rare and hereditary disorders. This is a

barrier to the optimal registration of patients with these disorders in databases, and

to much needed data-sharing efforts, such as those in the Human Variome Project

(http://www.humanvariomeproject.org).

Our study addresses this problem for metabolic diseases, a particular hereditary

subgroup of rare disorders. Metabolic diseases, also referred to as inborn errors of

metabolism, are generally monogenic defects resulting in a deficient activity in an

enzyme or a transporter in a pathway of cellular metabolism [Scriver et al., 2001]. The

number of recognized metabolic diseases is continually increasing due to advances

in knowledge and diagnostic laboratory techniques. Most metabolic diseases are

extremely rare (< 1 per 50,000 inhabitants), although all metabolic diseases combined

have an estimated, relatively high birth prevalence of up to 1 per 800 newborns

[Sanderson et al., 2006]. In the Netherlands, we decided to build a registry for

patients with metabolic disorders and also to optimize the codes for national use

in medical and clinical genetics. With these purposes in mind, we developed with

a dedicated group of clinical specialists a clinically oriented annotation system

for metabolic disorders based on two existing national coding systems. To assess

the potential value of adding our annotation system to ICD and SNOMED-CT, we

compared the three systems and identified large gaps in both ICD and SNOMED-CT.

To the best of our knowledge, we are the first to actually quantify these gaps for a

specific field of rare diseases.

matERIaLS & mEtHoDS

Study OverviewWe combined and expanded two existing coding systems for metabolic diseases, the

DDRMD (Dutch Diagnosis and Registration of Metabolic Diseases, www.ddrmd.nl)

and a subset of CINEAS (Dutch center for disease code development and distribution

to the clinical genetics community, www.cineas.org) to develop a more detailed and

strongly clinically oriented coding system. The DDRMD was set up by specialists

in metabolic disorders, whereas CINEAS was initiated by clinical geneticists. Both

systems were originally developed independent of each other and born out of the

need to have more extensive coding system available than the ones offered by

SMOMED and ICD. The primary purpose of each of our original coding systems was

improving patient classification and retrieval. We used the DDRMD as a starting

point for extending the coding system of metabolic diseases, because this system

had already been used for more than ten years by metabolic specialists in clinical

practice. We matched and enriched these systems in a three-step process, exemplified

in a flowchart (Figure 1).

108

6

A n

ew c

od

ing

system

for

metA

bo

lic d

isor

der

s dem

on

strA

tes gA

ps

A list of criteria for including codes in the coding system was drawn up for the

matching process (Table 1).

table 1. List of Criteria for Including Codes in our Coding System

no. Criteria

1 The disease has to be a separate clinical entity

2 Is must be likely that the disease is a separate clinical entity; just one case report in the literature is not enough, unless an enzyme deficiency or transport defect was demonstrated

3 No separate entries for gene defects; a gene can, however, be connected to a disease (no specific mutation is mentioned)

4 No specific entries for groups of diseases

5 One enzyme defect leads to only one separate code

To facilitate cross-linking, but also to investigate the extent to which codes were

lacking in ICD and SNOMED-CT, we checked and updated existing mappings to these

two international systems.

6

practice. We matched and enriched these systems in a three-step process, exemplified in a

flowchart (Figure 1).

A list of criteria for including codes in the coding system was drawn up for the

matching process (Table 1).

Table 1. List of Criteria for Including Codes in our Coding System

No. Criteria

1 The disease has to be a separate clinical entity

2 Is must be likely that the disease is a separate clinical entity; just one case report in the

literature is not enough, unless an enzyme deficiency or transport defect was

demonstrated

3 No separate entries for gene defects; a gene can, however, be connected to a disease

(no specific mutation is mentioned)

figure 1. Flowchart of building the metabolic disease coding system and quantifying the gaps in ICD-10 and SNOMED-CT

109

6

A n

ew c

od

ing

system

for

metA

bo

lic d

isor

der

s dem

on

strA

tes gA

ps

DDRMD (background, origin, objective)The Dutch Diagnosis and Registration of Metabolic Diseases is a collaborative project of

all the clinical metabolic centers in the Netherlands. It was started in 2001 and over 5,000

patients have been registered so far, with almost 300 different metabolic diseases. The main

reason for initiating the DDRMD was that despite the various diagnosis registration systems

used in hospitals, it was proving difficult to retrieve patients with metabolic diseases from

these registers. Since there was no disease-specific registration for metabolic diseases, it

was impossible to analyze relevant patient data, either for research or for care purposes.

In the DDRMD, patient data are registered by one metabolic specialist per metabolic

center (see Figure 2)

via a secure web server (SSL). In addition, relevant data on newborns referred

because of an abnormal neonatal screening result indicative for metabolic disease are

also included. The data are used to facilitate research on metabolic diseases and to

provide information on the outcome of the Dutch newborn screening procedure for

metabolic diseases.

CINEAS (background, origin, objective)CINEAS is the Dutch center for disease code development and distribution to the clinical

genetics community. It was initiated by the eight clinical genetics centers responsible for

genetic counseling and diagnostics in the Netherlands in 1992 [Zwamborn-Hanssen et al.,

1997]. It is used in daily practice by the Dutch clinical geneticists and genetic counselors

to assign diseases to patients. Presently, the 55th edition of CINEAS lists more than 5,500

diseases, most of them rare, and the metabolic diseases form a distinct subset (Figure 3).

figure 2. Data model for DDRMD**Dutch Diagnosis % Registration of Metabolic Diseases; bsn = dutch national identification number;nbs = newborn screening

110

6

A n

ew c

od

ing

system

for

metA

bo

lic d

isor

der

s dem

on

strA

tes gA

ps 9

via a secure web server (SSL). In addition, relevant data on newborns referred because of an

abnormal neonatal screening result indicative for metabolic disease are also included. The

data are used to facilitate research on metabolic diseases and to provide information on the

outcome of the Dutch newborn screening procedure for metabolic diseases.

CINEAS (background, origin, objective) CINEAS is the Dutch center for disease code development and distribution to the

clinical genetics community. It was initiated by the eight clinical genetics centers responsible

for genetic counseling and diagnostics in the Netherlands in 1992 [Zwamborn-Hanssen et al.,

1997]. It is used in daily practice by the Dutch clinical geneticists and genetic counselors to

assign diseases to patients. Presently, the 55th edition of CINEAS lists more than 5,500

diseases, most of them rare, and the metabolic diseases form a distinct subset (Figure 3).

figure 3. Data model CINEAS*- only core tables.*Dutch national disease code development distribution center for the clinical genetics community

A number of Dutch diagnostic DNA laboratories use the CINEAS system as well, and

recently the Danish genetics centers have decided to adopt CINEAS. Each new edition of

the list contains new disease entries submitted by users, after they have been discussed

and approved by a group of experts. The entire process of submitting and adding new

entries to the database is supported by a website (www.cineas.nl or www.cineas.org),

a paid professional curator and a quickly responding national expert panel, which has

reduced throughput time to an average of two weeks. Local system administrators

upload new editions to their own patient information systems, and the website facilitates

searching of the CINEAS database and is used to publish the new editions. Codes are

never removed from the system, but can be made obsolete and thus no longer assigned

to patients. Entry, modifications and obsoletion of codes including dates are saved in

the Diagnosis History. Crosslinks are provided to OMIM, Online Mendelian Inheritance

in Man, a catalogue of hereditary disorders and their genes (www.omim.org), and to

SNOMED and ICD. Although the codes in CINEAS, including the metabolic codes, are

non-hierarchical (see Discussion), individual codes can easily be found in the system using

the onboard search engine.

111

6

A n

ew c

od

ing

system

for

metA

bo

lic d

isor

der

s dem

on

strA

tes gA

ps

Existing Coding SystemsThe most widely used system in medical practice is the ICD, published by the World

Health Organization (version 9 published in 1977, or version 10 in 1999). It is categorized

by the affected organ system, which makes it difficult to use for diseases in which

more than one organ is affected, as is the case for many rare genetic diseases. The

WHO is working on the revision of ICD-10, with the aim of publishing ICD-11 in 2015

(www.who.int/classifications/icd/revision).

SNOMED-CT is a coding system which has been adopted by many hospital information

systems and standards organizations worldwide as a key coding system. Since 1974,

SNOMED-CT has evolved from a pathology-specific nomenclature into a healthcare

terminology system. There are many studies that have shown the value of SNOMED-CT

in theory, but studies on its use in clinical practice are relatively rare [Cornet and de

Keizer, 2008].

Assessment of gaps for metabolic diesease in ICD and SNOMED-CTDuring the final steps of our matching process (Figure 1), for each code in our system we

chose the most appropriate ICD-10 code as a cross-link, using the online WHO browser

(http://apps.who.int/classifications/icd10/browse/2010/en) and searching with disease

names and synonyms. When no specific disease code was available, we chose a group

name or non-specific code based on the etiology, for example “E79.8 Other disorders

of purine and pyrimidine metabolism” or “E88.8 other specified metabolic disorder”

(Table 2).

112

6

A n

ew c

od

ing

system

for

metA

bo

lic d

isor

der

s dem

on

strA

tes gA

ps

table 2. Example from Our Coding System

Disease Synonyms Identifier omIm ICD-10 SnomED

adenylosuccinase deficiency

adenylosuccinate lyase 1573 103050 E79.8 73843004

aldolase-B deficiency hereditary fructose intolerance 1318 229600 E74.1 20052008

alpha-aminoadipic aciduria

2-amino-/ 2-oxoadipic aciduria 2-aminoadipic 2-oxoadipic

1599 204750 E88.8 not available

alpha-aminoadipic semialdehyde dehydrogenase deficiency

pyridoxine dependent epilepsy AASA folinic acid responsive convulsions antiquitine gene

2286 266100 E88.8 not available

alpha-N-acetylgalactosaminidase deficiency

NAGAneuroaxonal dystrophiaSchindler disease

1539 609241 E88.8 double codes: 238048001

and 230365004 with 3 subcodes

In addition, we mapped as many entries as possible to the SNOMED-CT International

Edition of January 2011, using CliniClue Explore (http://www.cliniclue.com/) and by

searching SNOMED-CT using disease names and synonyms. We recorded all the

unambiguous mappings and all the possible mappings if more than one SNOMED-CT

code was available. Finally we calculated the gaps for metabolic diseases in ICD-10 and

SNOMED-CT as percentages of codes with matches in our coding system.

RESuLtS

We have developed a specific coding system for metabolic diseases, currently containing

almost 300 different disorders. Every item in our system has a unique identifier and

includes a disease name, existing synonyms, and mappings to the OMIM catalogue,

ICD-10 and SNOMED-CT (example in Table 2). For the unique identifiers we used the

existing CINEAS codes. The mappings to the other coding systems can be used for data

exchange with other databases and provide extra search possibilities.

Note that we deviated from our inclusion criteria (Table 1) for the group of mitochondrial

diseases. Apart from the separate respiratory chain disorders, we added two general

codes for diseases caused by mitochondrial DNA variations. This is because this particular

area is evolving rapidly and clear classification is not yet possible in all cases. We expect

to be able to create more specific entries for these diseases in the coming years.

For 214 (76%) of the diseases in our coding system, there was no specific

matching ICD-10 code and only an ICD-10 group name that was too general for our

113

6

A n

ew c

od

ing

system

for

metA

bo

lic d

isor

der

s dem

on

strA

tes gA

ps

clinical classification purposes was available (e.g. ICD-10 code E88.8 “other specified

metabolic disorders”).

For 155 (54%) of our codes, it was not possible to map unambiguously to SNOMED-CT,

because for 81 codes (29%), there was no SNOMED-CT code available and for 72 codes

(25%) SNOMED-CT contained double codes. These duplicates were counted mostly

because the disease and enzyme deficiency were given separate codes in SNOMED-CT,

but also because too much detail in SNOMED-CT made it difficult to distinguish between

group codes and subcodes. An example is the disease “alpha-N-acetylgalactosaminidase

deficiency” for which SNOMED provided mistakingly two codes, one with three subcodes

(Table 2). This shows that, despite the size of the SNOMED system, the unambiguous

detail needed in clinical practice for metabolic diseases is often not available.

We aim to publish incidences and prevalence of individual metabolic diseases in our

coding system online in the spring of 2013.Our coding system is being continuously

updated and is published on www.ddrmd.nl and www.cineas.org in pdf and xml formats.

CINEAS and DDRMD keep existing as two different organizations, each with a different

purpose, now sharing the code system for metabolic disorders. Requests for additions to

and alterations of DDRMD and/or CINEAS users are submitted by email to [email protected]

or [email protected] or by using an online web-form on the CINEAS website for

registered organizations. These requests are subsequently discussed in the national

CINEAS online expert panel for approval. The national coordinator of DDRMD is now a

member of the CINEAS expert panel. The coding system has already been updated using

these procedures and now contains 285 diseases.

Continued funding for the classification efforts is provided by CINEAS (Dutch national

disease code development and distribution center for the clinical genetics community).

Our novel coding system has recently been donated to ICD, SNOMED-CT, and also to

the Human Phenotype Ontology (www.human-phenotype-ontology.org), a promising

emerging ontology for phenotypic abnormalities, and to Orphanet (www.orpha.net)

an important reference portal for information on rare diseases and orphan drugs, for

further public application. Continuous updates of our system will be published online.

DISCuSSIon

The most widely used classification and coding systems in medical databases are ICD

and SNOMED-CT. Historically, the focus in the development of these systems has been

directed towards classifying common disorders. The development and updating process

for international broad medical coding systems is a highly demanding task and we

acknowledge the important contribution of ICD and SNOMED-CT to the annotation

of common disorders. However, annotation for rare disorders has been left behind.

Collectively, this group is large and growing steadily due to the identification of new

114

6

A n

ew c

od

ing

system

for

metA

bo

lic d

isor

der

s dem

on

strA

tes gA

ps

diseases and improved clinician awareness. Our study demonstrates large gaps in both ICD

(76%) and SNOMED-CT (54%) for metabolic disorders. Based on our clinical experience,

we suspect that there may be similar gaps for other types of rare disorders. Such gaps

are a barrier to database- and data-sharing efforts. We have shown that with the help

of dedicated clinicians and code development agencies, the problem of coding gaps for

rare disorders can be successfully addressed.

Developing codes for a rare field of medicine has special challenges. We observed

during the development of our system that existing hierarchical, ‘tree’, classification

structures, such as those used in SNOMED-CT and in ICD-10, were proving inconvenient

for our purpose. Such structures, when well developed for the particular branches, allow

for the selection of patients from groups of disorders rather than those with particular

individual disorders. However, in the rare field of metabolic disorders, these existing tree

structures turned out to be problematic and we dropped our initial hierarchical approach

for several reasons. Firstly, several diseases did not fit into any specific group or category

leading to a risk of misclassification. Secondly, several diseases fitted into more than one

group or category leading to significant risk of double entries for the same disorder.

Furthermore, given the explosion of knowledge in this field of rare genetic diseases,

extensive and continuous expertise is needed to update the accuracy of a specialist tree

structure. Given the aim of our coding system to assign diagnostic end codes to patients

and to obtain incidence and prevalence data from our registry on individual metabolic

diseases, a non- hierarchical design turned out to be functional. With growing knowledge

on underlying molecular pathways, well fitting metabolic branches of the coding trees are

likely to be developed in the future in the international community and this will support

better data handling on the level of groups of metabolic disorders.

The World Health Organization has signaled the need to improve ICD-10 for use in the

field of rare diseases. A special Topic Advisory Group (http://www.who.int/classifications/

icd/TAGs/en/index.html) has been assigned to the subject of rare diseases to advise the

WHO on the current updating and revision process from ICD-10 to ICD-11 (anticipated

publication in 2015). We recently donated our work to both the ICD and SNOMED-CT

communities to support further code development, and to the Orphanet and HPO

organizations as well. These organizations are also contributing to solving annotation

problems. The Orphanet organization (www.orpha.net) has stressed the need to provide

well-designed codes for rare diseases, especially for the purposes of data sharing and

it puts much effort into this field [Rath et al., 2012]. The Human Phenotype Ontology

(HPO, http://www.human-phenotype-ontology.org/) is another important international

initiative to support the annotation of genetic disorders and we are presently collaborating

with HPO in order to further enrich both coding systems.

We are convinced that the approach we adopted – of code development driven by

particular clinical and epidemiological needs, and support for that development from

115

6

A n

ew c

od

ing

system

for

metA

bo

lic d

isor

der

s dem

on

strA

tes gA

ps

experts working in the clinical and medical fields of interest – can contribute to the quality

of annotation for rare diseases, and thus to healthcare for patients with these diseases.

aCknoWLEDGmEntS

This work was supported by CINEAS (Dutch national disease code development and

distribution center for the clinical genetics community) and DDRMD (Dutch Diagnosis

& Registration of Metabolic Diseases). DDRMD was initially funded by Metakids and later

by Top Institute Pharma, Leiden, the Netherlands, as part of projects T6-208 (2008-2011)

and T6-505 (2012-2013). The public funding organizations were not involved in the

design or conduct of the study reported in this article, nor in the collection, analysis,

and interpretation of the data or preparation, review, or approval of the manuscript.

The corporate sponsors of the DDRMD only reviewed the manuscript for intellectual

property issues.

We thank Jan Smeitink, Franc Jan van Spronsen, Annet M. Bosch, Maaike de Vries,

Monique Williams, Margot F. Mulder and Jolanda Huizer for their contributions, and

Jackie Senior for editing the manuscript.

116

6

A n

ew c

od

ing

system

for

metA

bo

lic d

isor

der

s dem

on

strA

tes gA

ps

REfEREnCES1. Official Journal of the European Union. Council recommendation of 8 June 2009 on an

action in the field of rare disease (). 2. Cornet R, de Keizer N. 2008. Forty years of SNOMED: A literature review. BMC Med Inform

Decis Mak 8, Suppl 1:S2. 3. European Medicines Agency (EMA). Orphan drugs and rare diseases at a glance. http://

www.ema.europa.eu/docs/en_GB/document_library/Other/2010/01/WC500069805.pdf accessed October 20, 2011. Document ref. EMEA/290072/2007.

4. European Parliament. Regulation (EC) No. 141/2000 of the European Parliament and the Council of 16 December 1999 on Orphan Medicinal Products. Official Journal of the European Union. http://eur-lex.europa.eu.proxy-ub.rug.nl/LexUriServ/LexUriServ.do?uri=OJ: L:2000:018:0001:0005:En:PDF, January 22, 2000.

5. Food and Drug Administration (USA). Orphan Drug Act. Available at: Http://www.fda.gov/RegulatoryInformation/Legislation/FederalFoodDrugandCosmeticActFDCAct/SignificantAmendmentstotheFDCAct/OrphanDrugAct/default.htm accessed October 20, 2011.

6. Jones S, James E, Prasad S. 2011. Disease registries and outcomes research in children: focus on lysosomal storage disorders. Paediatr Drugs 13:33-47.

7. Rath A, Olry A, Dhombres F, Brandt MM, Urbero B, Ayme S. 2012. Representation of rare diseases in health information systems: The Orphanet approach to serve a wide range of end users. Hum Mutat 33:803-808.

8. Richesson R, Vehik K. 2010. Patient registries: Utility, validity and inference. Adv Exp Med Biol 686:87-104.

9. Sanderson S, Green A, Preece MA, Burton H. 2006. The incidence of inherited metabolic disorders in the West Midlands, UK. Arch Dis Child 91:896-899.

10. Scriver CR, Beaudet AL, Sly WS, Valle MD. 2001. The Metabolic and Molecular Bases of Inherited Disease. New York: McGraw-Hill, Medical Publishing Division. page xliiii.

11. Talele SS, Xu K, Pariser AR, Braun MM, Farag-El-Massah S, Phillips MI, Thompson BH, Cote TR. 2010. Therapies for inborn errors of metabolism: What has the orphan drug act delivered? Pediatrics 126:101-106.

12. Zwamborn-Hanssen AM, Bijlsma JB, Hennekam EF, Lindhout D, Beemer FA, Bakker E, Kleijer WJ, de France HF, de Die-Smulders CE, Duran M, van Gennip AH, van Mens JT, Pearson PL, Mantel G, Verhage RE, Geraedts JP. 1997. The Dutch uniform multicenter registration system for genetic disorders and malformation syndromes. Am J Med Genet 70:444-447.

CHaPtERSORTA: a System for Ontology-based Recoding and Technical

Annotation of biomedical phenotype data

Chao Pang, Annet Sollie, Anna Sijtsma, Dennis Hendriksen, Bart Charbon, Mark de Haan, Tommy de Boer, Fleur Kelpin, Jonathan Jetten, K. Joeri van der Velde,

Nynke Smidt, Rolf Sijmons, Hans Hillege, Morris A. Swertz

7

Published in Database, the journal of biological databases and curation, September 2015 as: Pang C, Sollie A, Sijtsma A, Hendriksen D, Charbon B, de Haan M, de Boer T, Kelpin F, Jetten J, van der Velde JK, Smidt N, Sijmons R, Hillege H, Swertz MA. SORTA: a system for ontology-based re-coding and technical annotation of biomedical phenotype data. Database (Oxford). 2015 Sep 18;2015. pii: bav089. doi: 10.1093/database/bav089.

121

7

SOR

TA: A

SySTem

fOR

On

TOlO

gy

-bA

Sed R

ecO

din

g A

nd

Tech

nic

Al A

nn

OTA

TiOn

abStRaCt

There is an urgent need to standardize the semantics of biomedical data values, such

as phenotypes, to enable comparative and integrative analyses. However, it is unlikely

that all studies will use the same data collection protocols. As a result, retrospective

standardization is often required, which involves matching of original (unstructured

or locally coded) data to widely used coding or ontology systems such as SNOMED

CT (clinical terms), ICD-10 (International Classification of Disease), and HPO (Human

Phenotype Ontology). This data curation process is usually a time-consuming process

performed by a human expert.

To help mechanize this process, we have developed SORTA, a computer-aided system

for rapidly encoding free text or locally coded values to a formal coding system or

ontology. SORTA matches original data values (uploaded in semicolon delimited format)

to a target coding system (uploaded in Excel spreadsheet, OWL ontology web language or

OBO open biomedical ontologies format). It then semi-automatically shortlists candidate

codes for each data value using Lucene and n-gram based matching algorithms, and can

also learn from matches chosen by human experts.

We evaluated SORTA’s applicability in two use cases. For the LifeLines biobank, we

used SORTA to recode 90,000 free text values (including 5,211 unique values) about

physical exercise to MET (Metabolic Equivalent of Task) codes. For the CINEAS clinical

symptom coding system, we used SORTA to map to HPO, enriching HPO when necessary

(315 terms matched so far). Out of the shortlists at rank 1, we found a precision/recall

of 0.97/0.98 in LifeLines and of 0.58/0.45 in CINEAS. More importantly, users found the

tool both a major time saver and a quality improvement because SORTA reduced the

chances of human mistakes. Thus, SORTA can dramatically ease data (re)coding tasks

and we believe it will prove useful for many more projects.

Database URL: http://molgenis.org/sorta or as an open source download from

http://www.molgenis.org/wiki/SORTA

122

7

SOR

TA: A

SySTem

fOR

On

TOlO

gy

-bA

Sed R

ecO

din

g A

nd

Tech

nic

Al A

nn

OTA

TiOn

IntRoDuCtIon

Biobank and translational research can benefit from the massive amounts of phenotype

data now being collected by hospitals and via questionnaires. However, heterogeneity

between data sets remains a barrier to integrated analysis. For the BioSHaRE(1) biobank

data integration project, we previously developed BiobankConnect(2), a tool to overcome

heterogeneity in data structure by mapping data elements from the source database

onto a target scheme. Here, we address the need to overcome heterogeneity of data

contents by coding and/or recoding data values, i.e. mapping free text descriptions or

locally coded data values onto a widely used coding system. In this ‘knowledge-based

data access’, data is collected and stored according to local requirements while information

extracted from the data is revealed using standard representations, such as ontologies,

to provide a unified view(3).

The (re)coding process is essential for the performance of three different kinds of

functions:

1. Search and query. The data collected in a research and/or clinical setting can be

described in numerous ways with the same concept often associated with multiple

synonyms, making it difficult to query distributed database systems in a federated

fashion. For example, using standard terminologies, the occurrence of ‘cancer’ written

in different languages can be easily mapped between databases if they have been

annotated with same ontology term.

2. Reasoning with data. Ontologies are the formal representation of knowledge and

all of the concepts in an ontology have been related to each other using different

relationships, e.g. ‘A is a subclass of B’. Based on these relationships, the computer can be

programmed to reason and infer the knowledge(4). For example, when querying cancer

patients’ records from hospitals, those annotated with ‘Melanoma’ will be retrieved

because ‘Melanoma’ is specifically defined as a descendant of ‘Cancer’ in the ontology.

3. Exchange or pooling of data across systems. Ontologies can also be used to

describe the information model, such as the MGED (Microarray Gene Expression Data)

ontology describing microarray experiments or hospital information coded using the

ICD-10 (International Classification of Diseases) coding system, so that the data can

easily flow across systems that use the same model(4).

The data (re)coding task is essentially a matching problem between a list of free text

data values to a coding system, or from one coding system to another. Unfortunately,

as far as we know, there are only a few software tools available that can assist in this

(re)coding process. Researchers still mostly have to evaluate and recode each data value

by hand, matching values to concepts from the terminology to find the most suitable

candidates. Not surprisingly, this is a time-consuming and error-prone task. Based on

123

7

SOR

TA: A

SySTem

fOR

On

TOlO

gy

-bA

Sed R

ecO

din

g A

nd

Tech

nic

Al A

nn

OTA

TiOn

our previous success in BioSHaRE, we were inspired to approach this problem using

ontology matching and lexical matching(2). We evaluated how these techniques can aid

and speed-up the (re)coding process in the context of phenotypic data. In particular, we

used our newly developed system, SORTA, to recode 5,210 unique entries for ‘physical

exercise’ in the LifeLines biobank(5) and 315 unique entries for ‘physical symptoms’

(including terms that are similar, but not the same) in the Dutch CINEAS (www.cineas.org)

(6) and HPO (Human Phenotype Ontology) coding systems for metabolic diseases.

RequirementsSeveral iterations of SORTA-user interviews resulted in the identification of the following

user requirements:

1. Comparable similarity scores, e.g. scores expressed as a percentage, so users can

easily assess how close a suggested match is to their data, and decide on a cut-off

to automatically accept matches.

2. Support import of commonly used ontology formats (OWL/OBO) for specialists and

Excel spread sheets for less technical users.

3. Fast matching algorithm to accommodate large input datasets and coding systems.

4. Online availability so users can recode/code data directly and share with colleagues

without need to download/install the tool.

5. Maximize the sensitivity to find candidate matches and let users decide on which one

of them is the ‘best’ match.

6. Enable complex matching in which not only a text string is provided but also associated

attributes such as labels, synonyms and annotations, e.g. [label: Hearing impairment,

synonyms:(Deafness, Hearing defect)].

ApproachesTwo types of matching approaches have been reported in the literature: lexical matching

and semantic matching. Lexical matching is a process that measures the similarity

between two strings(7). Edit-distance(8), n-gram(9) and Levenshtein distance(10) are

examples of string-based algorithms that focus on string constituents and are often

useful for short strings, but they do not scale up for matching large numbers of entity

pairs. Token-based techniques focus on word constituents by treating each string as a

bag of words. An example of these techniques is the vector space model algorithm(11), in

which each word is represented as a dimension in space and a cosine function is used to

calculate the similarity between two string vectors. Lexical matching is usually implemented

in combination with a normalization procedure such as lowering case, removing stop

words (e.g. ‘and’, ‘or’, ‘the’) and defining word stems (e.g. ‘smoking’ à ‘smoke’).

Semantic matching techniques search for correspondences based not only on the

textual information associated to a concept (e.g. description) but also on the associative

relationships between concepts (e.g. subclass, ‘is-a’)(7). In these techniques, for example,

124

7

SOR

TA: A

SySTem

fOR

On

TOlO

gy

-bA

Sed R

ecO

din

g A

nd

Tech

nic

Al A

nn

OTA

TiOn

‘melanoma’ is a good partial match for the concept called ‘cancer’. Because our goal is to

find the most likely concepts matching data values based on their similarity in description,

lexical-based approaches seem most suitable.

One of the challenges in the (re)coding task is the vast number of data values that need

to be compared, which means that the matcher has to find correspondences between the

Cartesian product of the original data values and the codes in the desired coding system.

High-throughput algorithms are needed to address this challenge and two methods have

been developed to deal with the matching problem on a large scale. The Early Pruning

Matching Technique(12) reduces search space by omitting irrelevant concepts from the

matching process, e.g. the ontology concept (label:hearing impairment, synonyms[deafness,

hearing defect, congenital hearing loss]) that does not contain any words from the search

query ‘protruding eye ball’ are eliminated. The Parallel Matching Technique(12) divides the

whole matching task into small jobs and the matcher then runs them in parallel, e.g. 100 data

values are divided into 10 partitions that are matched in parallel with ontologies.

Existing toolsWe found several existing tools that offered partial solutions, see table 1.

table 1. Comparison of existing tools with SORTA. ZOOMA and BioPortal Annotator were the closest to our needs.

SoRtabioPortal annotator Zooma Shiva

agreement maker Logmap Peregrine

Comparable similarity score

Y N N N Y Y N

Import code system in ontology format

Y Y Y Y Y Y Y

Import code system in excel format

Y N N N N N N

Uses lexical index to improve performance

Y Y Y N N Y Y

Code/Recode data directly in the tool

Y N N N Y N N

Tool available as online service

Y Y Y N/A N/A N/A N

Support partial matches

Y N N Y Y Y N

Match complex data values

Y N N Y Y Y N

Learns from curated dataset

Y N Y N N N N

Y represents Yes; N represents No; N/A represents unknown

125

7

SOR

TA: A

SySTem

fOR

On

TOlO

gy

-bA

Sed R

ecO

din

g A

nd

Tech

nic

Al A

nn

OTA

TiOn

Mathur and Joshi (13) described an ontology matcher, Shiva, that incorporates four

string-matching algorithms (Levenshtein distance, Q-grams, Smith Waterman and Jaccard),

any of which could be selected by users for particular matching tasks. They used general

resources like WordNet and Online Dictionary to expand the semantics of the entities being

matched. Cruz (14) described a matcher, Agreement Maker, in which lexical and semantic

matchers were applied to ontologies in a sequential order and the results were combined

to obtain the final matches. At the lexical matching stage, Cruz (14) applied several

different kinds of matchers, string-based matches (e.g. edit distance and Jar-Winkler) and

an internally revised token-based matcher, then combined the similarity metrics from these

multiple matchers. Moreover the philosophy behind this tool is that users can help make

better matches in a semi-automatic fashion that are not possible in automatic matching

(14). Jiménez-Ruiz and Cuenca Gra (15) described an approach where: I) they used lexical

matching to compute an initial set of matches; II) based on these initial matches, they took

advantage of semantic reasoning methods to discover more matches in the class hierarchy,

and III) they used indexing technology to increase the efficiency of computing the match

correspondences between ontologies. Peregrine (16) is an indexing engine or tagger that

recognizes concepts within human readable text, and if terms match multiple concepts

it tries to disambiguate BioPortal(17), the leading search portal for ontologies, provides

the BioPortal Annotator that allows users to annotate a list of terms with pre-selected

ontologies. While it was useful for our use cases, it was limited because it only retrieves

perfect matches and terms with slightly different spellings cannot be easily matched

(e.g. ‘hearing impaired’ vs. ‘hearing impairment’)(18). In addition, BioPortal Annotator’s

500-word limit reduces its practical use when annotating thousands of data values.

Finally, ZOOMA(19) enables semi-automatic annotation of biological data with selected

ontologies and was closest to our needs. ZOOMA classifies matches as ‘Automatic’ or

‘Curation required’ based on whether or not there is manually curated knowledge that

supports the suggested matches. ZOOMA does not meet our requirements in that it

does not provide similarity scores for the matches, does not prioritize recall over precision

(i.e. ZOOMA matches are too strict for our needs), and does not handle partial/complex

matches. For example, in ZOOMA, the OMIM (Online Mendelian Inheritance in Man)

term ‘Angular Cheilitis’ could not be partially matched to the HPO term ‘Cheilitis’ and

‘Extra-Adrenal Pheochromocytoma’ could not be matched to the HPO term ‘Extraadrenal

pheochromocytoma’ because of the hyphen character.

mEtHoD

Based on our evaluation of existing tools, we decided to combine a token-based algorithm,

Lucene(20), with an n-gram-based algorithm. Lucene is a high-performance search engine

that works similarly to the Early Pruning Matching Technique. Lucene only retrieves

126

7

SOR

TA: A

SySTem

fOR

On

TOlO

gy

-bA

Sed R

ecO

din

g A

nd

Tech

nic

Al A

nn

OTA

TiOn

concepts relevant to the query, which greatly improves the speed of matching. This

enables us to only recall suitable codes for each value and sort them based on their

match. However, the Lucene matching scores are not comparable across different queries

making it unsuitable for human evaluation. Therefore, we added an n-gram-based

algorithm as a second matcher, which allows us to standardize the similarity scores as

percentages (0-100%) to help users understand the quality of the match and to enable

a uniform cut-off value.

We implemented the following three steps. First, coding systems or ontologies are

uploaded and indexed in Lucene to enable fast searches (once for each ontology). Second,

users create their own coding/recoding project by uploading a list of data values. What

users get back is a shortlist of matching concepts for each value that has been retrieved

from the selected coding system based on their lexical relevance. In addition, the concepts

retrieved are matched with the same data values using the second matcher, the n-gram-

based algorithm, to normalize the similarity scores to values from 0-100%. Finally, users

apply a %-similarity-cut-off to automatically accept matches and/or manually curates

the remaining codes that are assigned to the source values. Finally, users download the

result for use in their own research. An overview of the strategy is shown in Figure 1.

We provide a detailed summary below.

Users upload coding sources such as ontologies or terminology lists to establish the

knowledge base. Ontologies are the most frequently used source for matching data

values, but some of the standard terminology systems are not yet available in ontology

formats. Therefore, we allow users to not only upload ontologies in OWL and OBO

figure 1. Sorta Overview.

METHOD Based on our evaluation of existing tools, we decided to combine a token-based algorithm,

Lucene(20), with an n-gram-based algorithm. Lucene is a high-performance search engine

that works similarly to the Early Pruning Matching Technique. Lucene only retrieves

concepts relevant to the query, which greatly improves the speed of matching. This enables us

to only recall suitable codes for each value and sort them based on their match. However, the

Lucene matching scores are not comparable across different queries making it unsuitable for

human evaluation. Therefore, we added an n-gram-based algorithm as a second matcher,

which allows us to standardize the similarity scores as percentages (0-100%) to help users

understand the quality of the match and to enable a uniform cut-off value.

We implemented the following three steps. First, coding systems or ontologies are uploaded

and indexed in Lucene to enable fast searches (once for each ontology). Second, users create

their own coding/recoding project by uploading a list of data values. What users get back is a

shortlist of matching concepts for each value that has been retrieved from the selected coding

system based on their lexical relevance. In addition, the concepts retrieved are matched with

the same data values using the second matcher, the n-gram-based algorithm, to normalize the

similarity scores to values from 0-100%. Finally, users apply a %-similarity-cut-off to

automatically accept matches and/or manually curates the remaining codes that are assigned

to the source values. Finally, users download the result for use in their own research. An

overview of the strategy is shown in Figure 1. Figure 1 – Sorta Overview.

127

7

SOR

TA: A

SySTem

fOR

On

TOlO

gy

-bA

Sed R

ecO

din

g A

nd

Tech

nic

Al A

nn

OTA

TiOn

formats, but also import a ‘raw knowledge base’ stored in a simple Excel format which

includes system ID, concept ID, and label (see table 2).

This example shows an Excel file with MET (Metabolic Equivalent of Task), a system

developed to standardize physical activity, in which each concept ID includes a list of

different sports representing specific amounts of energy consumption.

table 2. Example of how to upload a coding system and a coding/recoding target.

Concept ID Concept Label System ID

02060 cardio training MET

02020 bodypump MET

18310 swimming MET

15430 kung fu MET

15350 hockey MET

12150 running MET

The uploaded data is then indexed and stored locally to enable rapid matching.

To match data values efficiently, we used the Lucene search index with the default

snowball stemmer and a standard filter for stemming and removing stop words. A code/

ontology concept is evaluated as being a relevant match for the data value when it or its

corresponding synonyms (if available) contain at least one word from the data value. The

assumption in this strategy is that the more words a concept’s label or synonyms contain,

the more relevant Lucene will rank it, and therefore the top concepts on the list are most

likely to be the correct match. However, the snowball stemmer could not stem some of

the English words properly, e.g. the stemmed results for ‘placenta’ and ‘placental’ were

‘placenta’ and ‘placent’, respectively. To solve this problem, we enabled fuzzy matching

with 80% similarity and this allowed us to maximize the number of relevant concepts

retrieved by Lucene.

Lucene also provides matching scores that are calculated using a cosine similarity

between two weighted vectors (21), which takes the information content of words into

account, e.g. rarer words are weighted more than common ones. However, after our

first user evaluations we decided not to show Lucene scores to users for two reasons.

First, Lucene calculates similarity scores for any indexed document as long as it contains

at least one word from the query. Documents that have more words that match the

query, or contain words that are relatively rare, will get a higher score. Secondly, the

matching results produced by different queries are not comparable because the scales

are different (22) making it impossible to determine the ‘best’ cut-off value above which

the suggested matches can be assumed to be correct.

128

7

SOR

TA: A

SySTem

fOR

On

TOlO

gy

-bA

Sed R

ecO

din

g A

nd

Tech

nic

Al A

nn

OTA

TiOn

We therefore decided to provide an additional similarity score that ranges from

0-100% by using an n-gram calculation between the data value and the relevant

concepts retrieved by Lucene. In this n-gram-based algorithm, the similarity score

is calculated for two strings each time. The input string is lowercased and split by

whitespace to create a list of words, which are then stemmed by the default snowball

stemmer. For each of the stemmed words, it is appended with ‘̂ ’ at the beginning and

‘$’ at the end, from which the bigram tokens are generated, e.g. ^smoke$ à [^s, sm,

mo, ok, ke, e$]. All the bigram tokens are pushed to a list for the corresponding input

string with duplicated tokens allowed. The idea is that the more similar two strings are,

the more bigram tokens they can share. The similarity score is the product of number

of shared bigram tokens divided by the sum of total number of bigram tokens of two

input strings as follows,

respectively. To solve this problem, we enabled fuzzy matching with 80% similarity and this

allowed us to maximize the number of relevant concepts retrieved by Lucene.

Lucene also provides matching scores that are calculated using a cosine similarity between

two weighted vectors (21), which takes the information content of words into account, e.g.

rarer words are weighted more than common ones. However, after our first user evaluations

we decided not to show Lucene scores to users for two reasons. First, Lucene calculates

similarity scores for any indexed document as long as it contains at least one word from the

query. Documents that have more words that match the query, or contain words that are

relatively rare, will get a higher score. Secondly, the matching results produced by different

queries are not comparable because the scales are different (22) making it impossible to

determine the ‘best’ cut-off value above which the suggested matches can be assumed to be

correct.

We therefore decided to provide an additional similarity score that ranges from 0-100% by

using an n-gram calculation between the data value and the relevant concepts retrieved by

Lucene. In this n-gram-based algorithm, the similarity score is calculated for two strings each

time. The input string is lowercased and split by whitespace to create a list of words, which

are then stemmed by the default snowball stemmer. For each of the stemmed words, it is

appended with ‘^’ at the beginning and ‘$’ at the end, from which the bigram tokens are

generated, e.g. ^smoke$ [^s, sm, mo, ok, ke, e$]. All the bigram tokens are pushed to a list

for the corresponding input string with duplicated tokens allowed. The idea is that the more

similar two strings are, the more bigram tokens they can share. The similarity score is the

product of number of shared bigram tokens divided by the sum of total number of bigram

tokens of two input strings as follows,

Because we were only interested in the constituents of the strings being compared, the order

of the words in strings does not change the score. We also considered only using the n-gram

calculation, but that would require calculation of all possible pairwise comparisons between

all data values and codes, which would greatly slow down the process.

Ultimately both algorithms were combined because Lucene is very efficient in retrieving

relevant matches while our users preferred n-gram scores because they are easier to compare.

Combining Lucene with the n-gram-based algorithm is an optimal solution in which the

advantages of both methods complement each other while efficiency, accuracy and

comparability of scores are preserved.

Because we were only interested in the constituents of the strings being compared,

the order of the words in strings does not change the score. We also considered only

using the n-gram calculation, but that would require calculation of all possible pairwise

comparisons between all data values and codes, which would greatly slow down

the process.

Ultimately both algorithms were combined because Lucene is very efficient in retrieving

relevant matches while our users preferred n-gram scores because they are easier to

compare. Combining Lucene with the n-gram-based algorithm is an optimal solution in

which the advantages of both methods complement each other while efficiency, accuracy

and comparability of scores are preserved.

To code the data values, the data can be uploaded as a simple comma separate value

file or copy/pasted into the text area directly in SORTA. The uploaded data is usually a

list of simple string values, however in some cases it also can be complex data values

containing information other than a simple label.

For these cases, SORTA allows inclusion of descriptive information such as synonyms

and external database identifiers to improve the quality of the matched results shown

in table 3.

At minimum, one column of values should be provided: the first column with the

header ‘Name’. Additional optional columns that start with ‘Synonym_’ can contain the

synonyms for input values. Other optional column headers can contain other identifiers,

e.g. in this example OMIM.

129

7

SOR

TA: A

SySTem

fOR

On

TOlO

gy

-bA

Sed R

ecO

din

g A

nd

Tech

nic

Al A

nn

OTA

TiOn

table 3. Example of how to upload data values and coding/recoding source).

name (required) Synonym_1(optional) omIm (optional)

2,4-dienoyl-CoA reductase deficiency DER deficiency 222745

3-methylcrotonyl-CoA carboxylase deficiency 3MCC 210200

Acid sphingomyelinase deficiency ASM 607608

For each of the data values, a suggested list of matching concepts is retrieved and

sorted based on similarity. Users can then check the list from the top downwards and

decide which of the concepts should be selected as the final match. However, if the first

concept on the list is associated with a high similarity score, users can also choose not

to look at the list because they can confidently assume that a good match has been

found for that data value. By default, 90% similarity is the cut-off above which the first

concept on the retrieved list is automatically picked as the match for the data value and

stored in the system. Below 90% similarity, users are required to manually check the list

to choose the final match. The cut-off value can be changed according to the needs of

the project, e.g. a low cut-off of 70% can be used if the data value was collected using

free text because typos are inevitably introduced during data collection.

RESuLtS

We evaluated SORTA in various projects. Here we report two representative matching

scenarios where the original data values were either free text (case 1) or already coded, but

using a local coding system (case 2). In addition, as a benchmark, we generated matches

between HPO, NCIT (National Cancer Institute Thesaurus), OMIM (Online Mendelian

Inheritance in Man) and DO (Disease Ontology) and compared the matches with existing

cross references between these two (case 3)

Case 1: Coding unstructured data in the LifeLines biobank

Background

LifeLines is a large biobank and cohort study started by the University Medical Centre

Groningen, the Netherlands. Since 2006, it has recruited 167,729 participants from the

northern region of the Netherlands(5). LifeLines is involved in the EU BioSHaRE consortium

and one of the joint data analyses being conducted by BioSHaRE is the ‘Healthy Obese

Project’ (HOP) that examines why some obviously obese individuals are still metabolically

healthy(23). One of the variables needed for the HOP analysis is physical activity but,

unfortunately, this information was collected using a Dutch questionnaire containing free

text fields for types of sports. Researchers thus needed to match these to an existing

130

7

SOR

TA: A

SySTem

fOR

On

TOlO

gy

-bA

Sed R

ecO

din

g A

nd

Tech

nic

Al A

nn

OTA

TiOn

coding system: the Ainsworth compendium of physical activities(24). In this compendium

each code matches a metabolic equivalent task (MET) intensity level corresponding to

the energy cost of that physical activity and defined as the ratio of the metabolic rate for

performing that activity to the resting metabolic rate. One MET is equal to the metabolic

rate when a person is quietly sitting and can be equivalently expressed as:

using a local coding system (case 2). In addition, as a benchmark, we generated matches

between HPO, NCIT (National Cancer Institute Thesaurus), OMIM (Online Mendelian

Inheritance in Man) and DO (Disease Ontology) and compared the matches with existing

cross references between these two (case 3)

Case 1: Coding unstructured data in the LifeLines biobank

Background

LifeLines is a large biobank and cohort study started by the University Medical Centre

Groningen, the Netherlands. Since 2006, it has recruited 167,729 participants from the

northern region of the Netherlands(5). LifeLines is involved in the EU BioSHaRE consortium

and one of the joint data analyses being conducted by BioSHaRE is the ‘Healthy Obese

Project’ (HOP) that examines why some obviously obese individuals are still metabolically

healthy(23). One of the variables needed for the HOP analysis is physical activity but,

unfortunately, this information was collected using a Dutch questionnaire containing free text

fields for types of sports. Researchers thus needed to match these to an existing coding

system: the Ainsworth compendium of physical activities(24). In this compendium each code

matches a metabolic equivalent task (MET) intensity level corresponding to the energy cost of

that physical activity and defined as the ratio of the metabolic rate for performing that activity

to the resting metabolic rate. One MET is equal to the metabolic rate when a person is quietly

sitting and can be equivalently expressed as:

1 �� ≡ 1 �� ≡ 4.184 ��

��

A list of 800 codes has been created to represent all kinds of daily activities with their

corresponding energy consumption(24). Code 1015, for example, represents ‘general

bicycling’ with a MET value of 7.5. The process of matching the physical activities of

LifeLines data with codes is referred to as coding.

Challenges and motivation

There were two challenges in this task. First, the physical activities were collected in Dutch

and therefore only researchers with a good level of Dutch could perform the coding task.

Second, there were data for more than 90,000 participants and each participant could report

up to four data values related to ‘Sport’ that could be used to calculate the MET value. In

total, there were 80,708 terms (including 5,211 unique terms) that needed to be coded. We

consulted with the researchers and learned that they typically coded data by hand in an Excel

A list of 800 codes has been created to represent all kinds of daily activities with their

corresponding energy consumption(24). Code 1015, for example, represents ‘general

bicycling’ with a MET value of 7.5. The process of matching the physical activities of

LifeLines data with codes is referred to as coding.

Challenges and motivation

There were two challenges in this task. First, the physical activities were collected in Dutch

and therefore only researchers with a good level of Dutch could perform the coding

task. Second, there were data for more than 90,000 participants and each participant

could report up to four data values related to ‘Sport’ that could be used to calculate the

MET value. In total, there were 80,708 terms (including 5,211 unique terms) that needed

to be coded. We consulted with the researchers and learned that they typically coded

data by hand in an Excel sheet or by syntax in SPSS, and for each entry they needed to

cross-check the coding table and look up the proper code. While this approach is feasible

on a small scale (<10,000 participants), it became clear it would be too much work to

manually code such a massive amount of data. Hence, we used our SORTA coding system.

To train SORTA, we reused a list of human-curated matches between physical

activities described in Dutch and the codes that were created for a previous project.

We used this as the basis to semi-automatically match the new data from LifeLines. An

example of the curated matches is shown in table 2 and the complete list can be found

at Supplementary material: Lifelines_mEt_mappings.xlsx. Moreover, we have

enhanced SORTA with an upload function to support multiple ‘Sport’-related columns

in one harmonization project. This can be done as long as the column headers comply

with the standard naming scheme, where the first column header is ‘Identifier’ and other

column headers start with string ‘Sport_’, e.g. ‘Sport_1’ and ‘Sport_2’. figure 2 shows

an example of manually coding the physical activity ‘ZWEMMEN’ (Swimming) with MET

codes, in which a shortlist of candidates were retrieved by SORTA and the first item of

the list selected as the true match.

Each time the manual curation process produced a new match, this new knowledge

could be added to the knowledge base to be applied to all future data values. This is an

optional action because data values (especially those filled in by participants of the study)

sometimes contain spelling errors that should not be added to the knowledge base.

131

7

SOR

TA: A

SySTem

fOR

On

TOlO

gy

-bA

Sed R

ecO

din

g A

nd

Tech

nic

Al A

nn

OTA

TiOn

Evaluation

With the assistance of SORTA, all of the data values have been coded by the researcher

who is responsible for releasing data about physical activity in the LifeLines project.

The coding result containing a list of matches was used as the gold standard for the

following analysis, in which we evaluated two main questions: I) How far could the

previous coding round improve the new matching results? II) What is the best cut-off

value above which the codes selected by SORTA can be confidently assumed to be

correct matches to a value?

SORTA’s goal is to shortlist good codes for the data values so we first evaluated the

rank of the correct manual matches because the higher they rank, the less manual work

the users need to perform. Our user evaluations suggested that as long as the correct

matches were captured in the top 10 codes, the researchers considered the tool useful.

Otherwise, based on their experience, users changed the query in the tool to update

the matching results.

Re-use of manually curated data from the previous coding round resulted in an

improvement in SORTA’s performance with recall/precision at rank 1st increasing from

0.59/0.65 to 0.97/0.98 and at rank 10th from 0.79/0.14 to 0.98/0.11 (see figure 3 and

table 4).

sheet or by syntax in SPSS, and for each entry they needed to cross-check the coding table

and look up the proper code. While this approach is feasible on a small scale (<10,000

participants), it became clear it would be too much work to manually code such a massive

amount of data. Hence, we used our SORTA coding system.

To train SORTA, we reused a list of human-curated matches between physical activities

described in Dutch and the codes that were created for a previous project. We used this as the

basis to semi-automatically match the new data from LifeLines. An example of the curated

matches is shown in Table 2 and the complete list can be found at Supplementary material: Lifelines_MET_mappings.xlsx. Moreover, we have enhanced SORTA with an upload

function to support multiple ‘Sport’-related columns in one harmonization project. This can

be done as long as the column headers comply with the standard naming scheme, where the

first column header is ‘Identifier’ and other column headers start with string ‘Sport_’, e.g.

‘Sport_1’ and ‘Sport_2’. Figure 2 shows an example of manually coding the physical activity

‘ZWEMMEN’ (Swimming) with MET codes, in which a shortlist of candidates were

retrieved by SORTA and the first item of the list selected as the true match.

Figure 2 Example of coding a physical activity

Each time the manual curation process produced a new match, this new knowledge could be

added to the knowledge base to be applied to all future data values. This is an optional action

because data values (especially those filled in by participants of the study) sometimes contain

spelling errors that should not be added to the knowledge base.

figure 2. Example of coding a physical activity

132

7

SOR

TA: A

SySTem

fOR

On

TOlO

gy

-bA

Sed R

ecO

din

g A

nd

Tech

nic

Al A

nn

OTA

TiOn

In total, 90,000 free text values (of which 5,211 were unique) were recoded to physical

exercise using MET coding system. The table shows recall and precision per position in

the SORTA result before coding (using only the MET score descriptions) and after coding

(when a human curator had already processed a large set of SORTA recommendations

by hand).

At the end of the coding task, about 97% of correct matches were captured at rank 1st

with users only needing to look at the first candidate match.

We included use of an n-gram-based algorithm to provide users with an easily

understood metric with which to judge the relevance of the proposed codes on

a scale of 1-100%, based on the n-gram match between value and code (or a

synonym thereof). Supplementary table 1 suggests that, in the LifeLines case,

82% similarity is a good cut-off for automatically accepting the recommended code

because 100% of the matches produced by the system were judged by the human

curator to be correct matches. Because LifeLines data is constantly being updated

(with new participants, and with new questionnaire data from existing participants

every 18 months), it would be really helpful to recalibrate the cut-off value when

the tool is applied anew.

figure 3. Receiver operating characteristic (ROC) curves evaluating performance on LifeLines data.

Evaluation

With the assistance of SORTA, all of the data values have been coded by the researcher who

is responsible for releasing data about physical activity in the LifeLines project. The coding

result containing a list of matches was used as the gold standard for the following analysis, in

which we evaluated two main questions: I) How far could the previous coding round improve

the new matching results? II) What is the best cut-off value above which the codes selected by

SORTA can be confidently assumed to be correct matches to a value?

SORTA’s goal is to shortlist good codes for the data values so we first evaluated the rank of

the correct manual matches because the higher they rank, the less manual work the users need

to perform. Our user evaluations suggested that as long as the correct matches were captured

in the top 10 codes, the researchers considered the tool useful. Otherwise, based on their

experience, users changed the query in the tool to update the matching results.

Re-use of manually curated data from the previous coding round resulted in an improvement

in SORTA’s performance with recall/precision at rank 1st increasing from 0.59/0.65 to

0.97/0.98 and at rank 10th from 0.79/0.14 to 0.98/0.11 (see Figure 3 and Table 4). Figure 3: Receiver operating characteristic (ROC) curves evaluating performance on LifeLines data.

Table 4 Precision and recall for the LifeLines case study.

In total, 90,000 free text values (of which 5,211 were unique) were recoded to physical

exercise using MET coding system. The table shows recall and precision per position in the

SORTA result before coding (using only the MET score descriptions) and after coding (when

a human curator had already processed a large set of SORTA recommendations by hand).

133

7

SOR

TA: A

SySTem

fOR

On

TOlO

gy

-bA

Sed R

ecO

din

g A

nd

Tech

nic

Al A

nn

OTA

TiOn

table 4. Precision and recall for the LifeLines case study.

Rank cut-off

before coding after coding

Recall Precision f-measure Recall Precision f-measure

1 0.59 0.65 0.62 0.97 0.98 0.97

2 0.66 0.39 0.49 0.97 0.50 0.66

3 0.71 0.29 0.41 0.97 0.34 0.50

4 0.74 0.24 0.36 0.97 0.26 0.41

5 0.76 0.21 0.33 0.97 0.21 0.35

6 0.77 0.19 0.30 0.97 0.18 0.30

7 0.78 0.17 0.28 0.97 0.15 0.26

8 0.78 0.16 0.27 0.98 0.14 0.25

9 0.78 0.14 0.24 0.98 0.12 0.21

10 0.79 0.14 0.24 0.98 0.11 0.20

11 0.79 0.13 0.22 0.98 0.10 0.18

12 0.79 0.12 0.21 0.98 0.09 0.16

13 0.79 0.12 0.21 0.98 0.09 0.16

14 0.79 0.12 0.21 0.98 0.08 0.15

15 0.79 0.11 0.19 0.98 0.08 0.15

16 0.79 0.11 0.19 0.98 0.07 0.13

17 0.79 0.11 0.19 0.98 0.07 0.13

18 0.80 0.11 0.19 0.98 0.06 0.11

19 0.80 0.10 0.18 0.98 0.06 0.11

20 0.80 0.10 0.18 0.98 0.06 0.11

30 0.80 0.10 0.18 0.98 0.04 0.08

50 0.80 0.09 0.16 0.98 0.03 0.06

Case 2: Recoding from CINEAS coding system to HPO ontology

Background

CINEAS is the Dutch centre for disease code development and its distribution to the

clinical genetics community (www.cineas.org)(6). This centre was initiated by the

eight clinical genetics centres responsible for genetic counselling and diagnostics

in the Netherlands in 1992(25). CINEAS codes are used in daily practice by Dutch

clinical geneticists and genetic counsellors to assign diseases and clinical symptoms

to patients. The 63rd edition of CINEAS now lists more than 5,600 diseases and

more than 2,800 clinical symptoms. The challenge was to match and integrate

134

7

SOR

TA: A

SySTem

fOR

On

TOlO

gy

-bA

Sed R

ecO

din

g A

nd

Tech

nic

Al A

nn

OTA

TiOn

(or recode) the CINEAS clinical symptom list with HPO in order to use one enriched

standardized coding system for future coding of patients’ symptoms and to obtain

interoperability for CINEAS codes already registered in local systems all over the

country. The metabolic diseases obtained from CINEAS disease list, which has

become an independent project called The Dutch Diagnosis Registration Metabolic

Diseases (DDRMD, https://ddrmd.nl/)(25), will be matched with Orphanet ontology

in the future.

Challenge and motivation

The previous strategy of CINEAS curators was to search HPO via BioPortal, however,

tracking possible candidate terms meant making written notes or keeping a digital

registry on the side, tracking methods that are time-consuming, prone to human errors

and demand a lot of switching between tools or screens. Therefore, SORTA was brought

into the project. figure 4 shows an example of a data value ‘external auditory canal

defect’ and a list of HPO ontology terms as candidate matches.

While none of them is a perfect match for the input term, the top three candidates

are the closest matches, but are too specific for the input. Scrutiny by experts revealed

that ‘Abnormality of auditory canal’ could be a good ‘partial’ match because of its

generality. Figure 4: Example of matching the input value ‘external auditory canal defect’ with HPO ontology

terms.

While none of them is a perfect match for the input term, the top three candidates are the

closest matches, but are too specific for the input. Scrutiny by experts revealed that

‘Abnormality of auditory canal’ could be a good ‘partial’ match because of its generality.

Evaluation

In an evaluation study, the first 315 clinical symptoms out of 2,800 were re-coded by a human

expert, in which 246 were matched with HPO terms while 69 could not be matched. In

addition, we performed the same matching task using BioPortal Annotator and ZOOMA

because these existing tools seemed most promising (see Table 5).

Table 5 Comparison of SORTA, BioPortal and ZOOMA. Evaluation based on the CINEAS case study in which 315 clinical symptoms were matched to

Human Phenotype Ontology. The table shows the recall/precision per position in SORTA,

BioPortal Annotator and ZOOMA. N.B. both BioPortal Annotator and ZOOMA have a

limitation that they can only find exact matches and return a maximum of three candidates. SORTA BioPortal ZOOMA

Rank

cut-off Recall

Precisi

on

F-

measurRecall

Precisi

on

F-

measurRecall

Precisi

on

F-

measur

figure 4. Example of matching the input value ‘external auditory canal defect’ with HPO ontology terms.

135

7

SOR

TA: A

SySTem

fOR

On

TOlO

gy

-bA

Sed R

ecO

din

g A

nd

Tech

nic

Al A

nn

OTA

TiOn

Evaluation

In an evaluation study, the first 315 clinical symptoms out of 2,800 were re-coded by

a human expert, in which 246 were matched with HPO terms while 69 could not be

matched. In addition, we performed the same matching task using BioPortal Annotator

and ZOOMA because these existing tools seemed most promising (see table 5). Evaluation based on the CINEAS case study in which 315 clinical symptoms were

matched to Human Phenotype Ontology. The table shows the recall/precision per

position in SORTA, BioPortal Annotator and ZOOMA. N.B. both BioPortal Annotator and

ZOOMA have a limitation that they can only find exact matches and return a maximum

of three candidates.

table 5. Comparison of SORTA, BioPortal and ZOOMA.

Rank cut-off

SoRta bioPortal Zooma

Recall Precision f-measure Recall Precision f-measure Recall Precision f-measure

1 0.58 0.45 0.51 0.34 0.54 0.42 0.17 0.63 0.27

2 0.69 0.27 0.39 0.35 0.44 0.39 0.17 0.60 0.26

3 0.73 0.19 0.30 0.35 0.44 0.39 0.18 0.60 0.28

4 0.76 0.15 0.25 N/A N/A N/A N/A N/A N/A

5 0.78 0.13 0.22 N/A N/A N/A N/A N/A N/A

6 0.81 0.11 0.19 N/A N/A N/A N/A N/A N/A

7 0.81 0.09 0.16 N/A N/A N/A N/A N/A N/A

8 0.83 0.08 0.15 N/A N/A N/A N/A N/A N/A

9 0.83 0.08 0.15 N/A N/A N/A N/A N/A N/A

10 0.85 0.07 0.13 N/A N/A N/A N/A N/A N/A

11 0.85 0.06 0.11 N/A N/A N/A N/A N/A N/A

12 0.85 0.06 0.11 N/A N/A N/A N/A N/A N/A

13 0.86 0.06 0.11 N/A N/A N/A N/A N/A N/A

14 0.86 0.05 0.09 N/A N/A N/A N/A N/A N/A

15 0.87 0.05 0.09 N/A N/A N/A N/A N/A N/A

16 0.87 0.05 0.09 N/A N/A N/A N/A N/A N/A

17 0.87 0.05 0.09 N/A N/A N/A N/A N/A N/A

18 0.88 0.04 0.08 N/A N/A N/A N/A N/A N/A

19 0.88 0.04 0.08 N/A N/A N/A N/A N/A N/A

20 0.88 0.04 0.08 N/A N/A N/A N/A N/A N/A

30 0.89 0.03 0.06 N/A N/A N/A N/A N/A N/A

50 0.92 0.02 0.04 N/A N/A N/A N/A N/A N/A

N/A not applicable

136

7

SOR

TA: A

SySTem

fOR

On

TOlO

gy

-bA

Sed R

ecO

din

g A

nd

Tech

nic

Al A

nn

OTA

TiOn

We further investigated which cut-off value can be confidently used to assume that

the automatic matches are correct by calculating precision and recall for all possible

n-gram cut-offs (0-100%). Supplementary table 2 shows 89% to be a good cut-off

value for future CINEAS matching tasks because above this value all of the suggested

matches are correct with 100% precision.

Case 3: Benchmark against existing matches between ontologiesWe downloaded 700 existing matches between HPO and DO concepts, 1148 matches

between HPO and NCIT concepts, and 3631 matches between HPO and OMIM concepts from

BioPortal. We used the matching terms from DO, NCIT and OMIM as the input values and HPO

as the target coding system and generated matches using SORTA, BioPortal Annotator and

ZOOMA. Supplementary table 3 shows that all three tools managed to reproduce most of

the existing ontology matches with SORTA slightly outperforming the other two by retrieving

all of the ontology matches. Scrutiny revealed that SORTA was able to find the complex

matches, where data values and ontology terms consist of multiple words, and some of which

are concatenated, e.g. matching ‘propionic acidemia’ from DO with ‘Propionicacidemia’ from

HPO. We also noticed that beyond the 1st rank, precision in SORTA is lower than the other

two (with the highest precision in ZOOMA). In addition, we investigated what proportion of

data values could be automatically matched at different cut-offs. Supplementary table 4

shows that at similarity score cut-off of 90%, SORTA recalled at least 99.6% of the existing

matches with 100% precision across all three matching experiments.

DISCuSSIon

In RESULTS section, we have evaluated SORTA in three different use cases. It has shown

that SORTA could indeed help human experts in performing the (re)coding tasks in

terms of improving the efficiency and user evaluations of SORTA were very positive,

but there was much debate among co-authors on the combination of Lucene-based

matching with n-gram post-processing. As mentioned in the Method section, Lucene

scores were not really informative for users, but the order in which the matching results

were sorted by Lucene seemed better thanks to the cosine similarity function that takes

information content into account. After applying the n-gram-based algorithm, this order

was sometimes changed. To evaluate this issue we performed the same matching tasks

using Lucene and Lucene + n-gram. In the case of coding LifeLines data, the performances

were quite similar and the inclusion of n-gram did not change the order of the matching

results, see Supplementary material: PrecisionRecallLifeLines.xlsx. However, in the

case of matching HPO terms, there was a large difference in precision and recall as shown

in figure 5 and Supplementary material PrecisionRecallCInEaS.xlsx.

137

7

SOR

TA: A

SySTem

fOR

On

TOlO

gy

-bA

Sed R

ecO

din

g A

nd

Tech

nic

Al A

nn

OTA

TiOn

Lucene alone outperformed the combination of the two algorithms. We hypothesize

that this may be caused by Lucene’s use of word inverse document frequency (IDF)

metrics, which are calculated for each term (t) using the following formula:

Figure 5: Performance comparison for matching HPO terms among three algorithms.

Lucene alone outperformed the combination of the two algorithms. We hypothesize that this

may be caused by Lucene’s use of word inverse document frequency (IDF) metrics, which are

calculated for each term (t) using the following formula:

where docFreq is the number of documents that contain the term.

We checked the IDFs for all the words from input values for the HPO use case and

Supplementary Figure 1 shows the large difference in the information carried by each word.

This suggested that, to improve the usability of the tool, we should allow users to choose

which algorithm they wish to use to sort the matching results, an option that we will add in

the near future. We also explored if we could simply add information content to the n-gram

scoring mechanism to make the ranks consistent by redistributing the contribution of each of

the query words in the n-gram score based on the IDF. For example, using n-gram the

contribution of the word ‘joint’ in the query string ‘hyperextensibility hand joint’ is about

18.5% because ‘joint’ is 5/27 letters. However, if this word is semantically more important,

results matching this word should have a higher score. We therefore adapted the n-gram

algorithm to calculate the IDF for each of the words separately, calculate the average, and

reallocate the scores to the more important words as follows:



Supplementary figure 1 shows the large difference in the information carried by each

word. This suggested that, to improve the usability of the tool, we should allow users

to choose which algorithm they wish to use to sort the matching results, an option that

we will add in the near future. We also explored if we could simply add information

content to the n-gram scoring mechanism to make the ranks consistent by redistributing

the contribution of each of the query words in the n-gram score based on the IDF.

For example, using n-gram the contribution of the word ‘joint’ in the query string

‘hyperextensibility hand joint’ is about 18.5% because ‘joint’ is 5/27 letters. However,

Figure 5: Performance comparison for matching HPO terms among three algorithms.

Lucene alone outperformed the combination of the two algorithms. We hypothesize that this

may be caused by Lucene’s use of word inverse document frequency (IDF) metrics, which are

calculated for each term (t) using the following formula:



Supplementary Figure 1 shows the large difference in the information carried by each word.

This suggested that, to improve the usability of the tool, we should allow users to choose

which algorithm they wish to use to sort the matching results, an option that we will add in

the near future. We also explored if we could simply add information content to the n-gram

scoring mechanism to make the ranks consistent by redistributing the contribution of each of

the query words in the n-gram score based on the IDF. For example, using n-gram the

contribution of the word ‘joint’ in the query string ‘hyperextensibility hand joint’ is about

18.5% because ‘joint’ is 5/27 letters. However, if this word is semantically more important,

results matching this word should have a higher score. We therefore adapted the n-gram

algorithm to calculate the IDF for each of the words separately, calculate the average, and

reallocate the scores to the more important words as follows:

figure 5. Performance comparison for matching HPO terms among three algorithms.

138

7

SOR

TA: A

SySTem

fOR

On

TOlO

gy

-bA

Sed R

ecO

din

g A

nd

Tech

nic

Al A

nn

OTA

TiOn

if this word is semantically more important, results matching this word should have a

higher score. We therefore adapted the n-gram algorithm to calculate the IDF for each

of the words separately, calculate the average, and reallocate the scores to the more

important words as follows:

�� = ��ℎ��ℎ�� × ��

��

�� = ��ℎ��ℎ��

�� = ��ℎ��ℎ�� ×

��∑ ��

Common_word is defined as having an IDF that is lower than IDFaverage

Important_words is defined as the IDF that is higher than IDFaverage

This resulted in an improvement of recall compared to naive n-gram scoring at rank 10th from

0.79 to 0.84 (for details see Supplementary material: comparision_ngram_lucene.xlsx),

and the summarized comparison is provided via receiver operating characteristic (ROC) curve

in Figure 5. However, Lucene still outperforms this metric and we speculate that this can be

explained by the fundamental difference between the underlying scoring functions. The n-

gram score is more sensitive to the length of input strings than Lucene and it is quite possible

that two strings do not share any of the words but share similar bigram tokens, especially

when dealing with long strings. Consequently, the n-gram-based algorithm might find more

false positives than Lucene. However, in practice, the number of data values to be

coded/recoded is quite large and the benefit of using an n-gram score cut-off value above

which all the suggested matches are automatically selected outweighs this drawback.

Another issue was whether we could make better use of all the knowledge captured in

ontologies. We noticed in some matching examples that related terms that come from the

same ontological cluster tend to show up together in the matching results. For example,

Figure 4 shows that the input term ‘external auditory canal defect’ is not matched to any of

the top three candidates because they are too specific and hence we have to take the more

general ontology term ‘Auditory canal abnormality’, which is actually ranked 11th, as the

match even though this term is in fact the parent of the three top candidates. This indicates

that if the input value is not matched by any of the candidates with a high similarity score and

the candidates contain clusters of ontology terms, the parent ontology term should probably

be selected as the best match (which is similar to the way human curators make decisions on

such matches). However, translating this knowledge into an automatic adaptation of matching

a score is non-trivial and something we plan to work on in the future.

CONCLUSIONS We developed SORTA as a software system to ease data cleaning and coding/recoding by

automatically shortlisting standard codes for each value using lexical and ontological

This resulted in an improvement of recall compared to naive n-gram scoring at

rank 10th from 0.79 to 0.84 (for details see Supplementary material: comparision_ngram_lucene.xlsx), and the summarized comparison is provided via receiver

operating characteristic (ROC) curve in figure 5. However, Lucene still outperforms

this metric and we speculate that this can be explained by the fundamental difference

between the underlying scoring functions. The n-gram score is more sensitive to the

length of input strings than Lucene and it is quite possible that two strings do not

share any of the words but share similar bigram tokens, especially when dealing with

long strings. Consequently, the n-gram-based algorithm might find more false positives

than Lucene. However, in practice, the number of data values to be coded/recoded is

quite large and the benefit of using an n-gram score cut-off value above which all the

suggested matches are automatically selected outweighs this drawback.

Another issue was whether we could make better use of all the knowledge

captured in ontologies. We noticed in some matching examples that related terms

that come from the same ontological cluster tend to show up together in the

matching results. For example, figure 4 shows that the input term ‘external auditory

canal defect’ is not matched to any of the top three candidates because they are

too specific and hence we have to take the more general ontology term ‘Auditory

canal abnormality’, which is actually ranked 11th, as the match even though this

term is in fact the parent of the three top candidates. This indicates that if the

input value is not matched by any of the candidates with a high similarity score

and the candidates contain clusters of ontology terms, the parent ontology term

should probably be selected as the best match (which is similar to the way human

curators make decisions on such matches). However, translating this knowledge into

an automatic adaptation of matching a score is non-trivial and something we plan

to work on in the future.

139

7

SOR

TA: A

SySTem

fOR

On

TOlO

gy

-bA

Sed R

ecO

din

g A

nd

Tech

nic

Al A

nn

OTA

TiOn

ConCLuSIonS

We developed SORTA as a software system to ease data cleaning and coding/recoding

by automatically shortlisting standard codes for each value using lexical and ontological

matching. User and performance evaluations demonstrated that SORTA provided

significant speed and quality improvements compared to the earlier protocols used by

biomedical researchers to harmonize their data for pooling. With increasing use, we

plan to dynamically update the precision and recall metrics based on all users’ previous

selections so that users can start the matching tasks with confident cut-off values. In

addition, we plan to include additional resources such as WordNet for query expansion to

increase the chance of finding correct matches from ontologies or coding systems. Finally,

we also want to publish mappings as linked data, for example as nanopublications (26)

(http://nanopub.org), so they can be easily reused. SORTA is available as a service running

at http://molgenis.org/sorta. Documentation and source code can be downloaded from

http://www.molgenis.org/wiki/SORTA under open source LGPLv3 license.

aCknoWLEDGEmEntS

This work was supported by the European Union Seventh Framework Programme

(FP7/2007-2013) grant number 261433 (Biobank Standardisation and Harmonisation for

Research Excellence in the European Union - BioSHaRE-EU) and grant number 284209

(BioMedBridges). It was also supported by BBMRI-NL, a research infrastructure financed by

the Netherlands Organization for Scientific Research (NWO), grant number 184.021.007.

We thank Anthony Brookes of Leicester University who contributed the abbreviation

‘SORTA’, and Kate Mc Intyre and Jackie Senior for editing the manuscript.

140

7

SOR

TA: A

SySTem

fOR

On

TOlO

gy

-bA

Sed R

ecO

din

g A

nd

Tech

nic

Al A

nn

OTA

TiOn

REfEREnCES 1. BioShaRE (2011) BioSHaRE project. https://www.bioshare.eu/2. Pang,C., Hendriksen,D., Dijkstra,M. and Velde,K.J. Van Der (2015) BiobankConnect :

software to rapidly connect data elements for pooled analysis across biobanks using ontological and lexical indexing. 10.1136/amiajnl-2013-002577.

3. Poggi,A., Lembo,D., Calvanese,D., De Giacomo,G., Lenzerini,M. and Rosati,R. (2008) Linking data to ontologies. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 4900 LnCS, 133–173.

4. Rubin,D.L., Shah,N.H. and Noy,N.F. (2008) Biomedical ontologies: A functional perspective. Briefings in Bioinformatics, 9, 75–90.

5. Scholtens,S., Smidt,N., Swertz,M.A., Bakker,S.J.L., Dotinga,A., Vonk,J.M., van Dijk,F., van Zon,S.K.R., Wijmenga,C., Wolffenbuttel,B.H.R., et al. (2014) Cohort Profile: LifeLines, a three-generation cohort study and biobank. International Journal of Epidemiology, 10.1093/ije/dyu229.

6. Zwamborn-Hanssen,A.M.N., Bijlsma,J.B., Hennekam,E.F.A.M., Lindhout,D., Beemer,F.A., Bakker,E., Kleijer,W.J., De France,H.F., De Die-Smulders,C.E.M., Duran,M., et al. (1997) The Dutch uniform multicenter registration system for genetic disorders and malformation syndromes. American Journal of Medical Genetics, 70, 444–447.

7. Euzenat,J. and Shvaiko,P. (2013) Ontology Matching Second. Available at: http://www.springer.com/computer/database+management+&+information+retrieval/book/978-3-642-38720-3.

8. Navarro,G. (2001) A guided tour to approximate string matching. ACM Computing Surveys, 33, 31–88.

9. Brown,P.F., DeSouza,P. V., Mercer,R.L., Pietra,V.J. Della and Lai,J.C. (1992) Class-Based n-gram Models of Natural Language. Computational Linguistics, 18, 467–479.

10. Levenshtein,V.I. (1966) Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics Doklady, 10, 707–710.

11. Salton,G., Wong,A. and Yang,C.S. (1975) A vector space model for automatic indexing. Communications of the ACM, 18, 613–620.

12. Saruladha,K. (2011) A Comparative Analysis of Ontology and Schema Matching Systems. International Journal of Computer Applications, 34, 14–21.

13. Mathur,I. and Joshi,N. (2014) Shiva ++ : An Enhanced Graph based Ontology Matcher. International Journal of Computer Applications, 92, 30–34.

14. Cruz,I.F. (2009) AgreementMaker : Efficient Matching for Large Real-World Schemas and Ontologies. PVLDB, 2, 1586–1589.

15. Jiménez-Ruiz,E. and Cuenca Grau,B. (2011) LogMap: Logic-based and scalable ontology matching. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 7031 LnCS, 273–288.

16. Schuemie,M.J., Jelier,R. and Kors,J. (2007) Peregrine: Lightweight gene name normalization by dictionary lookup. In Proc of the Second BioCreative Challenge Evaluation Workshop.pp. 131–133.

17. Whetzel,P.L., Shah,N.H., Noy,N.F., Dai,B., Dorf,M., Griffith,N., Jonquet,C., Youn,C., Coulet,A., Callendar,C., et al. (2009) BioPortal : Ontologies and Integrated Data Resources at the Click of a Mouse. Nucleic acids research, 37, 170–3.

18. Funk,C., Baumgartner,W., Garcia,B., Roeder,C., Bada,M., Cohen,K.B., Hunter,L.E. and Verspoor,K. (2014) Large-scale biomedical concept recognition: an evaluation of current automatic annotators and their parameters. BMC bioinformatics, 15, 59.

19. Burdett,T., Jupp,S., Malone,J., Williams,E., Keays,M., Parkinson,H., Trust,W. and Campus,G. (2012) Zooma2 - A repository of annotation knowledge and curation API. In Intelligent Systems for Molecular Biology.

141

7

SOR

TA: A

SySTem

fOR

On

TOlO

gy

-bA

Sed R

ecO

din

g A

nd

Tech

nic

Al A

nn

OTA

TiOn

20. The Apache Software Foundation (2006) Apache Lucene. Agenda, 2009.21. Apache Software Foundation (2001) Lucene Similarity Score. https://lucene.apache.org/

core/4_6_0/core/overview-summary.html22. ElasticSearch (2015) ElasticSearch: Lucene’s Practical Scoring Function. https://www.elastic.

co/guide/en/elasticsearch/guide/master/practical-scoring-function.html#query-norm23. Van Vliet-Ostaptchouk,J. V, Nuotio,M.-L., Slagter,S.N., Doiron,D., Fischer,K., Foco,L.,

Gaye,A., Gögele,M., Heier,M., Hiekkalinna,T., et al. (2014) The prevalence of Metabolic Syndrome and metabolically healthy obesity in Europe: a collaborative analysis of ten large cohort studies. BMC Endocrine Disorders, 14, 1–13.

24. Ainsworth,B.E., Haskell,W.L., Leon,A.S., Jacobs,D.R., Montoye,H.J., Sallis,J.F. and Paffenbarger,R.S. (1993) Compendium of physical activities: Classification of energy costs of human physical activities. In Medicine and Science in Sports and Exercise.Vol. 25, pp. 71–80.

25. Sollie,A., Sijmons,R.H., Lindhout,D., van der Ploeg,A.T., Rubio Gozalbo,M.E., Smit,G.P. a, Verheijen,F., Waterham,H.R., van Weely,S., Wijburg,F. a., et al. (2013) A New Coding System for Metabolic Disorders Demonstrates Gaps in the International Disease Classifications ICD-10 and SNOMED-CT, Which Can Be Barriers to Genotype-Phenotype Data Sharing. Human Mutation, 34, 967–973.

26. Sernadela, Pedro and Horst, Eelke and Thompson, Mark and Lopes, Pedro and Roos, Marco and Oliveira,J. (2014) A Nanopublishing Architecture for Biomedical Data. In 8th International Conference on Practical Applications of Computational Biology & Bioinformatics (PACBB 2014).pp. 277–284.

CHaPtERProposed roadmap to stepwise

integration of genetics in family medicine and clinical research

Elisa JF Houwink#,, Annet W Sollie#, Mattijs E Numans, Martina C Cornel# contributed equally (shared first authors)

8

Published in Clinical and translational medicine, february 2013 as: Houwink EJ, Sollie AW, Numans ME, Cornel MC. Proposed roadmap to stepwise integration of genetics in family medicine and clinical research. Clin Transl Med. 2013 Feb 16;2(1):5. doi: 10.1186/2001-1326-2-5.

145

8

Pro

Posed

ro

ad

ma

P to stePw

ise integ

ra

tion

of g

enetic

s in fa

mily

med

icin

e

abStRaCt

We propose a step-by-step roadmap to integrate genetics in the Electronic Patient Record

in Family Medicine and clinical research. This could make urgent operationalization of

readily available genetic knowledge feasible in clinical research and consequently improved

medical care.

Improving genomic literacy by training and education is needed first. The second

step is the improvement of the possibilities to register the family history in such a way

that queries can identify patients at risk. Adding codes to the ICPC chapters “A21

Personal/family history of malignancy” and “A99 Disease carrier not described further”

is proposed. Multidisciplinary guidelines for referral must be unambiguous. Electronical

patient records need possibilities to add (new) family history information, including links

between individuals who are family members. Automatic alerts should help general

practitioners to recognize patients at risk who satisfy referral criteria. We present a familial

breast cancer case with a BRCA1 mutation as an example.

146

8

Pro

Posed

ro

ad

ma

P to stePw

ise integ

ra

tion

of g

enetic

s in fa

mily

med

icin

e

baCkGRounD

Public health benefits of advancements in understanding the human genome are still to

be realized for common chronic diseases such as cardiovascular disease, diabetes mellitus,

and cancer [1]. International attempts to integrate and operationalize such knowledge

into clinical practice are in the early stages, and as a result, many questions surround the

current state of this translation [1-3]. Most physicians lack genetic knowledge and skills

that might be relevant for decision support in daily practice. [4] Family history taking and

family tree drawing need to be introduced. Oversight of clinical utility of genetic testing

should be supported by e-Health facilities to bypass unfamiliarity with facts on genetic

testing. Shortcomings in registration systems and inadequate implementation of genetics

in existing guidelines are reported and result in inability to register genetic information

in Electronic Patient Records. Privacy and risk of discrimination cause concerns when

registration is considered. Consequently, inadequacy to deliver genetic services is reported

in literature [1]. We present a roadmap (Figure 1) to integrate actual genetic knowledge

into the Electronic Patient Record and into clinical research in Family medicine, which

would enable urgent operationalization of readily available knowledge feasible in daily

genetic medical care.

Evidence for necessary changeThe clinical relevance of integrating genetics in clinical practice was demonstrated

for several familial diseases such as colorectal cancer and breast cancer. Dove-Edwin

et al. calculated mortality risk reduction up to 80% by identifying and subsequently

screening individuals with an increased familial colorectal cancer (CRC) risk [5]. Cancer

risk management options through genetic testing for BRCA mutations and subsequent

options for preventive surgery after testing positive can empower women and can also

reduce morbidity and mortality [6]. Currently, a large number of patients in whom

screening would be beneficial, are out of sight or being missed by their physicians [7,8].

Barriers to changeScheuner et al. identified deficiencies in primary care workers’ basic genetic knowledge

and ability to interpret familial patterns [1]. This is in line with our prioritized educational

topics, including knowledge of basic genetic principles, the most common genetic

disorders and family history communication skills [9]. Taylor and Edwards stated primary

care should be encouraged to invest more time and energy in questioning and registering

family history data [10]. However, they also stressed identified barriers such as time

constraints should be encountered. They identified the need to develop strategies to

overcome difficulties as well as strategies to support accurate record keeping in the

electronic medical record (EMR) [10]. Another identified barrier is the presence of

147

8

Pro

Posed

ro

ad

ma

P to stePw

ise integ

ra

tion

of g

enetic

s in fa

mily

med

icin

e

ambiguous referral guidelines to clinical genetics and other medical specialists for patients

with a possible high risk at familial disease, such as cancer [9]. Computerised decision

support might be helpful in familial risk assessment for common cancers (e.g. breast,

ovarian and colon cancers) and would render timely genetic risk assessments and

consequently support referrals more consistent with guidelines. These results support the

implementation of genetics education aimed at enhancing effective referral indications

and options.

A roadmap for translationIn order to be able to truly turn useful genetic discoveries from the laboratory bench

to daily clinical practice, a roadmap is crucial to make urgent translation feasible. First,

advances in the genomic literacy of health care providers are indispensable. Secondly,

innovative and practical ICT tools to apply these newly acquired knowledge and skills

are needed, such as registration of family history and registry alerts supporting this. We

Background Public health benefits of advancements in understanding the human genome are still to be

realized for common chronic diseases such as cardiovascular disease, diabetes mellitus, and

cancer [1]. International attempts to integrate and operationalize such knowledge into clinical

practice are in the early stages, and as a result, many questions surround the current state of

this translation [1-3]. Most physicians lack genetic knowledge and skills that might be

relevant for decision support in daily practice. [4] Family history taking and family tree

drawing need to be introduced. Oversight of clinical utility of genetic testing should be

supported by e-Health facilities to bypass unfamiliarity with facts on genetic testing.

Shortcomings in registration systems and inadequate implementation of genetics in existing

guidelines are reported and result in inability to register genetic information in Electronic

Patient Records. Privacy and risk of discrimination cause concerns when registration is

considered. Consequently, inadequacy to deliver genetic services is reported in literature [1].

We present a roadmap (Figure 1) to integrate actual genetic knowledge into the Electronic

Patient Record and into clinical research in Family medicine, which would enable urgent

operationalization of readily available knowledge feasible in daily genetic medical care. Figure 1 Proposed roadmap to stepwise integration of genetics in the family medicine.

figure 1. Proposed roadmap to stepwise integration of genetics in the family medicine.

148

8

Pro

Posed

ro

ad

ma

P to stePw

ise integ

ra

tion

of g

enetic

s in fa

mily

med

icin

e

propose a step-by-step roadmap (Figure 1) to effectively integrate genetics in daily family

medicine to its full potential:

1. Improve basic knowledge of genetics in clinicians and develop skills and attitude to

obtain and interpret a family history through effective education;

For example, training on oncogenetics for GPs was recently developed and

evaluated in collaboration with The Dutch College of Family Physicians. Also, a

website on genetics targeted to GPs was developed to easily obtain information on,

amongst other topics, genetic diseases, referral guidelines and family history taking

(huisartsengenetica.nl, translated “GP and genetics”). Oncogenetic knowledge,

skills and attitude were effectively transmitted through an accredited online and

live interactive training and could internationally serve as an example for other

common topics (i.e. reproductive medicine, familial coronary heart disease and

diabetes) and possibly other medical specialties provided that they are translated

to its medical systems.

2. Add relevant International Classification of Primary Care (ICPC) codes and other

coding strategies for simple registry of family history and develop and support

coding skills;

In order to identify and track persons and/or families at risk for hereditary diseases

adequate coding is a starting point. We propose to add a number of codes for

simple registration of family history. This will enable and support adequate case-

finding and decision strategies [8]. We propose to add a number of codes in order

to enable simple but structured registry of a family history. In ICPC-2, which is the

most frequently used coding system for GPs in Western countries, these codes

should be included in Chapter A (General and Unspecified), under A21 “Risk

factor for malignancy”. ICPC-2 was developed by the WHO and classifies patient

data and clinical activity in the domains of General/Family Practice and primary

care, taking into account the frequency distribution of problems seen in these

domains. It allows classification of the patient’s reason for encounter (RFE), the

problems/diagnosis managed, interventions, and the ordering of these data in an

episode of care structure. ICPC-2 has a biaxial structure and consists of 17 chapters,

each divided into 7 components (comp.) dealing with symptoms and complaints

(comp. 1), diagnostic, screening and preventive procedures (comp. 2), medication,

treatment and procedures (comp. 3), test results (comp. 4), administrative (comp. 5),

referrals and other reasons for encounter (comp. 6) and diseases (comp. 7). (see

http://www.who.int/classifications/icd/adaptations/icpc2/en/index.html) Mapping

is available between ICPC and ICD-10, which was also developed by the WHO for

broad application in healthcare registries. The codes suggested below should suit

other coding systems such as SNOMED such as SNOMED and should also be added

for other cancer types.

149

8

Pro

Posed

ro

ad

ma

P to stePw

ise integ

ra

tion

of g

enetic

s in fa

mily

med

icin

e

A21 Personal/family history of malignancy (Existing code)

A21.1 One or more 1st degree family member(s) with breast cancer

A21.2 One or more 2nd degree family member(s) with breast cancer

A21.3 One or more family member(s) with bilateral or multifocal breast cancer

A21.4 Breast cancer in the family in one or more men

A99 Disease carrier not described further (Existing code)

A99.1 BRCA-1 mutation carrier

A99.2 BRCA-2 mutation carrier

A99.3 TP53 mutation carrier

A99.99 Carrier of mutation in other specified gene

3. Improve access to up-to-date and unambiguous referral guidelines;

For example, in the Netherlands multiple referral guidelines for hereditary cancers

were developed independently (Oncoline, Foundation for detection of hereditary

tumors (In Dutch STOET), clinical genetics centres in University hospitals and The

Dutch College of Family physicians (NHG)). Limited usable information however is

available for General Practitioners, i.e. only for Diagnostics of Breast Cancer and

Rectal Bleeding. The guidelines are heterogeneous and difficult to interpret We

propose to improve this by agreeing on national multi-disciplinary referral guidelines

and provide synchronised online access to up-to-date and easy to interpret versions.

Provide service or online app to (self) register family history including family

relations, that can be coupled with routine healthcare registries and the EMR

used in primary care;

The best way to re-use and expand previously recorded family history information and

to view this history from the perspective of a different family member is by recording

parent–child relations and diagnoses directly with each correct family member. This

would require functionality to be added to the EMR. In order to overcome privacy

issues an online app or website to register family history is recommended (for example:

myfamilyhistory.com or familyhealthware.com).

4. Pro-active genetic services integrated in clinical practice facilitated by ICT (for example

family history registry and registry alerts);

For example, the GP or nurse practitioners should be able to (periodically) register

or consult family history information directly into the EMR. Accurate and up-to-date

treatment and referral guidelines and subsequent automatic alerts should pop up

when certain combinations of symptoms and familial risk factors indicate referral to

a clinical geneticist or other medical specialist.

150

8

Pro

Posed

ro

ad

ma

P to stePw

ise integ

ra

tion

of g

enetic

s in fa

mily

med

icin

e

Illustration of the proposed roadmap with a familial breast cancer case in clinical

research and family medicine.

Patient name: Angela B., Female, age 35.

Step 1: Angela lives in the city with her husband and two daughters aged 13 and 10. She

works as a hair dresser, has been happily married for a decade and the family just bought

a new home in the suburbs. She consults the GP on a busy Monday morning with the

following complaints: lump in left breast which she noticed during the weekend. The 4

cm irregular swelling is not painful but rather sensitive. The skin on the swelling is a little

red and dimpled. Angela has no medical history, but since you followed the oncogenetic

training for GPs a few weeks ago you are aware of the possible familial risks of breast

cancer and decide to take her family history. Angela’s mother died of breast cancer when

she was only 50 years of age 10 years ago. Her mother’s father had an unknown cancer

and died at age 55. Angela tells you, when you further ask her for her family history, her

sister had bilateral breast cancer at age 30 and died of ovarian cancer at age 33, two years

ago. Her two other and younger sisters seem healthy. On father’s side of the family no one

has been diagnosed with cancer yet.

Step 2: If proposed codes would be added the following could be registered:

Two first-degree family members with breast cancer at an early age: mother (died at

age 50) and sister (age 30, died 33, bilateral breast cancer). : A21.1 and A21.3 One first-

degree family member with ovarian cancer at an early age (sister age 30, died age 33).

Step 3: You are alarmed by the family history and the medical complaints of Angela. After

checking the referral guidelines for cancer online, you talk with Angela about referral to

the closest hospital as soon as possible for further diagnostics and possibly necessary

surgical treatment. You also inform her of the chance that she might be a carrier of a

DNA mutation which could be further analysed by a clinical geneticist. You promise to

call the clinical geneticist and discuss the problem. The clinical geneticist agrees Angela

needs further genetic DNA testing based on this positive family history and will invite

her this week to quickly start DNA testing, which may inform further treatment. You call

Angela afterwards and she is grateful for taking her case so seriously.

Step 4: Angela is alarmed by the fact that her positive family history for breast and ovary

cancer could mean an added risk to her and her daughters to develop breast or ovarian

cancer and decides to use the online tool to easily register her family history together

with her family members during the upcoming family reunion. Although it was a little

awkward at first to ask her family members for their medical history, they agreed to

do so anonymously online and repeat this every 5 years. Angela shows her family tree

151

8

Pro

Posed

ro

ad

ma

P to stePw

ise integ

ra

tion

of g

enetic

s in fa

mily

med

icin

e

online to her GP who registers relevant information in his EPD and uses this information

to build a pdf with only initials and years of birth of family members and adds this to her

record. Not only is she now able to take her family history to her GP, the other family

members who used the online tool are also able to do so. The whole family is enabled

to operationalize their family history through a snowball effect.

Step 5: Five years later Angela’s daughter Stephany, then aged 18, visits the GP with

gynaecological problems. She feels a painful swelling. She started to study law in a

different city and her new GP uploaded her medical and family history into his EPD. The

EPD has alarmed Stephany’s new GP with a pop-up that Stephany is carrier of a BRCA2

mutation since the clinical geneticist not only diagnosed Angela with a mutation, but

unfortunately also her two daughters. Angela’s daughter is frequently checked with

a physical and MRI by a surgeon familiar with familial breast- and ovarian cancer who

follows the national guidelines for familial cancer. Now that she has these complaints

you decide to call the surgeon and after careful deliberation you refer her the same day

to the clinic for further diagnostics. Fortunately, no abnormalities are found through the

gynaecological and vaginal ultrasound examination.

Extending translational genetic competences We offered our conceptual framework for stepwise integration of genetics into family

medicine and clinical research by adding codes to the ICPC-2 list and took oncogenetics as

an example. Of course this list could be further improved by adding codes in case of other

diseases commonly seen in family medicine such as diabetes, cardiovascular diseases and

monogenic subtypes (Maturity Onset Diabetes of the Young (MODY), BRCA 1/2, familial

hypercholesterolemia (FH) and long QT syndrome) in particular, are expected to come

increasingly to the forefront in primary care. Translational health education research is our

guiding principle to improve our translational efforts and ultimately improve (genetic) medical

care. Engaging colleagues in health education, clinical and biomedical research and medicine

in collaboration will enhance our collective ability to move research from the “data generated

from research projects” phase to the “changes in practice and policy” phase, which will

then bring us full circle to finally translate genetics in to primary care. As advances both in

genetic discoveries and health education research evolve, it will generate interdisciplinary

collaborative endeavors within the broader scope of public health and medicine. Impact

of these advances will only become manifest in better decision-making, better advocacy,

better health policy and finally improved health if GPs could play a key role in translating

potentially life-saving advancements in genetic technologies to patient care. If GPs are to

make an effective contribution in this area, not only their competencies need to be upgraded

by offering suitable and effective genetics training, but performance in real practice needs to

be facilitated as well by operationalizing integration of genetics in Electronic Patient Records.

152

8

Pro

Posed

ro

ad

ma

P to stePw

ise integ

ra

tion

of g

enetic

s in fa

mily

med

icin

e

REfEREnCES1. Scheuner MT, Sieverding P, Shekelle PG: Delivery of genomic medicine for common chronic

adult diseases: a systematic review. Jama 2008, 299(11):1320–1334.2. Kemper AR, Trotter TL, Lloyd-Puryear MA, Kyler P, Feero WG, Howell RR: A blueprint for

maternal and child health primary care physician education in medical genetics and genomic medicine: recommendations of the United States secretary for health and human services advisory committee on heritable disorders in newborns and children. Genet Med 2010, 12(2):77–80.

3. Khoury MJ, Gwinn M, Yoon PW, Dowling N, Moore CA, Bradley L: The continuum of translation research in genomic medicine: how can we accelerate the appropriate integration of human genome discoveries into health care and disease prevention? Genet Med 2007, 9(10):665–674.

4. Baars MJ, Scherpbier AJ, Schuwirth LW, Henneman L, Beemer FA, Cobben JM, et al: Deficient knowledge of genetics relevant for daily practice among medical students nearing graduation. Genet Med 2005, 7(5):295–301.

5. Dove-Edwin I, Sasieni P, Adams J, Thomas HJ: Prevention of colorectal cancer by colonoscopic surveillance in individuals with a family history of colorectal cancer: 16 year, prospective, follow-up study. Bmj 2005, 331(7524):1047.

6. Domchek SM, Friebel TM, Singer CF, Evans DG, Lynch HT, Isaacs C, et al: Association of risk-reducing surgery in BRCA1 or BRCA2 mutation carriers with cancer risk and mortality. Jama 2010, 304(9):967–975.

7. Burke W, Culver J, Pinsky L, Hall S, Reynolds SE, Yasui Y, et al: Genetic assessment of breast cancer risk in primary care practice. Am J Med Genet A 2009, 149A(3):349–356.

8. Rose PW, Watson E, Yudkin P, Emery J, Murphy M, Fuller A, et al: Referral of patients with a family history of breast/ovarian cancer–GPs’ knowledge and expectations. Fam Pract 2001, 18(5):487–490.

9. Houwink EJ, Henneman L, Westerneng M, van Luijk SJ, Cornel MC, Dinant JG, et al: Prioritization of future genetics education for general practitioners: a Delphi study. Genet Med 2012, 14(3):323–329.

10. Taylor MR, Edwards JG, Ku L: Lost in transition: challenges in the expanding field of adult genetics. Am J Med Genet C Semin Med Genet 2006, 142C(4):294–303.

Summary

CHaPtERSummarizing Discussion 9

159

9

Sum

ma

rizin

g D

iScu

SSion

SummaRIZInG DISCuSSIon

ReflectionWhen we, as patients, visit our General Practitioner (GP) for a persistent cough, when we

are worried about a lump we seem to feel in our breast, or about a little blood in our stool,

we assume our GP will record this. We expect he or she will also register measurements

and findings as well as a hypothesis or diagnosis for the cause of our complaints and

a plan to reach a diagnosis and treatment somewhere in the Electronic Health Record

(EHR). We also assume the GP will know about our worries when we visit her the next

time. We are surprised when we visit an out-of-ours clinic or emergency department in

the local hospital during the weekend and the doctors do Not seem to have access to our

EHR or don’t know anything about our medical history. We are annoyed when we have

to repeat the serious diagnoses we have had in the past to every next doctor we meet.

Diagnoses such as asthma, myocardial infarction or even cancer should be available to

the locum or the surgeon when our sprained ankle turns out to be broken.

From a patient perspective, despite the anxieties about privacy issues that are often

expressed, electronic health record (EHR) data reuse and sharing for purposes of care is

rational and desirable. In fact nowadays patients do expect high quality data recording and

sharing between doctors. Most patients do not object and are even happy to contribute

when researchers want to re-use their data (even genomic data) for research, as we know

from studies in the field of rare diseases[1]. Many patients also welcome early detection

of (genetic) risk at serious disease as is illustrated by studies but also by the popularity of

new e-health tools such as www.yourdiseaserisk.wustl.edu or www.testuwrisico.nl We

do not know how patients feel about the fact that their EHR records are being used to

assess quality of care but we know patients do expect high quality healthcare[2].

As we have shown in the introduction (chapter 1) data reuse and sharing is highly

desirable not only from a patient’s perspective but also from the perspective of the

researcher, the quality assessor and the GP. We know that reuse and sharing of EHR

data is already becoming commonplace despite serious concerns about data-quality and

subsequent reusability. We felt there was a need to quantify this problem in Primary Care,

with a special focus on diagnosis registry and diagnosis coding and on exploring novel

ways for obtaining complete data. Furthermore we wanted to explore ways to enable

EHR data reuse and sharing in a sensible way. We decided to broaden our horizons by

working with rare diseases as well as common diseases (cancer) but also by searching

cooperation with medical specialists (hospital EHRs) and bio-informaticians. We chose to

study disease coding by actually developing a coding system in the field of rare diseases

and participating in the development of a coding tool. Because we discovered a lack

of application of available genetics-knowledge in Primary Care, which is partly due to

limitations in the EHR, we decided to develop a roadmap on this subject.

160

9

Sum

ma

rizin

g D

iScu

SSion

We set ourselves the following aims which in our opinion, we have successfully

fulfilled.

Aims of the thesis1. Assess (aspects of) data quality in (parts of) the Primary Care EHR, focussing on

diagnosis registry as a central item and diagnosis coding as an important tool;

2. Find strategies and solutions to improve quality of (Primary Care) EHR data and to

contribute to the enabling of reuse and sharing of EHR data.

Summary of Results

Part One - Quality of data: literature review & hands on

identification of bottlenecks and areas for improvement

Literature on data quality in primary care is scarce; our review shows that available

studies focus mostly on completeness and some also on correctness of registry of a

small number of data items. Quality of data varies per GP centre and per data category.

In general, coded data such as diagnoses, medication prescriptions and laboratory test

results is registered fairly accurate and complete but there is room for improvement.

Registry of vital parameters, risk factors and allergies & intolerances is often incomplete

and incorrect (chapter 2).

For the studies described in chapters 3, 4 and 5 we used the routine EHR data extracted

from practice centres in the Utrecht area, the Netherlands, that are a member of the Julius

General Practitioners’ Network (JGPN; 120 GPs, 50 practice centres, 290,000 patients).

Coded and free-text primary care data from individual patients enlisted with these centres

is periodically extracted to the central anonymized EHR. From the two studies (chapters 3

and 4) we performed to assess quality (completeness and correctness) of diagnosis registry

in the primary care EHR we learned that the quality of coded data, as demonstrated for

patients with cancer or suspected of having cancer, is suboptimal. GPs do know their

cancer patients but this does not mean that re-users of data can find these cancer cases

using anonymized, coded EHR data easily. In both studies we compared cancer cases found

in the EHR with the Netherlands Cancer Registry (NCR), a reference standard which is

considered reliable. We found that when re-users of data try to select cancer cases using

only coded data on a population level (chapter 3) as well as on an individual level (chapter

4) a large number of cancer cases is due to be missed (up to 40% false-negatives#) and

a large number of cancer cases will be wrongly classified (up to 50%, false-positives#).

We conclude that the quality of coded EHR data improves over the years and that

the type of EHR system used influences data quality. More specifically we found that in

recent years diagnosis registry is more complete but as a drawback also the number of

false-positives increases. In our linkage study (chapter 4) we discovered that for 77%

of the missing (false-negative) cancer cases information about the cancer is available in

161

9

Sum

ma

rizin

g D

iScu

SSion

the EHR elsewhere, merely in un-coded plain text. Also for 38% of seemingly wrong

(false-positive) cases the GP appeared to have correctly registered the cancer diagnosis,

including 31% (of 38%) where the diagnosis is not or not yet retrievable from the NCR.

In our study described in chapter 5 we reused coded as well as free-text EHR data for

a research study to gain experience and by doing so also assessed GP management of

women with breast cancer related concerns. We selected for the period under study all

women from the EHR that presented with physical signs and symptoms of the breast (for

instance pain in the breast or a lump) but also women that presented with fear of breast

cancer or a family history of breast cancer. We found that concerns relating to breast

cancer are presented to an average GP frequently (incidence rate 25.9 per 1,000 women

per year), the larger part consisting of women experiencing physical signs and symptoms

of the breast (85.3% or 23.2 per 1,000 per year). Symptomatic and asymptomatic women

are referred for further investigation equally often (50%), so the GPs diagnostic workup

phase does not seem to be paramount in the decision process. Referral practice for annual

screening and genetic counselling is suboptimal and relevant information concerning

family history of cancer is often missing in the EHR. Identification and management

of women with an increased risk of breast cancer by GPs can be improved as well as

identification and reassurance of women without an increased risk or relevant symptoms.

In this study we presented incidence rates based on extracted EHR data, taking into

account the limitations of routine care data (see recommendations) but without applying

corrections to results because of lack of information on data quality of symptoms and

family history registry in Primary Care. Furthermore, considering the dimensions of data

quality (see table 1 introduction), in all three studies we experienced that data in the

EHR can be incomplete, incorrectly coded and not up to date (current). We also found

examples of lacking concordance and plausibility of data but these two dimensions were

not structurally assessed in our studies.

Part two: Strategies & Solutions for improving data-quality and

enabling reuse and sharing of EHR data

Improving disease coding systems and the development of tools mapping codes between

those systems can help to increase EHR data quality, not just within primary care, but

in health care in general. We have studied the quality of coding systems for EHR use in

the field of rare diseases, in particular of metabolic disorders. Collectively, this group of

diseases is large (>6.000) and growing steadily due to the identification of new diseases or

variants of known diseases and improved clinician awareness. We know from experience

that the annotation of rare diseases by means of adequate coding systems, and thus the

possibility to accurately code patients with these disorders and identify them in EHRs, has

been left behind. This has recently been confirmed by other researchers[3]. Our study,

as presented in chapter 6, demonstrates that there are large gaps in the widely used

existing international coding systems ICD-10 (International Classification of Diseases)

162

9

Sum

ma

rizin

g D

iScu

SSion

(76% missing codes) and SNOMED-CT (Sytematized Nomenclature of Medicine Clinical

Terms) of (54% missing codes) for metabolic disorders. Based on our clinical experience,

we suspect that there may be similar gaps for other types of rare disorders. This has also

been recognized by the SNOMED and Orphanet organisations which have joined efforts

recently to improve coding for rare diseases. Existing gaps are a barrier to database- and

data-sharing efforts especially for rare disorders, where the disease code is often used

as a key to communication. We have shown that with the help of dedicated clinicians

and code development agencies, the problem of coding gaps for rare disorders can be

successfully addressed and that rich and up to date coding systems actually contribute

to the quality of annotation for rare diseases, and thus to healthcare for patients with

these diseases. Although this study was performed in a hospital setting, patients with a

rare disease diagnosis should be recognizable as such also in Primary Care. Furthermore

this study provided insight in the extensive process of developing a high-quality, usable,

up-to-date coding system which will actually be adopted by prospective users.

Another barrier to data sharing for various purposes is the need to standardize

semantics of data values such as diagnoses and other phenotypical codes. Ideally coding

systems are aligned before data entry but often retrospective standardization will be

required. In chapter 7 we describe the development of SORTA, a software tool to ease

data (re-)coding and mapping between coding systems. We participated in this study by

using SORTA for a pilot project to map an existing Dutch coding system for phenotype

coding of physical symptoms to the international Human Phenotype Ontology (HPO) and

demonstrated that existing coding systems can be harmonized with significant speed

and quality improvement compared to earlier manual procedures.

Coding of disease diagnoses and symptoms is pivotal in EHR performance, but not all

types of relevant medical information, e.g. family history, may be captured well in this manner.

The EHR data structure design should allow for storing basically all relevant information and

matching codes should be available to capture that information. In addition, the quality

of the user interface itself is one of the other factors contributing to EHR performance.

These aspects were studied in the context of delivery of genetic services, which despite

readily available genetic knowledge is reported to be inadequate in Primary Care, among

other medical specialties. We have confirmed this problem in chapter 5 where we found

suboptimal referral practices for annual screening/genetic counseling but also information

missing in the EHR concerning family history of women with breast cancer related concerns.

In chapter 8 we identify existing barriers to implementation of genetic services in Primary

Care such as shortcomings in design and interface of EHR systems to register genetic

information. We propose a step-by-step roadmap including adjustments to the EHR and

to existing coding systems to integrate genetics in General Practice and clinical research.

This roadmap can be used as an example for introducing other complicated additions or

adjustments to the EHR or to coding systems driven by needs in daily medical practice.

163

9

Sum

ma

rizin

g D

iScu

SSion

Lessons LearnedSummarizing and interpreting the results of the studies performed we conclude there

are a number of lessons to learn from this thesis:

1. Data-quality in primary care currently is suboptimal, even for a key-item such as

the coded diagnosis and for a serious disease such as cancer; relevant information

regarding important risk factors such as family history is either frequently missing,

incorrectly coded or cannot be found easily;

2. Despite suboptimal quality and subsequent reusability with clear limitations, the

primary care EHR is a rich and voluminous source of (mostly uncoded) medical data,

often comprising many years of follow-up

3. Because of suboptimal quality, primary care data should only be reused by people

that fully understand the context of routine care data-entry and take explicitly into

account the limitations of this data which can be assessed using the checklist in

appendix 1 of this thesis;

4. There is a need to improve data-quality since reuse and sharing are desirable and

expanding, ideally at the source (at data entry) and supported by adequate coding

options. GPs can and should be facilitated and supported to achieve this in a number

of ways (see recommendations below);

5. Adequate and up-to-date coding systems are pivotal for data reuse and sharing,

not only for common but also for rare diseases and can successfully be developed

using not only coding agencies but also dedicated clinicians and can be facilitated

by software tools;

6. These coding systems are only valuable if they are continually be maintained, provide

adequate synonyms and relevant crosslinks to other systems and are equipped with

a guideline for use and an extensive fool-proof thesaurus;

7. Obligatory coding in EHR systems results in more complete registry but also leads to

(over-) registration errors;

8. Linkage of EHR records to other data-sources can be useful to validate diagnoses but

is currently complex and time-consuming;

9. The Primary Care EHR can be complementary to other data sources, even to a known

reliable reference standard such as the Netherlands Cancer Registry;

10. Concerning reuse of Primary Care EHR data, there are many stakeholders involved,

all interested in data reuse but from different perspectives: patients, GPs, the Dutch

Association of General Practitioners (NHG), EHR suppliers, health inspection/insurance

companies, (quality assessors), hospitals and out-of-hours clinics, researchers,

Academic Practice Based Research Networks, departments of Vocational Training

for GPs, Coding agencies/organizations, and owners of External data sources such

as the NCR.

164

9

Sum

ma

rizin

g D

iScu

SSion

RecommendationsThe lessons learned can be translated into recommendations for the various stakeholders

involved.

Patients

Patients should be made aware of anonymous and non-anonymous EHR data reuse which requires their consent. Patients should be made aware that there is a

difference between reuse of data for purposes of care, which cannot be done anonymously,

and re-use for purposes such as research, which can be done with anonymized patient

data. Also, there are ways to let re-users work with anonymized data and still provide

options to contact the patient if necessary through a third party. This means there are

options to avoid privacy issues.

Patients should be stimulated to take responsibility and make sure all important diagnoses and information regarding allergies and intolerances is known and registered in their medical file. The Primary Care EHR is central

for registry of a patient’s medical condition over the years and the GP in particular

has an overview of this data. GPs receive results from laboratory and diagnostic

tests and medical reports whenever their patients visit a medical specialist or a

paramedical professional. Patients have the right to read and check their own medical

files including their EHR record at the GP and should do so at least once to suggest

possible corrections and additions. They should take responsibility and make sure all

important diagnoses are recorded as Episodes in their EHR, as well as information

regarding allergies and intolerances. If patients are convinced their GP is adequately

keeping records and they trust their GP to keep doing this, it is safe and better to let

him/her do the file-keeping. In recent years a number of companies have introduced

software to maintain your own online medical file as a patient, beside or instead of

the doctors’ file, such as www.zorgdoc.nl, www.patient1.nl or www.healthvault.

com. It is however not easy to assemble and interpret all the right information to

keep your own medical file, as is illustrated by the premature exit of Google Health in

2011 (http://www.medischcontact.nl/Nieuws/Laatste-nieuws/Nieuwsbericht/125705/

Google-health-stopt.htm ). However, for patients with chronic or rare diseases the

keeping of personal records with certain measurements and symptoms or complaints

can be very useful, especially when these personal records could be combined with

medical files in the future. When the EHR record is complete and correct, only privacy

issues could be barriers to data reuse and sharing, for instance when “opting-in” to

share data through the LSP[4], and these issues should be individually weighed by

every patient.

165

9

Sum

ma

rizin

g D

iScu

SSion

General Practitioner (GPs)

GPs should improve data-quality by investing in updating and coding key-items in their EHR files and by optimizing working processes at the GP practice. Our research shows that GPs do know their patients but relevant information is

missing or hidden, because it is uncoded, within the EHR text and this is barring and

confounding data reuse and sharing. Data quality in many GP EHRs can be improved.

Despite the fact that we acknowledge the need to lessen the administrative burden

and we therefore support like almost 70% of Dutch GPs the movement “het roer

moet om”[5] we do feel data reuse will only expand and GPs can benefit if they

improve their data quality. They should however be supported and facilitated much

more than they are today (see recommendations below to Dutch College of General

Practitioners and EHR suppliers). There are several ways to improve data quality from

a GP perspective as we will explain.

GPs should invest in updating key-items (such as the important diagnoses, allergies

& intolerances and risk factors) in their EHR files starting by making sure there are coded

Episodes, correctly dated, for every disease the patient suffered or suffers that could

have medical consequences or could be important regarding future medical decisions for

the patient and/or his family. It is important to add the code for a disease only after the

diagnosis has been determined and for suspected disease to code the main symptom,

in line with the Guideline for Adequate EHR registry (ADEPD) Guideline ([6]). Also, the

date of the Episode should actually be the date the final diagnosis was made, not the

date of data entry.

GPs should evaluate and update working processes at the GP practice to integrate

diagnosis registry after a letter from a hospital or diagnostic laboratory is received.

Although most EHR systems do not adequately support registry of a positive family

history for genetic disease such as cancer, it is useful to record this information since

it can have major consequences (suggestion: create a separate Episode and code this

with one of the available ICPC-1 codes for positive family history A29.01 – A29.07).

We expect data reuse and sharing to expand in the coming years since this is the key

to transition of care, such as the follow- up care for cancer patients from hospitals to

Primary Care. We also expect, like the NHS in England (https://www.gov.uk/government/

publications/personalised-health-and-care-2020 ), that it will not be long before patient

empowerment will be taken a step forward and patients will seek and gain the right to

access their full EHR record online and will even be able to add notes to their EHR (but

not edit medical entries made by the GP). From a GP perspective it is undesirable to reuse

and share low quality data for any purpose but certainly not with the patient since this

can have many negative consequences (see discussion chapter 4), besides the risk of the

patient losing confidence in their GP.

166

9

Sum

ma

rizin

g D

iScu

SSion

We recommend GPs to participate in relevant projects which require data reuse and sharing. We recommend GPs, that have improved the quality of their

data, to participate in projects, for instance research, by sharing patient data. Sharing

patient data for other purposes will further stimulate to improve its quality. Privacy

barriers can be addressed and suboptimal coding could potentially be solved by arising

text mining techniques[7]. Participating in a Practice Based Research Network at a local

Academic centre can be feasible for GPs for instance to obtain benchmarking information

(“spiegelinformatie”) which can be valuable for management purposes. Last but not

least, many EHR systems are provided with often unrecognized but useful ICT functions,

for instance to build selection queries which can also be used to assess patient mix and

assemble relevant management information.

Dutch College of General Practitioners (NHG)

the Dutch College of General practitioners (nHG) should on the one hand support and facilitate improvement of data quality, on the other restrict the unlimited reuse and sharing of EHR data. This is especially true for uncoded plain

text in the EHR which is more prone to misinterpretation; reuse should be restricted to

those that fully understand the Primary Care context in which the data was registered.

The focus should be on improving, if possible (re-) coding, and subsequently sharing

and reusing only a limited set of key data items, including the diagnosis. The quality of

these data might be improved by the implementation of tools that can assist in recoding

text (f.i. text mining and optimizing thesauri). This is in line with national developments

such as the Continuity of Care record developed in the project Registry at the Source

fom the NFU (Netherlands Federation of University Medical Centres) (http://www.nfu.nl/

thema/registratie-aan-de-bron/ ). These key-items should be chosen as a subset of the

seventeen data-items that are part of the Continuity of Care record for hospitals, coded

with SNOMED.

the Dutch College of General Practice should study and test techniques such as voice-recognition and text-mining to facilitate recording of high quality data at the source. Supporting the improvement of data quality can be done in a number of

ways, without losing sight of the actual goal of the EHR: supporting the primary process

in every-day General Practice. For the GP this means facilitating easy, fast and user-friendly

recording of consultations. In most EHRs this is suboptimal now and actions taken to

improve data quality should not add to this burden but rather enhance functionality.

This means that existing and upcoming techniques like voice-recognition and text-mining

should be studied and tested.

167

9

Sum

ma

rizin

g D

iScu

SSion

furthermore the implementation of the EHR Reference model by EHR suppliers should be prioritized and stimulated and amendments should be made to the aDEPD and the Reference model. Also, the publication of the EHR reference model

for EHR suppliers provides a firm basis but the implementation of the standard needs

immediate attention. Many EHR suppliers have not yet implemented recent versions of

the model. Furthermore the Reference Model and ADEPD guidelines should be amended

on, among other things, the following items:

1. The Episode date should be the date the diagnosis was made (not the date of the

first attached consultation) and hence the system should propose this date and

furthermore it should be possible to alter this date. For instance for assessing familial

risk of cancer, but also for research purposes it is important to know the age at

diagnosis;

2. Enable attaching consultations to more than one Episode which will hugely simplify

en fasten registration of consultations with multiple symptoms. Investigate the options

and consequences of selecting more than 1 ICPC code per consultation;

3. Enabling the registry of a family history within the context of the EMR is necessary

and should be facilitated for instance following the roadmap we presented in

chapter 8;

4. Develop and add to the Reference Model a list of integrity checks for the EMR

(for instance: it should not be possible to register prostate cancer with a female

patient);

5. Design ways that support easy registration of suspected disease/differential diagnoses,

pathological-proven disease, recurrent disease, etcetera;

6. Enable registry of a suspected or proven rare disease including relevant coding, by

making use of existing and supporting connected coding systems, and develop a way

to make these patients “visible” for the GP.

Investigate the digital integration of guidelines in the EHR and monitor the further development of the ICPC coding system. Furthermore it would be useful

to investigate ways to integrate guidelines in the EHR, for instance to support GPs in risk

assessment of familial cancer and subsequent referral. Last but not least, the monitoring

and further development of the ICPC-1 coding system maintains to be an important issue.

This could be improved by extending the ICPC-1 with useful codes (for instance such as

suggested in chapter 8), but also by providing a fool-proof thesaurus that would suggest

coding options during registration of consultations. This thesaurus should comprise

adequate synonyms and relevant crosslinks (f.i. with SNOMED) to facilitate data-sharing

between sources.

168

9

Sum

ma

rizin

g D

iScu

SSion

EHR suppliers

Investigate how user-interface and system design can be adjusted to support high-quality data entry at the source. We demonstrated that the type of EHR

system used influences data quality in the EHR. EHR suppliers should investigate

(in cooperation with an academic research team) how their user-interfaces and system

design can be adjusted to actually improve data quality. This thesis provides a number

of directions: facilitate user-friendly and accurately coded diagnosis registry without

encouraging false-positives, add options to directly suggest and register the correct

date of diagnosis, suspected, recurring and metastasizing disease, treatment, increased

markers (eg for Prostate Specific Antigen/ or PSA) and a positive family history within

the context of the EHR. Extending integrity checks at data entry would improve data

quality as well.

Integrate referral guidelines into the EmR and facilitate feedback by easy-to-use selection queries. Furthermore EHR suppliers should think about optimizing the

availability of [online] up-to-date and easy to use referral guidelines by integrating them

into the EMR. Also, facilitating feedback on a practice level by providing easy-to use

selection queries for the GP would be worthwhile. Last but not least, the feasibility of

voice-recognition and text-mining to facilitate structured data entry and retrieval should

be investigated.

Quality Assessors (health inspection/insurance companies)

Stop measuring the quality of registration and find ways to adequately measure quality of care in dialogue with GPs. Quality Assessors should be aware

that data quality in Primary Care is suboptimal, even for a key item such as a cancer

diagnosis. The current list of indicators (www.nhg.org/themas/publicaties/download-

indicatoren) in Primary Care that can be calculated by retrieving information from the

EHR relies heavily on adequate disease coding, for instance because the total number

of patients with a certain disorder is used as a denominator. Taking into account the

patient mix of a practice is justly becoming more important, which is another reason

to aim for a reliable denominator. Also, by identifying patients with a certain disorder

such as asthma or diabetes, information registered within those patients records such

as smoking or blood pressure measurements are counted. This means that incorrect or

incomplete diagnosis registry will bias results. Furthermore, we suspect that the quality

of data for items in the EHR such as risk factors will also be suboptimal. This means

that the “paper tiger” that is being created before our eyes, measures, inaccurately,

the quality of registration instead of the quality of care and should be stopped, the

sooner the better.

169

9

Sum

ma

rizin

g D

iScu

SSion

Education

add “adequate health file recording” to the mD curriculum and teach GPs necessary skills. Since digital health file recording is standard procedure in General

Practice and more recently also in hospitals, there is a need for a new subject which should

be added to the MD curriculum at University called “ adequate health file recording”.

Doctors and GPs (to be) should be made more aware of consequences of recording

choices and be taught necessary skills such as correct coding. This subject should also be

integrated more widely into courses for practising GPs and other medical professionals too.

Researchers & Practice Based Research Networks (PBRNs)

When working with routine care data, validate diagnoses. Using EHR data means

understanding and taking into account the limitations of routine care data. If researchers

choose to work with EHR data, the diagnoses should be validated, either by linkage to

external sources or other means. Linkage to other sources could decrease the number

of false-negative records and hence more cases could be traced and included. False-

positive records can only (partly) be identified by studying the full EHR text, which is

time-consuming and may be undesirable considering privacy issues.

PbRns should seize the opportunities to support participating General Practitioners in improving the data quality in their EHRs for instance through providing benchmarking information. In this way they could provide additional

advantages to practices to participate in their Networks. By providing bench marking

and management information GPs could actually assess their patient mix and find EHR

records that need quality improvement.

Future Research crossing boundariesIn this thesis we have been able to successfully assess data quality in Primary Care for

certain data items and have also been able to identify strategies and solutions to improve

data quality to actually enable reuse and sharing. We realize these are pieces of a large

jigsaw that has to be completed in the coming years.

We believe we have only been able to obtain results and devise recommendations

because we have crossed boundaries between academic disciplines: primary care, clinical

genetics, medical informatics, computer science and bio-informatics. Working with

scientists from other disciplines provides new insights and solutions to research questions.

A number of research challenges remain to be studied in the near future, all of them

interdisciplinary. First of all, beside completeness and correctness, other dimensions of data

quality should be evaluated: concordance, plausibility and currency, not only for diagnosis

registry but also for other key data items, for instance risk factors, treatments and allergies.

170

9

Sum

ma

rizin

g D

iScu

SSion

Secondly, user-interface designers should be involved in these studies: what aspects

of user interfaces in the different EHRs lead to differences in data quality and how can

user interfaces be improved to enhance data quality?

Thirdly, the design, implementation and evaluation of actual interventions in the

GP practice to improve data quality could provide the effective interventions needed to

improve daily practice.

In the fourth place, possibilities of natural language recognition to suggest coding

alternatives during data entry should be investigated.

Last but not least, it is necessary to experiment with patient entered data to assess

usability of this data for various purposes, first of all care. This can be done for instance by

developing an app or software tool, that patients can use to enter family history information.

About our patientJanuary 2020: a 48-year-old male visits his General Practitioner (GP) for a persistent mucus

producing cough. A few days ago he made the appointment online and entered the reason

for consultation and his complaints in the text box. Also he answered a few multiple

choice questions presented by the EHR system triggered by the reason for encounter,

about his complaints. Just before the allotted time the GP reads this information and

looks at the patients’ personal health data which includes data from various apps the

patient uses such as exercise-apps. She notices that the patient has lost some weight but

also that the training frequency and duration of this running-enthusiast have decreased

substantially in the last 4 weeks. The GP asks some additional questions and performs

a physical examination that turns out to be normal. She summarizes her findings orally

using speech recognition software and along the way selects relevant codes, prompted

by the system, for symptoms, signs and differential diagnosis. On her screen a pop-up

appears (based on the guideline “Acute cough”) asking her if a request for a chest X-ray

should be send to the nearest hospital selected on diagnostic quality, reimbursement by

the patient’s health care insurance company and shortest waiting list. She clicks “yes”

and schedules an e-consultation for follow-up a week later. The GP has a few minutes

to spare and chats with her patient about his wife, children and his new job.

This thesis hopefully contributes to the improvement of EHR data in general and to the

exposure of the true goldmine these data can become, with the ultimate goal to improve

care for patients with common and rare diseases.

Notes# False-negatives are cancer cases that are present in the NCR but not in the Primary

Care EHR

False-positives are cases that are registered in the Primary Care EHR as having cancer

but are not present in the NCR

171

9

Sum

ma

rizin

g D

iScu

SSion

Checklist before EHR data reuse and/or sharing

nr Item Check

Relevance

1 List the data items you need

2 Critically assess the items you just listed on necessity to answer your (research) question. Delete every item that is not absolutely necessary

Data Quality (for each data item)

3 Gather existing information on the quality of data for each item (dimensions: completeness, correctness, concordance, plausibility, currency)

origin (for each data item)

4 Who entered this data?

5 For what purpose was this data entered?

6 What information is captured with this data?

7 What information is NOT captured with this data?

8 Could entry of this data be biased in any way?

9 Are there other ways to enter the same data in this system?

10 Could another user with the same role, decide to enter this data differently or not at all?

Condition (for each data item)

11 When was the data entered (relative to disease process)?

12 Is there any metadata available?

13 Was the data changed since entry, why and by whom?

14 Was there financial benefit for registering this data at all or in a certain way?

15 List possible errors that could have occurred at data entry

format (for each data item)

16 What format was used entering the data? If coded: what coding system?

17 If a coding system was used: what version, using which instructions? Check out the alternative codes in the system for registry of this item.

18 Were there any restriction rules in the EHR system for entry of this data?

assessment

19 Critically assess every data item using the information gathered and determine usefulness for answering (research) question.

NB: privacy /policy issues are not included in this list

172

9

Sum

ma

rizin

g D

iScu

SSion

REfEREnCES1. Burstein MD, Robinson JO, Hilsenbeck SG, et al. Pediatric data sharing in genomic

research: attitudes and preferences of parents. Pediatrics 2014;133:690–7. doi:10.1542/peds.2013-1592

2. Lateef F. Patient expectations and the paradigm shift of care in emergency medicine. J Emerg Trauma Shock 2011;4:163. doi:10.4103/0974-2700.82199

3. Fung KW, Richesson R, Bodenreider O. Coverage of rare disease names in standard terminologies and implications for patients, providers, and research. AMIA Annu Symp Proc 2014;2014:564–72.http://www.ncbi.nlm.nih.gov/pubmed/25954361 (accessed 16 Jun2016).

4. Zorgcommunicatie) V (Vereniging Z voor. Sharing your medical file and the LSP (Brochure: Uw medische gegevens elektronisch delen?). https://www.vzvz.nl/page/Zorgconsument/Links/Informatie/Informatiemateriaal

5. Het manifest van de bezorgde huisarts. Het roer moet om (free translation: ‘We need a radical change’. www.hetroermoetom.nu; www.hetroergaatom.lhv.nl

6. The Dutch College of General Practitoners. Guideline adequate EHR registry. Revised version 2013. Available at: https://www.nhg.org/themas/publicaties/richtlijn-adequate-dossiervorming-met-het-epd.


CHaPtERNederlandse Samenvatting

Dankwoord

Curriculum Vitae

10

177

10

Ned

erla

Nd

se sam

eNv

at

tiNg

nEDERLanDSE SamEnvattInG

ReflectieWanneer wij naar onze huisarts stappen voor een hoest die maar niet over gaat, of wanneer

we bezorgd zijn over een knobbel in de borst of wat bloed bij de ontlasting, dan gaan

we ervan uit dat onze huisarts dit noteert. We verwachten ook dat hij of zij de resultaten

van het lichamelijk onderzoek registreert in het huisartsgeneeskundig dossier (ook wel

HIS = Huisartsen Informatie Systeem), samen met een hypothese of diagnose die onze

klachten kan verklaren en natuurlijk een plan voor diagnostiek en/of behandeling. We

gaan er dan ook vanuit dat hij of zij onze zorgen kent wanneer we opnieuw langskomen.

We zijn daarentegen verbaasd wanneer we in het weekend na een sportongelukje een

huisartsenpost of Eerste Hulp bezoeken in ons lokale ziekenhuis en de dokters blijken Niet

in ons dossier te kunnen kijken en Niets te weten over onze medische voorgeschiedenis.

We raken geïrriteerd wanneer we belangrijke diagnoses keer op keer moeten herhalen bij

iedere nieuwe dokter die we zien. Diagnoses zoals astma, een doorgemaakt hartinfarct

of kanker zouden inzichtelijk moeten zijn voor de huisarts die dienst heeft op de post

én voor de eerste hulp arts wanneer onze verstuikte enkel toch gebroken blijkt te zijn.

Vanuit patiënten perspectief is hergebruik van gegevens uit het huisartsgeneeskundig

medisch dossier voor zorgdoeleinden logisch en wenselijk, ondanks zorgen die regelmatig

worden geuit, bijvoorbeeld in de media, met betrekking tot privacy aspecten. Vandaag

de dag verwachten patiënten dat medische gegevens worden vastgelegd, dat deze van

goede kwaliteit zijn en dat deze ook worden gedeeld tussen dokters. We weten uit studies

in het veld van de zeldzame ziektes dat de meeste patiënten geen bezwaar hebben en

blij zijn om te kunnen bijdragen wanneer onderzoekers hun medische gegevens willen

hergebruiken. Dit geldt zelfs voor informatie betreffende ons genoom (genetische data).

Veel patiënten staan positief tegenover vroege detectie van (genetisch) risico op ernstige

ziektes, waarvoor dit soort gegevens nodig zijn, blijkt uit diverse studies maar ook uit

de populariteit van e-health applicaties zoals www.yourdiseaserisk.wustl.edu of www.

testuwrisico.nl. We weten echter niet wat patiënten vinden van het feit dat hun medische

gegevens ook worden gebruikt om kwaliteit van zorg te evalueren, maar we weten wel

dat patiënten een hoge kwaliteit van zorg verwachten.

Zoals we hebben laten zien in de introductie (hoofdstuk 1), zijn het hergebruik en

het delen van medische gegevens wenselijk, niet alleen vanuit het perspectief van de

patiënt, maar ook vanuit het perspectief van de onderzoeker, de kwaliteits-beoordelaar

en de huisarts zelf. We weten dat hergebruik en delen van medische gegevens al op grote

schaal gebeurt, ondanks zorgen met betrekking tot de kwaliteit van deze gegevens en

de daaruit volgende herbruikbaarheid. Wij vonden dat er aan de ene kant behoefte was

om dit probleem in kaart te brengen en te kwantificeren voor de huisartsgeneeskunde,

met een speciale focus op de registratie van diagnoses en diagnose codering en aan de

178

10

Ned

erla

Nd

se sam

eNv

at

tiNg

andere kant dat er behoefte was aan het bedenken van nieuwe manieren om medische

data van goede kwaliteit te verkrijgen. Verder wilden we ook uitzoeken of en hoe

medische gegevens uit het huisartsendossier op een verstandige manier kunnen worden

hergebruikt en gedeeld. We besloten onze horizon breed te maken door zowel met

zeldzame als met meer alledaagse ziektes (veel voorkomende kankers) te werken en door

samenwerking te zoeken met medisch specialisten (ziekenhuis Electronische Patiënten

Dossiers of EPD’s) en met (bio-)informatici. We kozen ervoor om diagnose codering te

bestuderen door zelf een code systeem te ontwikkelen binnen het veld van de zeldzame

ziektes en door te participeren in de ontwikkeling van een coderings-applicatie. Omdat

we ontdekten dat er een gebrek is aan toepassing van beschikbare kennis van genetica

in de huisartsenzorg, gedeeltelijk door beperkingen in de huidige HIS-sen, besloten we

om een “roadmap” te ontwikkelen op dit vlak.

Samenvattend stelden we onszelf de volgende doelen:

Doelstellingen van dit promotie-onderzoek1. Beoordeel (aspecten van) data-kwaliteit in (delen van) het huisartsgeneeskundig

medisch dossier, daarbij focussend op diagnose registratie als centraal onderdeel en

diagnose codering als belangrijk middel;

2. Vind strategieën en oplossingen om de kwaliteit van gegevens in het

huisartsgeneeskundig medisch dossier te verbeteren en om het hergebruik en het

delen van deze gegevens mogelijk te maken.

Samenvatting van de Resultaten

Deel 1 – Data kwaliteit: literatuur onderzoek en ‘hands-on’

identificatie van knel- & verbeterpunten

Literatuur op het vlak van data kwaliteit in de huisartsgeneeskunde is schaars; ons

literatuuronderzoek laat zien dat beschikbare studies vooral gericht zijn op compleetheid

en sommige op correctheid (accuraatheid) van een klein aantal data-items. De kwaliteit

van data varieert per huisartsenpraktijk en per data categorie. In het algemeen worden

gegevens die kunnen worden gecodeerd, zoals diagnoses, medicatie voorschriften en

resultaten van laboratorium onderzoek, vrij correct en compleet geregistreerd maar er

is ruimte voor verbetering. Registratie van vitale parameters, risico factoren en allergieën

& intoleranties gebeurt vaak incompleet en incorrect (hoofdstuk 2).

Voor de drie studies die worden beschreven in hoofdstuk 3, 4 en 5 maakten

wij gebruik van een grote database van het Julius Huisartsen Netwerk waarvoor

periodiek geanonimiseerde gegevens worden geëxtraheerd uit de medische dossiers

van 120 huisartsen uit 50 huisartsenpraktijken in de regio Utrecht (290.000 patiënten).

Uit de twee studies die wij uitvoerden om de kwaliteit van diagnoseregistratie in

het huisartsgeneeskundig dossier te meten (hoofdstuk 3 en 4), leerden we dat de

179

10

Ned

erla

Nd

se sam

eNv

at

tiNg

kwaliteit van gecodeerde data, zoals aangetoond voor patiënten met kanker of

een vermoeden op kanker, suboptimaal is. Huisartsen kennen hun kankerpatiënten

maar dit betekent niet dat her-gebruikers van data deze kankerpatiënten eenvoudig

kunnen vinden door gecodeerde, geanonimiseerde huisartsendata te gebruiken. In

beide studies vergeleken we kankergevallen gevonden in het huisartsendossier met

de Nederlandse Kanker Registratie (NKR) (www.cijfersoverkanker.nl), een referentie

standaard die wordt gezien als zeer betrouwbaar. Wanneer her-gebruikers van data

proberen om kankergevallen te vinden door gecodeerde gegevens te gebruiken op

populatieniveau (hoofdstuk 3) maar ook op individueel niveau (hoofdstuk 4), zal een

groot deel van de kankergevallen worden gemist (tot wel 40% fout-negatieven#) en

een groot deel van de gevonden kankergevallen zal fout geclassificeerd zijn (tot 50%

fout-positieven#).

Wij trekken uit deze studies de conclusie dat de kwaliteit van gecodeerde data in

het huisartsgeneeskundig dossier verbetert over de jaren heen, maar ook in recente

jaren suboptimaal is en dat het type HIS systeem de kwaliteit beïnvloedt. Meer specifiek

vonden we dat in recente jaren de diagnose registratie completer is maar als nadeel heeft

dat het aantal fout-positieven stijgt. In onze ‘linkage’-studie beschreven in hoofdstuk 4,

waarvoor we patiënten 1-op-1 koppelden aan de NKR, ontdekten we dat voor 77%

van de missende (dus fout-negatieve) kankergevallen, er wel informatie over de kanker

beschikbaar is, elders in het medisch dossier van de patiënt, meestal in de vorm van

ongecodeerde platte tekst. Ook vonden we dat voor 38% van de ogenschijnlijk foutieve

(fout-positieve) kanker gevallen, de huisarts toch de kankerdiagnose correct heeft

geregistreerd, waarbij voor 31% (van de 38%) de diagnose niet of nog niet beschikbaar

is in de Nederlandse Kanker Registratie.

Om ervaring op te doen hergebruikten wij zelf in onze studie beschreven in hoofdstuk 5

gecodeerde gegevens maar ook vrije tekst uit het huisartsgeneeskundig dossier voor

een onderzoek naar het management van vrouwen die met borst kanker gerelateerde

problemen de huisarts bezoeken. Wij selecteerden voor de onderzoeksperiode alle

vrouwen uit het HIS die zich presenteerden met fysieke klachten en symptomen van de

borst (bijvoorbeeld pijn in de borst of een knobbel in de borst) maar ook alle vrouwen

die naar de huisarts stapten met angst voor borstkanker of met borstkanker in de

familie. We ontdekten dat borstkanker gerelateerde problemen vaak bij de huisarts

worden gepresenteerd (incidentie 25.9 per 1.000 vrouwen per jaar), het grootste deel

bestaat daarbij uit vrouwen met fysieke klachten en symptomen (85.3% of 23.2 per

1.000 per jaar). Ongeveer de helft van de vrouwen wordt doorverwezen voor (meestal

beeldvormend) onderzoek, ongeacht of zij klachten hebben van de borst of niet. Kennelijk

weegt de werkhypothese van de huisarts niet het zwaarst in het besluitvormingsproces.

Verwijzingen voor jaarlijkse screening en genetische counseling blijken suboptimaal en

relevante informatie betreffende de familie anamnese van kanker mist vaak in het HIS.

180

10

Ned

erla

Nd

se sam

eNv

at

tiNg

De identificatie en het management van vrouwen met een verhoogd risico op borstkanker

kan worden verbeterd net zoals de identificatie en geruststelling van vrouwen zonder

verhoogd risico of relevante symptomatologie.

In de laatstgenoemde studie presenteerden we incidentie cijfers gebaseerd routine

zorg gegevens uit huisartsgeneeskundige dossiers, hierbij rekening houdend met de

beperkingen van die data (zie ook aanbevelingen) maar zonder correcties toe te passen

aan de resultaten omdat we onvoldoende informatie hadden over de data kwaliteit. Ook

ondervonden we in alle drie de hierboven beschreven studies dat huisartsgeneeskundige

data incompleet, incorrect gecodeerd en niet tijdig kan zijn. We vonden ook voorbeelden

van onvoldoende concordantie en geloofwaardigheid, maar deze dimensies van data

kwaliteit zijn door ons niet structureel beoordeeld.

Deel 2 – Strategieën en Oplossingen om de kwaliteit van gegevens in

het huisartsgeneeskundig medisch dossier te verbeteren en om het

hergebruik en het delen van deze gegevens mogelijk te maken

Het verbeteren van diagnosecode systemen en de ontwikkeling van tools (digitaal

gereedschap) om codes te ‘mappen’ tussen systemen kan helpen om de kwaliteit van

data in het medisch dossier te verbeteren, niet alleen binnen de huisartsgeneeskunde

maar binnen de gezondheidszorg in het algemeen. Wij hebben de kwaliteit van

coderingssystemen voor het medisch dossier binnen het veld van de zeldzame ziektes,

en meer specifiek de metabole ziektes, bestudeerd. De totale groep zeldzame ziektes is

groot (> 6.000 ziektes) en neemt nog verder toe door de ontrafeling van nieuwe ziektes

of varianten van bekende aandoeningen maar ook door een verbeterde ‘awareness’

bij clinici. We weten uit ervaring dat het annoteren van zeldzame ziektes via adequate

coderingssystemen en dus de mogelijkheid om patiënten accuraat te registreren in

medische dossiers, onvoldoende is en dit is recent bevestigd door andere onderzoekers.

Onze studie, zoals weergegeven in hoofdstuk 6, laat zien dat er grote gaten zitten

in wereldwijd veelgebruikte code-systemen zoals ICD-10 (International Classification

of Diseases) (76% missende codes) en SNOMED-CT (Systematized Nomenclature of

Medicine Clinical Terms) (54% missende codes) voor metabole ziektes. Wij verwachten

dat er vergelijkbare gaten zullen zijn voor andere groepen zeldzame ziektes. Recent is

dit probleem ook herkend door twee organisaties die zich intensief bezighouden met

codering, te weten SNOMED en Orphanet en is er initiatief genomen om gezamenlijk de

codering van zeldzame ziektes te verbeteren. Gaten in codesystemen vormen barrières

voor het delen van data, met name voor zeldzame ziektes, waar de diagnosecode vaak

wordt gebruikt als een sleutel tot communicatie. Wij hebben laten zien dat met de

hulp van ervaren clinici en codeer organisaties het mogelijk is gaten in codesystemen

te dichten en dat het ontwikkelen van rijke en up-to-date codesystemen daadwerkelijk

bij kan dragen aan de kwaliteit (zeldzame) diagnoseregistratie en daarmee indirect

ook aan de zorg voor patiënten met een (zeldzame) aandoening. Ook al werd deze

181

10

Ned

erla

Nd

se sam

eNv

at

tiNg

studie uitgevoerd in een ziekenhuisomgeving, patiënten met een zeldzame ziekte

zouden herkenbaar moeten zijn, ook binnen de huisartsgeneeskunde en dus in

het huisartsgeneeskundig dossier. De studie in dit hoofdstuk gaf ook inzicht in het

uitgebreide proces van het ontwikkelen van een kwalitatief goed, bruikbaar en up-

to-date codesysteem.

Een andere barrière voor het delen van data is de noodzaak om semantiek van

data velden zoals diagnoses maar ook andere fenotypische codes te standaardiseren.

In het ideale geval worden codesystemen eerst op elkaar afgestemd, dus voor data

invoer, maar vaak zal echter retrospectief standaardisatie nodig zijn. In hoofdstuk 7

beschrijven we de ontwikkeling van SORTA, een software tool waarmee data (her)-

codering en ‘mapping’ tussen codesysteem wordt gefaciliteerd. Wij namen deel in

deze studie door SORTA in de praktijk te gebruiken voor een pilot project waarin

een bestaand Nederlands codesysteem voor symptoom (of fenotype) codering werd

‘gemapt’ met een internationaal codesysteem voor fysieke symptomen (HPO = Human

Phenotype Ontology) en wij lieten zien dat bestaande codesystemen vrij snel en met

voldoende verbetering in kwaliteit kunnen worden geharmoniseerd, vergeleken met

eerdere handmatige procedures.

Het coderen van ziektes en symptomen is cruciaal voor een goed medisch dossier,

maar niet alle relevante medische informatie kan goed worden gevangen op deze

manier, bijvoorbeeld de familie anamnese. Het ontwerp van een huisartsgeneeskundig

dossier zou echter registratie van alle relevante medische informatie mogelijk moeten

maken, daarbij waar mogelijk gebruik makend van codesystemen. Verder blijkt dat

de kwaliteit van de user interface ook een belangrijke factor is die bijdraagt aan

data kwaliteit en prestatie van een huisartsinformatie systeem. Deze aspecten

werden door ons bestudeerd in het kader van het leveren van genetisch advies,

wat ondanks beschikbare genetische kennis niet adequaat blijkt te zijn binnen de

huisartsgeneeskunde, maar ook in diverse andere medische specialismen. We hebben

dit probleem bevestigd in hoofdstuk 5 waar we suboptimale verwijs routines vonden

voor jaarlijkse screening en genetische counseling maar ook missende informatie

in het medisch dossier met betrekking tot de familie anamnese van vrouwen met

borstkanker gerelateerde klachten. In hoofdstuk 8 identificeren we obstakels voor

de implementatie van beschikbare genetische kennis binnen de huisartsgeneeskunde

zoals tekortkomingen in het ontwerp en de interface van HIS systemen om genetisch

relevante informatie op te slaan. We introduceren een gefaseerde ‘roadmap’ inclusief

aanpassingen op het HIS en bestaande codesystemen om integratie van genetica

binnen de huisartsgeneeskunde en klinisch onderzoek te verbeteren. Deze roadmap

kan gebruikt worden als een voorbeeld voor het introduceren van andere complexe

toevoegingen en veranderingen aan het HIS of aan codesystemen, altijd vanuit een

behoefte in de dagelijkse medische praktijk.

182

10

Ned

erla

Nd

se sam

eNv

at

tiNg

Lessons LearnedWanneer we de resultaten van de uitgevoerde studies samenvatten en interpreteren

kunnen we de volgende lessen daaruit halen:

1. De kwaliteit van data binnen de huisartsgeneeskunde is suboptimaal, zelfs voor een

sleutel-item zoals de gecodeerde diagnose en voor een ernstige ziekte zoals kanker;

relevante informatie betreffende risicofactoren zoals familie-anamnese mist vaak, is

onvoldoende gecodeerd of kan niet makkelijk worden gevonden;

2. Ondanks suboptimale data-kwaliteit en daaruit volgende herbruikbaarheid met

duidelijke restricties, is het huisartsgeneeskundig medisch dossier een rijke en

volumineuze bron van (voornamelijk ongecodeerde) medische data, die vaak vele

jaren van ‘follow-up’ bestrijkt;

3. Doordat de kwaliteit van data suboptimaal is, zou data uit het huisartsgeneeskundig

medisch dossier alleen moeten worden hergebruikt door mensen die de context van

de huisarts en dus de context rond invoer van deze routine-zorg-gegevens begrijpen

en die expliciet rekening houden met de beperkingen van deze data;

4. Het is noodzakelijk om de kwaliteit van data in het HIS te verbeteren omdat het

hergebruik en het delen van deze data op zich wenselijk is en ook zal toenemen,

idealiter aan de bron (bij data invoer) en ondersteund door adequate codesystemen.

Huisartsen kunnen en moeten hierin worden gefaciliteerd op een aantal manieren

(zie aanbevelingen);

5. Adequate en up-to-date codesystemen zijn cruciaal voor data hergebruik en delen,

niet alleen voor alledaagse maar ook voor zeldzame ziektes en kunnen succesvol

worden ontwikkeld in een samenwerking tussen codeerinstanties en clinici, hierbij

gefaciliteerd door software tools;

6. Deze codesystemen zijn waardevol wanneer ze continue worden onderhouden,

wanneer ze voorzien zijn van adequate synoniemen, relevante crosslinks naar andere

systemen en een duidelijke handleiding voor gebruik;

7. Verplicht coderen van velden in het huisartsgeneeskundig dossier zorgt voor een

meer complete registratie maar ook tot over-registratie en fouten;

8. Het 1-op-1 linken van huisartsgeneeskundige dossiers aan andere data bronnen kan

waardevol zijn om diagnoses te valideren maar is op dit moment nog een complex

en tijdrovend proces;

9. Het huisartsgeneeskundig medisch dossier kan complementair zijn aan andere data

bronnen, zelfs aan een betrouwbare referentiestandaard zoals de Nederlandse Kanker

Registratie (NKR);

10. Er zijn veel ‘stakeholders’ betrokken wanneer het gaat om hergebruik en delen

van medische informatie uit huisartsendossiers: patiënten, huisartsen, het NHG

(Nederlands Huisartsen Genootschap), leveranciers van HISsen (Huisartsen Informatie

Systeem), verzekeringsmaatschappijen/inspectie voor de gezondheidszorg (kwaliteit

183

10

Ned

erla

Nd

se sam

eNv

at

tiNg

beoordelaars), ziekenhuizen, huisartsenposten, onderzoekers, academische

onderzoeksnetwerken, huisartsenopleidingen, codeerinstanties en eigenaars van

externe data bronnen zoals het NKR.

Aanbevelingen De lessons learned kunnen vertaald worden in aanbevelingen voor de diverse stakeholders

(samenvatting, zie voor uitwerking de Engels tekst);

Patiënten

y Patiënten zouden bewust gemaakt moeten worden van het huidige anonieme en

niet-anonieme hergebruik van hun huisartsgeneeskundig medisch dossier.

y Patiënten zouden gestimuleerd moeten worden om verantwoordelijkheid te nemen

door te controleren of al hun belangrijke diagnoses en informatie mbt allergieën en

intoleranties bekend zijn bij hun huisarts en daar ook zijn geregistreerd.

Huisartsen

y Huisartsen zouden de kwaliteit van hun data moeten verbeteren door te investeren

in het updaten en coderen sleutel-items in hun dossiers en door het optimaliseren

van werkprocessen rond registratie.

y Wij bevelen aan dat huisartsen, die hun data kwaliteit hebben verbeterd, actief

participeren in relevante projecten om hun data te delen en te hergebruiken.

Nederlands Huisartsen Genootschap (NHG)

y Het NHG zou aan de ene kant de verbetering van kwaliteit van data in het

huisartsgeneeskundig dossier moeten nastreven en faciliteren, maar aan de andere

kant het ongelimiteerde hergebruik en delen van deze data tegen moeten gaan.

y Het NHG zou digitale technieken die een hoge kwaliteit van data bij invoer kunnen

faciliteren, zoals spraakherkenning, natuurlijke taalherkenning en tekst-mining tools,

actief moeten onderzoeken.

y De implementatie van nieuwere versies van het HIS referentiemodel door leveranciers

zou gestimuleerd moeten worden en diverse aanpassingen zouden moeten worden

gedaan aan het referentiemodel en de ADEPD richtlijn omdat deze bij kunnen dragen

aan het verhogen van de data kwaliteit.

HIS leveranciers

y Onderzoek hoe userinterface en systeem ontwerp aangepast kunnen worden zodat

hoge kwaliteit van (gecodeerde) data bij invoer wordt gestimuleerd.

y Integreer richtlijnen in het HIS zodat digitale suggesties kunnen worden gedaan aan

de huisarts en faciliteer feedback aan huisartsen via eenvoudig te bouwen zoekvragen

en selectie-queries.

184

10

Ned

erla

Nd

se sam

eNv

at

tiNg

Kwaliteitsbeoordelaars

y Stop het meten van de kwaliteit van registratie en vind manieren om de kwaliteit

van zorg adequaat en zonder verstoring van het zorgproces te meten in een dialoog

met huisartsen.

Opleiders

y Voeg een vak “adequate dossiervoering” toe aan het basiscurriculum en leer huisartsen

en huisartsen in opleiding de benodigde vaardigheden.

Onderzoekers en Academische onderzoeksnetwerken

y Als er gewerkt wordt met routine zorg gegevens, valideer dan de diagnose.

y Onderzoeksnetwerken zouden huisartsen actief moeten stimuleren om de kwaliteit van

data in het huisartsgeneeskundig dossier te verbeten en zouden concreet hierbij kunnen

ondersteunen door (spiegel) informatie terug te koppelen aan participerende huisartsen.

Suggesties voor verder onderzoekIn dit proefschrift menen we succesvol te zijn geweest in het beoordelen van de data

kwaliteit van bepaalde items in het huisartsgeneeskundig dossier en zijn we ook in staat

geweest om strategieën en oplossingen te bedenken waarmee data kwaliteit kan worden

verbeterd zodat het hergebruik en het delen van data mogelijk kan worden gemaakt.

We realiseren ons dat dit slechts stukjes van een grote puzzel zijn die de komende jaren

zal moeten worden gemaakt.

Wij zijn er van overtuigd dat we deze resultaten hebben kunnen behalen en

aanbevelingen hebben kunnen formuleren omdat we over de grenzen hebben gekeken

van academische disciplines: huisartsgeneeskunde, klinische genetica, medische

informatica en bio-informatica. Juist het samenwerken met wetenschappers uit andere

disciplines brengt inzicht en oplossingen voor onderzoeksvragen.

Er zijn een aantal (interdisciplinaire) onderzoeks-uitdagingen op dit gebied voor de

toekomst. Allereerst zouden, naast compleetheid en correctheid, ook andere dimensies

van data kwaliteit moeten worden bestudeerd: concordantie, geloofwaardigheid en

tijdigheid, niet alleen voor diagnose registratie maar ook voor andere sleutel-items

zoals risico factoren, doorgemaakte behandelingen en allergieën en intoleranties. In de

tweede plaats zouden interactie-designers moeten worden betrokken bij deze studies:

welke aspecten van de userinterfaces van de verschillende HISsen leiden tot de door

ons geconstateerde verschillen in data kwaliteit en hoe zouden userinterfaces kunnen

worden aangepast zodat de data kwaliteit verbetert?

In de derde plaats zouden het ontwerp, de implementatie en de evaluatie van

diverse interventies, zoals genoemd in dit hoofdstuk (door middel van studies) in de

huisartsenpraktijk zelf kunnen bijdragen aan het ontwikkelen van effectieve interventies

om de data kwaliteit te verbeteren.

185

10

Ned

erla

Nd

se sam

eNv

at

tiNg

Ten vierde zouden de huidige mogelijkheden van natuurlijke taal herkenning en

het realiseren van digitale codeersuggesties tijdens data invoer verder moeten worden

onderzocht.

Tenslotte is het noodzakelijk om ervaring op te doen met door de patiënt ingevoerde

gegevens om de bruikbaarheid van deze data te bepalen voor diverse doeleinden,

natuurlijk in de eerste plaats de zorg zelf. Dit kan bijvoorbeeld gedaan worden door

de ontwikkeling van een App of software tool waarmee patiënten relevante medische

familie informatie kunnen invoeren.

Over onze patiënt uit de inleidingJanuari 2020: een 48-jarige man bezoekt zijn huisarts met een persisterende slijm-

producerende hoest. Enkele dagen geleden maakte hij online een afspraak en voerde

daarbij de reden voor het consult en zijn klachten in. Net voor de afspraak leest de huisarts

deze informatie en kijkt gelijk in de persoonlijke gezondheidsinformatie van de patiënt,

waarin ook gegevens worden verzameld vanuit diverse Apps die de patiënt gebruikt. Het

valt haar op dat de patiënt wat gewicht heeft verloren maar ook dat de trainingsfrequentie

en duur van deze hardloopfanaat behoorlijk zijn afgenomen de afgelopen 4 weken.

De huisarts stelt tijdens het consult een aantal aanvullende vragen en doet lichamelijk

onderzoek, wat zonder afwijkingen blijkt te zijn. Ze vat haar bevindingen hardop samen

voor de patiënt en voor de registratie en gebruikt daarbij haar spraakherkenningssoftware.

Tijdens het inspreken kiest ze relevante codes voor symptomen, klachten en de differentiaal

diagnose, grotendeels gesuggereerd door het systeem. Op haar scherm verschijnt een

pop-up, gebaseerd op de in het systeem geïntegreerde NHG richtlijn “Acuut Hoesten”,

met de vraag of er een aanvraag moet worden verstuurd voor een röntgenonderzoek

(X-Thorax) naar het dichtstbijzijnde ziekenhuis met de kortste wachttijd en vergoeding

door de verzekeraar van de patiënt. Ze klikt op “akkoord” en maakt een afspraak met

de patiënt voor een consult de week daarop om de uitslagen te bespreken.

Dit proefschrift draagt hopelijk bij aan de verbetering van digitale gegevens in medische

dossiers in het algemeen en daarbij aan de ware goudmijn die deze gegevens kunnen

zijn, met als uiteindelijk doel de zorg te verbeteren voor patiënten met alledaagse en

zeldzame ziektes.

Noten# fout-negatieven zijn gevallen van kanker die wel in de NKR staan maar niet in het

huisartsgeneeskundig dossier

fout-positieven zijn gevallen die wel geregistreerd staan in het huisartsgeneeskundig

dossier maar niet in de NKR

data

per

cancerCare

EHRbreast

quality

codi

ng

patients

GP

alsostudy women

used

GPs

12

et

4

al3

5

eg

10

Primary

system

cases

fam

ilyris

kreuse

diagnosis

code

d

using

clinical

code

first

one

gene

tic

two

user

s

age

medical

codes

diag

nose

s

EMR

diseases

registryresearchinform

ation

systemsm

atch

ing

gene

ral

dise

ase

registered

Netherlands

patient

found

Dutc

hyears

practice

history

number

results

NCR

metabolic

heal

th

rare

table

year

registratio

n

avai

labl

e

matches

based

database

genetics

sharing

text

improve

use

however

ICPC

rout

ine

chapter

recordsvalues

listconcerns

disorders

SORTA

University

existing

consultatio

nFigure

example

relevant

value

referral

sym

ptom

s

reco

rd

new

including

Reference

Ontology

related

incidence

SNOM

EDCT

Since

miss

ing

knowledge

purp

oses

incr

ease

d

Episode

consultatio

ns

popu

latio

n

date

step

match

time

studies

national

three

False

posit

ive

CINEAS

2011

case

correct

Furthermorest

anda

rd

types

different

HPO

1000

electronic

know

man

agem

ent

centre

JGPN

Utrecht

physical

Practitioners

Published

inte

rnat

iona

l

analysis

proc

ess

importanton

line

Lucene

according

assess

deve

lope

d

extracted

well

means

referred

complaints

practices

poss

ible

present

Literature

Medicine

guideline

with

out

guidelines

compared

positive

simila

rity

Linkage

large

common

completeness

Groningen

words

Terms

thesis

van

provide

potentially

Many

need

development

shows

improved

matched

screening

Human

Classification

register

determine

instance

EHRs

complete

group

performed

differences

field

hospital

design

oftento

tal

189

10

Da

nk

wo

or

D

DankWooRD

Uiteindelijk typ ik dit dankwoord in het vliegtuig van Marseille naar Amsterdam. Op

weg voor een paar dagen Nederland, waar ik inmiddels dakloos ben omdat ons huis is

verhuurd! Vier weken geleden verhuisden Adri en ik met onze jongste twee kinderen

naar Zuid-Frankrijk om een aantal jaar te wonen en werken in de Provence: een (zeer

fijne) nieuwe omgeving om nieuwe ervaringen op te doen en een andere manier van

leven te ontdekken.

De afgelopen jaren heb ik regelmatig gedacht aan het moment dat ik eindelijk het

dankwoord van mijn proefschrift zou kunnen schrijven: het teken dat het grootste

project wat ik ooit heb aangepakt zou zijn afgerond. Op welke momenten je daar als

promovenda aan denkt? Nou, om wat voorbeelden te noemen: dat zijn de momenten

dat je al uren en uren alleen achter je laptop zit te ploeteren met SPSS of Excel en dat

het lijkt of je in de cijfers verzuipt… Die keren dat je een artikel waar je je uiterste best

op hebt gedaan en waar je heel trots op bent, na twee dagen weer in je inbox vindt

omdat het is afgewezen! De fase dat het lijkt alsof er nooit een eind aan het project

gaat komen, je man en kinderen geïrriteerd raken door je afwezige gedrag en dat je

jezelf afvraagt waarom je hier ooit aan bent begonnen? Maar dan komt er weer een

mail binnen van je promotor met steevast onderaan de tekst “hou vol!”. En dan gaat

je copromotor er eens goed voor zitten, geeft advies, regelt studenten (dank Jessika

Roskam en Rosanne Ader!) die komen helpen met het taaiste werk en wordt het ene na

het andere artikel wél geaccepteerd. Kortom een project met bergen en dalen, net als

onze nieuwe leefomgeving de Provence, en net als het leven zelf.

Dank Mattijs Numans, mijn promotor, voor de kans die je me bood in 2011 om dit

project op te starten en de vrijheid om het zelf vorm te geven. Dank voor je inspiratie

en je “hou vol” mailtjes op de juiste momenten vanuit Starbucksen over de hele

wereld maar bij voorkeur uit de VS. Dank Rolf Sijmons, ook promotor, voor je morele

ondersteuning, je humor, enthousiasme en inzichten. Ik waardeer het enorm dat je

gewoon naast me kwam zitten achter de computer en stukken niet lopende tekst

samen met mij herschreef. Welke promotor doet dat nou? Charles Helsper, merci voor

je inzet en je concrete hulp op lastige momenten. Dank voor het investeren van tijd

en het bellen, ook na werktijd, net voor een deadline met twee kleine jongetjes die

aan je broek hangen!

Leden van de leescommissie, de professoren Damoiseaux, Van der Horst, Cornel

en Brinkkemper en doctor Cornet, dank voor de bestuderen van mijn proefschrift en

vooral voor het goedkeuren daarvan. Beste Sjaak, wat bijzonder dat we elkaar hier

weer treffen, meer dan 20 jaar geleden studeerde ik af bij jou in de Informatica aan de

Universiteit Twente.

190

10

Da

nk

wo

or

D

Paranimfen en vriendinnen Ilya en Floor, de één ken ik al lang, de ander nog maar een

paar jaar, maar jullie zijn mij allebei zeer dierbaar. Jullie weten uit eigen ervaring wat een

promotie-traject inhoudt en hebben mijn gemopper aangehoord en me tips gegeven

en gemotiveerd wanneer dat nodig was. Ik word nu al iets minder gestresst bij het idee

dat jullie naast me staan op 27 januari 2017!

Mijn vriendinnen en vrienden, natuurlijk, allemaal verschillend, vanuit verschillende

activiteiten of rollen die je zo hebt in het leven. Gerreke, wij delen ervaringen, niet allemaal

even leuk, maar begrijpen elkaar daardoor met een enkel woord. Merci voor je steun

en motivatie in allerlei situaties. Anja, o wat mis ik ons trim-zwem- & bijpraatavondje op

maandagavond. Dank voor je vrolijkheid en grenzeloze optimisme waar ik me aan op

kan trekken! Christi, schoonzus en (hoe bijzonder) ook vriendin, dank voor je interesse

en meeleven. Margriet, mijn bijna-collega in Frankrijk (hoe jammer), kom gewoon over

een paar jaar met Bram en Ellen deze kant op, gaan we onderzoek opzetten bij sporters

die de Ventoux op fietsen! Madelon en Isa, congres- en reismaatjes, dank voor jullie

meedenken en discussiëren maar vooral ook jullie gezelligheid op reis. Jullie waren voor

mij de collega’s die ik eigenlijk niet had omdat ik tussen drie universiteiten rondzworf.

Emmy en Hugo, ook al zien we elkaar minder dan toen we nog om de hoek woonden,

onze warme vriendschap blijft! Richard, eindelijk ondernemer, dank voor je vriendschap,

optimisme en meeleven privé en zakelijk, al vanaf onze studietijd in Twente. Ik ga ervan

uit dat we elkaar gewoon regelmatig in Frankrijk zien, of in Morzine, of in Mormoiron.

Steven en Marion, laten we vooral samen blijven skiën, met Floor en ons gezin! Ik

waardeer jullie meeleven en betrokkenheid. Vrienden en vriendinnen die ik hier nog niet

heb genoemd; dank voor jullie vriendschap en meeleven.

Mijn collega’s uit de redactie van Huisarts & Wetenschap (H&W): Just Eekhof, Hans

Hawkeye van der Wouden, Bèr Pleumeekers, Lidewij Broekhuizen, Sjoerd Hobma, Wim

Verstappen, Marianne Dees, Henny Helsloot, Susan Umans, Marissa Scherptong-Engbers,

Nadine Rasenberg en Ivo Smeele; ik wil jullie hier zeker noemen want hoeveel makkelijker

is het voor mij geworden door alle dingen die ik leer in de redactie op het gebied van

wetenschap & onderzoek. Dank voor inzichten, kritische noten maar vooral veel humor.

Ook mijn huisartsopleiders wil ik niet ongenoemd laten, want zij hebben soms

last gehad van mijn promotiewerkzaamheden naast de opleiding maar hebben daar

nooit moeilijk over gedaan. Bert ter Horst, Mieke van Dillen en Wietze Eizenga, dank

voor jullie flexibiliteit, motivatie en het voorbeeld wat jullie voor mij zijn als huisarts.

Wietze vooral, jij begrijpt hoe onderzoek werkt, dank voor onze gesprekken en dank

voor het vertrouwen wat je uitte: ik denk dat jij me net het zetje hebt gegeven om de

Frankrijk plannen door te zetten. Ik verwacht je zeker hier met Brigitte om nog eens de

Ventoux op te fietsen met je retro racefiets maar dan bij wat koeler weer (of gewoon

wat vroeger op de dag).

191

10

Da

nk

wo

or

D

Gepke Visser, metabool kinderarts, wat een geploeter was het he, die metabole

ziektecodes. Het werd mijn eerste geaccepteerde artikel, hoofdstuk 6, dank voor het

samenwerken, soms in het WKZ, soms aan de koffietafel bij jou thuis! Dick Lindhout,

klinisch geneticus en emeritus hoogleraar, door jou begon de liefde voor de genetica,

dank voor de kansen die je me jaren geleden alweer bood! Morris Swertz en Chao Pang,

dank voor de samenwerking die geleid heeft tot hoofdstuk 7.

En dan wil ik hier ook mijn Franse collega’s noemen die mij zo hartelijk hebben

ontvangen en geduldig begeleiden. Ik voel me meer dan welkom. Sébastien Adnot, Lies

de Vos, Francis en Els van der Velden en Philippe Morvan de la Maison de Santé Bel Air

à Carpentras: merci pour votre bienvenue à Carpentras et votre accompagnement dans

les semaines passées. Je suis sûr que nous allons collaborer bien!

Pa en ma, dank voor de veilige, liefdevolle en stevige basis die jullie mij en Henk en Gerard

in het leven hebben gegeven. Ik weet dat ik soms rechts ga waar jullie links hadden

gekozen maar weet dat ik de basis niet vergeet en ook waardeer. Henk en Gerard, mijn

broers, we zien elkaar niet zo vaak maar als dat wel zo is, is het altijd vertrouwd en

gezellig. Nel, schoonma, je voelt net zo close als een “gewone” moeder, dank dat je er

altijd voor mij en mijn gezin bent.

Adri, wij zijn niet zo van veel mooie woorden en dat ga ik hier maar niet veranderen

al heb ik wel iets te zeggen natuurlijk. Geen woorden maar daden kenmerkt jou: dank

voor al je inzet en liefdevolle opvang thuis als ik druk was, dank dat je me altijd vrij laat

in mijn keuzes en motiveert om de dingen te doen die me interesseren. Nog steeds ben

ik gefascineerd door al die dingen die jij kunt bedenken en vervolgens ook kunt maken

(en ik niet). Wat een geluk dat jij met mij het leven wilt delen.

Kristel, het is echt leuk om te zien hoe jij je hebt ontpopt tot (bijna) verloskundige,

een mooie, vrolijke en levenslustige vrouw. Hoe leuk is het dat je me soms belt laat op

de avond om te vertellen over een indrukwekkende bevalling en dat ik na een dienst jou

kan vertellen over wat ik nou weer heb meegemaakt. Je gaat nu zelfs onderzoek doen

op Curaçao en werkt voor een onderzoeksproject in het AMC (yes, research!). Dank

voor je nuchtere kijk op het leven en ook op dit promotietraject en je aanstekelijke lach.

Nathalie, toekomstig collega dokter (huisarts?) en wereldburger. Ik heb het geluk

dat ik ook hier kan schrijven: hoe leuk als je dochter je belt om te vertellen over een

college met een indrukwekkende patiënt! Met jou kan ik altijd discussiëren, jij leest mijn

zelfgeschreven stukken en corrigeert ze. Merci voor je open blik, je brede interesse, je

meedenken en scherpe inzichten. Ik ben benieuwd waar je avonturen je naar toe gaan

leiden de komende jaren, maar ik heb er alle vertrouwen in!

Mike, stoere, slimme en sportieve “gast”, in sommige dingen lijken wij nogal op

elkaar. Zoals jij in een boek of in Duckstad kunt zitten, zo kan ik dat ook. Zoals jij “domme

en verstrooide” dingen kunt doen, kan ik dat ook (slagboompje?). Ik hoop dat je ook

192

10

Da

nk

wo

or

D

nog wat goede genen hebt meegekregen van mij, maar anders zijn er genoeg van papa,

daar ben ik zeker van. Als je niet met Jesse een bedrijf begint later (paintballhal?), is

onderzoeker misschien iets voor je?

Marilyn, jij bent vrolijk, expressief en heel sportief, maar vooral echt een lieverdje. Ik

kan me nog goed herinneren dat ik (weer) een afwijzing kreeg van een wetenschappelijk

blad en dat er even later een briefje op mijn laptop lag: “ lieve Annet, ik vind je artikel

WEL goed!”. Jij vond het vaak niet leuk als ik alweer weg moest of alweer achter de

laptop moest kruipen. Ook al ben je nog maar 11 (bijna 12) je begreep het wel. Dank je

en houd vooral niet op met mij duidelijk zeggen als ik teveel met andere dingen bezig

ben. Ik ben heel benieuwd hoe jij je verder gaat ontwikkelen, wie weet wel tot dierenarts.

Annet, november 2016

195

10

CU

RR

ICU

LUM

VITA

E

CuRRICuLum vItaE

Annet Sollie was born on Friday the 13th of March 1970 in Zwolle, but how lucky can

a person be. She is happily married to her soulmate Adri Wisse and together they are

blessed with four wonderful children: Kristel, Nathalie, Mike and Marilyn. Besides this

she got to be a doctor after all. She enjoys using her past education and experience to

“make ICT solutions actually work for the doctor”.

Currently she is working in Carpentras at the Maison de Santé Bel Air in the south

of France as a general practitioner (médecin généraliste). She also works as an editor

for Huisarts & Wetenschap (H&W) and occasionally as a consultant on ICT & Healthcare

projects for her own company Soll-Vite.

Professional skills and interestsRare diseases in primary care

Phenotype coding & coding systems

Medical Genetics in primary care

Electronic Health Record systems

Working experienceFor more information on finished courses, publications, projects and working experience

in the ICT sector please visit:

Linkedin: nl.linkedin.com/in/annetsollie

Researchgate: researchgate.net/profile/Annet_Sollie

Twitter: twitter.com/annetsollie

About: about.me/asoll

Education1992 - 1988 Secondary education (VWO) prof. dr. Greijdanusschool in Zwolle

1988 - 1994 Computer Science, Technical University of Twente, Enschede

specialisation in Business Administration

1999 - 2000 Propaedeutic Psychology, Open University

2001 - 2009 Medical School, University of Utrecht

2012 – 2016 Residency General Practice