65
The Clinical and Translational Science Awards (CTSA) is a registered trademark of DHHS. If it is in the EHR it must be true Using EHR data for research Keith Marsolo, PhD Jareen Meinzen-Derr, PhD Bin Huang, PhD

If it is in the EHR it must be true - CCTST GR slides FINAL... · If it is in the EHR it must be true Using EHR data for research ... Large-scale electronic health record-based

Embed Size (px)

Citation preview

The Clinical and Translational Science Awards (CTSA) is a registered trademark of DHHS.

If it is in the EHR it must be true

Using EHR data for research

Keith Marsolo, PhD

Jareen Meinzen-Derr, PhD

Bin Huang, PhD

The Clinical and Translational Science Awards (CTSA) is a registered trademark of DHHS.

Outline of Discussion• Jareen Meinzen-Derr - Epidemiologist

– Introduction to using EHR in research, advantages and methodologic limitations/challenges

• Keith Marsolo – Informaticist

– Overview of data abstraction and challenges, introduction to large network EHR- based registry (PCORNet)

• Bin Huang - Biostatistician

– More in-depth look at the challenges and implications from the analysis perspective along with potential solutions and considerations

The Clinical and Translational Science Awards (CTSA) is a registered trademark of DHHS.

Large-scale electronic

health record-based

research is more

challenging

than traditional

retrospective studies

The Clinical and Translational Science Awards (CTSA) is a registered trademark of DHHS.

A Primer• My target population of interest is all children

with autism spectrum disorder (ASD) seen at

Cincinnati Children’s

– How do I define ASD?

• ICD9? ICD10?

• Age? ASD diagnostic assessments?

– Where is my population?

• Specific divisions?

• Any clinic? Inpatient vs. outpatient?

• With or without follow-up?

The Clinical and Translational Science Awards (CTSA) is a registered trademark of DHHS.

A Primer

MRN ICD-9 Clinic Date

0001 299.0 Dev Peds 01/01/2015

0001 299.0 Dev Peds 10/01/2015

0002 299.0 Optho 02/01/2013

0003 315.31 Dev Peds 03/01/2012

0003 299.8 Dev Peds 03/01/2013

0004 348.39 Psych 01/01/2009

0004 348.39 Dev Peds 01/01/2010

Only record in chart

Expressive language disorder Do you include previous visit?

Static encephalopathy

Notes state ASD

assessments indicate

ASD

If I include this code in search, I will receive

thousands of records who do not have ASD

The Clinical and Translational Science Awards (CTSA) is a registered trademark of DHHS.

Electronic Health Records

• A longitudinal collection of electronic health information for and about persons

• Immediate electronic access to person- and population-level information by authorized, and only authorized, users

• Provision of knowledge and decision support that enhance the quality, safety, and efficacy of patient care

• Support of efficient processes for healthcare delivery

IOM

EHR use in research

• Surge in the use of EHR (12.2%-2009 to 75.5%- 2014)

– EHR-based outcomes research studies have increased >6-fold

• Accommodate collection of structured, coded, electronically available data

– Can be used to build longitudinal histories

• All access to health records from multiple locations

– Electronic transmission of records

• More efficient/less expensive alternative to clinical trials

• Can be used to populate databases for both clinical and research purposes

The Clinical and Translational Science Awards (CTSA) is a registered trademark of DHHS.

Great Opportunities• Quality improvement purposes

– Facilitate data sharing, decision-making, efficient administrative operations

• Recruiting for prospective studies/clinical trials

• Public health initiatives– Facilitate surveillance of infectious diseases, disease

outbreaks, chronic illnesses

• Replicating results of randomized controlled trials

• Conduct “Big Data” research– Rich data to study disease progress, health

disparities, clinical outcomes, treatment effectiveness, efficacy of public health interventions

The Clinical and Translational Science Awards (CTSA) is a registered trademark of DHHS.

How do I Begin?

JUST LIKE YOU WOULD ANY

OTHER OBSERVATIONAL

RESEARCH STUDY THAT USES

SECONDARY DATA

The Clinical and Translational Science Awards (CTSA) is a registered trademark of DHHS.

Shifts in primary responsibility

Study design protocol

Create Data Collection Tool

Manual data abstraction &

entry

Manual verify missing & erroneous

data

Data management

Data analysis

Study design protocol

Electronic Data

Abstraction

Data management

verify data

Manual verification

Data analysis

Clinical researcher

Methodologist should be engaged throughout Methodologist

Clinical researcher Clinical researcher

Methodologist

The Clinical and Translational Science Awards (CTSA) is a registered trademark of DHHS.

How do I Begin?• What is your research question?

– Is it descriptive vs. analytic?

– Does it have a clear testable hypothesis?

• What are the appropriate study designs?

• Is the information needed to answer question present, accessible, & reliable in the EHR?

• How will you extract and analyze the information?

– What are additional data management and methods needs?

The Clinical and Translational Science Awards (CTSA) is a registered trademark of DHHS.

Before you begin

• Crucial to develop criteria for identifying patients who have condition to be studied

– Data may need to be searched from problem lists, billing codes, medication lists, physical exam results across any/all possible clinic sources

– Must identify how long a patient has had a problem

– Develop processes for solving issues such as identification of first diagnosis

• Study subjects are patients, not participants

– Part of an “open-cohort” and enter or leave at any time

The Clinical and Translational Science Awards (CTSA) is a registered trademark of DHHS.

Have an Awareness

• Known limitations of EHR data must be considered

– In the study design

– In the data collection/abstraction

– In the data analysis

– In the interpretation

• Consequences can include:

– Flawed conclusions

– Altered policy decision or clinical practice

The Clinical and Translational Science Awards (CTSA) is a registered trademark of DHHS.

EHRs are designed for

clinical care, not research• Not structured in a way that facilitates research

– Providers decide where to put information

– Information may be entered free-text (not

structured or finite list)

– Providers use different terms for same info

– Information not always stored in a way that is

readily searchable

– Data not important in clinical care may be missing

Awareness:

Poor Data Quality• Quality variable due to differences in measurement,

recording, information systems, and clinical focus

• Serious threat to validity and generalizability of clinical research findings

• Context dependent– Same elements deemed high quality for one use and poor

quality for different use

• Presence of extreme values may be irrelevant in determining a median rough estimate of #eligible patients for study

• Same extreme values may have significant undue influence on results of algorithms, or analytic methods sensitive to extreme values

The Clinical and Translational Science Awards (CTSA) is a registered trademark of DHHS.

Incomplete Data• Due to fragmentation of healthcare systems

– Patients moving between systems for special referrals or emergency care

• Due to “poor”/inaccurate documentation (on the part of patients and healthcare providers)– Lack essential information such as treatment

outcomes

• Sick patients often have more data– Non-random missing

• Complete information about patient vs. complete information about patient’s encounter

The Clinical and Translational Science Awards (CTSA) is a registered trademark of DHHS.

Examples in the literature• 30-40% of patients have clinical visits across

multiple institutions

• 55% of clinical research studies supplemented with non-EHR sources of data

– 40% supplemented with patient-reported data

• 49% of patients with ICD-9 pancreatic cancer did not have corresponding pathology documentation (incomplete or incorrect)

Bourgeouis 2010; Finnell 2011; Thiru 2003; Dean 2009; Botsis 2010

“Sicker” Have More Data

Figure 5. Average number of days with data per patient by ASA class. For both

medication orders and laboratory results, all ASA Classes are significantly different

except for Classes 1 and 2.Rusanov 2014

The Clinical and Translational Science Awards (CTSA) is a registered trademark of DHHS.

Sicker have more complete data

Figure 4. Complete records by ASA Class where complete records are those having

at least seven values in each of the two categories (medication orders and laboratory

results).

The Clinical and Translational Science Awards (CTSA) is a registered trademark of DHHS.

Data Quality

• Data entry errors

– Reported as high as 26.9% Goldberg 2008

– Medication discrepancies common

• Data coding, standardization, extraction

– Free text narrative

– Inconsistent terms, phrases, abbreviations

– Billing purposes

– Diagnostic codes may be recorded for detection or “rule out” purposes

Meredith L et al. 2008

The Clinical and Translational Science Awards (CTSA) is a registered trademark of DHHS.

Study Design Still Matters• Errors can occur in selecting a cohort and

characterizing that cohort

• Errors in a small number of cases can have a relatively large effect on outcomes

• Manual review of cases or a sample of cases is invaluable in improving the sample

• May be difficult to find “healthy” patients with sufficient data (comparison cohorts)

• Requires special methodologic approaches to selecting complete patient records from EHR databases while avoiding bias

Hripcsak 2011; Weiskopf 2013

The Clinical and Translational Science Awards (CTSA) is a registered trademark of DHHS.

Impact of Data Errors

Hripcsak 2011

The Clinical and Translational Science Awards (CTSA) is a registered trademark of DHHS.

Bias Challenges

Selection bias: Subset of individuals studied is

not representative of the population of interest

– Selection is not random

• Can distort assessments of measures (e.g.

disease prevalence or exposure risk)

• Estimates not as generalizable

– Ex: Including only patients with complete data

– Ex: Generalizing findings from a hospital-based

study to all who may have a condition

The Clinical and Translational Science Awards (CTSA) is a registered trademark of DHHS.

Bias ChallengesMeasurement bias: Errors in measurement and/or

data collection

• Instrument calibration

• Data collection variability – depending on the field,

clinician judgement plays a role

• Patient’s ability to complete assessment/provide

history (recall)

• Use of certain codes/data to measure exposure

• Clinician decides how long to “follow” patient

– Impact calculation of prevalence, incidence, risk ratios

The Clinical and Translational Science Awards (CTSA) is a registered trademark of DHHS.

Confounding

• Distortion of the estimated effect of exposure

on outcome caused by the presence of an

extraneous factor associated with both

exposure and outcome

– SES factors, lifestyle choices, age

• Without consideration, estimated effect of

treatment may be actually caused by some

other factor

The Clinical and Translational Science Awards (CTSA) is a registered trademark of DHHS.

Confounding Can Hurt

• EHR study of hospitalized patients >65

years, NSAID use associated with 32%

mortality risk reduction

• However, after included additional specific

confounders and analytical techniques,

NSAID use associated with 6% mortality

risk increase

– Addressed unmeasured confoundingSturmer 2005

The Clinical and Translational Science Awards (CTSA) is a registered trademark of DHHS.

Confounding

• Confounding by indication

– Treatment choices influenced by severity or

duration of patient’s disease

– Also influences outcome of treatment

– Sicker patients receive different treatments

– Sicker patients have different (worse) outcomes

• Cannot be adjusted for in conventional

regression analyses

The Clinical and Translational Science Awards (CTSA) is a registered trademark of DHHS.

From EHR to Clinical Evidences

EHR recorded at the point of

care

Data Extractions

Data Wrangling

Data Curation

Data Analyses

Causal Inference

Decision Theory

Evidence Based

Decision

The Clinical and Translational Science Awards (CTSA) is a registered trademark of DHHS.

EHR data – entry to extract

The Clinical and Translational Science Awards (CTSA) is a registered trademark of DHHS.

Sources of variability – data entry

*partial list

The Clinical and Translational Science Awards (CTSA) is a registered trademark of DHHS.

Sources of variability – ETL

*partial list

The Clinical and Translational Science Awards (CTSA) is a registered trademark of DHHS.

Sources of variability – User request

*partial list

The Clinical and Translational Science Awards (CTSA) is a registered trademark of DHHS.

Sources of variability - analyst

*partial list

The Clinical and Translational Science Awards (CTSA) is a registered trademark of DHHS.

Sources of variability – self-service tools

*partial list

The Clinical and Translational Science Awards (CTSA) is a registered trademark of DHHS.

Why is this so complicated?

• Conceptual idea of clinical process does not translate to how data are captured in the EHR

• Many different ways to document same piece of information

– Workflow used to collect data often dictates where those elements are stored in reporting database

– Most researchers lack understanding of these workflows

• Quality of results then depend on how question is asked, skill of analyst

The Clinical and Translational Science Awards (CTSA) is a registered trademark of DHHS.

Why is this so complicated?

• Conceptual idea of clinical process does not translate to how data are captured in the EHR

• Many different ways to document same piece of information

– Workflow used to collect data often dictates where those elements are stored in reporting database

– Most researchers lack understanding of these workflows

• Quality of results then depend on how question is asked, skill of analyst

The Clinical and Translational Science Awards (CTSA) is a registered trademark of DHHS.

Example – encounters (CCHMC FY14)

• Annual Report– Total patient encounters: ~1.2 million

– ED visits: ~100K

– Admissions (including short stay): ~31K

– Outpatient: ~1 million

• EHR– Total patient encounters: ~3 million

– ED admissions to inpatient: ~145K

– Inpatient: ~28K

– Ambulatory: ~2.8 million

• Encounter != encounter

The Clinical and Translational Science Awards (CTSA) is a registered trademark of DHHS.

Just pull data from “ambulatory”

encounters…EEG

EXERCISE

CARDIOLOGY TESTING

PUMP/CGM INITIATION ORDERS

MED TAPER SCHEDULE

GENETIC COUNSELOR

NEONATOLOGY TESTING

CARE CONFERENCE - PATIENT/FAMILY PRESENT

HOME VISIT - PALLIATIVE CARE

ABUSE REPORTING

CARE COORDINATOR

SPECIAL NEEDS SUMMARY

EARLY INTERVENTION

HI NEURODEVELOPMENTAL CLINIC TRACKING

INFUSION ORDERS

ENT CLINIC VISITS

FEES/VOICE

HEPATOBLASTOMA LIVER TRANSPLANT FOLLOW UP

PRE-ADOPTION ENCOUNTER

EB PLANNING

FEES CLINIC

VPI - ENT/SPEECH

INTAKE

HVMC PLANNING

PRE-OP PHYSICAL

PLAN OF CARE

ENT INPATIENT VISIT

HOSPITAL TO HOSPITAL TRANSFER

DEVELOPMENTAL TESTING

BIOETHICS CONSULT

ENDO STIM TESTING

HIM INTERFACE CREATED

SURGICAL SITE INFECTION

DERM PATCH TESTING

INTAKE CONSULT

ADEC INTAKE

CPST-PSY ENCOUNTER

ECONSULT TELEMEDICINE

ROADMAP

HOSPITAL ENCOUNTER

UPDATE

PCP/CLINIC CHANGE

WAIT LIST

CLERICAL ORDERS

MOTHER BABY LINK

LACTATION ENCOUNTER

CANCELED

APPOINTMENT

SURGERY

ANESTHESIA

ANESTHESIA EVENT

UNMERGE

HEALTH MAINTENANCE LETTER

PATIENT EMAIL

E-VISIT

MOBILE ORDER ONLY

QUESTIONNAIRE SERIES SUBMISSION

PATIENT OUTREACH

CONTACT MOVED

NURSE TRIAGE

E-CONSULT

E-CONSULT COMMUNITY ORDER

TELEMEDICINE

EXTERNAL CONTACT

OPHTH EXAM

HOSPICE ADMISSION

HOME HEALTH ADMISSION

HOME CARE VISIT

HOME CARE UPDATE

PATIENT WEB UPDATE

COMMUNITY ORDERS

COMMITTEE REVIEW

POST MORTEM DOCUMENTATION

BILLING ENCOUNTER

HOSPITAL

CONFIDENTIAL

OPH TESTING

EDUCATOR

VOICE CLINIC

TELEPHONE

REGISTRATION

EMPTY

LAB REQUISITION

INITIAL CONSULT

ANTI-COAG VISIT

PROCEDURE VISIT

OFFICE VISIT

CONSENT FORM

SCREENING FORM

EXTERNAL HOSPITAL ADMISSION

LETTER (OUT)

REFILL

IMMUNIZATION

HISTORY

RESEARCH ENCOUNTER

REFERRAL

ORDERS ONLY

RX REFILL AUTHORIZE

MEDS ONLY (WEB)

MEDS VOID (WEB)

RESOLUTE PROFESSIONAL BILLING HOSPITAL PROF FEE

EPISODE CHANGES

ANCILLARY ORDERS

PHARMACY VISIT

BPA

ROUTINE PRENATAL

INITIAL PRENATAL

OPHTH OFFICE VISIT

ABSTRACT

WALK-IN

TREATMENT PLAN

ALLIED HEALTH

NURSE ONLY

SOCIAL WORK

NUTRITION

PHYSICAL THERAPY

OCCUPATIONAL THERAPY

SPEECH THERAPY

RESPIRATORY THERAPY

CASE MANAGEMENT

EDUCATION

SURGICAL H&P

CLINICAL SUPPORT

MEDS ONLY / E - PRESCRIBE

PFT ONLY

TRANSPLANT PRE-EVALUATION

TRANSPLANT EVALUATION

TRANSPLANT FOLLOW-UP

TRANSPLANT RESULTS ENTRY

IMMUNOTHERAPY

ALLERGY TESTING

SPECIMEN COLLECTION

AUTO RELEASE ORDERS

URODYNAMIC TESTING

PRE-NATAL

CONSULT CHECKLIST

BOWEL MANAGEMENT

CARE CONFERENCE

INTAKE/TRIAGE

VNS REPROGRAM/SHUTOFF

CLINICAL NOTE

GENETICS

PASTORAL

THERAPY VISIT

INTAKE - NEW PATIENT

HIM SCANS

PRE-VISIT PLANNING

TRANSCRIBED ORDERS

SCHOOL TEACHER/INTERVENTION

CHILD LIFE

THERAPY PROGRESS SUMMARY

BRONCHOSCOPY REQUEST

HEMONC SOCIAL WORK

AUD CONSULT

OPH CONSULT

ALG CONSULT

UROLOGY COMPLEX INTAKE

The Clinical and Translational Science Awards (CTSA) is a registered trademark of DHHS.

Give me all data for element X…

The Clinical and Translational Science Awards (CTSA) is a registered trademark of DHHS.

Why is this so complicated?

• Conceptual idea of clinical process does not translate to how data are captured in the EHR

• Many different ways to document same piece of information

– Workflow used to collect data often dictates where those elements are stored in reporting database

– Most researchers lack understanding of these workflows

• Quality of results then depend on how question is asked, skill of analyst

The Clinical and Translational Science Awards (CTSA) is a registered trademark of DHHS.

EHRs are constantly evolving

• New functionality is released & workflows change over time

– Clinician-entered

– Patient entry via welcome kiosk

– Patient entry via web-based questionnaire

• These workflows are typically additive, not substitutive

– Need to remember this history

– Will otherwise result in gaps in population

The Clinical and Translational Science Awards (CTSA) is a registered trademark of DHHS.

Example – Has a HEALTH RELATED QUALITY OF

LIFE (QOL) ASSESSMENT been documented?

• Flowsheet RHE PEDS QL #129, Measure RHE PARENT #3757

• Flowsheet RHE PEDS QL #129, Measure RHE PATIENT #1799

• Flowsheet RHE PEDS QL #129, Measure GEN PATIENT #3758

• Flowsheet RHE PEDS QL #129, Measure GEN PARENT#3759

• Questionnaire RHE PEDSQL 13-18 TEEN REPORT #20702, Question RHE PEDSQL 13-18 CHILD TOTAL SCORE #400411

• Questionnaire RHE PEDSQL 13-18 PARENT REPORT FOR TEENS #20703, Question: RHE PEDSQL 13-18 PARENT TOTAL SCORE #20544

• Questionnaire RHE PEDSQL 2-4 PARENT REPORT FOR TODDLERS #20699, Question: RHE PEDSQL 2-4 PARENT TOTAL SCORE #400415

• Questionnaire RHE PEDSQL 5-7 PARENT REPORT FOR YOUNG CHILDREN #20700, Question: RHE PEDSQL 5-7 PARENT TOTAL SCORE #400421

• Questionnaire RHE PEDSQL 5-7 YOUNG CHILD REPORT #20701, Question: RHE PEDSQL 5-7 CHILD TOTAL SCORE #400427

• Questionnaire RHE PEDSQL 8-12 PARENT REPORT FOR CHILDREN #20706, Question: RHE PEDSQL 8-12 PARENT TOTAL SCORE#400439

• Questionnaire RHE PEDSQL 8-12 CHILD REPORT #20705, Question: RHE PEDSQL 8-12 CHILD TOTAL SCORE #400433

• Questionnaire PEDSQL GENERIC 1-12MOS PARENT REPORT FOR INFANTS #20758, Question: PEDSQL 1-12MOS TOTAL SCORE #400280

• Questionnaire PEDSQL GENERIC 13-18 TEEN REPORT #20745, Question: PEDSQL 13-18C TOTAL SCORE #400163

• Questionnaire PEDSQL GENERIC 13-18 PARENT REPORT FOR TEENS #20686, Question: PEDSQL 13-18P TOTAL SCORE #400158

• Questionnaire PEDSQL GENERIC 13-24MOS PARENT REPORT FOR INFANTS #20759, Question: PEDSQL 13-24MOS TOTAL SCORE #100857

• Questionnaire PEDSQL GENERIC 18-25 YOUNG ADULT REPORT #20684, Question: PEDSQL 18-25C TOTAL SCORE #400183

• Questionnaire PEDSQL GENERIC 2-4 PARENT REPORT FOR TODDLERS #20688, Question: PEDSQL 2-4P TOTAL SCORE #400188

• Questionnaire PEDSQL GENERIC 5-7 PARENT REPORT FOR YOUNG CHILDREN #20689, Question: PEDSQL 5-7P TOTAL SCORE #400153

• Questionnaire PEDSQL GENERIC 5-7 YOUNG CHILD REPORT #20683, Question: PEDSQL 5-7C TOTAL SCORE #400178

• Questionnaire PEDSQL GENERIC 8-12 PARENT REPORT FOR CHILDREN #20687, Question: PEDSQL 8-12P TOTAL SCORE #400173

• Questionnaire PEDSQL GENERIC 8-12 CHILD REPORT #20685, Question: PEDSQL 8-12C TOTAL SCORE #400168

The Clinical and Translational Science Awards (CTSA) is a registered trademark of DHHS.

Are there any solutions?

• Engagement with operational reporting groups / data stewards– Often serve as source of truth for a given area

– Deal with much higher request volume

– However – different priorities, funding models – can be difficult to keep activities aligned

• Quality checks / Data characterization– Should help identify if there is a problem

– But not necessarily where to look for the solution

– Difficult to communicate/disseminate findings

The Clinical and Translational Science Awards (CTSA) is a registered trademark of DHHS.

Are there any solutions?

• Engagement with operational reporting groups / data stewards– Often serve as source of truth for a given area

– Deal with much higher request volume

– However – different priorities, funding models – can be difficult to keep activities aligned

• Quality checks / Data characterization– Should help identify if there is a problem

– But not necessarily where to look for the solution

– Difficult to communicate/disseminate findings

Assessing Data Quality within PCORnet

Enabling Research at a National Scale

How do you ask a research question at hundreds

of institutions and get back results you can trust?

Option 1 — Write a description and have everyone

create a local implementation to run on their data

Option 2 — Create an algorithm that can run

against a single, common data model

PCORnet Data Strategy

Standardize data into a common data model

Focus on data quality: data curation

Operate a secure distributed query infrastructure

Develop re-usable tools to query the data

Send questions to the data and only return required information

Learn by doing and repeat

Loading the Common Data Model (easy)Same data are represented differently at different institutions (e.g., Race)

Common Data Model Value Set

01 = American Indian or Alaska Native

02 = Asian

03 = Black or African American

04 = Native Hawaiian or Other Pacific Islander

05 = White

06 = Multiple Race

07 = Refuse to Answer

NI = No Information

UT = Unknown

OT = Other

In order to be able to trust results of an analysis, we need to have consistent representations

Common Data Model Value Set

01 = American Indian or Alaska Native

02 = Asian

03 = Black or African American

04 = Native Hawaiian or Other Pacific Islander

05 = White

06 = Multiple Race

07 = Refuse to Answer

NI = No Information

UT = Unknown

OT = Other

SITE 1

Caucasian

African American

Asian

Multiple Race

Blank

SITE 2

101

201

300

401

500

600

SITE 3

African American

American Indian

Asian American

White

Other

Unknown

SITE 1

Caucasian

African American

Asian

Multiple Race

Blank

SITE 2

101

201

300

401

500

600

SITE 3

African American

American Indian

Asian American

White

Other

Unknown

22

Loading the Common Data Model (less easy)Same data are represented differently at different institutions (e.g., Type of Encounter)

In order to be able to trust results of an analysis, we need to have consistent representations

Common Data Model

Ambulatory Visit (AV)

Emergency Department (ED)

ED Admit to Inpatient (EI)

Inpatient Hospital (IP)

Non-Acute Inst. Stay (IS)

Other Ambulatory (OA)

Other (OT)

Unknown (UN)

No Information (NI)

SITE 1

Social Work Visit

Allied Health

Office Visit

Nurse Visit

Procedure Visit

Employee Health

Vascular Lab

Sleep Study Visit

Social Work Visit

SITE 2

Office Visit

Specimen

Postpartum Visit

Clinical Support

Initial Prenatal

SITE 3

Home Care Visit

Office Visit

Therapy Visit

Orders Only

Cardiology Testing

Hospital Encounter

21

Factors that increase complexity

People interpret the CDM specification differently, resulting in variability in how CDM is populated

Different health systems, with different EHRs, implemented at different times

Clinical workflows differ across institutions & impact availability of data

Understanding of EHR / claims data sources differs across institutions – may impact what gets loaded from source systems

All of these issues are present when doing research with EHR data, even within a single center

51

We have tools/processes to address this!

Data Curation assesses and improves global data quality

Characterize the contents of the PCORnet CDM

Evaluate global data quality and fitness-for-use across a broad research portfolio

For a given study, still need to consider data characterization specific to the aims

Assess data on the intended cohort

Ensure that outcomes / variables of interest are available & complete

Determine whether partners actually have enough data / patients to participate

Requires upfront investment, but can save significant time overall

52

Data Curation

Data curation

Step 1

Network partner plans DataMart refresh

Step 2

Network partner responds to the data characterization

query package

Step 3

Coordinating Center approves the DataMart

Step 4

Coordinating Center analyzes results and solicits more information as needed

Step 5

Coordinating Center holds Data Characterization and

Implementation Forums and updates Implementation

Guidance

54

Cycle 2 Required Data Checks

55

Category Data

Check Description

Data Model

ConformanceDC 1.01 Required tables are not present

DC 1.02 Expected tables are not populated

DC 1.03 Required fields are not present

DC 1.04 Fields do not conform to CDM specifications for data type, length, or name.

DC 1.05 Tables have primary key definition errors

DC 1.06 Fields contain values outside of CDM specifications

DC 1.07 Fields have non-permissible missing values

DC 1.08 Tables contain orphan PATIDs (PATIDs not in DEMOGRAPHIC)

DC 1.10 Replication errors between the ENCOUNTER, PROCEDURES and DIAGNOSIS tables

Data

CompletenessDC 3.04 Less than 50% of patients with encounters have DIAGNOSIS records

DC 3.05 Less than 50% of patients with encounters have PROCEDURES records

Cycle 2 Investigative Data Checks

Category

Data

Check Data Check DescriptionData Model Conformance DC 1.09 Tables have orphan ENCOUNTERIDs for more than 5% of records.

Data Plausibility DC 2.01 More than 5% of records have future dates.

DC 2.02 More than 10% of records fall into the lowest or highest categories of age, height, weight,

diastolic blood pressure, systolic blood pressure, prescribed days supply, or dispensed days

supply

DC 2.03 More than 5% of records have illogical date relationships.

DC 2.04 The average number of encounters per visit is > 2.0 for inpatient (IP), emergency department

(ED), or ED to inpatient (EI) encounters

Data Completeness DC 3.01 The average number of diagnoses records with known diagnosis types per encounter is below

threshold [1.0 for ambulatory (AV), inpatient (IP), emergency department (ED), or ED to

inpatient (EI) encounters].

DC 3.02 The average number of procedure records with known procedure types per encounter is below

threshold [0.75 for ambulatory (AV) encounters, 0.75 for emergency department (ED)

encounters, 1.00 for ED to inpatient (EI) encounters, and 1.00 for inpatient (IP) encounters

DC 3.03 More than 10% of records have missing or unknown values for the following fields:

BIRTH_DATE, SEX, DISCHARGE_DISPOSITION (IP/EI encounters only),

DISCHARGE_DATE (IP/EI encounters only), PX_DATE, LOINC, RX_NORM_CUI,

RX_ORDER_DATE, RX_DAYS_SUPPLY, or DISPENSE_SUP

DC 3.06 More than 10% of inpatient (IP) or ED to inpatient (EI) encounters with a diagnosis don't have a

principal diagnosis

56

Data partners are asked to investigate and comment on any exceptions in their Annotated Data Dictionary, and to classify these

exceptions as follows: feature/limitation of source data; could be improved in the near future; may be improved in the future;

or warrants further investigation.

Resources for Network PartnersEmpirical Data Characterization Report Excerpt

57

Resources for Network PartnersEmpirical Data Characterization Report Excerpt

58

Study-specific data quality

Antibiotics study overview

Study Aims: To evaluate the comparative effects of different types, timing, and amount of antibiotics prescribed during the first 2 years of life on:

Body mass index and risk of obesity at 5 and 10 years

Growth trajectories from infancy onwards

And how these effects differ according to:

Child sex, race/ethnicity, geography

Use of other medications

Maternal BMI, antibiotics during pregnancy, C-section (analysis at 7 sites)

Conducted study-specific data characterization to assess site eligibility:

Findings for prescriptions

RxNorm considerations

60

Study-specific data characterization findings

Lower number of children ≤ 2 with an antibiotics prescription

Start minus end date Low percent missing (~5%)

• Note: This is very different than global measures (highly missing) May be useful: 50th percentile = 10 days Huge range (5th percentile = 0 days ; 95th percentile = 108 days)

Quantity Varying interpretations of quantity (pill, mg, ml, etc.) Large range (5th percentile = 11.00; 95th percentile = 225.50) Missing in 52% of ABX prescriptions

Refills - not consistently populated (60% missing)

Days supply - only populated in 4% of ABX prescribing records

61

Study-specific data characterization findings

Initial query only included RxNorm Dose Form and Clinical Drug or Pack

Specific codes that allow identification of all aspects of the prescription (>2000 codes)

Did not include less specific codes: RxNorm Ingredient, Precise Ingredient, or Drug Component

Learned that several network partners had not mapped to the specific codes

Had to ask network partners to map to the specific codes

Assess whether to include ingredient-level records in the analysis

62

What RXCUI term types are used?Categorization of term types

63

Category-11. Semantic Clinical Drug-SCD

2. Semantic Branded Drug-SBD

3. Generic Pack-GPCK

4. Branded Name Pack-BPCK

Category-2

1. Semantic Clinical Drug Form-SCDF

2. Semantic Branded Drug Form-SBDF

3. Semantic Clinical Dose Form Group-SCDG

4. Multiple Ingredients-MIN

5. Precise Ingredient-PIN

6. Ingredient-IN

7. Semantic Branded Drug Component-SBDC

8. Semantic Clinical Drug Component-SCDC

Category-3

1. Branded Name-BN

2. Semantic Branded Dose Form Group-SBDG

3. Dose Form Group-DFG

4. Dose Form-DF

Category-1(Ingredient + Strength + Dose Form)

Category-21. Ingredient

2. Ingredient + Strength

3. Ingredient+ Dose Form

Category-31. Brand Name

2. Dose Form

RXCUI Term Types Distribution by Category and DataMart

64

Network

ID

Data

Mart ID

From EHR to Clinical Evidences

EHR recorded at the point of

care

Data Extractions

Data Wrangling

Data Curation

Data Analyses

Causal Inference

Decision Theory

Evidence Based

Decision