February 12, 2013: I. Sim EHRs and ResearchEpi 206 — Medical Informatics
Ida Sim, MD, PhD
February 12, 2013
Division of General Internal Medicine, and Center for Clinical and Translational Informatics
UCSF
Electronic Health Records for Clinical Research
Copyright Ida Sim, 2013. All federal and state rights reserved for all original material presented in this course through any medium, including lecture or print.
February 12, 2012: I. Sim EHRs and ResearchEpi 206 — Medical Informatics
Summary of Last Class
• Informatics crucial for making sense of complex data, and crucial for promise of translational research
• Key informatics challenges– naming data– exchanging data– reasoning to knowledge, capturing knowledge
• Challenges occur in parallel for clinical care and clinical research
February 12, 2012: I. Sim EHRs and ResearchMedical Informatics
Big Picture Take-Home Points
• Puts care and research together
• Separates data from the transactional systems used to collect that data
• Shows need to capture computable knowledge, not just data
• Clear place for decision support
• Emphasizes user-centered design as glue
February 12, 2012: I. Sim EHRs and ResearchEpi 206 — Medical Informatics
• “Meaningful Use” of EHRs• EHR Features Affecting Research
– functionality and adoption– naming data– getting data out (of APEX)
• Beyond EHRs• Summary
Outline
February 12, 2012: I. Sim EHRs and ResearchEpi 206 — Medical Informatics
Promotion of EHR Adoption
• “Stimulus” Act (2009) directed $19 billion to health IT
• $17.2 billion Medicare/Medicaid payments for “meaningful use” of EHRs– $44K over 5 years for MDs/clinics/hospitals that
achieve meaningful use by 2012-2014– $6 billion already paid out
• Medicare fees to be reduced for “non-EHR physician users” starting 2015
• UCSF spent $50 mil+ on UCare; over $100m expected total on Epic
February 12, 2012: I. Sim EHRs and ResearchMedical Informatics
8 Types of EHR Functionality
February 12, 2012: I. Sim EHRs and ResearchEpi 206 — Medical Informatics
Meaningful Use
• Stage 1 (2011), basic functions, e.g., – capture vital signs, demographics, active meds, allergies,
up-to-date problem lists, smoking status– one clinical decision support rule and track compliance– computer provider order entry (CPOE) (>30% of pts)– electronic prescribing (of >40% of prescriptions)– capability of exchanging key clinical information– report clinical quality measure to CMS or states
• Stage 2 (2013), reaction to Stage 1 over-reach?
– very minor tweaks to above, plus more data to patients • Stage 3 (2015)
February 12, 2012: I. Sim EHRs and ResearchEpi 206 — Medical Informatics
Certified EHRs
• EHRs need certification for meeting “meaningful use” http://onc-chpl.force.com/ehrcert– ambulatory practice
• 2961 products (was 269 in 2011)• (Epic products from 2008,2009,2010 listed separately)
– inpatient • 962 products (was 101 in 2011)• GE Centricity (aka UCare) certified, but we dropped
them in 2011 due to problems with CPOE
• Epic (maker of APEX) is market dominant– 33-44% of U.S. population has at least one account in Epic
February 12, 2012: I. Sim EHRs and ResearchEpi 206 — Medical Informatics
Rising Office-Based EHR Adoption
CDC 2012 http://www.cdc.gov/nchs/data/databriefs/db111.htm
February 12, 2012: I. Sim EHRs and ResearchEpi 206 — Medical Informatics
• “Meaningful Use” of EHRs• EHR Features Affecting Research
– functionality and adoption– naming data– getting data out (of APEX)
• Beyond EHRs• Summary
Outline
February 12, 2012: I. Sim EHRs and ResearchEpi 206 — Medical Informatics
• Retrospective cohort study of outpatients• Compare 5 year rate for congestive heart failure for
diabetics treated with a glitazone vs. not– find diabetics– find whether treated with a glitazone– for these patients, find all subsequent cases of congestive
heart failure – analyze at 5 years
• adjust for age, sex, severity of diabetes, previous CHF,
other meds, etc., etc.
Outcomes Research Project
February 12, 2012: I. Sim EHRs and ResearchEpi 206 — Medical Informatics
• Diabetes diagnosis– chart, HgbA1C, meds taken, problem list...
• Glitazone usage– orders, pharmacy
• Potential confounders– age, sex, severity, other meds, etc.
Types of Data Needed
February 12, 2012: I. Sim EHRs and ResearchEpi 206 — Medical Informatics
Community-Based Research
• For generalizability, and where chronic conditions are, you want to analyze EHR data from community practices
• Which EHRs products should you work with?
• Which practices should you approach for participation?
February 12, 2012: I. Sim EHRs and ResearchEpi 206 — Medical Informatics
Which EHRs?
• Should be an ONC-certified EHR that meets (some) Meaningful Use criteria
• Should provide needed functionality for study protocol– patient demographics– problem list– medication list – clinical documents and notes
• The more structured and coded the data, the better
February 12, 2012: I. Sim EHRs and ResearchEpi 206 — Medical Informatics
Which Practices?
• Adoption curve– what % of docs using the system? where are they
on adoption curve? (takes 6+ months for initial roll-out, 1-2 years for comfortable use)
• Which functionality being used?– most EHR purchasers do not use all available
functionality (e.g., guidelines support)• Is there a physician champion?
– your best liaison to the practice’s EHR• Consider a practice-based research network for
outpatient/community clinics
February 12, 2012: I. Sim EHRs and ResearchEpi 206 — Medical Informatics
• “Meaningful Use” of EHRs• EHR Features Affecting Research
– functionality and adoption– naming data– getting data out (of APEX)
• Beyond EHRs• Summary
Outline
February 12, 2012: I. Sim EHRs and ResearchEpi 206 — Medical Informatics
How Structured is the Data?
• Structured data does not equal coded data
February 12, 2012: I. Sim EHRs and ResearchEpi 206 — Medical Informatics
How Coded is the Coded Data?
• Availability of coding does not mean coding is used!• e.g., Problem List
– “more than 80% of patients have at least one entry in structured
data” (MU Stage 1)– to what vocabulary? who does the coding? gamed”?
Malignant neoplasm of colon, unspecified site
February 12, 2012: I. Sim EHRs and ResearchEpi 206 — Medical Informatics
• A term is a designation of a concept or an object in a specific vocabulary
• e.g., English blood = German blut – standardization enables predictable, accurate search and
retrieval
• “Controlled vocabularies” range from simple lists of terms to rich descriptions of knowledge– terminologies: list of terms corresponding to concrete (e.g.,
heart) and abstract concepts (e.g., hypertension) – ontologies: includes concepts, their definitions, various types
of relationships among the concepts, and axioms• data (e.g., lisinopril), information (e.g., lisinopril IS-A ACEI)• knowledge (e.g., ACEIs lower blood pressure)
Standardization of Clinical Terms
February 12, 2012: I. Sim EHRs and ResearchEpi 206 — Medical Informatics
Notable Clinical Vocabularies
February 12, 2012: I. Sim EHRs and ResearchEpi 206 — Medical Informatics
Terminology Features (e.g, ICD-9)
• Coverage– is the idea (e.g., SNP) included?
• Granularity / specificity– do you need left heart failure? subendocardial myocardial
infarction?• Synonomy
– cervical: does this mean related to the neck or or the cervix?• Relationships between terms
– lisinopril IS-A ACE-inhibitor; see• Atomic concepts vs. “post-coordinated” concepts
– left heart failure vs. left + heart failure; • Usability
– can you find the “right” code (SNOMED CT has > 357,000 concepts)
February 12, 2012: I. Sim EHRs and ResearchEpi 206 — Medical Informatics
Terminology Features (cont.)
• Unambiguousness– each concept clearly defined (e.g., immunocompromised)
• Non-redundancy– each concept has only one corresponding code
• Consistency– each code has only one meaning in all situations
• Concept permanence– meaning never changes, even with new versions
• Versioning– new terms (e.g., SNP), defunct terms (e.g., dropsy),
corrected concepts (e.g., rabies not a psychiatric disorder)
February 12, 2012: I. Sim EHRs and ResearchEpi 206 — Medical Informatics
ICD-9 Concept Coverage
• How well would ICD-9 do in capturing a medical chart?
• Inpatient and outpatient charts from 4 medical centers abstracted into 3061 concepts [Chute, 96]
– diagnoses, modifiers, findings, treatments and procedures, other
• Matching: 0=no match, 1=partial, 2=complete– 1.60 for diagnoses– 0.77 overall– ICD-9 augmented with CPT: overall 0.82
February 12, 2012: I. Sim EHRs and ResearchEpi 206 — Medical Informatics
ICD-9 Coding Accuracy • VBAC uterine rupture rate
– 665.0 and 665.1 ICD-9 discharge codes used in study (NEJM 2001;345:3-8)
– letter to editor: in 9 years of Massachusetts data• 716 patients with 665.0 and 665.1 discharged• reviewed 709 charts• 363 (51.2%) had actual uterine rupture
– others had incidental extensions of C-section incision, or were incorrectly coded or typed
• 674.1 (dehiscence of the uterine wound) used to code another 197 ruptures (or 35% of confirmed cases of uterine rupture)
• i.e., sensitivity 65%, specificity 51.2%
February 12, 2012: I. Sim EHRs and ResearchEpi 206 — Medical Informatics
SNOMED-CT “Ontology”
• To “help structure and computerize the medical record, reducing the variability in the way data is captured, encoded and used for clinical care of patients and medical research”– 311,000 unique health care concepts– 800,000 descriptions– over 1.36 million relationships between concepts, e.g.,
• Diabetes Mellitus IS_A disorder of glucose regulation• Finger PART_OF hand
February 12, 2012: I. Sim EHRs and ResearchEpi 206 — Medical Informatics
SNOMED-CT Structure• Formally constructed vocabulary/knowledge map
– 18 high-level hierarchies • e.g. finding, organism, substance, body structure, event, social
context
– each concept can be described by many attributes • e.g., finding site = lung, associated-morphology = inflammation
– encodes “knowledge”• pneumonia is an infection of the lung by an organism
– can “post-coordinate” terms to increase expressive power• pneumonia: finding-site=lung ; finding-site=lower lobe;
laterality=right; causative agent=pneumococcus;
• http://nciterms.nci.nih.gov/ncitbrowser/pages/vocabulary.jsf or http://vtsl.vetmed.vt.edu/
February 12, 2012: I. Sim EHRs and ResearchEpi 206 — Medical Informatics
SNOMED-CT Status
• Best semantic coverage of all existing vocabs
• de facto standard for EHR clinical vocabulary– owned by newly created International Healthcare
Terminology Standards Development Organization
(Danish, with 9 founding countries)– site-licensed (i.e., free) in U.S., as a founding country
February 12, 2012: I. Sim EHRs and ResearchEpi 206 — Medical Informatics
Coding Barriers
• Poor inter-coder reliability– 3 docs, 5 opthalmology cases, 242 concepts, 2 SNOMED-
CT browsers [Chiang M, 2006]
• reliability between coders (exact term match): 44% and 53%• reliability within same coder: 45% over 2 browsers
• Automatic coding into ICD-9, etc. – precision (true pos) 0.88, recall (sens) 0.9 [Goldstein, 2007]
– experts precision 0.6 to 0.9, recall 0.7 - 0.9– still a major Natural Language Processing (NLP) research
challenge in general, let alone with typical clinical notes
ICD-9 Going Away…
• UCSF moving to ICD-10 by 2014. First webinar March 4
• Example– W5803XA Crushed by alligator, initial encounter– W5803XD Crushed by alligator, subsequent encounter
February 12, 2012: I. Sim EHRs and ResearchEpi 206 — Medical Informatics
AMA, 2010 http://www.ama-assn.org/ama1/pub/upload/mm/399/icd10-icd9-differences-fact-sheet.pdf
February 12, 2012: I. Sim EHRs and ResearchEpi 206 — Medical Informatics
EHR for Research Summary
• Variable adoption of EHRs limits benefit to clinical research
• Not automatically going to help clinical research– if all unstructured free text, won’t help much at all
• the more structured it is (i.e., more defined fields), the better– if just coded sporadically in ICD-9
• problem with gamed codes, poor semantic coverage• ICD-10 transition will be very challenging
– very, very few EHRs coded in SNOMED• some clinical concepts still not well covered• SNOMED is essentially unusable by front-line clinicians • general automated coding still some time away, but may be an
option for constrained domains (e.g., path, radiology reports)
February 12, 2012: I. Sim EHRs and ResearchEpi 206 — Medical Informatics
• “Meaningful Use” of EHRs• EHR Features Affecting Research
– functionality and adoption– naming data– getting data out (of APEX)
• Beyond EHRs• Summary
Outline
February 12, 2012: I. Sim EHRs and ResearchEpi 206 — Medical Informatics
• Retrospective cohort study of outpatients
• Compare 5 year rate for congestive heart failure for diabetics treated with a glitazone vs. not– find diabetics– find whether treated with a glitazone– for these patients, find all subsequent cases of congestive
heart failure – analyze at 5 years
• adjust for age, sex, severity of diabetes, previous CHF, other meds, etc., etc.
Outcomes Research Project
February 12, 2012: I. Sim EHRs and ResearchEpi 206 — Medical Informatics
• Diabetes diagnosis– chart, HgbA1C, meds taken, problem list...
• Glitazone usage– orders, pharmacy
• Potential confounders– age, sex, severity, other meds, etc.
Types of Data Needed
February 12, 2012: I. Sim EHRs and ResearchEpi 206 — Medical Informatics
Getting Data Out
• Cohort identification– how many potentially eligible patients at UCSF?
• Data extraction– extract particular data items for particular
patients?– cannot “go to APEX” to pull out data for outcomes
research• APEX built for treating one patient at a time• backend database (Clarity) is a relational database, but
data schema is proprietary
March 6, 2012: I. Sim Research InformaticsEpi 206 — Medical Informatics
MICU
FinanceResearch
QA
IntegratedData Repository
Internet
ADT Chem APEX XRay PBM Claims
• autofeed nightly, data stored securely with backup
Data from APEX to IDR
ReplicaSource Systems
At UCSF: IDR & My Research – Big Picture
Audit DB /IDR
Data Warehouse
End User Tools
Cognos BI
Data Warehousing Business Intelligence
Cohort Selection
Tool (i2b2), SAS,
STATA,SPSS,Alias.ti,
Enterprise Architect
UCare
PICIS
CancerRegistry
MisysIDXrad
Apollo
Worx
CTMS
STOR
MAR
Flowcast
TSI
CoPath
Kaiser
VA
ED
Epic
Extract,
Transfer
Proxy process and
Load
Axium
Siemens Radiology
Transplant
Terminal Servers
SGD Web top
Alfresco
REDCap
Epic
LPPIEMR
UCare
Will be replaced by Epic
Will have interfaces to bring data into Epic
SFGH
PICIS
CancerRegistry
CTMS?
TSI
Kaiser
VA
Axium
Transplant?
LPPIEMR
SFGH
Security
Red = Currently Integrated
February 12, 2012: I. Sim EHRs and ResearchMedical Informatics
EHR vs. IDR Queries
• EHR Queries• What was Mr. Smith’s last
potassium?• Does he have an old CXR
for comparison?• What antihypertensives
has he been on before?• What did the neurology
consult say about his epilepsy?
• IDR Queries• What proportion of
diabetics with AMI admissions were discharged on -blockers?
• What was the average Medicine length of stay in 2010 compared to 2005?
• What is the trend in use of head CTs in patients with migraine?
February 12, 2012: I. Sim EHRs and ResearchMedical Informatics
EHR/Data Repository Comparison
• Enterprise viewpoint more appropriate for QI and research
• Data repository cleans and aggregates data from multiple sources
March 6, 2012: I. Sim Research InformaticsEpi 206 — Medical Informatics
UCSF IDR• First version: all UCare data from July 1, 2005 (Ucare
roll-out) to mid-2011
– 2.875 million records (not all unique)– 5 Million encounter records -- manual refresh– Included inpatient data, Dentistry, some billing
data, never got STOR/VA/Kaiser/THREDS data
yet; • APEX data to IDR as of Spring 2012
APeX-IDR Information Flow
Shadow Server Clarity
(Microsoft SQL Server)
IDR(HIPAA Limited
Data Set*)
Epic Production Server (Chronicles Caché)
MyResearch(Data Marts / Ontologies)
• Operational and financial reports• Government-mandated reporting
Research
Patient careStagingServer
(MicrosoftSQL Server)
Medical Center Network(Requires CHR Approval for Access)
MyResearch Network(No CHR Approval Required)
* “HIPAA Limited Data Set” = No PHI Except Dates of Service
IDR Demo
• Go to MyAccess (myaccess.ucsf.edu)• Go to MyResearch• Launch Cohort Selection Tool
– might need to sign up for account first
March 6, 2012: I. Sim Research InformaticsEpi 206 — Medical Informatics
IDR User InterfaceUCare Cohort Selection Tool
Requesting Data Extraction
• https://redcap.ucsfopenresearch.org/surveys/?s=SMB9LX– demographic data– diagnostic codes (ICD-9)
• admit, discharge, outpatient distinction?
– procedural codes (CPT)– lab tests
• No medications yet• CHR approval needed for getting identifiable
information
February 12, 2012: I. Sim EHRs and ResearchEpi 206 — Medical Informatics
March 6, 2012: I. Sim Research InformaticsEpi 206 — Medical Informatics
Limitations
• Content limitations– “meds, orders, results” on track for March 2013
• Search option limitations– ICD-9 terms are cumbersome; ICD-10 is coming– no labs on search interface– very little user support for the interface– no free text or NLP (natural language processing) search
• Other limitations– Diagnoses include primary, secondary, admit, discharge– Queries are for entire time period since start of IDR – Data is whatever comes out of APEX, errors exist (e.g. 97
married children under 10)• Beware!
Current Plans on “IDR”
• A clinical data warehouse needed to serve both research and operational needs– build a new virtual warehouse
• Rebrand the IDR effort…Enterprise Data Warehouse– “IDR is associated with a project that has not met the
requirements of the community; Trust in the product must be built.” (External Advisory Group, 2012)
• Re-do system architecture– costs: ~$7-12 million/yr for next 3 years
• Drive work using case studies– arthroplasty and ACOs, OB/neonatal database, etc
February 12, 2012: I. Sim EHRs and ResearchEpi 206 — Medical Informatics
UC Rex
• “IDR” of all 5 UC campuses– go live was Dec 31, 2012– med and lab data loaded– some pilot projects underway
February 12, 2012: I. Sim EHRs and ResearchEpi 206 — Medical Informatics
February 12, 2012: I. Sim EHRs and ResearchEpi 206 — Medical Informatics
Summary EHRs for Research
• EHR does not always = easier clinical research– “Frankly, one of the biggest attractions to LastWord (aka
UCare) is going to be a boon to clinical research. Information will be accessible in a much more uniform and complete way.” ex-SOM Dean Haile Debas, UCSF Daybreak, 2001
• Coding is critical– standardized, coded data trumps free text
• especially important for research• but most controlled vocabularies have insufficient clinical coverage
and are difficult to use– automated methods possible in restricted or custom situations
• Data warehouses are only as good as (and sometimes worse than) the original data sources
February 12, 2012: I. Sim EHRs and ResearchEpi 206 — Medical Informatics
• “Meaningful Use” of EHRs• EHR Features Affecting Research
– functionality and adoption– naming data– getting data out (of APEX)
• Beyond EHRs• Summary
Outline
Beyond the EHR -- discussion
• Patient care activities (e.g., orders, referrals, results review)
• Charting• Billing• Improving team
communication• Meeting regulatory
requirements• Clinical decision making• Increasing revenue
• Clinical research• Reducing practice
variation• Controlling clinician
behavior• Collecting “big data” to
improve care practices • Involving the patient in
collaborative care• Other
February 12, 2012: I. Sim EHRs and ResearchEpi 206 — Medical Informatics
For Patients
• http://healthdesignchallenge.com/
February 12, 2012: I. Sim EHRs and ResearchEpi 206 — Medical Informatics
ccti.ucsf.edu
51
Context is Automatically Known
ccti.ucsf.edu
52
Data Query -- Labs, Notes, Xrays
ccti.ucsf.edu
53
Data Capture – Medications
ccti.ucsf.edu
54
Data Capture – Physical Exam
ccti.ucsf.edu
55
Assessment and Plan
ccti.ucsf.edu
56
Finishing Up
ccti.ucsf.edu
57
Advanced Medical “Home”
24/7/
February 12, 2012: I. Sim EHRs and ResearchEpi 206 — Medical Informatics
• “Meaningful Use” of EHRs• EHR Features Affecting Research
– functionality and adoption– naming data– getting data out (of APEX)
• Beyond EHRs• Summary
Outline
February 12, 2012: I. Sim EHRs and ResearchEpi 206 — Medical Informatics
Current State of EHRs
• HITECH driving adoption of yesterday’s fundamentally mis-conceived technology – lots of activity, churn, money, effort spent to meet Meaningful
Use – level of data exchange being mandated is unlikely to improve
care quality, decrease cost
• ACO era starting to align incentives– to drive and reward use of data for care, not billing– to magnify role of patient and teams– to diminish role of hospitals– to upend business roles, business models
February 12, 2012: I. Sim EHRs and ResearchEpi 206 — Medical Informatics
• Major barriers still exist to EHR adoption
• EHR does not always = easier clinical research
• Coding is critical– standardized, coded data trumps free text
• especially important for research• but most controlled vocabularies have insufficient clinical
coverage and are difficult to use– automated methods possible in restricted or custom
situations• All signs point to coming disruptive change…
Take-Home Points
February 12, 2012: I. Sim EHRs and ResearchMedical Informatics
Next Class
• Clinical decision support systems
• Informatics for clinical research
• Disruptive change…