Using OWL to Facilitate Quality Reporting from Electronic

Journal of Ambient Intelligence and Smart Environments 0 (2010) 1–0 1IOS Press

Using OWL to Facilitate Quality Reportingfrom Electronic Healthcare Records

Chimezie Ogbuji E-mail: [email protected] a,Chris Deaton E-mail: [email protected] b

a Cleveland Clinic Foundation JJ4 9500 EuclidAvenue, Cleveland, OH 44195, USAb Cycorp, Inc. 7718 Wood Hollow Drive Suite 250,Austin, TX 78731-1601, USA

Abstract. The Institute of Medicine’s patient safety andquality chasm reports noted that today’s healthcare systemscan be harmful and routinely fail to deliver potential benefits.National groups such as Leapfrog and the National Coalitionfor Quality also note that not all healthcare quality is equal,there is room for improvement, and quality improvementscan produce long-term cost savings. Hospitals use indicatorssuch as risk-adjusted mortality, complications, and morbid-ity rates to monitor quality. Semantic Web technologies suchas OWL are well poised to help address the various termi-nology and informatics challenges associated with the defi-nition and derivation of variables for quality reporting fromclinical data in an Electronic Health Record (EHR). They arere-usable, precise in meaning, and easily support distributed,web-based processing of discrete clinical data. This paperdiscusses the use of OWL in a currently deployed and certi-fied reporting pipeline in the Clinical Investigations depart-ment of the Cleveland Clinic’s Heart and Vascular Institute.This pipeline is used to generate staff reports for quality as-surance as well as external quality reports to the Society ofThoracic Surgeons’ (STS) Adult Cardiac Surgery Databaseon an ongoing basis.

Keywords: owl, rdf, medical records, reporting

1. Introduction

The Institute of Medicine’s (IOM) patient qualitychasm report [8] noted that healthcare delivery systemsdo not provide consistent, high quality medical care toall people. In tandem with the advancement of medicalscience and technology is the growing complexity of

health care. In the face of this, the nation’s health caredelivery system has fallen short in its ability to trans-late knowledge into practice and to apply new technol-ogy safely and appropriately. The IOM’s Committeeon the Quality of Health Care in America called forall purchasers to reexamine their payment policies toremove barriers that impede quality improvement andbuild in stronger incentives for quality enhancement.

The term Electronic Health Record (EHR) has of-ten been used to refer to digital health record contentand its associated processes [2]. EHR content can bestructured and examples of this are medications, lab-oratory results, or International Classification of Dis-eases (ICD) codes. EHR content can also be unstruc-tured or narrative free text. Structured EHR data sup-port consistent retrieval, reporting, and data aggrega-tion for research. Much earlier in 1991, the IOM con-tributed [12] a related term with a more precise def-inition: The Computer-Based Patient Record (CPR).They defined it as:

an electronic patient record that resides in a system specif-ically designed to support users by providing accessibil-ity to complete and accurate data, alerts, reminders, clini-cal decision support systems, links to medical knowledge,and other aids.

This definition emphasizes data that are well-organizedand discrete1 in addition to being in a digital form, andfor the sake of compatibility with the literature, theterm EHR as used in this paper should be understoodto refer to this IOM definition.

It is inevitable that routinely collected, discrete dataresiding in an EHR will be used increasingly forresearch purposes [2][10] as well as for supportingHealthcare Information Technology (HIT) and clinicalinformatics.

1The term discrete data is used to refer to data that are comprisedof a finite set of values for variables. In the context of medical data,this is often referred to as coded data

1876-1364/10/$17.00 c© 2010 – IOS Press and the authors. All rights reserved

2 C. Ogbuji and C. Deaton / Quality Reporting from EHRs with OWL

Whereas HIT is a term used to describe the appli-cation of computers and technology in health care set-ting, informatics is the discipline focused on the ac-quisition, storage, and use of information in a specificsetting or domain [19]. It is distinguished from com-puter science by its rooting in a domain and in thecase of medical informatics or clinical informatics, theprimary domain is the health care settings. It is moreabout information than technology. HIT has achieved anew, national prominence in the United States with itsinclusion in the American Recovery and ReinvestmentAct (ARRA) of 2009, the federal economic stimuluspackage signed into law by President Barack Obamaon February 17, 2009 [19]. A major reason for thisis the promise of HIT for improving the quality andsafety of health care while reducing costs.

One of the major goals of the ARRA is to stimulatethe meaningful use of HIT in order to lay the ground-work for advanced electronic health information sys-tems [23]. The term meaningful use describes [7] theuse of an EHR that

includes electronically capturing health information in acoded format, using that information to track key clini-cal conditions, communicating that information in orderto help coordinate care, and initiating the reporting of clin-ical quality measures and public health information.

Beyond the installation of EHRs, the goal of thislegislative initiative is to incentivize the effective useof EHRs that are certified as having certain capabili-ties. In July 13 2010, the department of Health and Hu-man Services (HHS) specified the final rule that imple-ments the provisions of the ARRA and they include re-quirements for electronic reporting of clinical qualitymeasures beginning in 2012 for Medicare and Medi-caid [11]. The application of OWL described in thispaper demonstrates value for the requirements regard-ing reporting of quality measures.

Quality process measures examine the consistentprovision of evidence-based diagnostic or therapeuticinterventions (such as whether a patient with a partic-ular disorder received a medication of some kind ina timely fashion) [21]. Clinical outcome measures fo-cus on how patients fare over time typically by therecording of an endpoint such as whether or not theyare alive. Creating such quality measures involves thedefinition of an eligible population (referred to as thedenominator) and a reliable identification of qualify-ing medical interventions within a specified time frame(referred to as the numerator) [29].

There is consensus about the need for transparencyand the rights of patients to have greater knowledgeabout the care they receive. A critical component ofany improvement activity is accurate, reliable, stan-dardized, and cost-effective means for measuring cur-rent performance and for setting desired performancegoals [32]. The primary response of providers and pol-icymakers alike to address the lack of quality in todayshealthcare information systems has been performancemeasurement and public reporting of quality of caredata [21]. From a recent report, nearly every hospitalin the nation is reporting performance on at least 24process measures across 4 clinical conditions, 3 out-comes measures, and 9 different measures of patientexperience [21]. EHR-based quality measurement andreporting has the potential to reduce the administrativechallenges of current quality measurement and report-ing activities [1].

To be useful, a quality measure should be preciselydefined and the primary metric of this precision is howwell the original purpose for which the data were en-tered matches its use in reporting quality measures[32]. Relying on administrative data from billing sys-tems to deduce clinical context violates this principleand is often produces misleading results that may leadto misinformed policy. Administrative data, also calledclaims data, are byproducts of administering and reim-bursing health-care services [2]. They have importantlimitations that discrete data derived from an EHR canpotentially overcome. The most important limitation isthe lack of clinical details. EHR-derived data are typ-ically more comprehensive, especially with respect toclinical details and risk factors, and are shown to be su-perior to administrative data sets in predicting diseaseprognosis and outcomes.

In their study [15], Fowles et al. explored the expe-riences of 5 provider organizations in developing, test-ing, and implementing quality-of-care indicators basedon data collected from their EHR systems. They reportthat the success of the case studies in implementingEHR-based quality measures demonstrated that suchmeasures are worth pursuing, despite the challenges ofensuring the validity and reliability of data, efficientworkflow, and staff support associated with the use ofan EHR in this way.

There are many medical informatics challenges thatstand in the way of realizing the objective of usingEHR data for the routine and comprehensive measure-ment of the quality of care [18]:

C. Ogbuji and C. Deaton / Quality Reporting from EHRs with OWL 3

– guidelines are not typically specified in a way thattranslate easily to quality measurement or com-puter implementation

– structured/coded data needed for quality mea-sures are not standardized and are subject to localvariations in EMR implementations and clinicalpractice

– much of the data required for quality measure-ment reside in free-text notes documenting clini-cal encounters.

Some degree of customization is often required toaccount for the sublanguage and documentation prac-tices of specific medical subdomains [9]. This can bea costly and resource-intensive endeavor and both thecost and technical complexity can increase consider-ably when attempting to extract EHR data from multi-ple institutions.

Unfortunately, current methods of quality measure-ment and reporting remain highly challenging [1]. Notall stand-alone EHR systems are capable of collect-ing all the data needed for reporting without signif-icant manual intervention. As a result, much of thedata needed must be obtained from other sources, suchas diagnostic laboratories. The traditional approach toreporting is one where data from clinical sources aremanually abstracted into certified, third-party vendorrepositories from where they are submitted to externalagencies [21].

This redundant abstraction is a common, generalproblem in the management of clinical data and oftenreferred to as double data entry [26]. In this case, theredundancy is due to the opportunity to submit reportsthat are automatically derived from existing EHR datasuch that the data can be reused for multiple reportingpurposes. In addition to this problem, traditional ap-proaches are often subject to changing and disparatedefinitions of variables. The tools involved are typi-cally based on relational database technology and arecomprised of special purpose modules tailored to thedata definitions of one or two specific quality agencies.

Hazlehurst et al., designed [18] a quality measure-ment system as a pipeline of transformation steps takenon encounter-level EHR data with the goal of captur-ing all of the clinical events required to assess carefor specific clinical domains. They stated that compre-hensive and routine quality of care assessment requiresstate-of-the-art EHR implementations as well as a scal-able technology platform in order to be reliable, main-tainable, and automated. Jung et al. developed [22] asecure, web report delivery system built on standard

relational database reporting technology that is inte-grated with a longitudinal medical record system anduses data aggregated into a data warehouse from vari-ous sources.

This paper describes an approach to using a standardontology language (OWL) for the purpose of describ-ing and deriving clinical quality measures. It will alsodescribe the advantages and disadvantages of such anapproach to solving this important problem.

2. Approach

2.1. Semantic web technologies

The Semantic Web combines a highly-distributableaddressing and naming mechanism (Uniform Re-source Identifiers: URIs) with a formal knowledge rep-resentation (RDF and OWL), a mechanism for render-ing document dialects in this knowledge representation(GRDDL), and a common query language (SPARQL).It is an extension of the current World Wide Web inwhich information is given well-defined meaning inorder to facilitate better cooperation between peopleand computers.

2.2. Architecture

In the healthcare setting, the term interoperabilityis often used to describe communication between dif-ferent information systems using standard data repre-sentations [23]. There has been progress in addressingsyntactic or data interoperability via the use of data ex-change standards such as Health Level Seven (HL7)for exchange between laboratory systems and a com-mon data repository using XML [26][23]. However,there is less progress in addressing the issue of se-mantic interoperability. Semantic interoperability en-ables the more complex use of knowledge for clini-cal decision-making. The complexity of mapping rela-tionships to standardized vocabularies is a formidablechallenge. Data exchange standards such as HL7-RIMfail to provide a common benchmark for how data areto be formulated by their senders, how they are inter-preted by their recipients, and suffer from ambiguityin their meaning [31]. This lack of semantic interop-erability limits the ability for tools to infer meaningwhen working with incomplete clinical data [10].

Many of these collective informatics challenges areaddressed through the use of ontologies to define themeaning of data elements in a patient record system


[4]. The Design of a knowledge representation – thestudy of what knowledge to represent and how to rep-resent it for a machine to act intelligently –involvesmaking a set of decisions about how and what to seeof the world (i.e., a conceptualization). A conceptual-ization is a way of conceiving the world and decidingwhat to model in a knowledge representation and theartifacts of such conceptualizations are often referredto as an ontology. Ontology development is a way toreduce the complexity of the world and in this way caneffectively address the semantic interoperability chal-lenges to the meaningful use of EHR data.

The emergence of second-generation knowledge-based systems provide a more explicit and maintain-able use of ontologies for encoding and applying (clin-ical) knowledge. In addition to explicitly representingpatient record content, ontologies can also be used forconcept classification, automatically relating EHR datawith the activities in the primary care process [4][5].

Proprietary representation formats are a significantbarrier to both syntactic and semantic interoperabil-ity [23]. In order to demonstrate meaningful use, HITmust adopt standards that address the challenges of se-mantic interoperability. Proprietary representation for-mats also limits the amount of time and money wastedin the manual transcription between computer systems.Ruttenberg et al. demonstrated [28] the value of the in-formation environment the Semantic Web can supportand illustrated the range of semantic web technologiesthat have applications in areas of biomedicine.

The requirements for report generation in the systemour application is a component of is the ability for aninformatician to:

– identify a reporting cohort as an RDF dataset ofnamed patient graphs

– incorporate content from external data sourceswithin the healthcare information system

– identify a set of operations that qualify for a par-ticular report (the numerator)

– derive a corresponding entailed RDF graph foreach input patient graph via a set of ontolo-gies and rules represented in a standard knowl-edge representation language for the semanticweb that captures the logic used to derive thereport-specific variables

– generate a resulting report in an appropriate for-mat

The system needs to be automated, declarative (in-volving definitions of what is to be computed, but not

necessarily how [25]), reusable, expedient, and scal-able.

Below are some definitions important to the discus-sion of the system described in this paper. The phrasereport-entailed refers to the result of logical entailmentthat follows from axioms capturing the criteria in a re-port variable definition.

Definition (The Denominator dataset). The originalset of patient record RDF graphs that comprise the el-igible population for a particular report

Definition (Report-entailed numerator graph). Theentailed RDF graph that follows from the formallyspecified derivation logic for quality measures and thestatements in a graph from the denominator dataset.

Definition (Report-entailed numerator dataset).The RDF dataset formed from the set of report-entailednumerator graphs for each patient RDF graph in thedenominator dataset

Definition (Reporting expert system2). The compo-nent that provides the logical reasoning capability forderiving the report-entailed numerator graph for aparticular report using semantic web knowledge rep-resentation standards such as OWL, Rule InterchangeFormat (RIF), and RDFS.

2.3. Managing expressive power restrictions

Doyle and Patil argued [14] that worst case timecomplexity, logical soundness, and completeness isnot the right measure for evaluating the provisionsof a representation service. Restricting classificationto purely terminological information significantly re-duces its utility in practical applications. They demon-strate examples of common classes of problems thatrequire a more expressive power in the underlyingknowledge representation:

– Recursive definitions, transitive relations, and or-dered sets (such as connectedness, reachability,part-whole, before-after, and cause-effect)

– Functions over ordered sets (such as functions orrelations over ordered sets of integers or naturalnumbers)

2An expert system is a type of application program that makesdecisions or solves problems in a particular field by using knowledgeand analytical rules defined by experts in the field.


Fig. 1. Semantic web-based architecture for reporting quality mea-sures

Hastings et al. proposed [17] an approach usingOWL, description graphs, and rules to implement astructure-enriched knowledge base for chemicals thatperforms classifications based on the chemical struc-tures and rules. The main strength of their approach isthe direct encoding of complex structures at the classlevel in the ontology, and the encoding of rules for de-termining properties such as being cyclic, which can-not be expressed as OWL axioms.

This distinction between classes of reasoning prob-lems (into ontology and rule-based axioms) is similarto the approach described in this paper and works wellwith a reasoning strategy that relies on the W3C’s re-cent translation3 of OWL 2 RL into RIF notation suchthat a conformant (RIF) Core consumer can be usedfor the logical reasoning needs of a reporting expertsystem and off-the-shelf ontology editors can be usedto manage the terminological definitions.

2.4. Materialized and Virtual Approaches toEntailment

One of the primary challenges in this application ofOWL for the purpose of quality reporting is the deci-sion of whether to calculate the report-entailed graph(or a subset of the entailments that follow from quality

3http://www.w3.org/TR/rif-owl-rl/

measure axioms) a prior or to perform the inference atquery time. This is a classic problem in deductive de-ductive database theory that contrasts bottom-up4 withtop-down evaluation.

In data warehouse architecture, this contrast is man-ifested in the distinction between virtual views and ma-terialized views. In the former, queries against rela-tions defined by views are re-written, decomposed, andevaluated against the data warehouse. In the latter, suchqueries can be directly evaluated against the databasewhich includes the derived entailments.

An important metric in this distinction is the mul-tiplicative effect of materializing entailments such asinstances of partial orders5. When done exhaustively,the number of instances of such relations added isN ·(N−1)

2 , where N is the number of things that stand inthe ordering. This was the most significant factor in de-termining which approach to use with which predicateand will be discussed later in section 4. Another criti-cal notion is that of "freshness". This is a measure ofhow quickly updates to source databases are reflectedin the view.

Broadly speaking, the virtual approach may be bet-ter if the information sources are changing frequently(i.e., if freshness is an important requirement), whereasthe materialized approach may be better if the informa-tion sources change infrequently and very fast queryresponse time is needed [20]. In this application, a pri-marily materialized approach is used as report-entailednumerator graphs are derived a priori and stored in atriple store for subsequent querying. As will be dis-cussed later, this balance between responsive queries,freshness of data, and the multiplicative effect of cer-tain entailments is a major challenge to the application.

2.5. Higher-order logical reasoning

As shown in Figure 1, once the report-entailed nu-merator dataset is loaded into an RDF triple store, Cycis used to derive the final report through a series ofSPARQL queries over the report-entailed triple storeusing some additional, higher-order reasoning to gen-erate the queries and process the answers from them.

4In the literature, the term bottom-up and forward chaining areoften used interchangeably and refer to an inference technique thatbegins with data and uses inference rules to derive more data untila goal is reached or there are no additional derivations, reaching afixed point / finite grounding / conflict set

5A relation that is irreflexive, transitive, and asymmetric is a par-tial order


Cyc is a large and highly efficient expert system witha domain that spans common objects and actions [24].It is comprised of a large universal schema (an ontol-ogy) of general concepts and a powerful higher-orderlogical reasoning system.

In a manner that appeals to the arguments of Doyleand Patil, its authors gave up completeness of in-ference in return for expressiveness and efficiency.Its language (CycL) involves first-order predicate cal-culus, set theory, meta-level assertions, context, andmodal operators. Its logical reasoning system includesspecial-purpose reasoning algorithms and heuristicsthat identify situations in which algorithmic shortcutscan be used.

The Structured Knowledge Source Integration (SKSI)is a component of the Cyc architecture that providesa means for Cyc to seamlessly access external struc-tured data. The Cyc System includes specialized CycLvocabulary, inference modules and supporting connec-tion management code [30]. The SKSI component hasrecently been augmented to support integration withsystems that implement the SPARQL protocol and thisis the means by which Cyc is used to derive the arti-facts from the report-entailed triple store.

During the generation of a report, Cyc is leveragedfor reasoning that surpasses the expressive power ofSPARQL OWL or RIF entailment [16]. Examples ofthis are:

– the ability to filter based on duration: selectingmedical events that occurred within a certain pe-riod of time prior to or following an index proce-dure in the the numerator

– selecting the highest values from a set of both nu-merical and non-numerical values (selecting thehighest preoperative regurgitation value within aparticular period)

Another common use of Cyc in this application isto facilitate closed world negation. As discussed byAlan Rector [27], large, open world description logicA-boxes are rarely appropriate for medical data be-cause most large fact bases in biomedicine come fromclosed world databases. There is little research and fewoff-the-shelf solutions for systems combining OWL T-boxes with standard closed world databases and so Cycis leveraged for this purpose.

Derivation logic involving closed world negation isimplemented by taking advantage of the combinationof OPTION, !BOUND, and FILTER typically used inSPARQL for this purpose. A representative example ofclosed world negation is the need to identify isolated

procedures, i.e., procedures that were the only com-ponent of an operation or weren’t paired with anothermajor concomitant procedure. Such use of negationstrictly follows the Closed World Assumption (CWA),since the semantics of the exclusion are interpreted toassume the instances of the predicates involved in thedefinition are complete and so the procedure is classi-fied as being isolated if there is no indication of anothercomponent of the operation in the record.

The general approach is to calculate the positive en-tailments using a more restricted knowledge represen-tation and logical reasoning system (step 3 in Figure 1)and then rely on SPARQL for the closed world nega-tion semantics evaluated against the entailed dataset.

2.6. Parallel and distributed reasoning

Urbani et al. attempt to address [33] the problemof reasoning over a large amount of data using theMapReduce distributed system along with RDFS andOWL reasoning algorithms. Their hypothesis is thatthe reasoning process can be efficiently translated intoone or more MapReduce jobs in order to overcome thelimitation of physical hardware constraints.

The results of Urbani et al. show that RDFS reason-ing in their implementation can compute RDFS clo-sures over 1 billion triples in less than an hour. The re-sults from their OWL implementation were not as ef-fective.

Weaver and Hendler define a partitioning schemethat can be used to perform parallel (finite) RDFS in-ferencing over A-box partitions, demonstrating linearscaling of the inferencing time [34]. In their experi-ments, they use a large memory cluster of up to 128processors, producing over 650 million triples from adataset of 345 million triples in only 8 minutes (4 ofwhich is the time spent performing inference).

Both approaches demonstrate the value of using par-allel computation for large scale RDF reasoning, how-ever, the advantages of a distributed system can be ex-ploited only if the input can be partitioned so that in-dividual nodes can work without communicating witheach other [33]. In the case of the reporting expert sys-tem, the process of calculating the report-entailed nu-merator graph depends only on the RDF statements ina single patient graph. This framework is well suitedto take advantage of a high degree of parallelism withlittle cost of overhead or relational database manage-ment system maintenance. This is how Google pro-cesses data on the order of petabytes on a network of afew thousand computers.


The computational problem of calculating a report-entailed numerator dataset can be characterized interms of a framework of partitioned and distributedcomputation:

Definition (The distributed, numerator reportingproblem). The problem involving the distributed, par-allel use of N reporting expert systems to derive alarge, report-entailed numerator dataset comprised ofM graphs. Let R denote an instance of a reportingexpert system.

Intuitively, the primary bottleneck in the distributedand parallel evaluation of a report-entailed numeratordataset is the number of numerator graphs that are notallocated to a node in the computing cluster at anytime and the overhead of managing a queue needed insuch a situation. As the number of nodes in the clus-ter approaches the size of the dataset, the total timespent deriving a report-entailed numerator dataset con-verges with the longest time spent calculating any sin-gle, report-entailed numerator graph.

2.7. Implementation

The reporting expert system is comprised of an en-tailment pipeline with two main parts: the first usesa fast, in-memory production rule system6 to calcu-late the report-entailed numerator graph and the sec-ond part performs more expressive variable derivationsvia procedural post-processing of answers to SPARQLqueries dispatched against the entailed graph using anative programming language (Python).

The production rule system uses OWL 2 RL axiomsto build an efficient Rete7 decision network. In hisPh.D. dissertation, Doorenbos contributed [13] meth-ods for enabling the scaling up of the number of rulesin production rule systems.

An important feature of Rete is that of state-saving[13]. After each change to the working memory, thestate (results) of the matching process is saved in thememory nodes. After subsequent changes to the work-ing memory, many or most of the results are usuallyunchanged. So Rete avoids a lot of re-computationby keeping these results around in-between successive

6A production rule system is a computer program typically used toprovide some form of artificial intelligence, which consists primarilyof a set of rules about behavior.

7The term Rete refers to the class of production rule reasoningalgorithms that sacrifice memory for computational expedience. It isusually pronounced either REET or REE-tee, from the Latin wordfor network

changes to the working memory. In this way a systemthat uses such a decision network will be able to sup-port incremental inference where new RDF statementscan be added as a result of derivations involving a moreexpressive logic and entailments that follow can be cal-culated via a path through the network that only dealswith conditions involving the newly added statements.

This feature is important to the integration of thesecond part of the reporting expert system since newstatements can be incrementally added to the deci-sion network, updating the working memory. Theseupdates can incrementally trigger subsequent entail-ments, maintaining the semantics of the system.

The reporting expert system uses an open sourcesemantic web logical reasoning system8 written inPython, which leverages these methods for efficiency.OWL 2 RL is an OWL 2 profile (a trimmed down ver-sion of OWL 2 that trades some expressive power forthe efficiency of reasoning).

RIF Core is a recent W3C standard that correspondsto the language of definite Horn rules without func-tion symbols (often called Datalog) and has a stan-dard first-order semantics9. RIF Core is primarily usedfor those entailments involving semantics that are ei-ther not expressible in OWL or are more intuitively ex-pressed in RIF Core (such as the combination of prop-erty chain axioms with binary, partial orders).

Figure 2 shows an example of report-entailment be-ginning with two definitions in OWL 2 of an aorticaneurysm repair and thoracic aorta procedure. Belowthis is an example in Turtle of RDF statements from apatient record as it was recorded in the registry and af-ter the report-entailed classifications have been addedto the graph. Note the use of Global Inclusion Axioms(GCIs) to define the report classes, which will be jus-tified later.

Derivations that involve logic even more expressivethan either OWL 2 RL and RIF Core are performedprogramatically via the post-processing of SPARQLquery results against the report-entailed dataset (seestep 4 and 5 in Figure 1).

Once the report-entailed dataset is loaded into anRDF triple store, the use of Cyc to orchestrate the an-swering of top-level queries through the dispatch ofSPARQL queries against this triple store is where thevirtual or bottom-up approach is used in the applica-tion. Logical specifications of report variables that ref-

8http://code.google.com/p/fuxi/9http://www.w3.org/TR/rif-core/


Fig. 2. Sample report-entailment

erence terms in the triple store (described in the OWLontology) are added to Cyc. Then, starting with thesespecifications, Cyc resolves them against the refer-ences and bottoms out in the evaluation of a number ofSPARQL queries whose answers are (in many cases)subject to additional inference and post-processing be-fore returning as top-level answers.

3. Case study

3.1. Overview

SemanticDBTMis a patient registry implemented ona content repository in the Cleveland Clinic’s Heartand Vascular Institute10. It is used to manage a co-hort of patients who all have a qualifying procedurein common for the purpose of outcomes research. TheAgency for Healthcare Research and Quality (AHRQ)defines a patient registry as

an organized system that uses observational study meth-ods to collect uniform data (clinical and other) to evaluatespecified outcomes for a population defined by a partic-

10http://www.w3.org/2001/sw/sweo/public/UseCases/ClevelandClinic/

ular disease, condition, or exposure, and that serves oneor more predetermined scientific, clinical, or policy pur-poses.

Once a qualifying procedure has been identified foraddition into the population, content from the clinic-wide EHR for that patient is transcribed into discretedata (from possibly unstructured sections) into the reg-istry.

The reporting system described above was recentlycertified for data submissions to the Society of Tho-racic Surgeons’ (STS) Adult Cardiac Surgery NationalDatabase. The STS is a not-for-profit organization rep-resenting almost 6,000 surgeons, researchers, and al-lied health professionals worldwide who are dedicatedto ensuring the best possible heart, lung, esophageal,and other surgical procedure components for the chest.The reporting system is also used for internal qualityreports within the Heart and Vascular Institute.

The reporting expert system is initially given a set ofRDF documents comprised of patient records from theSemanticDB registry that have operations within thetime period of the target report (a rolling one year pe-riod ). These patient record graphs are augmented withstructured data loaded from other primary sources ofdata within the clinic such as the echocardiography andlaboratory systems databases. Data from these sourcesare mapped into the local vocabulary used for patientrecord content.

The reporting semantics are captured in two OWLdocuments and a third document with 3 rules (step 1 inFigure 1). The first OWL document is a formal axiom-atization of the core set of variables defined in the STSAdult Cardiac Surgery Database v2.61’s data specifi-cation document11. It is comprised of 236 classes and16 object properties. The second OWL document isa taxonomy of surgical procedure components com-prised of 285 classes.

The patient records in the registry conform to an ex-isting schema captured in an OWL document. The in-stances of these (primitive) terms in the schema are therecorded clinical facts from which quality measuresare derived. The quality measure classes in the STSontology are defined using these primitive terms.

Figure 3 above is another example of this. TheDeathEvent, PostOpPreHospitalDischarge, PostOp-Within24Hours, and IntraOp classes are all registryprimitives. PostOpInHospEvent is an intermediateclass whose instances are those events that follow a

11http://www.sts.org/sections/stsnationaldatabase/datamanagers


Fig. 3. PostOpHospitalDeath logic

qualifying operation (see section 3.3) within the samehospital stay. PostOpHospitalDeath is an STS classthat corresponds to the outcome measure of those pa-tients who died after a qualifying operation during thehospital stay (i.e., prior to being discharged).

A Global Inclusion Axiom (GCI) is used here toavoid defining the STS variable in terms of necces-sary and sufficient criteria (since it could be definedin terms of the primitives of another patient registrywith a different schema altogether). So as an alterna-tive, any PostOpInHospEvent that is a DeathEvent andwhere the circumstances are either that the death oc-curred after an operation but prior to being discharged,within 24 hours of an operation, or during an operationis necessarily a PostOpHospitalDeath.

The report-entailed numerator dataset is derived onan SGI Altix 350 with 8 Symmetric Multiprocessors(Intel Itanium processors) and 92 GB of physical mem-ory. A software module was written that maintains awork queue with 8 slots where each slot corresponds toa process running the reporting expert system locally.Initially, the OWL documents are converted into RIFand merged with the three rules and the resulting rule-set is used to construct an optimal Rete decision net-work that is repeatedly fed RDF graphs (step 2 in Fig-ure 1). After the report-entailed numerator graph hasbeen computed, the network is reset, emptying all thememory nodes and relinquishing the memory in theprocess. The queue is maintained for maximal utiliza-tion by continually feeding patient graphs into the 8reporting expert systems.

During the most recent report at the time of this writ-ing, the average amount of time spent calculating thepositive entailments (step 3 in Figure 1) was 86 sec-onds. The average time spent identifying qualifyingqualifying operations via procedural post-processingof SPARQL queries over each partially entailed nu-merator graphs was 5 seconds. Finally, the average to-tal time spent procedurally deriving RDF statementsvia post-processing of SPARQL queries over each par-tially entailed numerator graphs was 16 seconds.

The report-entailed numerator dataset is then loadedinto a triple store (step 6 in Figure 1) and a Cyc reportgenerator is used to derive the final report through a se-

Fig. 4. Sample reporting rules

ries of SPARQL queries managed via the SKSI moduleagainst the report-entailed triple store.

3.2. Representational challenges: rules

As described previously, there were a number of ax-ioms that were more easily expressed as rules thanin OWL. The second rule in Figure 3 is an exam-ple that derives the hasHospitalization predicate. Thispredicate holds between a medical event in the patientrecord and the subsuming, parent event that is the en-tire hospital episode (from admit to discharge):

The use of an external predicate (or built-in) in thisfashion precludes the possibility of using OWL to cap-ture this derivation logic.

3.3. Representational challenges: duration reasoning

Some of the logic for the terms that summarizerecorded information in the patient graph involvingclasses whose definitions cannot be completely spec-ified declaratively with either of the semantic webknowledge representation standards and so must bederived procedurally and introduced into the report-entailed graph. An example of this are the qualifyingoperations that comprise the quality measure numera-tor. In the case of STS submissions, they are the firstcompleted operations performed (primarily) by a par-ticular set of staff physicians that have more than justthe insertion or removal of an Extracorporeal Mem-brane Oxygenator (ECMO) or a counter-pulsation de-vice that provides temporary cardiac assistance (a bal-loon pump) as a component.

Often, simple temporal ordering within a hospitalepisode such as that used in the logic for identifiy-


ing qualifying operations can be answered by a sin-gle SPARQL query that uses the OPTIONAL/FIL-TER/!BOUND combination to exclude the possibil-ity of an later event that matches the criteria. How-ever, this will not work if the criteria invoilves a certainproximity to a reference point within the episode.

The CRenFail variable is an example of this:

Indicate whether the patient had acute or worsening renalfailure resulting in one or more of the following:

– Increase of serum creatinine to > 2.0, and 2x mostrecent preoperative creatinine level.

– A new requirement for dialysis postoperatively.

In the current implementation, the closest creatininelaboratory measure within 6 months is considered to bethe most recent preoperative creatinine level. The com-bination of the requirement that the serum creatininelevel is twice that of the most recent preoperative levelalong with the semantics of the term preoperative wasthe primary reason that the instances of this class iscomputed by procedural post-processing of a SPARQLquery against a (partially) report-entailed graph thatincludes instantiations of the hasHospitalization pred-icate in order to reason about temporal relationshipswithin a hospital episode.

4. Concluding thoughts

Beyond the issues and workarounds discussed re-garding expressiveness constraints, there were otherchallenges in practice. Since the current reporting ex-pert system strategy is one that relies primarily ona bottom-up inference strategy, there is a tradeoff ofspace for query complexity (at least for the terms thatcan be expressed in the semantic web knowledge rep-resentation languages). By deriving summary qualitymeasures a priori, the queries needed to retrieve themare less complex than they otherwise would be in a top-down inference strategy that relied primarily on query-time inference and post-processing of answers. How-ever, as a result, the report-entailed numerator datasetcan often be significantly large for a reasonably sizeddenominator. The most recent report-entailed numera-tor dataset was comprised of 6.52 million RDF state-ments in a cohort of about 4000 patients. Prior to deriv-ing quality measure terms, the dataset was comprisedof 5.95 million RDF statements.

Initially, a predicate called startsAfterStartingOfwas introduced via a rule-based definition (the firstrule in Figure 3). It held between two medical events

within the same hospital episode where the first beganafter the beginning of the second. As the number ofblood test events incorporated from the laboratory sys-tem database increased, the instances of this becamea significant bottleneck in the bottom-up derivation ofreport-entailed numerator graphs. In some cases, over47000 such predicates were being added. This wasaddressed by moving the definition of this predicateinto the Cyc ontology, addressing the bottleneck with-out significantly slowing down the SKSI-generatedSPARQL queries that would now incorporate this con-straint.

As a result of the limited size of the computing grid,a large amount of time is spent calculating the report-entailed numerator dataset for each report. On average,this step takes about 6 hours to complete. Typically,data warehouses that leverage an integrated approachto views include a mediator that has (as a component)an incremental update processor (IUP) that propagatesincremental updates into the materialized views [20].From one day to the next, only a small fraction of thepatient records in SemanticDB are modified or addedand so there is an opportunity to update the report-entailed numerator dataset in an incremental fashionthat the current application does not take advantage of.

The questions of whether the derivation of the nu-merator dataset can be done incrementally betweendata submissions, whether a different reasoning strat-egy is more appropriate, or regarding the impact of ex-panding the capacity of the computing cluster are stilloutstanding and beyond the scope of this paper.

This paper discusses how a profile of OWL withexpressive restrictions is being used as the colloquial,declarative knowledge representation for capturing thesemantics of summary quality measures and their com-ponent variables. It also discusses the characteristicsof the limitations of leveraging OWL in this way andhow some of these limitations have been overcomethrough the use of the emerging rule interchange lan-guage of the semantic web. These standard languagesare used by the reporting expert system as the atomiccomponent of a distributed computing grid used to de-rive report-entailed datasets for use in reporting qual-ity measures from EHRs. Since its certification by theSTS in early 2009, the system described has been usedto produce 3 data submissions to the STS and about 9internal staff reports, demonstrating the ability to useEHR data in a meaningful way.


References

[1] American College of Physicians, EHR - Based Quality Mea-surement and Reporting: Critical for Meaningful Use and HealthCare Improvement, American College of Physicians’ Newsroom(2010).

[2] A. Atreja, J.P. Achkar, A.K. Jain, C.M. Harris, and B.A. Lash-ner, Using Technology to Promote Gastrointestinal OutcomesResearch: A Case for Electronic Health Records, The Americanjournal of gastroenterology 103 (2008), 2171–2178.

[3] F. Bancilhon, D. Maier, Y. Sagiv, and J.D. Ullman, Magic setsand other strange ways to implement logic programs (extendedabstract), Proceedings of the fifth ACM SIGACT-SIGMOD sym-posium on Principles of database systems (1985), 1–15.

[4] E. Bayegan, Ø. Nytrø, and A. Grimsmo, Ontologies for knowl-edge representation in a computer-based patient record, Pro-ceedings of the 14th IEEE International Conference on Toolswith Artificial Intelligence (2002), 114–121.

[5] E. Bayegan, Knowledge Representation for Relevance Rankingof Patient-Record Content in Primary-Care Situations, Ph.D.Dissertation, Norwegian University of Science and Technology,2002.

[6] C. Beeri and R. Ramakrishnan, On the power of magic, Thejournal of logic programming 10 (1991), 255–299.

[7] D. Blumenthal, Launching HITECH, The New England journalof medicine 362 (2010), 382–285.

[8] J.M. Corrigan, M.S. Donaldson, L.T. Kohn, S.K. Maguire, andK.C. Pike, Crossing the quality chasm: a new health systemfor the 21st century, Washington, DC: The Institute of Medicine(2001).

[9] L.W. D’Avolio and A.A.T Bui, The Clinical Outcomes As-sessment Toolkit: A Framework to Support Automated ClinicalRecords–based Outcomes Assessment and Performance Mea-surement Research, Journal of the American Medical Informat-ics Association 15 (2008), 333—340.

[10] S. De Lusignan and C. Van Weel, The use of routinely collectedcomputer data for research in primary care: opportunities andchallenges, Family practice 23 (2006), 253–263.

[11] CMS EHR Incentive Program Final Rule. July 3rd. Departmentof Health and Human Services.

[12] R.S. Dick and E.B Steen, The computer-based patient record:an essential technology for health care, National AcademyPress, 1991.

[13] R.B. Doorenbos, Production Matching for Large LearningSystems, Ph.D. Dissertation, Carnegie Mellon University Pitts-burgh, 2001.

[14] J. Doyle and R. Patil, Two theses of knowledge representation,Artificial intelligence 48 (1991), 261–297.

[15] J.B. Fowles, E.A. Kind, S. Awwad, J.P. Weiner, and K.S. Chan,Performance measures using electronic health records: five casestudies, New York: The Commonwealth Fund (2008).

[16] B. Glimm and C. Ogbuji, SPARQL 1.1 Entailment Regimes,Editors, W3C Working Draft (work in progress), June 2010.Latest version available at http://www.w3.org/TR/sparql11-entailment/

[17] J. Hastings, M. Dumontier, D. Hull, M. Horridge, C. Steinbeck,U. Sattler, R. Stevens, T. H"orne, and K. Britz, Representing Chemicals using OWL, De-scription Graphs and Rules Proceedings of the 7th InternationalWorkshop on OWL: Experiences and Directions (2010).

[18] B. Hazlehurst, M.A. McBurnie, R. Mularski, J. Puro, and S.Chauvie, Automating Quality Measurement: A System for Scal-able, Comprehensive, and Routine Care Quality Assessment,AMIA Annual Symposium Proceedings (2009), 229–233.

[19] W. Hersh, A stimulus to define informatics and health informa-tion technology, BMC Medical Informatics and Decision Mak-ing 9 (2009), 24.

[20] R. Hull and G. Zhou, A Framework for Supporting Data In-tegration Using the Materialized and Virtual Approaches, Pro-ceedings of the 1996 ACM SIGMOD international conferenceon Management of data (1996), 481–492.

[21] A.K. Jha, The impact of public reporting of quality of care:Two decades of U.S. experience, Paper prepared for the work-shop of the Committee on Value-Added Methodology for Instruc-tional Improvement, Program Evaluation, and Educational Ac-countability, National Research Council (2008).

[22] E. Jung, Q. Li, A. Mangalampalli, J. Greim, M.S. Eskin, D.Housman, J. Isikoff, A.H. Abend, B. Middleton, and J.S. Ein-binder, Report Central: quality reporting tool in an electronichealth record, AMIA Annual Symposium Proceedings (2006),971.

[23] B. Kadry, I.C. Sanderson, and A. Macario, Challenges thatlimit meaningful use of health information technology, CurrentOpinion in Anesthesiology 23 (2010), 184–192.

[24] D.B. Lenat, CYC: A large-scale investment in knowledge in-frastructure, Communications of the ACM 38 (1995), 33–38.

[25] J. Lloyd, Practical Advantages of Declarative Programming,Proceedings of the Joint Conference on Declarative Program-ming, 94 (1994).

[26] C. Ogbuji, Clinical Data Acquisition, Storage and Manage-ment, in: Encyclopedia of Database Systems, L. L. Liu and M.T.Özsu, ed., Springer US, 2009, pp. 344-348.

[27] A. Rector, Knowledge driven software and âAIJfractal tailor-ingâAI: Ontologies in development environments for clinicalsystems,Formal Ontology in Information Systems (FOIS 2010)8 (2010), 17–30.

[28] A. Ruttenberg, T. Clark, W. Bug, M. Samwald, O. Bodenrei-der, H, Chen, D. Doherty, K. Forsberg, Y.Gao, V. Kashyap, J.Kinoshita, J. Luciano, M.S. Marshall, C. Ogbuji, J. Rees, S.Stephens, G.T. Wong, E. Wu, D. Zaccagnini, T. Hongsermeier,E. Neumann, I. Herman, K. Cheung,Advancing translational re-search with the Semantic Web 8 (2007), s2.

[29] T.D. Sequist, T. Cullen, and J.Z. Ayanian, Information technol-ogy as a tool to improve the quality of American Indian healthcare, American journal of public health 95 (2005), 2173–2179.

[30] N. Siegel, K. Goolsbey, R. Kahlert, G. Matthews, and I. Cy-corp, The Cyc R© System: Notes on Architecture, Cycorp, Inc.(2004).

[31] B. Smith and W. Ceusters, HL7 RIM: An incoherent standard,Studies in Health Technology and Informatics 124 (2006), 133–138.

[32] P.C. Tang, M. Ralston, M. Arrigotti, L. Qureshi, and J. Graham,Comparison of Methodologies for Calculating Quality Mea-sures Based on Administrative Data versus Clinical Data froman Electronic Health Record System: Implications for Perfor-mance Measures, Journal of the American Medical InformaticsAssociation 14 (2007), 10–15.

[33] J. Urbani, E. Oren, and F. van Harmelen, RDFS/OWL reason-ing using the MapReduce framework Masters thesis, Vrije Uni-versiteit - Faculty of Sciences, 2009.


[34] J. Weaver and J. Hendler, Parallel Materialization of the FiniteRDFS Closure for Hundreds of Millions of Triples, The Seman-

tic Web-ISWC 2009 (2009), 682–697.

Documents

Using OWL to Facilitate Quality Reporting from Electronic