24
Proceedings of Machine Learning for Healthcare 2016 JMLR W&C Track Volume 56 Identifiable Phenotyping using Constrained Non–Negative Matrix Factorization Shalmali Joshi [email protected] Electrical and Computer Engineering The University of Texas at Austin Austin, TX, USA Suriya Gunasekar [email protected] Electrical and Computer Engineering The University of Texas at Austin Austin, TX, USA David Sontag [email protected] Computer Science New York University NYC, NY, USA Joydeep Ghosh [email protected] Electrical and Computer Engineering The University of Texas at Austin Austin, TX, USA Abstract This work proposes a new algorithm for automated and simultaneous phenotyping of mul- tiple co–occurring medical conditions, also referred to as comorbidities, using clinical notes from electronic health records (EHRs). A latent factor estimation technique, non-negative matrix factorization (NMF), is augmented with domain constraints from weak supervision to obtain sparse latent factors that are grounded to a fixed set of chronic conditions. The proposed grounding mechanism ensures a one-to-one identifiable and interpretable map- ping between the latent factors and the target comorbidities. Qualitative assessment of the empirical results by clinical experts show that the proposed model learns clinically interpretable phenotypes which are also shown to have competitive performance on 30 day mortality prediction task. The proposed method can be readily adapted to any non-negative EHR data across various healthcare institutions. 1. Introduction Reliably querying for patients with specific medical conditions across multiple organiza- tions facilitates many large scale healthcare applications such as cohort selection, multi-site clinical trials, epidemiology studies etc. (Richesson et al., 2013; Hripcsak and Albers, 2013; Pathak et al., 2013). However, raw EHR data collected across diverse populations and multiple care-givers can be extremely high dimensional, unstructured, heterogeneous, and noisy. Manually querying such data is a formidable challenge for healthcare professionals. c 2016.

Identi able Phenotyping using Constrained Non{Negative ...proceedings.mlr.press/v56/Joshi16.pdf · EHR data across various healthcare institutions. 1. Introduction ... {inducing soft

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Identi able Phenotyping using Constrained Non{Negative ...proceedings.mlr.press/v56/Joshi16.pdf · EHR data across various healthcare institutions. 1. Introduction ... {inducing soft

Proceedings of Machine Learning for Healthcare 2016 JMLR W&C Track Volume 56

Identifiable Phenotyping using Constrained Non–NegativeMatrix Factorization

Shalmali Joshi [email protected] and Computer EngineeringThe University of Texas at AustinAustin, TX, USA

Suriya Gunasekar [email protected] and Computer EngineeringThe University of Texas at AustinAustin, TX, USA

David Sontag [email protected] ScienceNew York UniversityNYC, NY, USA

Joydeep Ghosh [email protected]

Electrical and Computer Engineering

The University of Texas at Austin

Austin, TX, USA

Abstract

This work proposes a new algorithm for automated and simultaneous phenotyping of mul-tiple co–occurring medical conditions, also referred to as comorbidities, using clinical notesfrom electronic health records (EHRs). A latent factor estimation technique, non-negativematrix factorization (NMF), is augmented with domain constraints from weak supervisionto obtain sparse latent factors that are grounded to a fixed set of chronic conditions. Theproposed grounding mechanism ensures a one-to-one identifiable and interpretable map-ping between the latent factors and the target comorbidities. Qualitative assessment ofthe empirical results by clinical experts show that the proposed model learns clinicallyinterpretable phenotypes which are also shown to have competitive performance on 30 daymortality prediction task. The proposed method can be readily adapted to any non-negativeEHR data across various healthcare institutions.

1. Introduction

Reliably querying for patients with specific medical conditions across multiple organiza-tions facilitates many large scale healthcare applications such as cohort selection, multi-siteclinical trials, epidemiology studies etc. (Richesson et al., 2013; Hripcsak and Albers, 2013;Pathak et al., 2013). However, raw EHR data collected across diverse populations andmultiple care-givers can be extremely high dimensional, unstructured, heterogeneous, andnoisy. Manually querying such data is a formidable challenge for healthcare professionals.

c©2016.

Page 2: Identi able Phenotyping using Constrained Non{Negative ...proceedings.mlr.press/v56/Joshi16.pdf · EHR data across various healthcare institutions. 1. Introduction ... {inducing soft

EHR driven phenotypes are concise representations of medical concepts composed ofclinical features, conditions, and other observable traits facilitating accurate querying ofindividuals from EHRs (NIH Health Care Systems Research Collaboratory, 2014). Effortslike eMerge Network1, PheKB2 are well known examples of EHR driven phenotyping. Tra-ditionally used rule–based composing methods for phenotyping require substantial time andexpert knowledge and have little scope for exploratory analyses. This motivates automatedEHR driven phenotyping using machine learning with limited expert intervention.

We propose a weakly supervised model for jointly phenotyping 30 co–occurring condi-tions (comorbidities) observed in intensive care unit (ICU) patients. Comorbidities are aset of co-occurring conditions in a patient at the time of admission that are not directlyrelated to the primary diagnosis for hospitalization (Elixhauser et al., 1998). Phenotypesfor the 30 comorbidities listed in Table 1 are derived using text-based features from clinicalnotes in a publicly accessible MIMIC-III EHR database (Saeed et al., 2011). We presenta novel constrained non–negative matrix factorization (CNMF) for the EHR matrix thataligns the factors with target comorbidities yielding sparse, interpretable, and identifiablephenotypes.

The following aspects of our model distinguish our work from prior efforts:

1. Identifiability: A key shortcoming of standard unsupervised latent factor modelssuch as NMF (Lee and Seung, 2001) and Latent Dirichlet Allocation (LDA) (Blei et al.,2003) for phenotyping is that, the estimated latent factors learnt are interchangeable andunidentifiable as phenotypes for specific conditions of interest. We tackle identifiabilityby incorporating weak (noisy) but inexpensive supervision as constraints our framework.Specifically, we obtain weak supervision for the target conditions in Table 1 using theElixhauser Comorbidity Index (ECI) (Elixhauser et al., 1998) computed solely from patientadministrative data (without human intervention). We then ground the latent factors tohave a one-to-one mapping with conditions of interest by incorporating the comorbiditiespredicted by ECI as support constraints on the patient loadings along the latent factors.

2. Simultaneous modeling of comorbidities: ICU patients studied in this paperare frequently afflicted with multiple co–occurring conditions besides the primary causefor admission. In the proposed NMF model, phenotypes for such co–occurring conditionsjointly modeled to capture the resulting correlations.

3. Interpretability: For wider applicability of EHR driven phenotyping for advanceclinical decision making, it is desirable that these phenotype definitions be clinically inter-pretable and represented as a concise set of rules. We consider the sparsity in the represen-tations as a proxy for interpretability and explicitly encourage conciseness of phenotypesusing tuneable sparsity–inducing soft constraints.

We evaluate the effectiveness of the proposed method towards interpretability, clinicalrelevance, and prediction performance on EHR data from MIMIC-III. Although we focuson ICU patients using clinical notes, the proposed model and algorithm are general and canbe applied on any non-negative EHR data from any population group.

1. http://emerge.mc.vanderbilt.edu/2. http://phekb.org/

2

Page 3: Identi able Phenotyping using Constrained Non{Negative ...proceedings.mlr.press/v56/Joshi16.pdf · EHR data across various healthcare institutions. 1. Introduction ... {inducing soft

Table 1: Target comorbidities

Congestive Heart Failure Cardiac Arrhythmias Valvular Disease Pulmonary Circulation Disorder Peripheral Vascular DisorderHypertension Paralysis Other Neurological Disorders Chronic Pulmonary Diseases Diabetes UncomplicatedDiabetes Complicated Hypothyroidism Renal Failure Liver Disease (excluding bleeding) Peptic UlcerAIDS Lymphoma Metastatic Cancer Solid Tumor (without metastasis) Rheumatoid ArthritisCoagulopathy Obesity Weight loss Fluid Electrolyte Disorder Blood Loss AnemiaDeficiency Anemia Alcohol abuse Drug abuse Psychoses Depression

2. Data Extraction

The MIMIC-III dataset consists of de-identified EHRs for ∼ 38, 000 adult ICU patients atthe Beth Isreal Deaconess Medical Center, Boston, Massachusetts from 2001–2012. For allICU stays within each admission, clinical notes including nursing progress reports, physiciannotes, discharge summaries, ECG, etc. are available. We analyze patients who have stayedin the ICU for at least 48 hours (∼ 17000 patients). We derive phenotypes using clinicalnotes collected within the first 48 hours of patients’ ICU stay to evaluate the quality ofphenotypes when limited patient data is available. Further, we evaluate the phenotypeson a 30 day mortality prediction problem. To avoid obvious indicators of mortality andcomorbidities, apart from restricting to 48 hour data, we exclude discharge summaries asthey explicitly mention patient outcomes (including mortality).

1. Clinically relevant bag-of-words features: Aggregated clinical notes from allsources are represented as a single bag-of-words features. To enhance clinical relevance,we create a custom vocabulary containing clinical terms from two sources (a) the System-atized Nomenclature of Medicine-Clinical Terms (SNOMED CT), and (b) the level-0 termsprovided by the Unified Medical Language System (UMLS), consolidated into a standardvocabulary format using Metamorphosys — an application provided by UMLS for customvocabulary creation.3 To extract clinical terms from the raw text, the notes were tagged forchunking using a conditional random field tagger4. The tags are looked up against the cus-tom vocabulary (generated from Metamorphosys) to obtain the bag-of-words representation.Our final vocabulary has ∼3600 clinical terms.

2. Computable weak diagnosis: We incorporate domain constraints from weak su-pervision to ground the latent factors to have a one-to-one mapping with the conditions ofinterest. In the model described in Section 3, this is enforced by constraining the non-zeroentries on patient loading along the latent factors using a weak diagnosis for comorbidities.The weak diagnoses of target comorbidities in Table 1 are obtained using ECI5, computedsolely from patient administrative data without human annotation. We refer to this indexas weak diagnoses as it is not a physician’s exact diagnosis and is subject to noise andmisspecification. Note that ECI ignores diagnoses code related to the primary diagnoses ofadmission. Thus, ECI models presence and absence of conditions other than the primaryreason for admission (comorbidities). The phenotype candidates from the proposed modelcan be considered as concise representations of such comorbidities.

3. See https://www.nlm.nih.gov/healthit/snomedct/ and https://www.nlm.nih.gov/research/umls/

4. https://taku910.github.io/crfpp/5. https://git.io/v6e7q

3

Page 4: Identi able Phenotyping using Constrained Non{Negative ...proceedings.mlr.press/v56/Joshi16.pdf · EHR data across various healthcare institutions. 1. Introduction ... {inducing soft

Notation Description[m] for integer m Set of indices [m] = {1, 2, . . . ,m}.∆d−1 Simplex in dimension d, ∆d−1 = {x ∈ Rd

+ :∑xi = 1}.

x(j) Column j of a matrix X.supp(x) Support of a vector x, supp(x) = {i : xi 6= 0}.ObservationsN , d Number of patients (∼ 17000) and features (∼ 3600), respectively.

X ∈ Rd×N+ EHR matrix from MIMIC III: Clinically relevant bag-of-words features

from notes in first 48 hours of ICU stay for N patients.k = 1, 2, . . . ,K Indices for K = 30 comorbidities in Table 1.Cj ⊆ [K] for j ∈ [N ] Set of comorbidities patient j is diagnosed with using ECI .Factor matrices

W ∈ [0, 1]K×N Estimate of patients’ risk for the K conditions.

A ∈ Rd×K+ , b ∈ Rd

+ Estimate of phenotype factor matrix and feature bias vector.

Table 2: Notation used in the paper

3. Identifiable High–Throughput Phenotyping

The notation used in the paper are enumerated in Table 2. In summary, for each patientj ∈ [N ], (a) the bag-of-words features from clinical notes is represented as column x(j) ofEHR matrix X ∈ Rd×N

+ , and (b) the list of comorbidities diagnosed using ECI is denotedas Cj ⊆ [K]. Let an unknown W∗ ∈ [0, 1]K×N represent the risk of N patients for Kcomorbidities of interest; each entry w∗kj lies in the interval [0, 1], with 0 and 1 indicatingno-risk and maximum-risk, respectively, of patient j being afflicted with condition k. IfC∗j ⊆ [K] denotes an accurate diagnosis for patient j, then w∗(j) satisfies supp(w∗(j)) ⊆ C∗j .

Definition 1 (EHR driven phenotype) EHR driven phenotypes forK co–occurring con-ditions are a set of vectors {a∗(k) ∈ Rd

+ : k ∈ [K]}, such that for a patient j afflicted withconditions C∗j ⊆ [K],

E[x(j)|w∗(j)] =∑

k∈C∗jw∗kja

∗(k) + b∗, (1)

where b∗ is a bias representing the feature component observed independent of the K targetconditions. A∗ ∈ Rd×K with a∗(k) as columns is referred as the phenotype factor matrix.

Note that we explicitly model a feature bias b∗ to capture frequently occurring terms thatare not discriminative of the target conditions, e.g., temperature, pain, etc.

Cost Function The bag-of-words features are represented as counts in the EHR matrixX. We consider a factorized approximation of X parametrized by matrices A ∈ Rd×K

+ ,

W ∈ RK×N+ and b ∈ Rd

+ as Y = AW + b1>, where 1 denotes a vector of all ones ofappropriate dimension. The approximation error of the estimate is measured using theI–divergence defined as follows:

D(X,Y) =∑

ij yij − xij − xij logyijxij. (2)

Minimizing the I–divergence is equivalent to maximum likelihood estimation under a Pois-son distributional assumption on individual entries of the EHR matrix parameterized byY = AW + b1> (Banerjee et al., 2005).

4

Page 5: Identi able Phenotyping using Constrained Non{Negative ...proceedings.mlr.press/v56/Joshi16.pdf · EHR data across various healthcare institutions. 1. Introduction ... {inducing soft

Algorithm 1 Phenotyping using constrained NMF.Input: X, {Cj : j ∈ [N ]} and paramter λ. Initialization: A(0), b(0).

while Not converged doW(t) ← arg minWD(X,A(t−1)W + b(t−1)1

>) s.t. W ∈ [0, 1]K×N , supp(wj) = Cj , ∀jA(t), b(t) ← arg minA,b≥0D(X,AW(t) + b1>) s.t. a

(k)j ∈ λ∆d−1, ∀k

Phenotypes For the K comorbidities, columns of A, {a(k)}k∈[K] are proposed as candi-

date phenotypes derived from the EHR X, i.e. approximations to {a∗(k)}k∈[K].

Constraints The following constraints are incorporated in learning A and W.

1. Support Constraints: The non-negative rank–K factorization of X is ‘grounded’ toK target comorbidities by constraining the support of risk w(j) corresponding to patient jusing weak diagnosis Cj from ECI as an approximation of the conditions in Definition 1.

2. Sparsity Constraints: Scaled simplex constraints are imposed on the columns of Awith a tuneable parameter λ > 0 to encourage sparsity of phenotypes. Restricting thepatient loadings matrix as W ∈ [0, 1]K×N not only allows to interpret the loadings as thepatients’ risk, but also makes simplex constraints effective in a bilinear optimization.

Simultaneous phenotyping of comorbidities using constrained NMF is posed as follows:

A,W, b =argminA≥0,W≥0,b≥0 D(X,AW + b1>)

s.t. supp(w(j)) = Cj ∀j ∈ [N ], W ∈ [0, 1]K×N ,

a(k) ∈ λ∆d−1 ∀j ∈ [K],

(3)

The optimization in (3) is convex in either factor with the other factor fixed. It is solvedusing alternating minimization with projected gradient descent (Parikh and Boyd, 2014;Lin, 2007). See complete algorithm in Algorithm 1. The proposed model in general canincorporate any weak diagnosis of medical conditions. In this paper, we note that, since weuse ECI, the results are not representative of the primary diagnoses at admission.

4. Empirical Evaluation

The estimated phenotypes are evaluated on various metrics. We denote the model learnedusing Algorithm 1 with a given parameter λ > 0 as λ–CNMF. The following baselines areused for comparison:

1. Labeled LDA (LLDA): LLDA (Ramage et al., 2009) is the supervised counterpartof LDA, a probabilistic model to estimate topic distribution of a corpus. It assumes thatword counts of documents arise from multinomial distributions. It incorporates supervisionon topics contained in a document and can be naturally adapted for phenotyping from bag-of-words clinical features, where the topic–word distributions form candidate phenotypes.While LLDA assumes that the topic loadings of a document lie on the probability simplex∆K−1, λ–CNMF allows each patient–condition wkj loading to lie in [0, 1]. In interpretingthe patient loading as a disease risk, the latter allows patients to have varying levels ofdisease prevalence. Also, LLDA can induce sparsity only indirectly via a hyperparameterβ of the informative prior on the topic–word distributions. While this does not guarantee

5

Page 6: Identi able Phenotyping using Constrained Non{Negative ...proceedings.mlr.press/v56/Joshi16.pdf · EHR data across various healthcare institutions. 1. Introduction ... {inducing soft

sparse estimates, we obtain reasonable sparsity on LLDA estimates. We use the Gibbssampling code from MALLET (McCallum, 2002) for inference. For a fair comparison toCNMF which uses an extra bias factor, we allow LLDA to model an extra topic shared byall documents in the corpus.

2. NMF with support constraints (NMF+support): This NMF model incorpo-rates non–negativity and support constraints from weak supervision but not the sparsityinducing constraints on the phenotype matrix. This allows to study the effect of sparsityinducing constraints for interpretability. On the other hand, imposing sparsity without ourgrounding technique does not yield identifiable topics and hence is not studied as a baseline.

3. Multi-label Classification (MLC): This baseline treats weak supervision (fromECI) as accurate labels in a fully supervised model. A sparsity inducing `1 regularizedlogistic regression classifier is learned for each condition independently. The learned weightvector for each condition k determines importance of clinical terms towards discriminatingpatients with condition k and are treated as candidate phenotypes for condition k.

The weak supervision does not account for the primary diagnosis for admission in theICU population as the ECI ignores primary diagnoses at admission (Elixhauser et al., 1998).However, the learning algorithm can be easily modified to account for the primary diagnoses,if required by using a modified form of supervision or absorbing the effects in an additionaladditive term appended to the model. Nevertheless, the proposed model generates highlyinterpretable phenotypes for comorbidities. Finally, to mitigate the effect of local minima,whenever applicable, for each model, the corresponding algorithm was run with 5 randominitializations and results providing the lowest divergence were chosen for comparison.

4.1 Interpretability–accuracy trade–off

Sparsity of the latent factors is used as a proxy for interpretability of phenotypes. Sparsityis measured as the median of the number of non–zero entries in columns of the phenotypematrix A (lower is better). The λ parameter in λ–CNMF controls the sparsity by imposingscaled simplex constraints on A. CNMF was trained on multiple λ in the range of 0.1 to 1.Stronger sparsity-inducing constraints results in worse fit to the cost function. This trade–off is indeed observed in all models (see A for details). For all models, we pick estimateswith lowest median sparsity while ensuring that the phenotype candidate for every conditionis represented by at least 5 non-zero clinical terms.

4.2 Clinical relevance of phenotypes

We requested two clinicians to evaluate the candidate phenotypes based on the top 15terms learned by each model. The ratings were requested on a scale of 1 (poor) to 4 (excel-lent). The experts were asked to rate based on whether the terms are relevant towards thecorresponding condition and whether the terms are jointly discriminative of the condition.Figure 1 shows the summary of qualitative ratings obtained for all models. For each model,we show two columns (corresponding to two experts). The stacked bars show the histogramof the ratings for the models. Nearly 50% of the phenotypes learned from our model wererated ‘good’ or better by both annotators. In contrast, NMF with support constraints butwithout sparsity inducing constraints hardly learns clinically relevant phenotypes. The pro-posed model 0.4–CNMF also received significantly higher number of ‘excellent’ and ‘good’

6

Page 7: Identi able Phenotyping using Constrained Non{Negative ...proceedings.mlr.press/v56/Joshi16.pdf · EHR data across various healthcare institutions. 1. Introduction ... {inducing soft

0.4-CNMF LLDA MLC NMF+supportModel

0

5

10

15

20

25

30

Num

ber

of

Phenoty

pes

Excellent Good Fair Poor

Figure 1: Qualitative Ratings from Annotation: The two bars represent the ratings provided by thetwo annotators. Each bar is a histogram of the scores for the 30 comorbidities sorted by scores.

0.4–CNMF LLDA MLC NMF0.4–CNMF 0 28 20 44LLDA 7 0 12 35MLC 6 21 0 42NMF+support 1 0 1 0

Table 3: Relative Rankings Matrix: Each row of the table is the number of times the model alongthe row was rated strictly better than the model along the column by clinical experts, e.g., column 3in row 2 implies that LLDA was rated better than MLC 12 times over all conditions by all experts.

ratings from both experts. Although LLDA and MLC estimate sparse phenotypes, theyare not at par with λ–CNMF. Table 3 shows a summary of relative rankings for all models.Each cell entry shows the number of times the model along the corresponding row was ratedstrictly better than that along the column. 0.4–CNMF is better than all three baselines.The supervised baseline MLC outperforms LLDA even though LLDA learns comorbiditiesjointly suggesting that the simplex constraint imposed by LLDA may be restrictive.

Figure 2 is an example of a phenotype (top 15 terms) learned by all models for psychoses.For this condition, the proposed model was rated “excellent” and strictly better than bothLLDA and MLC by both annotators while LLDA and MLC ratings were tied. However,the phenotype for Hypertension (in Figure 3) learned by 0.4–CNMF has more terms relatedto ‘Renal Failure’ or ‘End Stage Renal Disease’ rather than hypertension. One of ourannotators pointed out that “Candidate 1 is a fairly good description of renal disease,which is an end organ complication of hypertension”, where the anonymized Candidate1 refers to 0.4–CNMF. Exploratory analysis suggests that hypertension and renal failureare the most commonly co-occurring set of conditions. Over 93% of patients that havehypertension (according to ECI) also suffer from Renal Failure. Thus, our model is unableto distinguish between highly co-occurring conditions. Other baselines were also ratedpoorly for hypertension, while LLDA was rated only slightly better. More examples ofphenotypes are provided in B.

7

Page 8: Identi able Phenotyping using Constrained Non{Negative ...proceedings.mlr.press/v56/Joshi16.pdf · EHR data across various healthcare institutions. 1. Introduction ... {inducing soft

schizophreniabipolar_disorderoverdoseschizoaffective_disorderparanoiapsychosislithium_toxicitypoisoningpersonalityserotonin_syndromeparanoid_schizophreniamental_retardationsuicidepsychiatric_diseasesuicide_attempt

0.4-CNMFaltered_mental_statusfeveragitatedschizophreniaagitationstress_ulceroverdosebipolar_disorderdeliriummental_statusaspirationdepressionhyponatremiaunresponsiveleukocytosis

LLDAbipolar_disorderschizophreniaflat_affectoverdoseschizoaffective_disorderhematomaspsychosislvhmetastatic_prostate_cancerdiastolic_dysfunctionagitatedlethargysuicidal_ideationileusacquired_immunodeficiency_syndrome

MLCpainpneumothoraxagitatededemaatelectasisanxietyconfusedaspirationopacitypleural_effusionagitationtraumaschizophreniastress_ulcerbipolar_disorder

NMF+support

Figure 2: Phenotypes learned for ‘Psychoses’ (words are listed in order of importance)

esrdcrickdchronic_renal_insufficiencychronic_renal_failureend_stage_renal_diseaseacute_on_chronic_renal_failurechronic_kidney_diseasecns_lymphomajaw_painamyloidosisskin_impairmentglomerulonephritishyperparathyroidismholosystolic_murmur

0.4-CNMFchfhtnhypertensionchest_paincadcracklessobcppulmonary_edemaischemiastress_ulcerheart_failuregibdyspneanausea

LLDAcriav_fistulachronic_renal_insufficiencyckdleft_ventricular_hypertrophyrenal_insufficiencyesrdchronic_renal_failureacute_on_chronic_renal_failuresinus_rhythmcardiomegalyleft_atrial_abnormalityjaw_painhtnrenal_failure

MLChtnpainintraventricular_hemorrhagepulmonary_edemahypoxiahydrocephalushypotensioncoughacute_renal_failuresobconfusedstenosisherniationbleedhemorrhage

NMF+support

Figure 3: Phenotypes learned for ‘Hypertension’

4.3 Mortality prediction

To quantitatively evaluate the utility of the learned phenotypes, we consider the 30 daymortality prediction task. We divide the EHR into 5 cross-validation folds of 80% trainingand 20% test patients. As this is an imbalanced class problem, the training–test splits arestratified by mortality labels. For each split, all models were applied on the training data toobtain phenotype candidates A and feature biases b. For each model, the patient loadings Walong the respective phenotype space (A, b) are used as features to train a logistic regressionclassifier for mortality prediction. For CNMF and NMF+support, these are obtained asWtrain/test = argminW∈[0,1]K×ND(AW + b1>,Xtrain/test) for fixed (A, b). For LLDA, theseare obtained using Gibbs sampling with fixed topic–word distributions. For MLC, thepredicted class probabilities of the comorbidities are used as features. Additionally, wetrain a logistic regression classifier using the full EHR matrix as features.

We clarify the following points on the methodology: (1) A is learned on the patientsin the training dataset only, hence there is no information leak from test patients intotraining. (2) Test patients’ comorbidities from ECI are not used as support constraints ontheir loadings. (3) Regularized logistic regression classifiers are used to learn models formortality prediction. The regularization parameters are chosen via grid-search.

The performance of the above baselines trained on `2 regularized logistic regression overa 5-fold cross-validation is reported in Table 4: rows 1–5. The classifier trained on the fullEHR unsurprisingly outperforms all baselines as it uses richer high dimensional informa-tion. All phenotyping baselines, except NMF+support, show comparable performance onmortality prediction which in spite of learning on a small number of 30 features, is onlyslightly worse than predictive performance of full EHR with ∼ 3500 features.

8

Page 9: Identi able Phenotyping using Constrained Non{Negative ...proceedings.mlr.press/v56/Joshi16.pdf · EHR data across various healthcare institutions. 1. Introduction ... {inducing soft

Model AUROC Sensitivity Specificity1. 0.4–CNMF 0.63(0.02) 0.59(0.04) 0.62(0.03)2. NMF+support 0.52(0.02) 0.56(0.13) 0.51(0.14)3. LLDA 0.64(0.02) 0.62(0.03) 0.61(0.05)4. MLC 0.66(0.01) 0.62(0.06) 0.62(0.05)5. Full EHR 0.72(0.02) 0.69(0.02) 0.63(0.04)6. CNMF+Full EHR (`1, C = 0.1) 0.72(0.02) 0.61(0.09) 0.71(0.07)

Table 4: 30 day mortality prediction: 5–fold cross-validation performance of logistic regression classi-fiers. Classifiers for 0.4–CNMF and competing baselines (NMF+support, LLDA, MLC) were trainedon the 30 dimensional phenotype loadings as features. Full EHR denotes the baseline classifier (`1-regularized logistic regression) using full ∼ 3500 dimensional EHR as features. CNMF+Full EHRdenotes the performance of the `1-regularized classifier learned on Full EHR augmented with CNMFfeatures (hyperparameter was manually tuned to match performance of the Full EHR model).

Augmented features for mortality prediction (CNMF+Full EHR) Unsurpris-ingly, Table 4 suggests that the high dimensional EHR data has additional informationtowards mortality prediction which are lacking in the 30 dimensional features generated viaphenotyping. To evaluate whether this additional information can be captured by CNMFif augmented with a small number of raw EHR features, we train a mortality predictionclassifier using `1 regularized logistic regression on CNMF features/loadings combined withraw bag–of–words features, with parameters tuned to match the performance of the fullEHR model. The results are reported in the final row of Table 4.

In exploring the weights learned by the classifier for all features, we observe that only8.3% of the features corresponding to raw EHR based bag-of-words features have non–zero weights. This suggests that comorbidities capture significant amount of predictiveinformation on mortality and achieve comparable performance to full EHR model with asmall number of additional terms. See Figure 35 in Appendix showing the weights learnedby the classifier for all features. Figure 4 shows comorbidities and EHR terms with topmagnitude weights learned by the CNMF+full EHR classifier. For example, it is interestingto note that the top weighted EHR term – dnr or ‘Do Not Resuscitate’ is not indicative ofany comorbidity but is predicitive of patient mortality.

5. Discussion and Related Work

Supervised learning methods like Carroll et al. (2011); Kawaler et al. (2012); Chen et al.(2013) or deep learning methods (Lipton et al., 2015; Kale et al., 2015; Henao et al., 2015) forEHR driven phenotyping require expert supervision. Although unsupervised methods likeNMF (Anderson et al., 2014) and non–negative tensor factorization (Kolda and Bader, 2009;Harshman, 1970) are inexpensive alternatives (Ho et al., 2014a,b,c; Luo et al., 2015), theypose challenges with respect to identifiability, interpretability and computation, limitingtheir scalability.

Most closely related to our paper is work by Halpern et al. (2016b) which is a semi-supervised algorithm for learning the joint distribution on conditions, requiring only thata domain expert specify one or more ‘anchor’ features for each condition (no other labeleddata). An ‘anchor’ for a condition is a set of clinical features that when present are highlyindicative of the target condition, but whose absence is not a strong label for absence of

9

Page 10: Identi able Phenotyping using Constrained Non{Negative ...proceedings.mlr.press/v56/Joshi16.pdf · EHR data across various healthcare institutions. 1. Introduction ... {inducing soft

Figure 4: Top magnitude weights on (a) EHR and (b) CNMF features in CNMF+Full EHR classifier

the target condition (Halpern et al., 2014, 2016a). For example, the presence of insulinmedication is highly indicative of diabetes, but the converse is not true. Joshi et al. (2015)use a similar supervision approach for comorbidities prediction. Whereas the conditions inHalpern et al. (2016b) are binary valued, in our work they are real-valued between 0 and 1.Furthermore, we assume that the support of the conditions is known in the training data.

Our approach achieves identifiability using support constraints to ground the latent fac-tors and interpretability using sparsity constraints. The phenotypes learned are clinicallyinterpretable and predictive of mortality when augmented with a sparse set of raw bag-of-words features on unseen patient population. The model outperforms baselines in termsof clinical relevance according to experts and significantly better than the model whichincludes supervision but no sparsity constraints. The proposed method can be easily ex-tended to other non–negative data to obtain more comprehensive phenotypes. However,it was observed that the algorithm does not discriminate between frequently co–occurringconditions, e.g. renal failure and hypertension. Further, the weak supervision (using ECI)does not account for the primary diagnoses of admission. Additional model flexibility toaccount for a primary condition in explaining the observations could potentially improveperformance. Addressing the above limitations along with quantitative evaluation of risk fordisease prediction, and understanding conditions for uniqueness of phenotyping solutionsare interesting areas of follow-up work.

10

Page 11: Identi able Phenotyping using Constrained Non{Negative ...proceedings.mlr.press/v56/Joshi16.pdf · EHR data across various healthcare institutions. 1. Introduction ... {inducing soft

Acknowledgements

We thank Dr. Saul Blecker and Dr. Stephanie Kreml for their qualitative evaluation ofthe computational phenotypes. SJ, SG and JG were supported by NSF: SCH #1418511.DS was supported by NSF CAREER award #1350965. We also thank Yacine Jernite forsharing a code used in preprocessing clinical notes.

References

A. Anderson, P. K. Douglas, W. T. Kerr, V. S. Haynes, A. L. Yuille, J. Xie, Y. N. Wu, J. A.Brown, and M. S. Cohen. Non-negative matrix factorization of multimodal MRI, fMRIand phenotypic data reveals differential changes in default mode subnetworks in ADHD.Neuroimage, 2014.

A. Banerjee, S. Merugu, I. S. Dhillon, and J. Ghosh. Clustering with bregman divergences.Journal of Machine Learning Research, 2005.

D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. Journal of MachineLearning Research, 2003.

R. J. Carroll, A. E. Eyler, and J. C. Denny. Naive electronic health record phenotypeidentification for rheumatoid arthritis. In AMIA Annual Symposium, 2011.

Y. Chen, R. J. Carroll, E. Hinz, A. Shah, A. E. Eyler, J. C. Denny, and H. Xu. Applyingactive learning to high-throughput phenotyping algorithms for electronic health recordsdata. Journal of the American Medical Informatics Association, 2013.

A. Elixhauser, C. Steiner, D. R. Harris, and R. M. Coffey. Comorbidity measures for usewith administrative data. Medical Care, 1998.

Y. Halpern, Y. Choi, S. Horng, and D. Sontag. Using anchors to estimate clinical statewithout labeled data. In AMIA Annual Symposium, 2014.

Y. Halpern, S. Horng, Y. Choi, and D. Sontag. Electronic medical record phenotypingusing the anchor and learn framework. Journal of the American Medical InformaticsAssociation, 2016a.

Y. Halpern, S. Horng, and D. Sontag. Clinical tagging with joint probabilistic models.Conference on Machine Learning for Health Care, 2016b.

R. A. Harshman. Foundations of the parafac procedure: Models and conditions for anexplanatory multi-modal factor analysis. UCLA Working Papers in Phonetics, 1970.

R. Henao, J. T. Lu, J. E. Lucas, J. Ferranti, and L. Carin. Electronic health record analysisvia deep poisson factor models. Journal of Machine Learning Research, 2015.

J. C Ho, J. Ghosh, S. R. Steinhubl, W. F. Stewart, J. C. Denny, B. A. Malin, and J. Sun.Limestone: High-throughput candidate phenotype generation via tensor factorization.Journal of Biomedical Informatics, 2014a.

J. C. Ho, J. Ghosh, and J. Sun. Extracting phenotypes from patient claim records usingnonnegative tensor factorization. In International Conference on Brain Informatics andHealth, 2014b.

J. C. Ho, J. Ghosh, and J. Sun. Marble: High-throughput phenotyping from electronichealth records via sparse nonnegative tensor factorization. In SIGKDD InternationalConference on Knowledge Discovery and Data Mining, 2014c.

11

Page 12: Identi able Phenotyping using Constrained Non{Negative ...proceedings.mlr.press/v56/Joshi16.pdf · EHR data across various healthcare institutions. 1. Introduction ... {inducing soft

G. Hripcsak and D. J. Albers. Next-generation phenotyping of electronic health records.Journal of the American Medical Informatics Association, 2013.

S. Joshi, O. Koyejo, and J. Ghosh. Simultaneous prognosis of multiple chronic conditionsfrom heterogeneous EHR data. In International Conference on Healthcare Informatics,2015.

D. C. Kale, Z. Che, M. T. Bahadori, W. Li, Y. Liu, and R. Wetzel. Causal PhenotypeDiscovery via Deep Networks. In AMIA Annual Symposium, 2015.

E. Kawaler, A. Cobian, P. Peissig, D. Cross, S. Yale, and M. Craven. Learning to predictpost-hospitalization VTE risk from EHR data. In AMIA Annual Symposium, 2012.

T. G. Kolda and B. W. Bader. Tensor decompositions and applications. SIAM Review,2009.

D. Lee and H. S. Seung. Algorithms for non-negative matrix factorization. In Advances inNeural Information Processing Systems, 2001.

C. J. Lin. Projected gradient methods for nonnegative matrix factorization. Neural com-putation, 2007.

Z. C. Lipton, D. C. Kale, C. Elkan, and R. Wetzell. Learning to diagnose with lstm recurrentneural networks. arXiv preprint arXiv:1511.03677, 2015.

Y. Luo, Y. Xin, E. Hochberg, R. Joshi, O. Uzuner, and P. Szolovits. Subgraph augmentednon-negative tensor factorization (santf) for modeling clinical narrative text. Journal ofthe American Medical Informatics Association, 2015.

A. K. McCallum. Mallet: A machine learning for language toolkit, 2002. URL http:

//mallet.cs.umass.edu.NIH Health Care Systems Research Collaboratory. Rethinking Clinical Trials: A Living

Textbook of Pragmatic Clinical Trials, 2014.N. Parikh and S. P. Boyd. Proximal algorithms. Foundations and Trends in optimization,

2014.J. Pathak, A. N. Kho, and J. C. Denny. Electronic health records-driven phenotyping: chal-

lenges, recent advances, and perspectives. Journal of the American Medical InformaticsAssociation, 2013.

D. Ramage, D. Hall, R. Nallapati, and C. D. Manning. Labeled lda: A supervised topicmodel for credit attribution in multi-labeled corpora. In Empirical Methods in NaturalLanguage Processing, 2009.

R. L. Richesson, W. E. Hammond, M. Nahm, D. Wixted, G. E. Simon, J. G. Robinson,A. E. Bauck, D. Cifelli, M. M. Smerek, J. Dickerson, et al. Electronic health recordsbased phenotyping in next-generation clinical trials: a perspective from the nih healthcare systems collaboratory. Journal of the American Medical Informatics Association,2013.

M. Saeed, M. Villarroel, A. T. Reisner, G. Clifford, L. W. Lehman, G. Moody, T. Heldt,T. H. Kyaw, B. Moody, and R. G. Mark. Multiparameter intelligent monitoring inintensive care II (MIMIC-II): a public-access intensive care unit database. Critical CareMedicine, 2011.

12

Page 13: Identi able Phenotyping using Constrained Non{Negative ...proceedings.mlr.press/v56/Joshi16.pdf · EHR data across various healthcare institutions. 1. Introduction ... {inducing soft

Appendix A. Phenotype Sparsity

As suggested in Section 4.1, there is an inherent tradeoff between fit to the cost functionand desired sparsity. The trade-off is made explicit for λ–CNMF in Figure 5. The sparsityof LLDA is controlled by tuning the hyperparameter (β) of the word-topic multinomialparameters (Blei et al., 2003) and for MLC via the `1 regularization parameter η. A smallervalue of β ensures that the word-topic probabilities are sparse. As the value of β is increased,sparsity decreases (i.e. number of non-zero elements increases). For logistic regression(used by MLC), as the `1 regularization parameter increases, sparsity increases. Figure 6ademonstrates the sparsity of the estimated phenotypes for LLDA and Figure 6b shows thatof logistic regression. We choose phenotypes obtained at β = 1 × 10−8 and η = 100 forqualitative annotation. The parameters were chosen to achieve the lowest median sparsitywhile ensuring that for each chronic condition, the corresponding phenotype candidate isrepresented by at least 5 non-zero clinical terms. Our fourth baseline (NMF + support)did not estimate sparse phenotypes and does not have a tuneable sparsity parameter (butwere nevertheless annotated for qualitative evaluation). The proposed model provides thebest sparsity among all baselines.

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9 1

(a) Sparsity Inducing parameter, λ

0

50

100

150

200

250

300

350

400

Phenoty

pe S

pars

ity

16.5

0(6

0.0

0) 3

5.5

0(1

00.0

0)

47.5

0(1

27.0

0)

64.0

0(1

51.0

0)

73.0

0(1

72.0

0)

99.0

0(2

00.0

0)

120.5

0(2

18.0

0)

139.0

0(2

46.0

0)

147.5

0(2

55.0

0)

169.0

0(2

77.0

0)

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9 1

(b) Sparsity Inducing parameter, λ

6500000

6520000

6540000

6560000

6580000

6600000

6620000

Div

erg

ence

D(X

,AW

)

Figure 5: Sparsity–Accuracy Trade–off. Sparsity of the model is measured as the median of thenumber of non-zero entries in columns of the phenotype matrix A. (a) shows a box plots of themedian sparsity across the 30 chronic conditions for varying λ values. The median and third–quartile values are explicitly noted on the plots. (b) divergence function value of the estimate fromAlgorithm 1 plotted against λ parameter.

Appendix B. Sample Phenotypes for Baseline Models

Figures 7–33 show the top 15 terms learned for all target chronic conditions for the pro-posed model and baselines. The sparsity level chosen is based on the criterion described inSection 4.1. For all conditions, the terms are ordered in decreasing order of importance aslearned by the models.

13

Page 14: Identi able Phenotyping using Constrained Non{Negative ...proceedings.mlr.press/v56/Joshi16.pdf · EHR data across various healthcare institutions. 1. Introduction ... {inducing soft

1e-1

0

1e-0

8

1e-0

6

0.0

001

0.0

1

1.0

10.0

Topic-word hyperparameter β (not linear scale)

0

500

1000

1500

2000

2500

3000

3500

4000

Phenoty

pe s

pars

ity

42

1.0

0(7

81.0

0)

41

6.5

0(7

59.0

0)

430

.00

(779

.00

)

45

7.0

0(8

03.0

0)

67

5.0

0(1

02

6.0

0)

360

9.0

0(3

60

9.0

0)

360

9.0

0(3

60

9.0

0)

(a) LLDA

1.0

2.0

10.0

20.0

100.0

200.0

1000.0

Regularization parameter, η (not linear scale)

0

500

1000

1500

2000

Phenoty

pe s

pars

ity 128

2.5

0(1

47

5.0

0)

987

.00

(1169

.00

)

39

2.0

0(5

20.0

0)

24

9.5

0(3

32.0

0)

66.0

0(1

05.0

0)

30.5

0(5

6.0

0)

6.0

0(1

3.0

0)

(b) MLC

Figure 6: Phenotype sparsity for baseline models

cirrhosisvaricesportal_hypertensionhepatitis_cesophageal_varicescirrhosis_of_livergastric_varicesalcoholic_cirrhosishep_chepatocellular_carcinomaspontaneous_bacterial_peritonitisportal_hypertensive_gastropathyascitesend_stage_liver_diseaseprimary_biliary_cirrhosis

0.4-CNMFcirrhosisascitesbleedinggi_bleedgibhepatic_encephalopathyvaricesencephalopathygastrointestinal_bleedaltered_mental_statusabdominal_painliver_failurehypotensionrenal_failureesophageal_varices

LLDAcirrhosishepatitis_cascitesliver_failurehep_ccryptogenic_cirrhosisfatty_liveretoh_abuseautoimmune_hepatitiscolitisdyspneawithdrawalvolume_overloadendocarditisend_stage_liver_disease

MLCpainpneumothoraxatelectasispleural_effusionascitesbleedingedemapneumonialiver_failurecoughafebrilehypotensionchfbmfree_fluid

NMF+support

Figure 7: Learned Phenotypes for Liver Disease

pancreatic_cancerovarian_cancermetastatic_ovarian_cancerpelvic_massglioblastomabrain_tumornsclcneoplasmabdominal_masshepatocellular_carcinomabladder_cachemoradiationmetsbronchopleural_fistulapartial_obstruction

0.4-CNMFbleedingpainpericardial_effusionmasshypotensionstress_ulcerdvtabdominal_painedemahemoptysismalignant_neoplasmcancersobchest_painpe

LLDAhepatocellular_carcinomathyroid_cabrain_tumorglioblastomaend_stage_liver_diseasecalcificationsprostate_cancercancerbladder_caovarian_cancerpancreatic_cancerincisional_paincolon_cancerlung_cancertumor

MLCbleedingpainnauseadvtchest_painedemagi_bleedcadgibvomitingdiverticulosishypotensionstress_ulcerabdominal_painbleed

NMF+support

Figure 8: Learned Phenotypes for Solid Tumor

14

Page 15: Identi able Phenotyping using Constrained Non{Negative ...proceedings.mlr.press/v56/Joshi16.pdf · EHR data across various healthcare institutions. 1. Introduction ... {inducing soft

metastaticmetastatic_melanomametastatic_diseasemetastatic_prostate_cancermetastatic_renal_cell_carcinomamelanomametastasesmetastasismetspancreatic_cancerlung_massmetastatic_colon_cancermetastatic_cancermetastatic_renal_cell_cancerovarian_ca

0.4-CNMFpainmasshypotensionmalignant_neoplasmmetastaticstress_ulcertumorsobcancermetastatic_diseasenauseadyspneapepleural_effusionrespiratory_failure

LLDAmetastaticmetastatic_diseaselung_cancermetastatic_melanomatumormetastasismetastatic_renal_cell_carcinomametastasesmetsmetastatic_prostate_cancerovarian_calung_masspancreatic_cancerlung_noduleshypovolemia

MLCpainedemamassfeverpneumothoraxrespiratory_failuredvtatelectasispleural_effusionhypoxiastress_ulcersobcoughpneumoniacrackles

NMF+support

Figure 9: Learned Phenotypes for Metastatic Cancer

copdasthmachronic_obstructive_pulmonary_diseaseemphysemabronchitisasbestosiscopd_exacerbationobstructive_lung_diseasepersonality_disorderspulmonary_infarct

0.4-CNMFcopdrespiratory_failureasthmapneumoniasobemphysemapnadyspneastress_ulcerchfhtnhypotensionrespiratory_distresscadcough

LLDAcopdasthmaemphysemawheezesbronchiectasisasbestosisaaawheezinglung_cancerrespiratory_failureresp_statuscolon_cahiveslesionpneumothorax

MLCpainedemacopdchest_painpneumothoraxsobstress_ulcercadchfnauseacoughhypotensionpneumoniaasthmaatelectasis

NMF+support

Figure 10: Learned Phenotypes for Chronic Pulmonary Disorder

etoh_abusealcohol_abusealcohol_withdrawalalcoholic_cirrhosisalcoholismdelirium_tremensalcoholic_hepatitiswithdrawal_symptomsneuroleptic_malignant_syndromepancreatic_necrosisdtshepatorenal_failurethiamine_deficiencydtalcoholic_cardiomyopathy

0.4-CNMFpancreatitisetoh_abuseagitationagitatedseizuresseizurealcohol_withdrawalpainalcohol_abusestress_ulceredemawithdrawalfallaltered_mental_statushtn

LLDAetoh_abusealcoholic_cirrhosistremorsalcohol_abusealcoholismwithdrawalcirrhosisalcoholic_hepatitisfracturemalaiseupper_gi_bleedobstructive_sleep_apneaagitatedliver_failurepancreatitis

MLCpainedemapneumothoraxhemorrhageagitationstress_ulceragitatedcoughfallfeverstrokeseizuresubarachnoid_hemorrhagehematomafracture

NMF+support

Figure 11: Learned Phenotypes for Alcohol Abuse

15

Page 16: Identi able Phenotyping using Constrained Non{Negative ...proceedings.mlr.press/v56/Joshi16.pdf · EHR data across various healthcare institutions. 1. Introduction ... {inducing soft

dmdm2diabetes_mellitusniddmtype_2_diabetestype_ii_diabetesdiabetesdiabetes_type_iitype_2_diabetes_mellitusconvulsive_status_epilepticusdiabetes_type_2diabetes_mellitus_type_2skin_ulcershypercoagulablechest_pains

0.4-CNMFpaindmhtnedemacadstress_ulcerdiabetes_mellituschest_painhypertensiondm2chfdiabeteshypotensionsobbleeding

LLDAniddmdm2diabetesdmobesesinus_rhythmcoronary_artery_diseasefacial_droopcardiomegalypseudocystpulm_edematachypneahyperglycemiadeliriumnecrotizing_pancreatitis

MLCpainpneumothoraxedemaatelectasispleural_effusiondmstrokecoughhtnstress_ulcernauseableedingchest_painsobhematoma

NMF+support

Figure 12: Learned Phenotypes for Diabetes Uncomplicated

dmhypoglycemiaretinopathygastroparesisneuropathydiabetes_mellitusesrdfoot_infectionend_stage_renal_diseasehypoglycemiccerebritisdiabetic_neuropathyfoot_ulcernephropathymastoiditis

0.4-CNMFdmhtnhypoglycemiadiabetes_mellituscadpainhyperkalemiadiabetesdm2hypertensionstress_ulcerhyperglycemiachest_painwoundanemia

LLDAneuropathyretinopathyperipheral_neuropathyav_fistuladmhypoglycemiaosteomyelitisgastroparesiscardiomegalydiabetescricongestive_heart_failuresinus_rhythmesrddm2

MLCpaindmchest_painedemacerebritispneumothoraxhtnatelectasismastoiditishypertensionstress_ulcerpleural_effusionhematomacadseizure

NMF+support

Figure 13: Learned Phenotypes for Diabetes Complicated

pvdperipheral_vascular_diseaseaaaaortic_aneurysmruptureclaudicationindurationheel_ulcertype_a_aortic_dissectionleg_ulcercarotid_artery_stenosisdural_tearendoleakvascular_diseaseeschar

0.4-CNMFpainpvdedemacadaaahtnhematomanauseaperipheral_vascular_diseaseischemiaatelectasiscoronary_artery_diseasestress_ulcerafibhypotension

LLDApvdperipheral_vascular_diseasepseudoaneurysmaaacoronary_artery_diseasecarotid_stenosisaortic_dissectionptxcardiomegalyaortic_aneurysmrenal_artery_stenosismesenteric_ischemiacomplaintsvegetationcalcifications

MLCpainpneumothoraxatelectasisedemanauseapleural_effusionhematomableedingafibhtnsobcoughacute_paincadchronic_pain

NMF+support

Figure 14: Learned Phenotypes for Peripheral Vascular Disorder

16

Page 17: Identi able Phenotyping using Constrained Non{Negative ...proceedings.mlr.press/v56/Joshi16.pdf · EHR data across various healthcare institutions. 1. Introduction ... {inducing soft

esrdchronic_kidney_diseasechronic_renal_failureckdend_stage_renal_diseaseacute_on_chronic_renal_failurethrillcriatrophic_kidneyscrfpulmonary_artery_hypertensiondiverticular_diseasenon_reactive

0.4-CNMFhypotensionesrdrenal_failuresepsischronic_renal_failurecadchronic_kidney_diseasehypotensiveacute_renal_failureafibarfinfectionend_stage_renal_diseasechfatrial_fibrillation

LLDAcriav_fistulaesrdckdchronic_renal_insufficiencyacute_on_chronic_renal_failurechronic_renal_failurerenal_insufficiencyleft_ventricular_hypertrophygoutcardiomegalysinus_rhythmjaw_painhydronephrosisrenal_failure

MLCpaincpnauseaesrdchest_paincadchronic_painhypertensionemesisgibsobacute_painbleedingobesestress_ulcer

NMF+support

Figure 15: Learned Phenotypes for Renal Failure

seizureseizure_disorderstatus_epilepticusmental_retardationseizuresrestless_leg_syndromeepilepsymultiple_sclerosistonic_clonic_seizurecns_infectiontrigeminal_neuralgiaparkinsons_diseasegrand_mal_seizuregeneralized_seizurefacial_twitching

0.4-CNMFseizureseizuresaspirationaltered_mental_statusfeverunresponsivestress_ulcerinfectionpneumoniahypotensionagitatedstatus_epilepticusseizure_disordermental_statusdementia

LLDAseizure_disorderrestless_leg_syndromemsseizuredementiahemothoraxretropulsionmultiple_sclerosisepilepsyhydrocephaluslethargichypoxemiaoverdoseshortness_of_breathinfarction

MLCpainseizureedemaatelectasisfeverpneumothoraxcoughseizureshtnpneumoniastress_ulcerhypotensionconfusedagitatedhemorrhage

NMF+support

Figure 16: Learned Phenotypes for Other Neurological Disorders

afibatrial_fibrillationrvrafchronic_atrial_fibrillationcrush_injurybabesiosisnon_reactive

0.4-CNMFafibatrial_fibrillationafpainstress_ulcerhtnstrokeedemableedinghypotensioncvagi_bleedaltered_mental_statusaspirationbleed

LLDArapid_ventricular_responseafibcardiomegalyatrial_fibrillationacute_cholecystitiscalcificationsacute_coronary_syndromesubdural_hematomaacute_on_chronic_renal_failureischemic_heart_diseasestrokeatrial_fluttertachycardiahip_fracturenarrowing

MLCpainafibedemahemorrhageatelectasisatrial_fibrillationstrokepneumothoraxhtncoughstress_ulcerpleural_effusionintracranial_hemorrhagesobnausea

NMF+support

Figure 17: Learned Phenotypes for Cardiac Arrhythmias

17

Page 18: Identi able Phenotyping using Constrained Non{Negative ...proceedings.mlr.press/v56/Joshi16.pdf · EHR data across various healthcare institutions. 1. Introduction ... {inducing soft

polysubstance_abusesubstance_abusecocaine_abuseoverdoseaddictionpoisoningrhabdomyolysisassaultheroin_abusehep_cmultiple_stab_woundsbile_leakbipolar_disorderesophageal_injuryhep

0.4-CNMFpainstress_ulcerpolysubstance_abuseagitatedasthmachronic_painpneumoniaanxietyfeveragitationsubstance_abuserespiratory_distressaspirationoverdoseinfection

LLDAsubstance_abusepolysubstance_abuseoverdosechest_pressurecocaine_abusewithdrawalskin_warmfractureepidural_abscesstamponadehep_cchronic_renal_failurehepatitis_chivtrauma

MLCpainedemapneumothoraxheadacheaneurysmcoughsubarachnoid_hemorrhagehemorrhagedyspneasobhivfractureafebrileatelectasisstress_ulcer

NMF+support

Figure 18: Learned Phenotypes for Drug Abuse

hemiparesisstrokeparaplegiacerebral_palsydecubitus_ulcersischemic_attacklower_extremity_weaknessquadriplegiaexpressive_aphasiaright_hemiplegiacerebral_infarctionquadraplegiacontracturesthalamic_hemorrhagemca_infarct

0.4-CNMFstrokeedemacvahemorrhageseizureweaknessintracranial_hemorrhageinfarctmovementaspirationstress_ulcercerebral_infarctioninfarctionhtnmass

LLDAmovementhemiparesisparaplegiacerebral_palsycvainfarctionquadriplegiabrainexpressive_aphasiapneumocephaluslower_extremity_weaknessintracranial_hemorrhageconstipationischemic_attacklung_collapse

MLCpainhemorrhageedemaseizureseizuresmassstrokesubarachnoid_hemorrhageaneurysmaspirationstress_ulceratelectasisnauseasubdural_hematomaheadache

NMF+support

Figure 19: Learned Phenotypes for Paralysis

hivbacterial_meningitisepidural_hematomacryptogenic_cirrhosisoccipital_fractureorthostasishuman_immunodeficiency_virusaidsacquired_immunodeficiency_syndrometemporal_bone_fracturesyncopehiv_positivememory_lossacute_liver_failureconjunctiva

0.4-CNMFhivaidspneumoniahypotensionfeversyncopefalledemarespiratory_distressbleedingepidural_hematomabradycardiaaspirationcoughhuman_immunodeficiency_virus

LLDAhivscalp_lacerationnsrposturingvarixsinus_tachycardianecrosisloose_stoolsubcutaneous_airafebrilelower_gi_bleedabdasciteslung_canceraneurysm

MLCpainpneumothoraxsubarachnoid_hemorrhageasciteshivappendicitisafebrilechfnauseabmaneurysmopacitiessepsisabdominal_distensionabdominal_distention

NMF+support

Figure 20: Learned Phenotypes for AIDS

18

Page 19: Identi able Phenotyping using Constrained Non{Negative ...proceedings.mlr.press/v56/Joshi16.pdf · EHR data across various healthcare institutions. 1. Introduction ... {inducing soft

hypotensionlactic_acidosishyperkalemiahypernatremiarespiratory_failurerenal_failurehyponatremiahyperpotassemiaacute_renal_failurehyposmolalityleukopeniaarfrhabdomyolysischronic_low_back_painviral_gastroenteritis

0.4-CNMFhypotensionrespiratory_failuresepsisacute_renal_failurestress_ulceraltered_mental_statusarfrenal_failureinfectionardspneumoniafeveraspirationhypotensivenausea

LLDAmetabolic_acidosishydronephrosishypernatremiahyperkalemiahyponatremiaopacitiesacidosisrespiratory_acidosisopacificationcomplicationsobstructionlactic_acidosisdehydrationchronic_painhypovolemia

MLCpainedemapneumothoraxhypotensionstress_ulcernauseaaspirationatelectasiscoughpleural_effusionhematomableedingpneumoniahtnsubarachnoid_hemorrhage

NMF+support

Figure 21: Learned Phenotypes for Fluid Electrolyte Disorders

rheumatoid_arthritislupussclerodermapolymyalgia_rheumaticahip_fractureabsent_bowel_soundsankylosing_spondylitisimimyelodysplastic_syndromeexertional_dyspneaeye_paininterstitial_lung_diseaseamyloid_angiopathyfemoral_neck_fractureliver_hematoma

0.4-CNMFpainfeverhypotensioninfectionsepsischronic_painrheumatoid_arthritiscadchfafebrilepnahip_fracturestress_ulcerhypotensivecrackles

LLDArheumatoid_arthritislupuspolymyalgia_rheumaticaankylosing_spondylitisinterstitial_lung_diseasesvtchronic_renal_insufficiencysclerodermadiverticulitisrefluxfeeling_weakprimary_biliary_cirrhosisocclusionexertional_dyspneatamponade

MLCfevercadpainpnachfsobcoronary_artery_diseasebleedingmicpcracklespulmonary_edemaedemadementiaischemic_heart_disease

NMF+support

Figure 22: Learned Phenotypes for Rheumatoid Arthritis

multiple_myelomamyelomalymphomahodgkins_lymphomaachalasiaamyloidosisremissionhemochromatosisfoot_painbarotraumaneutropenic_fevermmshinglesfungemiahypoxic_brain_injury

0.4-CNMFlymphomamultiple_myelomafeverhypotensionfeverspneumoniasobmyelomahypercalcemiahypoxiachest_painanemiapnarenal_failurestress_ulcer

LLDAlymphomahodgkins_lymphomamultiple_myelomamyelomaesophagitisopacitiesedematousremissionsahorthopneadiscomforthypercalcemiafebrile_neutropeniasubcutaneous_emphysemainfection

MLClesionpainafibdementiaedemaatrial_fibrillationproptosisperiorbital_swellinginfectionhtnseizurepneumothoraxabscesslacerationsubdural_hematoma

NMF+support

Figure 23: Learned Phenotypes for Lymphoma

19

Page 20: Identi able Phenotyping using Constrained Non{Negative ...proceedings.mlr.press/v56/Joshi16.pdf · EHR data across various healthcare institutions. 1. Introduction ... {inducing soft

thrombocytopeniahitcoagulopathyhepatic_encephalopathyhepatorenal_syndromecirrhosis_of_liverschistocyteslow_fibrinogensplenic_sequestrationfulminant_hepatic_failurehepatic_dysfunctionpolysubstance_abuseliver_cirrhosisdickidney_failure

0.4-CNMFsepsisthrombocytopeniahypotensionbleedingfeveracute_renal_failureascitesrenal_failurearfinfectionstress_ulcercoagulopathyfeversardscirrhosis

LLDAthrombocytopeniahitcoagulopathyliver_failureascitesedematousgeneralized_edemafatiguecirrhosissplenomegalytransaminitispulmonary_edemapulmonary_hypertensionhepatitis_csinus_tachycardia

MLCpainpneumothoraxhypotensionedemapleural_effusionbleedingatelectasisfeverhypotensivestress_ulcerfeverscoughsepsishemorrhagehematoma

NMF+support

Figure 24: Learned Phenotypes for Coagulopathy

morbid_obesityobesityosatracheobronchomalaciaobesity_hypoventilation_syndromeobstructive_sleep_apneabronchomalaciatracheomalaciapannusobesepancreatic_pseudocystvenous_stasis_ulcerseegdaytime_somnolencegroup_a_strep

0.4-CNMFobesepainobesityrespiratory_failureedemamorbid_obesitywoundhtnstress_ulcerhypotensionosafeversobanxietydyspnea

LLDAobesityobesemorbid_obesitycardiomegalyhypoxemiamyalgiasrespiratory_arrestrespiratory_statuspulmonary_embolismtamponadehypoxicosapulmonary_edemasleep_apneadiaphoresis

MLCpainedemacadhtnstress_ulcerfeverpericardial_effusionhypotensionbleedingpleural_effusionhyperlipidemiapneumothoraxafibobesemorbid_obesity

NMF+support

Figure 25: Learned Phenotypes for Obesity

hip_fracturepulmonary_hypertensionpolycythemiafemoral_neck_fracturepulmonary_infarctmediastinal_masspseudocystmucositisstasispulmonary_embolismchest_tightnesspepca_infarctacute_pulmonary_embolismmyeloma

0.4-CNMFpedyspneapainhypoxiapneumoniadvtpulmonary_embolismpulmonary_hypertensionshortness_of_breathfeverstress_ulcersobcoughrespiratory_failuresinus_tachycardia

LLDAischemic_heart_diseasepulmonary_hypertensioncardiomegalychest_tightnesspulmonary_embolismpehip_fractureosadvtsubstance_abusediaphoresisperipheral_neuropathysystolic_hypertensioninfectious_processhypovolemia

MLCpainhemoptysispneumothoraxmassseizureatelectasispepleural_effusionbleedingedemapulmonary_emboluspulmonary_embolismdvtseizuresstress_ulcer

NMF+support

Figure 26: Learned Phenotypes for Pulmonary Circulation Disorder

20

Page 21: Identi able Phenotyping using Constrained Non{Negative ...proceedings.mlr.press/v56/Joshi16.pdf · EHR data across various healthcare institutions. 1. Introduction ... {inducing soft

aortic_stenosisgout_flareacute_on_chronic_renal_failurediverticulumvalvular_heart_diseasealcoholic_hepatitisthoracic_aortic_aneurysmvegetationleg_ulcersseptic_arthritisguaiac_positive_stoolssystolic_ejection_murmurhearing_lossgurglingbenign_prostatic_hypertrophy

0.4-CNMFpainbleedinghypotensionaortic_stenosisgi_bleedhtngibhematomacadanemiableedischemiastress_ulcerhypotensivemelena

LLDAcardiomegalyaortic_stenosistrdiverticulitiswound_infectionsubdural_hematomamitral_regurgitationaortic_dissectionbmafebrilesystolic_murmursleep_apneaatrial_fibrillationleft_ventricular_hypertrophypna

MLCpainhemorrhagepneumothoraxhtnatelectasisedemacoughstrokesubarachnoid_hemorrhagebleedhematomanauseaafebrilebmsubdural_hematoma

NMF+support

Figure 27: Learned Phenotypes for Valvular Disease

celiac_diseasemrsa_bacteremiakyphosiscystulcerationsconvulsive_status_epilepticuscmvintussusceptionhemochromatosisgastric_ulcercolitisrigidintestinal_obstructionkidney_stonesvegetations

0.4-CNMFpaincolitisgi_bleedkyphosisbleedingfeverchronic_painendocarditisgastrointestinal_bleedcystfallsmrsa_bacteremiahtnosteoporosisdiarrhea

LLDAengorgementcough_nonproductivehemorrhagic_strokemetastatic_renal_cell_carcinomapancreatic_necrosisdiscomfortischemic_bowelinfiltratefoot_paineffusionhypoglycemiabreakdowndilatationcalcificationtremors

MLCpaindiscomfortafibtremorsanxiousbmafebrileproductive_coughcough_nonproductiveincisionincisional_paincomplaintslssrmrsa_bacteremia

NMF+support

Figure 28: Learned Phenotypes for Peptic Ulcer

chfdiastolic_heart_failurehypotensionpancolitismrsa_pneumoniacadjaw_painblack_tarry_stoolschronic_respiratory_failurefacial_flushingfemoral_fracturesubglottic_stenosisgout_flarechronic_inflammationtumor_lysis_syndrome

0.4-CNMFchfpneumoniapulmonary_edemapleural_effusionsepsispnahypoxiasobcracklesrespiratory_failureatelectasiscadaspirationcongestive_heart_failurefever

LLDAcardiomegalycongestive_heart_failurechfpulmonary_edemacalcificationship_fractureobstructiondnrrheumatoid_arthritiscadbmcracklesafebrilepleural_effusionobese

MLCpainpneumothoraxedemaatelectasispleural_effusionsepsiscoughpneumoniachfpulmonary_edemasobcracklesafebrilebleedingsubarachnoid_hemorrhage

NMF+support

Figure 29: Learned Phenotypes for Congestive Heart Failure

21

Page 22: Identi able Phenotyping using Constrained Non{Negative ...proceedings.mlr.press/v56/Joshi16.pdf · EHR data across various healthcare institutions. 1. Introduction ... {inducing soft

hypothyroidismhypothyroidsick_sinus_syndromethyroid_carespiratory_infectionessential_tremorpancreatic_ductfirst_degree_heart_blockstraininginsulin_dependent_diabetesaplastic_anemiaacute_deliriumpulm_hypertensionstimulusblock

0.4-CNMFpainhypothyroidismhypotensionstress_ulceredemapneumoniahypothyroidbleedingnauseahtnsobchronic_painanemiaacute_painpericardial_effusion

LLDAhypothyroidismhypothyroidendometrial_cainfectionhypoglycemiahypoxiahip_fracturecardiomegalyaortic_stenosisencephalopathyatelectasishypovolemicmeningiomapleural_effusionsrespiratory_distress

MLCpainpneumothoraxedemaatelectasishypothyroidismstress_ulcernauseahypotensionhtnsobpleural_effusionbleedingafebrilecoughhypertension

NMF+support

Figure 30: Learned Phenotypes for Hypothyroidism

malnutritionulcerative_colitisfailure_to_thrivehepatic_cirrhosishydrothoraxpancreatic_pseudocystvolvulusesophageal_varicesgastroparesisbloody_diarrheahemochromatosisnecrotizing_fascitismalnourisheddiverticulumgastric_cancer

0.4-CNMFrespiratory_failurepneumoniawoundascitesaspirationbleedingpleural_effusionfeverstress_ulcerhypoxiasepsispnadvtatelectasismalnutrition

LLDAmalnutritionweight_losspoor_dentitionfailure_to_thrivecalcificationsanasarcaulcerative_colitispneumocephalusvolvulusneutropenic_feverupper_gastrointestinal_bleedglaucomasubdural_hematomalesionepidural_abscess

MLCpainedemahemorrhagestrokefeverpneumothoraxsubdural_hemorrhagestress_ulcerfacial_fracturescoughatelectasispneumoniafractureintracranial_hemorrhagenecrotizing_fascitis

NMF+support

Figure 31: Learned Phenotypes for Weight loss

hypotensionpainanemia_of_chronic_diseasepyelonephritisend_stage_renal_diseaseiron_deficiency_anemiahypercalcemiaanemiachronic_anemiaesrdpancolitisbabesiosismicrocytic_anemiaguaiac_stoolsdry_gangrene

0.4-CNMFpainfeverhypotensionpneumoniaanemiasepsissobstress_ulcernauseacoughinfectionedemafeverschest_painpna

LLDAanemiairon_deficiency_anemiasinus_rhythmesrdchronic_renal_failurehydronephrosismitral_regurgitationendocarditiship_fracturevomitingpulmonary_edemashortness_of_breathpyelonephritisgerduti

MLCpainpneumothoraxedemanauseasobfeverpleural_effusionstress_ulceratelectasishypotensioncoughpneumoniaafebrilechest_painanemia

NMF+support

Figure 32: Learned Phenotypes for Deficiency Anemias

22

Page 23: Identi able Phenotyping using Constrained Non{Negative ...proceedings.mlr.press/v56/Joshi16.pdf · EHR data across various healthcare institutions. 1. Introduction ... {inducing soft

cryptogenic_cirrhosissquamous_cell_carcinomaheel_ulcerdiverticular_diseaselactate_levelsanastomotic_leakdark_stoolsgangrenous_cholecystitisgastropathybowel_perforationportal_hypertensive_gastropathysyncopal_episodesangioedemaneutropenic_feverirritable_bowel_syndrome

0.4-CNMFpainbleedinggi_bleedgibanemiastress_ulcerhiveshypotensiongastrointestinal_bleedabdominal_painchest_painmelenawoundchfdiarrhea

LLDAfulminant_hepatic_failuretiredhocmhitrestlesslower_gi_bleedeffusionscalcificationsperipheral_neuropathyblood_lossunresponsivenesssinus_tachycardiabacteremiaupper_gi_bleedduodenal_perforation

MLCpainbleedingchfpneumothoraxatelectasispleural_effusionedemahematomapnaafebrilesobpulmonary_edemapneumoniacoughconfused

NMF+support

Figure 33: Learned Phenotypes for Blood Loss Anemia

depressionoverdoseserotonin_syndromeodfibromyalgiaclonusblurred_visionelevated_ammoniatype_1_diabetescrohns_diseasefulminant_hepatic_failureliver_injurytoxic_ingestionvp_shuntbronchopleural_fistula

0.4-CNMFpaindepressionstress_ulceranxietynauseachest_painhypotensionaspirationfeversobchronic_painbleedingabdominal_painvomitinghtn

LLDAdepressionsystolic_dysfunctionoverdosechronic_painosteoarthritishablurred_visionchest_pressurecerebral_edemaback_painlightheadedpulmonary_edemaobesityosahypothyroidism

MLCpainhypotensionbleedingsobedemadepressionstress_ulcernauseableedpneumothoraxatelectasisaspirationhematomapleural_effusionanxiety

NMF+support

Figure 34: Learned Phenotypes for Depresssion

23

Page 24: Identi able Phenotyping using Constrained Non{Negative ...proceedings.mlr.press/v56/Joshi16.pdf · EHR data across various healthcare institutions. 1. Introduction ... {inducing soft

Appendix C. Augmented Mortality Prediction

Figure 35 shows weights learned by the classifier for all features. The weights shaded red cor-respond to phenotypes and are relatively high compared to raw notes based features (shadedblue), indicating that comorbidities capture significant amount of predictive information onmortality and achieve comparable performance to full EHR model when augmented withadditional raw clinical terms.

0 500 1000 1500 2000 2500 3000 3500 40001.0

0.5

0.0

0.5

1.0

1.5

Figure 35: Weights learned by the CNMF+Full EHR classifier for all features. The weights shadedred correspond to phenotypes.

24