24
A Review on Development of Deep Learning Architectures for Performing Various Analytical Tasks on Electronic Health Record Data K.Kamala Devi 1 , J.Rajasekar 2 J.Senthil Kumar 3 1,2 Department of Computer Science and Engineering. 3 Department of Electronics and Communication Engineering. 1,2,3 MepcoSchlenk Engineering College, Sivakasi ,Tamil Nadu, India August 5, 2018 Abstract This article presents a systematic review of deep learn- ing models for electronic health record (EHR) data, and illustrates various deep learning architectures for analyzing different data sources and their target applications. Also, ongoing researches were highlighted and open challenges in building deep learning models of EHRs were identified. Re- view was performed on numerous articles and analytics on disease recognition, classication, sequential forecast of med- ical events, concept embedding, EHR data privacyand data augmentation were observed. Then investigation was per- formed on how deep architectures can be applied to those tasks. Some special challenges arising from health data, modeling EHR data,and their potential solutions was also reviewed. Finally, performance evaluations conducted for 1 International Journal of Pure and Applied Mathematics Volume 120 No. 6 2018, 11115-11138 ISSN: 1314-3395 (on-line version) url: http://www.acadpubl.eu/hub/ Special Issue http://www.acadpubl.eu/hub/ 11115

A Review on Development of Deep Learning Architectures for ...dinal casualty data from Sutter Health [24]. In [26], they tried three deep learning models: one built on recurrent neural

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: A Review on Development of Deep Learning Architectures for ...dinal casualty data from Sutter Health [24]. In [26], they tried three deep learning models: one built on recurrent neural

A Review on Development of DeepLearning Architectures for PerformingVarious Analytical Tasks on Electronic

Health Record Data

K.Kamala Devi1, J.Rajasekar2

J.Senthil Kumar3

1,2Department of Computer Science andEngineering.

3Department of Electronics and CommunicationEngineering.

1,2,3MepcoSchlenk Engineering College,Sivakasi ,Tamil Nadu, India

August 5, 2018

Abstract

This article presents a systematic review of deep learn-ing models for electronic health record (EHR) data, andillustrates various deep learning architectures for analyzingdifferent data sources and their target applications. Also,ongoing researches were highlighted and open challenges inbuilding deep learning models of EHRs were identified. Re-view was performed on numerous articles and analytics ondisease recognition, classication, sequential forecast of med-ical events, concept embedding, EHR data privacyand dataaugmentation were observed. Then investigation was per-formed on how deep architectures can be applied to thosetasks. Some special challenges arising from health data,modeling EHR data,and their potential solutions was alsoreviewed. Finally, performance evaluations conducted for

1

International Journal of Pure and Applied MathematicsVolume 120 No. 6 2018, 11115-11138ISSN: 1314-3395 (on-line version)url: http://www.acadpubl.eu/hub/Special Issue http://www.acadpubl.eu/hub/

11115

Page 2: A Review on Development of Deep Learning Architectures for ...dinal casualty data from Sutter Health [24]. In [26], they tried three deep learning models: one built on recurrent neural

each analytics were summarized.Keywords:Electronic Health Records, Deep Learning, Neu-ral Networks, Analytical Tasks, Systematic Review.

1 Introduction

Electronic health record (EHR) are at present being preserved bydifferent healthcare institutions by gatheringdata from lots of pa-tients. It is extremelychallenging to produceprecise analytic modelsfrom EHR data, because of data and label obtainability, excellenceof data, and heterogeneous type of data. Traditional health an-alytics modeling often depends on labor intensive efforts and theresulting models often have limited generic features across datasetsor institutions.Deep learning is having a logicalimprintand it has al-tered the data analytic modeling model from expert-driven featureto data-driven feature. Over the earlier few years, acumulativesetof worksestablished the success of feature construction using deeplearning methods. Attention in deep learning for healthcare hasmatured for two reasons. First, for healthcare investigators, deeplearning representations yield improved performance in many tasksthan customary machine learning approaches and need less phys-ical feature engineering. Second, huge and complex datasets areexisting in healthcare and allow training of complex deep learn-ing models. Though EHR data also host many interesting model-ing challengesfor deep learning research. This review recapitulatesthe current development of deep learning models for EHR dataand recommends future research directions. Amethodical reviewof deep learning models using EHR data from several sources weredone along with combined search including deep learning,neuralnetworks, and health. For the relevance assessment, the follow-ing criteria is used. Since, the focus is on deep learning modelsthat use EHR data, works that do not utilize deep learning ap-proaches or did not use EHR datawere excluded. A small numberof articles related to medical imaging or genetic data if such datawere used in combination with HER were included. For example,deep learning for imaging classification for healthcare such as[4],[5]and forecasting the impact of gene expression mutations such as[6] and [7] are beyond the scope of this review. Readers who areconcerned in those topics might refer to the surveys [8] [10]. The

2

International Journal of Pure and Applied Mathematics Special Issue

11116

Page 3: A Review on Development of Deep Learning Architectures for ...dinal casualty data from Sutter Health [24]. In [26], they tried three deep learning models: one built on recurrent neural

topic significance review founded on titles and abstracts provides180exceptional articles. In the third step, from the full text of theremaining articles using the same inclusion criteria to confirm thefinal relevancy few these articles were chosen. This culminates with99articles that arecomprised in this survey. The worksassortment-method is demonstrated and defined in Figure 1.

Figure 1. Illustration of the literature review process on DeepLearning and EHR data.

For each of the chosen articles, three aspects are evaluated suchas, the category of the venue, use of EHR data, and target task,model, and performance. For the use of EHR data, we measured thesample size, quantity of clinical events, the presence of labels, useof longitudinal or temporal information, handling of data quality.Target tasks were separatedinto the following categories: diseasedetection, sequential prediction of clinical events, concept embed-ding, data augmentation, and EHR data privacy. Finally, the typeof deep learning models used in the articles and the consistent per-formancewererecognised. The modelling challenges and solutionsfrom the reviewed articleswere summarized into four categories andpossible solutions were enlisted fromthe existing works. Similarly,numerous open challenges that could become hopefuldirections forfuture research were generalized.

3

International Journal of Pure and Applied Mathematics Special Issue

11117

Page 4: A Review on Development of Deep Learning Architectures for ...dinal casualty data from Sutter Health [24]. In [26], they tried three deep learning models: one built on recurrent neural

2 Analytics Tasks on EHR data

2.1 Disease classification

The goal of evolving a deep learning model for disease classificationis to chart the input EHR data to the output disease target throughmultiple layers of neural networks. Of the surveyed articles, useddisease explicit datasets. Examples comprise the Pooled ResourceOpen-Access Amyotrophic Lateral Sclerosis (ALS) Clinical Trialsdata used in [11] and the Parkinsons Progression Markers Initiativedata used in [12]. Some studies include data from multiple modal-ities, and support both binary classification [13,14] and multi-classclassification [15]. Besides diseaseexplicit multimodal data, somestudies used multivariate time series data. For instance[16], ap-plied convolutional neural networks on multivariate encephalogram(EEG) signals for automated classification of normal and seizuresubjects.

2.2 Sequential prediction of clinical events

When modelling longitudinal EHR data, neural networks were usedto launch relationships between historical explanations and futureevents. In such cases, one can build predictive models of upcomingevents based on a patients history. In the reviewed articles, somewere directed to predict the forthcomingbeginning of a new diseasecondition such as heart failure onset forecast using RNN on longitu-dinal casualty data from Sutter Health [24]. In [26], they tried threedeep learning models: one built on recurrent neural networks, oneon an attention based timeaware neural network model, and one ona neural network with boosted time-based decision stumps. Theyexposed that numerous medical events can be precisely predictedusingdeep learning methods without site-specific data harmoniza-tion.

2.3 Concept embedding

It is significant that clinical phenotyping is a distinct case of conceptembedding where various EHR data elements are mapped to thephenotype of interest. However, general concept embedding alsooffers feature representation of those phenotypes such as med2vec

4

International Journal of Pure and Applied Mathematics Special Issue

11118

Page 5: A Review on Development of Deep Learning Architectures for ...dinal casualty data from Sutter Health [24]. In [26], they tried three deep learning models: one built on recurrent neural

[30]. For concept embedding tasks, deep learning models are trainedin an unsupervised situation without target labels. To guaranteegood generality power, these tasks often influence massive EHRdatabases. For example, the combined EHRs of about 7,00,000patients from the Mount Sinai data warehouse [31] were used toextract patient representation. Other types of concept embeddingtake only freetext as input[32], to excerptpre-defined medical con-cepts from discharge summaries from MIMIC III data and use themto forecast patient phenotypes. However, deep learning models donot always overtake traditional models, as [33] compared deep mod-els with shallow models using classification jobs on clinical notesand exposed when training sample size is small

2.4 Data augmentation

Data augmentation includes numerous data synthesis and genera-tion techniques that create either more training data to avoid over-fitting or more labeled data to decrease the cost of label acquisition[34],[35] or even producing adverse drug response trajectories toinform potential risks [36]. Their total cholesterol measurementswere collected, and were augmented by the Generative Adversar-ial Networks (GAN). The generated records were evaluated usingprediction of drug-induced laboratory test trajectories tasks anddemonstrated good performance. In [35], GAN was used to gener-ate stationary patient records of distinct events such as diagnosiscounts. The synthetic data attainedanalogous performance to realdata on many experiments, including distribution statistics, predic-tive modelling tasks, and medical proficient review.

5

International Journal of Pure and Applied Mathematics Special Issue

11119

Page 6: A Review on Development of Deep Learning Architectures for ...dinal casualty data from Sutter Health [24]. In [26], they tried three deep learning models: one built on recurrent neural

Figure 2: Transformation of EHR data from longitudinal form toperform different Analytical Tasks and Employing Deep Learning

Models

2.5 EHR data privacy

Deidentification is a crucial task in conserving privacy of patientEHR data. A robustde-identification system that was RNN basedwas built [37] and evaluated their system using i2b2 2014 data andMIMIC de-identification data and showed better performance us-ing RNN than existing systems. Later in [38], a RNN hybrid modelwas developed for clinical notes de-identification where a bidirec-tional LSTM model was deployed for characterlevel representationto capture the morphological information of words.

3 Deep learning architectures for ana-

lytics tasks on EHR data

Computational models that possess multiple processing layers, learnrepresentations of data with the support of deep learning with mul-tiple levels of abstraction [3]. This has intenselyenhanced machinelearning performance in many fields, such as computer vision[39],

6

International Journal of Pure and Applied Mathematics Special Issue

11120

Page 7: A Review on Development of Deep Learning Architectures for ...dinal casualty data from Sutter Health [24]. In [26], they tried three deep learning models: one built on recurrent neural

natural language processing[40], and speech recognition [41] andhas also established great enactment in healthcare and medicalprovinces, such as using deep neural networks to distinguish refer-able diabetic retinopathy. Various deep learning architectures be-sides fully connected neural networks were used to tackle differentchallenges as elaborated below. Figure 2 illustrates commonly useddeep architectures. Table 1 shows the architecture distribution overall tasks.

3.1 Recurrent neural networks (RNNs)

RNNs are an extension of feedforward neural networks to modelsequential data, such as time series [45], event sequences [24] andnatural language text [50]. In particular, the recurrent structurein RNN can capture the complex temporal dynamics in the lon-gitudinal HER data, thus making them the preferred architecturefor several EHR modelling tasks, including sequential clinical eventprediction, [24,27,43,4851,55,56] disease classification, [14,21,4247]and computational phenotyping [12,15,64]. The hidden states ofthe RNN work as its memory, since the current state of the hiddenlayer depends on the previous state of the hidden layer and theinput at the current time. This also enables the RNN to handlevariable-length sequence input.

Table 1. Distribution of Deep Learning Architectures over variousanalytic tasks

7

International Journal of Pure and Applied Mathematics Special Issue

11121

Page 8: A Review on Development of Deep Learning Architectures for ...dinal casualty data from Sutter Health [24]. In [26], they tried three deep learning models: one built on recurrent neural

3.2 Convolutional Neural Networks (CNNs)

In image, speech, and video analysis, CNNs exploit local proper-ties of and utilize convolutional and pooling layers to progressivelyextract abstract patterns. For example, CNNs greatly improvedthe performance of automatic classification of skin lesions from im-age data. CNNs work as follows: the convolutional layers connectmultiple local filters with their input data (raw data or outputs ofprevious layers) and produce translation invariant local features.Then, pooling layers progressively reduce the size of the output toavoid over fitting. Here, both convolution and pooling are locallyperformed, such that (in image analysis) the representation for onelocal feature will not influence other regions. As temporal EHRinformation is often informative, modelling it with CNNs requiresconsidering how to capture temporality. For example, in [13],[75],an additional convolutional operation was conducted over the tem-poral dimension. Besides modelling images and event sequences,CNNs have been used to label clinical text [19,21].

3.3 Autoencoders (AEs)

AEs are an unsupervised dimensionality reduction model via non-lineartransformation. For medical concept embedding (eg., embed-different medical codes in a common space), AEs are a preferred-family of models.[11,31,64,79,80,8385]. The composition of encoderand decoderis called the reconstruction function. An Auto En-coderdiminishes the reconstruction lossin a typical implementation,thus permitting AEs to emphasison capturing vital properties of thedata, while droppingits dimension. In [31], AEs were used to modelEHRs in an unsupervisedmanner to capture stable structures andregular patterns in thedata.Sparse AE (SAE) and denoising AE(DAE) are two AE variants.For SAE, the reconstruction loss is reg-ularized via a sparsity penaltyon internal code representation, sothat the model will learn sparserepresentation.

3.4 Unsupervised embedding

Several other unsupervised learning methods besides AEs have beenap-plied to EHR concept representations. Word2vec variants havebeen

8

International Journal of Pure and Applied Mathematics Special Issue

11122

Page 9: A Review on Development of Deep Learning Architectures for ...dinal casualty data from Sutter Health [24]. In [26], they tried three deep learning models: one built on recurrent neural

applied to learn representation for medical codes [30]. Inparticu-lar, word2vec has been extended to create two-levelrepresentationfor medical codes and clinical visits jointly [30]. Word2vec has twovariants: the continuous bag of words (CBOW) that predictstarget(codes) given surrounding contexts, and the Skip-gramthat predictssurrounding contexts given target. The goal ofthese models is toembed terminologies from different domainsinto the same space todiscover the relations among them. In addition, a RestrictedBoltz-mann Machine (RBM) has been used for latent concept embedding[32],[96]. It uses a generative approach to model the underlyingdatageneration process of the input, which can also provide latentrepresentationsfor EHR data.

3.5 Generative adversarial network (GAN)

GAN is an approach for data generation via a game-theoretical pro-cess. The main idea is to train two neural networks: a generatorand a discriminator. The generator takes random noise as input andgenerates samples, while the discriminator takes both real samplesand the generated samples as input and tries to distinguish betweenthe two. The two networks are trained alternatively, with the ex-pectation that the competition will drive the generator to producemore realistic samples and the discriminator to achieve greater dis-tinguishing power. Recently, GAN has been used in the healthcare domain for generating continuous medical time series [99] anddiscrete codes [33][35].

4 Special Challenges and Possible solu-

tions

Special challenges arise from EHR data (eg., temporality, irregu-larity,multiple modalities, lack of label) and model characteristics(eg.,interpretability). In this section, we elaborate on those chal-lengesand describe possible solutions from the surveyed articles.Temporality and irregularityLongitudinal EHR data describes thetrajectories of patients healthconditions over time. The short-termdependencies among medicalevents in EHRs were considered as lo-cal context for patient historyand the long-term effects provided

9

International Journal of Pure and Applied Mathematics Special Issue

11123

Page 10: A Review on Development of Deep Learning Architectures for ...dinal casualty data from Sutter Health [24]. In [26], they tried three deep learning models: one built on recurrent neural

global context [30].

4.1 Multi-modality

EHR data encompass multiple data modalities, including numeric-values such as lab tests, free-text clinical notes, continuous mon-itoring data, such as electrocardiography (ECG) and electroen-cephalography (EEG), medical images and discrete codes for di-agnosis, medication, and procedures. Researchers have confirmedthat finding patterns among multimodal data can increase the accu-racy of diagnosis,prediction, and overall performance of the learn-ing system.However, multi modal learning is challenging due to theheterogeneity of the data. Existing work often took a multitasklearning approach to jointly learn data across multiple modalities[63].

4.2 Lack of labels

In our setting, labels refer to the gold standard target of inter-est,such as true states of clinical outcomes or the true disease phe-notypes.Gold standard labels are often not consistently captured inEHR data and are thus typically unavailable in large numbers fortraining models. Identifying effective ways to label EHR recordsisone of the biggest obstacles to deep learning on EHR data. La-bel acquisition requires domain knowledge, often involving highlytrained domain experts. In practice, a silver standard is oftenadopted. For example, in this survey, in most articles that tooka supervised learning approaches, patient labels were derived basedon the occurrences of codes, such as diagnosis, procedure, and med-ication codes.Other than manually crafting labels, transfer learningcould offer alternative approaches.

4.3 Transfer learning.

Some articles attempt to label EHR data implicitly.For example,[28]used LSTM to model sequences of diagnosticcodes, a proxy prob-lem for disease progression, and showed that thelearned knowledgecould be transferred to new datasets for the sametask. An autoen-coder variant architecture was applied to performtransfer learning

10

International Journal of Pure and Applied Mathematics Special Issue

11124

Page 11: A Review on Development of Deep Learning Architectures for ...dinal casualty data from Sutter Health [24]. In [26], they tried three deep learning models: one built on recurrent neural

from generic EHR to predict a specific target,such as inferring pre-scriptions from diagnostic codes.

4.4 Interpretability

Although deep learning models can produce accurate predictions,theyare often treated as black-box models that lack interpretabilityandtransparency of their inner working. This is an importantprob-lem because clinicians often are unwilling to accept machinerecom-mendations without clarity as to the underlying reasoning.Recently,there have been some efforts to explain black-box deepmodels.

4.5 Evaluation of analytics tasks

For supervised models, evaluation was often done directly on the-learning task via quantitative metrics, such as accuracy and AUC.Forunsupervised models, evaluation was often indirectly done usingsep-arate prediction tasks [30,31]. Popular evaluation metrics for bina-ryprediction or classification include AUC, the area under theprecision-recall curve (PRAUC), and the F1 score. For multiclasspredictionor classification, micro-F1 and macro-F1 scores are popularchoices.In addition, some also use mean squared error for performanceeval-uation.

5 CONCLUSION

In this article, an overview of the current deep learningmodels forEHR data is provided. Results from the reviewed articles haveshownthat as associated to other machine learning approaches, deeplearningmodels excel in modelling raw data, diminishing the needfor pre-processingand feature engineering, and expressively improv-ingperformance in several analytical tasks. It is notable that deeplearn-ing models are perfect tools for identifying diseases or forecasting-clinical events or outcomes given time series data such as EEG orbiosignals from ICU orimaging data. However, although deep learn-ing techniques haveshown promising results in performing manyanalytics tasks, severalopen challenges remain.First, despite vari-ous attempts, there is still a significant need toimprove the quality

11

International Journal of Pure and Applied Mathematics Special Issue

11125

Page 12: A Review on Development of Deep Learning Architectures for ...dinal casualty data from Sutter Health [24]. In [26], they tried three deep learning models: one built on recurrent neural

of generated data and labels. For data augmentation,current chal-lenges include generated data lack variety, data generation is oftenconducted under supervision, making thegenerated data biased to-ward the prediction task; and there is aneed for more accuratequantitative measures to evaluate the utilityand privacy preserva-tion of the generated data. Challenges arise fortransfer learningof data and labels from the fact that deep modelsoften do not ex-plicitly capture uncertainties. This makes the modelsless robustin handling changes in underlying data distribution.Thus, there isrisk of deploying models in which the real EHR datacould invalidatethe models future predictions. This could be a significantrisk, es-pecially in the healthcare setting. General methodshave attemptedto solve these challenges. These include better calibrationof uncer-taintiesand adversarial learning with relaxingthe shared label spaceassumption. However, this is still an openarea for deep learning onEHR data.

References

[1] RajkomarA,Oren E, Chen K, et al. Scalable and accuratedeep learning for electronic health records. arXiv: 1801.07860[cs.CY]. 2018

[2] Richesson RL, Sun J, Pathak J, Kho AN, Denny JC. Clinicalphenotypingin selected national networks: demonstrating theneed for highthroughput,portable, and computational meth-ods. ArtifIntell Med2016; 71: 5761.

[3] LeCun Y, Bengio Y, Hinton G. Deep learning. Nature 2015;521 (7553):43644.

[4] Gulshan V, Peng L, Coram M, Stumpe MC, Wu D,Narayanaswamy A.Development and validation of a deeplearning algorithm for detectionof diabetic retinopathy in reti-nal fundus photographs. JAMA 2016; 316(22): 240210.

[5] Esteva A, Kuprel B, Novoa RA, et al. Dermatologist-levelclassificationof skin cancer with deep neural networks. Nature2017; 542 (7639):1158.

12

International Journal of Pure and Applied Mathematics Special Issue

11126

Page 13: A Review on Development of Deep Learning Architectures for ...dinal casualty data from Sutter Health [24]. In [26], they tried three deep learning models: one built on recurrent neural

[6] Leung MKK, Xiong HY, Lee LJ, Frey BJ. Deep learning ofthe tissueregulatedsplicing code. Bioinformatics 2014; 30 (12):i1219.

[7] Xiong HY, Alipanahi B, Lee LJ, et al. RNA splicing. The hu-man splicingcode reveals new insights into the genetic deter-minants of disease. Science2015; 347 (6218): 1254806.

[8] Litjens G, Kooi T, Bejnordi BE, et al. A survey on deep learn-ing in medicalimage analysis. Med Image Anal 2017; 42: 6088.

[9] Angermueller C, Parnamaa T, Parts L, Stegle O. Deep learningfor computationalbiology. MolSystBiol 2016; 12 (7): 878.

[10] Ching T, Himmelstein DS, Beaulieu-Jones BK, et al. Opportu-nities andobstacles for deep learning in biology and medicine.bioRxiv 2017; doi:10.1101/142760

[11] Beaulieu-Jones BK, Greene CS; Pooled Resource Open-AccessALS ClinicalTrials Consortium. Semi-supervised learning ofthe electronic healthrecord for phenotype stratification. JBiomed Inform 2016; 64: 16878.

[12] Baytas IM, Xiao C, Zhang X, Wang F, Jain AK, Zhou J. Pa-tient subtypingvia time-aware LSTM networks. In: proceed-ings of the 23rd ACMSIGKDD International Conference onKnowledge Discovery and DataMining. Halifax, NS, Canada:ACM; 2017: 6574.

[13] Cheng Y, Wang F, Zhang P, Hu J. Risk prediction with elec-tronic healthrecords: a deep learning approach. In: proceed-ings of the 2016 SIAM InternationalConference on Data Min-ing. Society for Industrial and AppliedMathematics, Miami,Florida, USA; 2016: 43240.

[14] Kam HJ, Kim HY. Learning representations for the early de-tection ofsepsis with deep neural networks. ComputBiol Med2017; 89: 24855.

[15] Che C, Xiao C, Liang J, Jin B, Zho J, Wang F. An RNNarchitecture withdynamic temporal matching for personalizedpredictions of Parkinsonsdisease. In: proceedings of the 2017

13

International Journal of Pure and Applied Mathematics Special Issue

11127

Page 14: A Review on Development of Deep Learning Architectures for ...dinal casualty data from Sutter Health [24]. In [26], they tried three deep learning models: one built on recurrent neural

SIAM International Conference onData Mining. Society for In-dustrial and Applied Mathematics, Houston,Texas, USA; 2017:198206.

[16] Acharya UR, Oh SL, Hagiwara Y, Tan JH, Adeli H. Deepconvolutionalneural network for the automated detection anddiagnosis of seizure usingEEG signals. ComputBiol Med 2017;doi: 10.1016/j.compbiomed.2017.09.017.

[17] Johnson AEW, Pollard TJ, Shen L, et al. MIMIC-III, a freelyaccessiblecritical care database. Sci Data 2016; 3: 160035.

[18] Vani A, Jernite Y, Sontag D. Grounded recurrent neural net-works. arXiv[Stat.ML] 2017. http://arxiv.org/abs/1705.08557

[19] Mullenbach J, Wiegreffe S, Duke J, Sun J, Eisenstein J. Ex-plainable Predictionof Medical Codes from Clinical Text. arXiv[Cs.CL] 2018. http://arxiv.org/abs/1802.05695

[20] Shi H, Xie P, Hu Z, Zhang M, Xing EP. Towards Auto-mated ICD CodingUsing Deep Learning. arXiv [Cs.CL] 2017.http://arxiv.org/abs/1711.04075

[21] Baumel T, Nassour-Kassis J, Cohen R, Elhadad M, El-hadad N. Multi-Label Classification of Patient Notes aCase Study on ICD Code Assignment.arXiv [Cs.CL] 2017.http://arxiv.org/abs/1709.09587

[22] Yoon H-J, Ramanathan A, Tourassi G. Multi-task deep neuralnetworksfor automated extraction of primary site and lateral-ity information fromcancer pathology reports. In: Angelov P,Manolopoulos Y, Iliadis L, RoyA, Vellasco M, eds. Advancesin Big Data. Cham: Springer; 2016:195204.

[23] Qiu J, Yoon H-J, Fearn PA, Tourassi GD. Deep learn-ing for automatedextraction of primary sites from cancerpathology reports. IEEE JBiomed Health Inform 2017; doi:10.1109/JBHI.2017.2700722

[24] Choi E, Schuetz A, Stewart WF, Sun J. Using recurrent neuralnetworkmodels for early detection of heart failure onset. J AmMed Inform Assoc2017; 24: 36170.

14

International Journal of Pure and Applied Mathematics Special Issue

11128

Page 15: A Review on Development of Deep Learning Architectures for ...dinal casualty data from Sutter Health [24]. In [26], they tried three deep learning models: one built on recurrent neural

[25] Futoma J, Morris J, Lucas J. A comparison of models for pre-dicting earlyhospital readmissions. J Biomed Inform 2015; 56:22938.

[26] Rajkomar A, Oren E, Chen K, et al. Scalable and accu-rate deep learningfor electronic health records. arXiv [Cs.CY]2018. http://arxiv.org/abs/1801.0786026. Choi E, BahadoriMT, Schuetz A, Stewart WF, Sun J. Doctor AI: predicting-clinical events via recurrent neural networks. JMLRWorkshop-ConfProc, Los Angeles, CA, USA: PMLR; 2016; 56: 30118.

[27] Bajor JM, Lasko TA. Predicting medications from di-agnostic codes withrecurrent neural networks. 2016.https://openreview.net/pdf? idrJEgeXFex

[28] Zhang Y, Chen R, Tang J, Stewart WF, Sun J. LEAP: learningto prescribeeffective and safe treatment combinations for mul-timorbidity. In:proceedings of the 23rd ACM SIGKDD Inter-national Conference onKnowledge Discovery and Data Mining.New York, NY: ACM; 2017:131524.

[29] Choi E, Bahadori MT, Searles E, et al. Multi-layer representa-tion learningfor medical concepts. In: proceedings of the 22NdACM SIGKDD internationalconference on knowledge discov-ery and data Mining. NewYork, NY: ACM; 2016: 14951504.

[30] Miotto R, Li L, Kidd BA, Dudley JT. Deep patient: an unsu-pervised representationto predict the future of patients fromthe electronic healthrecords. Sci Rep 2016; 6 (1): 26094.

[31] Gehrmann S, Dernoncourt F, Li Y, et al. Comparing Rule-Based andDeep Learning Models for Patient Phenotyping.arXiv [Cs.CL] 2017.http://arxiv.org/abs/1703.08705

[32] Turner CA, Jacobs AD, Marques CK, et al. Word2Vec inver-sion and traditionaltext classifiers for phenotyping lupus. BMCMed Inform DecisMak 2017; 17 (1): 126.

[33] Che Z, Cheng Y, Zhai S, Sun Z, Liu Y. Boosting deep learningrisk predictionwith generative adversarial networks for elec-tronic healthrecords. In: 2017 IEEE International Conference

15

International Journal of Pure and Applied Mathematics Special Issue

11129

Page 16: A Review on Development of Deep Learning Architectures for ...dinal casualty data from Sutter Health [24]. In [26], they tried three deep learning models: one built on recurrent neural

on Data Mining(ICDM). New Orleans, LA, USA: IEEE; 2017:78792.

[34] Choi E, Biswal S, Malin B, Duke J, Stewart WF, Sun J. Gen-erating MultilabelDiscrete Electronic Health Records UsingGenerative AdversarialNetworks. arXiv preprint arXiv: 170306490. 2017; https://arxiv.org/abs/1703.06490

[35] Yahi A, Vanguri R, Elhadad N, Tatonetti NP. Genera-tive AdversarialNetworks for Electronic Health Records: AFramework for Exploringand Evaluating Methods for Predict-ing Drug-InducedLaboratory Test Trajectories. arXiv [Cs.LG]2017. http://arxiv.org/abs/1712.00164

[36] Dernoncourt F, Lee JY, Uzuner O, Szolovits P. De-identification of patientnotes with recurrent neural networks.J Am Med Inform Assoc2017; 24 (3): 596606.

[37] Liu Z, Tang B, Wang X, Chen Q. De-identification of clini-cal notes viarecurrent neural network and conditional randomfield. J Biomed Inform2017; 75: S3442.

[38] Tompson JJ, Jain A, LeCun Y, Bregler C. Joint training of aconvolutionalnetwork and a graphical model for human poseestimation. In:Ghahramani Z, Welling M, Cortes C, LawrenceND, Weinberger KQ,eds. Advances in Neural Information Pro-cessing Systems 27. CurranAssociates, Inc.; 2014: 17991807.

[39] Sutskever I, Vinyals O, Le QV. Sequence to sequence learningwith neuralnetworks. In: Ghahramani Z, Welling M, Cortes C,Lawrence ND,Weinberger KQ, eds. Advances in Neural Infor-mation Processing Systems27. Curran Associates, Inc.; 2014:3104112.

[40] Hinton G, Deng L, Yu D, et al. Deep neural networks for acous-tic modelling in speech recognition: the shared views of fourresearch groups. IEEESignal Process Mag 2012; 29 (6): 8297.

[41] Choi E, Bahadori MT, Sun J, Kulas J, Schuetz A, Stewart W.RETAIN:an interpretable predictive model for healthcare us-ing reverse time attentionmechanism. In: Lee DD, Sugiyama

16

International Journal of Pure and Applied Mathematics Special Issue

11130

Page 17: A Review on Development of Deep Learning Architectures for ...dinal casualty data from Sutter Health [24]. In [26], they tried three deep learning models: one built on recurrent neural

M, Luxburg UV, Guyon I, GarnettR, eds. Advances in Neu-ral Information Processing Systems 29.Curran Associates, Inc.;2016: 350412.

[42] Choi E, Bahadori MT, Song L, Stewart WF, Sun J.GRAM: graphbasedattention model for healthcare representa-tion learning. In:proceedings of the 23rd ACM SIGKDD Inter-national Conference onKnowledge Discovery and Data Mining.New York, NY: ACM; 2017:78795.

[43] Ayyar S. Tagging patient notes with ICD-9 codes; 2017.https://web.stanford.edu/class/cs224n/reports/2744196.pdf

[44] Lipton ZC, Kale DC, Elkan C, Wetzel R. Learning to di-agnose withLSTM recurrent neural networks. arXiv [Cs.LG]2015. http://arxiv.org/abs/1511.03677

[45] Ma F, Chitta R, Zhou J, You Q, Sun T, Gao J. Dipole:diagnosis predictionin healthcare via attention-based bidirec-tional recurrent neural networks.In: proceedings of the 23rdACM SIGKDD InternationalConference on Knowledge Discov-ery and Data Mining. New York, NY:ACM; 2017: 190311.

[46] Goodwin TR, Harabagiu SM. Deep learning from EEG re-ports for inferringunderspecified information. AMIA Jt Sum-mits TranslSciProc 2017;2017: 11221.

[47] Nguyen P, Tran T, Venkatesh S. Finding Algebraic Structureof Care inTime: A Deep Learning Approach. arXiv [Cs.LG]2017. http://arxiv.org/abs/1711.07980

[48] Jagannatha AN, Yu H. Bidirectional RNN for medical eventdetection inelectronic health records. In: proceedings of the2016 Conference of theNorth American Chapter of the Asso-ciation for Computational Linguistics:Human Language Tech-nologies. Stroudsburg, PA: Association forComputational Lin-guistics; 2016: 473482.

[49] Jagannatha AN, Yu H. Structured prediction models for RNNbased sequencelabeling in clinical text. ProcConfEmpir Meth-ods Nat Lang Process2016; 2016: 85665.

17

International Journal of Pure and Applied Mathematics Special Issue

11131

Page 18: A Review on Development of Deep Learning Architectures for ...dinal casualty data from Sutter Health [24]. In [26], they tried three deep learning models: one built on recurrent neural

[50] Veli ckovi c P, Karazija L, Lane ND, et al. Cross-modal Recur-rent Modelsfor Human Weight Objective Prediction from Mul-timodal Time-seriesData. arXiv preprint arXiv: 1709 08073.2017. https://arxiv.org/abs/1709.08073

[51] Thodoroff P, Pineau J, Lim A. Learning robust features us-ing deep learningfor automatic seizure detection. In: MachineLearning for HealthcareConference. Childrens Hospital LA,Los Angeles, CA, USA: PMLR;2016: 178190.

[52] Luo Y. Recurrent neural networks for classifying relations inclinicalnotes. J Biomed Inform 2017; 72: 8595.

[53] Zhang S, Xie P, Wang D, Xing EP. Medical Diagnosis FromLaboratoryTests by Combining Generative and DiscriminativeLearning. arXiv[Cs.AI] 2017. http://arxiv.org/abs/1711.04329

[54] Pham T, Tran T, Phung D, Venkatesh S. DeepCare: a deepdynamicmemory model for predictive medicine. In: Advancesin Knowledge Discoveryand Data Mining. Cham: Springer In-ternational Publishing; 2016:3041.

[55] Pham T, Tran T, Phung D, Venkatesh S. Predicting healthcaretrajectoriesfrom medical records: a deep learning approach. JBiomed Inform2017; 69: 21829.

[56] Esteban C, Staeck O, Baier S, Yang Y, Tresp V. Predict-ing clinical eventsby combining static and dynamic informa-tion using recurrent neural networks.In: 2016 IEEE Interna-tional Conference on Healthcare Informatics(ICHI). Chicago,IL, USA: IEEE; 2016: 93101.

[57] Suresh H, Hunt N, Johnson A, Celi LA, SzolovitsP, Ghassemi M. ClinicalIntervention Prediction and Un-derstanding Using Deep Networks.arXiv [Cs.LG] 2017.http://arxiv.org/abs/1705.08498

[58] Futoma J, Hariharan S, Sendak M, et al. An ImprovedMulti-Output Gaussian Process RNN with Real-Time Val-idation for EarlySepsis Detection. arXiv [Stat.ML] 2017.http://arxiv.org/abs/1708.05894

18

International Journal of Pure and Applied Mathematics Special Issue

11132

Page 19: A Review on Development of Deep Learning Architectures for ...dinal casualty data from Sutter Health [24]. In [26], they tried three deep learning models: one built on recurrent neural

[59] Futoma J, Hariharan S, Heller K. Learning to Detect Sep-sis with a MultitaskGaussian Process RNN Classifier. arXiv[Stat.ML] 2017. http://arxiv.org/abs/1706.04152

[60] Yang Y, Fasching PA, Tresp V. Modeling progression free sur-vival inbreast cancer with tensorized recurrent neural networksand acceleratedfailure time models. In: Machine Learning forHealthcare Conference.Boston, Massachusetts: PMLR; 2017:164176.

[61] Liu Y, Logan B, Liu N, Xu Z, Tang J, Wang Y. Deep re-inforcementlearning for dynamic treatment regimes on medi-cal registry data. In:2017 IEEE International Conference onHealthcare Informatics (ICHI).Park City, UT, USA: IEEE;2017: 380385.

[62] Razavian N, Marcus J, Sontag D. Multi-task prediction ofdisease onsetsfrom longitudinal laboratory tests. In: MachineLearning for HealthcareConference. Los Angeles, CA, USA:PMLR; 2016: 73100.

[63] Suresh H, Szolovits P, Ghassemi M. The Use of Autoen-coders for DiscoveringPatient Phenotypes. arXiv [Cs.LG] 2017.http://arxiv.org/abs/1703.07004

[64] Che C, Xiao C, Liang J, Jin B, Zho J, Wang F. An RNNarchitecture withdynamic temporal matching for personalizedpredictions of Parkinsonsdisease. In: proceedings of the 2017SIAM International Conference onData Mining. Society for In-dustrial and Applied Mathematics; 2017. pp.198206.

[65] Dubois S, Romano N, Kale DC, Shah N, Jung K. LearningEffective Representationsfrom Clinical Notes. arXiv [Stat.ML]2017. http://arxiv.org/abs/1705.07025

[66] Jia Y, Zhou C, Motani M. Spatio-temporal autoencoder forfeaturelearning in patient data with missing observations. In:2017 IEEEInternational Conference on Bioinformatics andBiomedicine (BIBM).IEEE; 2017: 88690.

[67] Lipton ZC, Kale DC, Wetzel R. Modeling missing data in clin-ical timeseries with rnns. Machine Learning for Healthcare.2016.

19

International Journal of Pure and Applied Mathematics Special Issue

11133

Page 20: A Review on Development of Deep Learning Architectures for ...dinal casualty data from Sutter Health [24]. In [26], they tried three deep learning models: one built on recurrent neural

[68] Potes C, Parvaneh S, Rahman A, Conroy B. Ensemble of Fea-ture: Basedand Deep Learning: Based Classifiers for Detectionof Abnormal Heart

[69] Sounds. 2016 Computing in Cardiology Conference (CinC).Computingin Cardiology; 2016. doi: 10.22489/CinC.2016.182-399

[70] Zhang X, Henao R, Gan Z, Li Y, Carin L. Multi-Label LearningfromMedical Plain Text with Convolutional Residual Models.arXiv[Stat.ML] 2018. http://arxiv.org/abs/1801.05062

[71] Razavian N, Sontag D. Temporal Convolutional Neural Net-works forDiagnosis from Lab Tests. arXiv [Cs.LG] 2015.http://arxiv.org/abs/1511.07938

[72] Hao Y, Khoo HM, von Ellenrieder N, Zazubovits N, Gotman J.DeepIED:an epileptic discharge detector for EEG-fMRI basedon deep learning.NeuroimageClin.

[73] Yang Y, Xie P, Gao X, et al. Predicting Discharge Medicationsat AdmissionTime Based on Deep Learning. arXiv [Cs.CL]2017. http://arxiv.org/abs/1711.01386

[74] Nguyen P, Tran T, Wickramasinghe N, Venkatesh S. Deepr:aconvolutional net for medical records. IEEE J Biomed HealthInform2017; 21 (1): 2230.

[75] Zhu Z, Yin C, Qian B, Cheng Y, Wei J, Wang F. Measuringpatient similaritiesvia a deep architecture with medical conceptembedding. In: 2016IEEE 16th International Conference onData Mining (ICDM). Barcelona,Spain: IEEE; 2016: 749758.

[76] Che Z, Cheng Y, Sun Z, Liu Y. Exploiting Convolutional Neu-ral Networkfor Risk Prediction with Medical Feature Embed-ding. arXiv[Cs.LG] 2017. http://arxiv.org/abs/1701.07474

[77] Luo Y, Cheng Y, Uzuner O, Szolovits P, Starren J. Segmentconvolutionalneural networks (Seg-CNNs) for classifying rela-tions in clinical notes. JAm Med Inform Assoc 2018; 25 (1):938.

20

International Journal of Pure and Applied Mathematics Special Issue

11134

Page 21: A Review on Development of Deep Learning Architectures for ...dinal casualty data from Sutter Health [24]. In [26], they tried three deep learning models: one built on recurrent neural

[78] Grnarova P, Schmidt F, Hyland SL, Eickhoff C. Neural Docu-mentEmbeddings for Intensive Care Patient Mortality Predic-tion. arXiv[Cs.CL] 2016. http://arxiv.org/abs/1612.00467

[79] Suo Q, Xue H, Gao J, Zhang A. Risk factor analysis basedon deep learningmodels. In: proceedings of the 7th ACM In-ternational Conference onBioinformatics, Computational Bi-ology, and Health Informatics. Seattle,WA, USA: ACM; 2016:394403.

[80] Yuan Y, Xun G, Jia K, Zhang A. A multi-view deep learn-ing method forepileptic seizure detection using short-timeFourier transform. In:proceedings of the 8th ACM Interna-tional Conference on Bioinformatics,and Health Informatics.ACM; 2017:21322.

[81] Wang Z, Li L, Glicksberg BS, Israel A, Dudley JT, MaayanA. Predictingage by mining electronic medical records withdeep learning characterizesdifferences between chronologicaland physiological age. J BiomedInform 2017; 76: 5968.

[82] Huang Z, Dong W, Duan H, Liu J. A regularized deep learningapproachfor clinical risk prediction of acute coronary syndromeusing electronichealth records. IEEE Trans Biomed Eng 2017;doi: 10.1109/TBME.2017.2731158

[83] Che Z, Kale D, Li W, Bahadori MT, Liu Y. Deep com-putational phenotyping.In: proceedings of the 21th ACM.dl.acm.org; 2015. http://dl.acm.org/citation.cfm? id2783365

[84] Lasko TA, Denny JC, Levy MA. Computational pheno-type discovery usingunsupervised feature learning over noisy,sparse, and irregular clinicaldata. PLoS One 2013; 8 (6):e66341.

[85] Lv X, Guan Y, Yang J, Wu J. Clinical relation extraction withdeep learning.Int J Hybrid Inform Technol 2016; 9 (7): 23748.

[86] Jacobson O, Dalianis H. Applying deep learning on electronichealthrecords in Swedish to predict healthcare-associated in-fections. ACL2016; 2016: 191.

21

International Journal of Pure and Applied Mathematics Special Issue

11135

Page 22: A Review on Development of Deep Learning Architectures for ...dinal casualty data from Sutter Health [24]. In [26], they tried three deep learning models: one built on recurrent neural

[87] Ulloa Cerna AE, Wehner G, Hartzel DN, Haggerty C, Forn-walt B. Abstract16708: data driven phenotyping of patientswith heart failure usinga deep-learning cluster representationof echocardiographic and electronichealth record data. Circu-lation. 2017; 136: A16708.

[88] Bianchi FM, Mikalsen K, Jenssen R. Learning compressed rep-resentationsof blood samples time series with missing data.arXiv [Cs.NE]2017. http://arxiv.org/abs/1710.07547

[89] Yuan Y, Xun G, Suo Q, Jia K, Zhang A. Wave2vec: learn-ing deep representationsfor biosignals. In: 2017 IEEE In-ternational Conference onData Mining (ICDM). IEEE; 2017:115964.

[90] Hwang U, Choi S, Yoon S. Disease Prediction from ElectronicHealthRecords Using Generative Adversarial Networks. arXiv[Cs.LG] 2017.http://arxiv.org/abs/1711.04126

[91] Beaulieu-Jones BK, Moore JH. Missing data imputation in theelectronichealth record using deeply learned autoencoders. PacSympBiocomput2017; 22: 20718.

[92] Che Z, Purushotham S, Khemani R, Liu Y. Interpretabledeep models forICU outcome prediction. AMIA AnnuSymp-Proc 2016; 2016: 37180.

[93] Liang Z, Zhang G, Huang JX, Hu QV. Deep learning forhealthcaredecision making with EMRs. In: 2014 IEEE Interna-tional Conferenceon Bioinformatics and Biomedicine (BIBM):Belfast, UK: IEEE; 2014: 5569.

[94] Henriksson A, Kvist M, Dalianis H, Duneld M. Identifyingadverse drugevent information in clinical notes with distribu-tional semantic representationsof context. J Biomed Inform2015; 57: 33349.

[95] Du H, Ghassemi MM, Feng M. The effects of deep networktopology onmortality prediction. In: 2016 38th Annual Inter-national Conference of theIEEE Engineering inMedicine andBiology Society (EMBC). 2016: 26025.

22

International Journal of Pure and Applied Mathematics Special Issue

11136

Page 23: A Review on Development of Deep Learning Architectures for ...dinal casualty data from Sutter Health [24]. In [26], they tried three deep learning models: one built on recurrent neural

[96] Tran T, Nguyen TD, Phung D, Venkatesh S. Learning vec-tor representationof medical objects via EMR-driven nonneg-ative restricted Boltzmannmachines (eNRBM). J Biomed In-form 2015; 54: 96105.

[97] Glicksberg BS, Miotto R, Johnson KW, Shameer K, Li L, ChenR. Automateddisease cohort selection using word embeddingsfrom electronichealth records. Pac SympBiocomput 2018; 23:14556.

[98] Prakash A, Zhao S, Hasan SA, et al. Condensed Memory Net-works forClinical Diagnostic Inferencing. AAAI; 2017: 327480.

[99] Esteban C, Hyland SL, Ratsch G. Real-valued (Medical) TimeSeriesGeneration with Recurrent Conditional GANs. arXiv[Stat.ML] 2017.http://arxiv.org/abs/1706.02633.

23

International Journal of Pure and Applied Mathematics Special Issue

11137

Page 24: A Review on Development of Deep Learning Architectures for ...dinal casualty data from Sutter Health [24]. In [26], they tried three deep learning models: one built on recurrent neural

11138