Research Article Network-Based Inference Methods for Drug ...downloads.hindawi.com/journals/cmmm/2015/130620.pdf · future drug repositioning. It is expected that these methods will

Research ArticleNetwork-Based Inference Methods for Drug Repositioning

Hailin Chen1 Heng Zhang2 Zuping Zhang3 Yiqin Cao1 and Wenliang Tang1

1School of Software East China Jiaotong University Nanchang 330013 China2School of Information Engineering East China Jiaotong University Nanchang 330013 China3School of Information Science and Engineering Central South University Changsha 410083 China

Correspondence should be addressed to Hailin Chen chl ecjtuhotmailcom

Received 8 October 2014 Revised 18 March 2015 Accepted 24 March 2015

Academic Editor David A Winkler

Copyright copy 2015 Hailin Chen et al This is an open access article distributed under the Creative Commons Attribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

Mining potential drug-disease associations can speed updrug repositioning for pharmaceutical companies Previous computationalstrategies focused onprior biological information for association inferenceHowever such informationmaynot be comprehensivelyavailable andmay contain errors Different fromprevious research two inferencemethodsProbS andHeatS were introduced in thispaper to predict direct drug-disease associations based only on the basic network topologymeasure Bipartite network topologywasused to prioritize the potentially indicated diseases for a drug Experimental results showed that both methods can receive reliableprediction performance and achieve AUC values of 09192 and 09079 respectively Case studies on real drugs indicated that someof the strongly predicted associations were confirmed by results in the Comparative Toxicogenomics Database (CTD) Finally acomprehensive prediction of drug-disease associations enables us to suggest many new drug indications for further studies

1 Introduction

Drug discovery is a costly and time-consuming processPrevious research reported that it takes around 15 years and$800 million to $1 billion for pharmaceutical companies todevelop a new drug and bring it to market [1 2] Althoughsuch huge amount of time and money has been put into thisindustry only a relatively small number (sim20) of new drugsknown as newmolecular entities (NMEs) are approved byUSFood andDrugAdministration (FDA) each year (httpwwwfdagovDrugsDevelopmentApprovalProcessHowDrug-sareDevelopedandApprovedDrugandBiologicApprovalRe-portsdefaulthtm) Even so there is a declining trend inannual drug introductions and the most significant cause ofthis productivity decline has been the decrease in develop-ment survival rates especially the attrition rate in Phase IItrials [3] It was estimated that Phase II survival in drugdevelopment has dropped from near 50 to less than 30[4 5]

As the concept of ldquoone drug multiple targetsrdquo [6] isnow widely recognized drug repositioning or repurposing(ie finding a new use for an existing drug) [7] has beenproposed to provide a solution to the above problems faced

by pharmaceutical companies For example Rituxan wasoriginally indicated for non-Hodgkinrsquos Lymphoma and itwas later approved for chronic lymphocytic leukemia andrheumatoid arthritis Since drug repositioning can benefitnot only pharmaceutical companies but also patients variousefforts including traditionally blind screening methods ofchemical libraries against specific cell lines [8] or cellularorganisms [9 10] and serial testing of animal models [11]have been made to search new indications for existing drugsMeanwhile to reduce cost and time of in vivo and invitro experiments many computational methods have beenpublished for drug repositioning [12ndash22]These methods canbe classified as ldquodrug basedrdquo or ldquodisease basedrdquo and theymainly take similarity measures (chemical similarity molec-ular activity similarity or side effect similarity) moleculardocking or shared molecular pathology to reveal potentialrepurposing opportunities [23] New drug targets mainlyproteins genes or pathways are predicted by these methodsfor further drug repositioning Applications and limitationscoexist in these methods and comprehensive summaries ofrecent advancement of computational drug repositioning areavailable in the two reviews [23 24]

Hindawi Publishing CorporationComputational and Mathematical Methods in MedicineVolume 2015 Article ID 130620 7 pageshttpdxdoiorg1011552015130620

2 Computational and Mathematical Methods in Medicine

More recently several computational strategies have beenproposed to predict direct drug-disease associations for drugrepositioning The Comparative Toxicogenomics Database(CTD httpctdbaseorg) [25] is a comprehensive andpublicly available database which inferred chemical-diseaseassociations by integrating chemical-gene interactions andgene-disease relationships received manually from publishedliterature Chiang and Butte [26] implemented a network-based guilt-by-association method (GBA) for drug-diseaseassociation prediction and novel drug use suggestions weremade based on shared treatment profile from disease pairsThis repositioning strategy is limited by the complex relation-ships between drugs and diseases asmany drugs are indicatedas palliative treatments for diseases like cancers [23] Basedon the observation that similar drugs are indicated forsimilar diseases Gottlieb et al [27] developed a novel algo-rithm PREDICT to infer potential drug-disease associationsand predict new drug indications Multiple drug-drug anddisease-disease similaritymeasures were integrated into theirmethod and excellent prediction accuracy can be obtainedon cross validation However negative samples were neededin their model to implement the prediction procedureExperimentally verified negative drug-disease associationsare not available due to lack of research value A comodule(important gene modules shared by both drugs and diseases)method was proposed by Zhao and Li [28] to understandhow drugs and diseases are associated in the molecularlevel Applying the method to simulations and real datademonstrated that it was able to identify new drug-diseaseassociations and highlight their molecular basis Huang et al[29] designed a network propagation model and exploitedexisting chemical genomic and disease phenotype data toinfer drug-proteingene-disease phenotype relationships inwhich genes with similar functional modules were related tonot only drugs but also the disease phenotypeDaminelli et al[30] integrated structural and chemical data to build a drug-target-disease network and mined the network for networkmotifs of bi-cliques where every drug is linked to every targetand disease Links from drugs to diseases were predicted bycompleting the incomplete bicliques Ye et al [31] collectedknown drug target information and disease pathway profileto develop a method for evaluating the relationships betweendrugs and a specific disease Emig et al [32] integrated diseasegene expression signatures drug targets disease informationand molecular interaction network to prioritize drug targetsfor drug repositioning

Computational inference methods are important waysto choose the most promising drug-disease associations forfurther drug repositioning Current computational methodsneed biologically relevant information including negativeassociation results similarity measures or gene expressionprofiles to assist drug-disease association prediction Suchadditional information may not be easily available or maycontain errors [23] Different from previous research inthis paper two inference methods ProbS and HeatS wereintroduced to predict direct drug-disease associations basedonly on the basic network topology measure We formulatedthe problem as recommending preferable diseases for drugsNetwork topology was used to prioritize the potentially

targeted diseases for drugs We tested the two methods onan experimentally verified dataset with leave-one-out crossvalidation for performance evaluation Experimental resultsshowed that both methods can receive reliable predictionperformance and achieve AUC values of 09192 and 09079respectively The method ProbS with a better performancewas selected for potential drug-disease association predic-tion Case studies demonstrated that some of the stronglypredicted associations are confirmed by the publicly availabledatabaseCTD [25] which indicated the practical applicationsof the method ProbS in a real environment Finally plenty ofpredicted drug-disease associationswere publicly released forfuture drug repositioning It is expected that these methodswill provide help to facilitate further research on drugrepositioning

2 Materials and Methods

21 Dataset Experimentally confirmed drug-disease associ-ations were downloaded from the supplementary materialof [27] At the time of the paper [27] was written Got-tlieb et al collected 1933 associations between 593 drugstaken from DrugBank [34] and 313 diseases listed in theOnline Mendelian Inheritance in Man (OMIM) database[35] This set of known drug-disease associations is regardedas the ldquogold standardrdquo data and is used for evaluating theperformance of our introduced methods in the followingcross validation experiments as well as training data in thecomprehensive association prediction

We denote the drug set as 119863 = 1198891 119889

2 119889

119898 and the

disease set as 119875 = 1199011 119901

2 119901

119899 The drug-disease associ-

ations can be described as a bipartite DP graph 119866(119863 119875 119864)where 119864 = 119890

119894119895 119889

119894isin 119863 119901

119895isin 119875 An edge is drawn between

the drug 119889119894and the disease 119901

119895if there exists an association

between them The DP bipartite network can be presentedby an 119899 times 119898 adjacent matrix 119886

119894119895 where 119886

119894119895= 1 if 119901

119894and

119889

119895is linked while all other unknown drug-disease pairs are

labeled as 0 to indicate that their associations need to bepredicted

22 Method Description The methods ProbS and HeatS weapplied in this paper are based on the recommendationtechniques developed by Zhou et al [33 36] We formulatedour problem as recommending diseases for a given drug bymining data on the drug-disease bipartite network properties

Given the drug-disease bipartite network defined aboveProbSworks by assigning diseases an initial resource denotedby the vector 119891 and 119891(119901

119894) is the initial resource allocated to

the 119894th disease in the disease set 119875 The initial resource of anassociated disease is set to be 1 in our study otherwise it is setto be 0 In the first step all the resource in the disease set 119875flows to the drug set119863 according to

119891 (119889

119897) =

119899

sum

119894=1

119886

119894119897119891 (119901

119894)

119896 (119901

119894)

(1)

Computational and Mathematical Methods in Medicine 3

1 13 3136

518

16

2536

56

56

0

0

1

(a)

12 1318

23

12

5623

1

1

0

0

1

(b)

Figure 1 The flowchart of the two methods ProbS (a) and HeatS (b) The cylinder objects and the ellipse objects mean drugs and diseasesrespectively This figure is inspired by [33]

Table 1 Statistics of the validated drug-disease association network

Number ofdrugs

Number ofdiseases

Number ofdrug-diseaseassociations

Average degree ofdrugs

Average degree ofdiseases Sparsity

593 313 1933 326 618 00104

where 119896(119901119894) is the degree of node 119901

119894 Subsequently all the

resource in the drug set119863 returns similarly back to the diseaseset 119875 The final resource allocated to 119901

119894is calculated as

119891

1015840(119901

119894) =

119898

sum

119897=1

119886

119894119897119891 (119889

119897)

119896 (119889

119897)

=

119898

sum

119897=1

119886

119894119897

119896 (119889

119897)

119899

sum

119895=1

119886

119895119897119891 (119901

119895)

119896 (119901

119895)

(2)

The final recommendation list of candidate diseases is sortedaccording to the results of (2) in a descending order Thedisease(s) with the highest score(s) is chosen as the newindication(s) for the corresponding drug

While for the method HeatS [36] the final resourceallocated to 119901


119891

1015840(119901

119894) =

119898

sum

119897=1

119886

119894119897

119896 (119901

119894)

119899

sum

119895=1

119886

119895119897119891 (119901

119895)

119896 (119889

119897)

(3)

Comparison of the main pipeline of the two methods isillustrated in Figure 1

3 Results

31 Construction and Characteristics of the Drug-DiseaseAssociation Network In this study we first focused on theverified drug-disease associations There were 1933 experi-mentally confirmed drug-disease associations including 593drugs and 313 diseases on our benchmark dataset We usedthe 1933 associations to generate a bipartite graph (Figure 2)In the bipartite graph the heterogeneous nodes correspond toeither drugs or diseases and edges correspond to associationsbetween them An edge is placed between a drug node and adisease node if the drug is known to have an association withthe disease It can be observed that drugs tend to bind knowndiseases forming highly interconnected subnetworks

Figure 3 gives the degree distributions of drugs anddiseases in the drug-disease network We found that more

than 70 (422593) of the drugs were associated with at leasttwo diseases and approximately 66 (206313) diseases wereassociated with two or more drugs The average number ofassociated diseases for each drug is 326 and the averagenumber of associated drugs for each disease is 618 Moredetails are available in Figure 3 and Table 1

32 Leave-One-Out Cross Validation for Performance Eval-uation To assess the comparative performance of bothmethods in predicting new indications for existing drugswe perform leave-one-out cross validation experiments onthe benchmark dataset For a given drug 119889 each knownassociated disease was left out once in turn as test diseasewhose initial resource was set to be 0 and the candidatedisease set consisted of all the diseases which had no evidenceto show their associations with the drug 119889 The entireassociations were prioritized according to the scores derivedfrom the two methods

We calculated the sensitivity and specificity for eachthreshold Sensitivity refers to the percentage of the asso-ciations whose ranking is higher than a given thresholdnamely the ratio of the successfully predicted experimentallyverified drug-disease associations to the total experimentallyverified drug-disease associations Specificity refers to thepercentage of associations that are below the thresholdReceiver-operating characteristics (ROC) curveswere plottedby varying the threshold and the values of area under curves(AUC) were calculated When the two methods ProbS andHeatS were tested on the 1933 experimentally verified drug-disease associations in the framework of leave-one-out crossvalidation reliable AUC values of 09192 and 09079 werereceived respectively To be instructive we also providedAUC values for each drug received by leave-one-out crossvalidation (see Supplementary Material S1 available onlineat httpdxdoiorg1011552015130620 for themethodProbSand S2 for HeatS)


Figure 2 Drug-disease association network Red rectangles and yellow rectangles indicate drugs and diseases respectively The bipartitenetwork is generated by using 1933 experimentally verified associations between drugs and diseases This network is prepared by Cytoscape(httpwwwcytoscapeorg)

We also calculated the positions of preferably drug-disease associations ranked on the recommendation list forthe two methods For instance if there are 100 candidatediseases for a drug119889 and an associated disease119898 is ranked 5thin the candidate diseases we say the position of the disease119898is 5100 denoted by ⟨119875⟩ = 005 A good algorithm is expectedto give a small ⟨119875⟩Therefore we used the average value of theposition ⟨119875⟩ over all diseases in the candidate set to comparethe algorithmic accuracy The average values of the position⟨119875⟩ for ProbS andHeatSwere 00319 and 00434 respectively

The reliable performance suggested that both the twomethods can recover the confirmeddrug-disease associationsand therefore has the potential to predict new drug-diseaseassociations for drug repositioning

33 Comparison with Other Methods Recently several com-putational models which were based on different data fea-tures have been proposed for drug-disease association pre-diction Some information like protein-protein interactionsadopted in [29] has a high rate of false-positive and false-negative results which will influence experimental resultsThe most recent study related with our work is the computa-tionalmodel proposed byGottlieb et al [27] whichwas basedon the observation that similar drugs are indicated for similardiseases Excellent prediction performance can be receivedby this method However one limitation of this model is thatnegative samples were needed for association prediction Ourmethod is based on the experimentally verified drug-diseaseassociations and does not make use of any prior biologicalknowledge


0

20

40

60

80coun

t 100

120

140

160

180

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22

Drugs

degree

(a)

0

20

40

60

80

100

120

1 5 9 131721252933374145495357616569737781

Diseases

degree

coun

t(b)

Figure 3 Degree distributions of drugs and diseases in the drug-disease network (a) shows the histograms of the degree distributions ofdrugs (b) shows the histograms of the degree distributions of diseases

Table 2 The newly confirmed drug-disease associations in the top 10 predicted results of the drug Felodipine

Drug Disease Rank StatusFelodipine Hypoparathyroidism sensorineural deafness and renal disease 1 ConfirmedFelodipine Hypertension essential 2 ConfirmedFelodipine Hydrops-ectopic calcification-moth-eaten skeletal dysplasia 3Felodipine Enteropathy familial with villous edema and immunoglobulin G2 deficiency 4Felodipine Preeclampsiaeclampsia 1 Pee1 5Felodipine Insensitivity to pain with hyperplastic myelinopathy 6Felodipine Glaucoma 1 open angle A Glc1A 7Felodipine Acanthosis nigricans with muscle cramps and acral enlargement 8Felodipine Atrial fibrillation familial 3 Atfb3 9 ConfirmedFelodipine Prostatic hyperplasia benign Bph 10 Confirmed

34 Case Studies To further explain the inference power ofthe method ProbS on drug-disease association prediction weconducted case studies on three drugs (Felodipine Aspirinand Tamoxifen) The whole candidate diseases were rankedaccording to the method We manually checked the top10 predicted associations and confirmed that 4 6 and 6associations (Tables 2ndash4) are now annotated in the CTD(httpctdbaseorg) [25] database We take these as a strongevidence to support the practical application of the methodProbS Note that the predicted associations that are notyet reported may also exist in reality and they provideopportunities for drug repositioning

35 Comprehensive Prediction of Drug-Disease AssociationsAfter comprehensively confirming the accuracy of bothmethods we chose ProbS which showed a better perfor-mance to further predict novel drug-disease associationsfor future drug repositioning In this scenario we trained

ProbS with all the known drug-disease associations on thebenchmark dataset For all the 593 drugs we ranked andselected the top 10 predicted diseases for drug repositioning(SupplementaryMaterial S3) We believe that these predictedassociations would benefit drug research

4 Discussion

Searching novel drug-disease associations is a critical stepin drug repositioning Computational methods for drug-disease association prediction have received numerous inter-ests and current computational strategies depend on priorbiological information including gene expression profilesprotein-protein interactions (PPI) or chemical structuresSuch information may not be widely available and some maycontain errors

In this paper two network topology-based inferencemethods were introduced to predict potential drug-disease


Table 3 The newly confirmed drug-disease associations in the top 10 predicted results of the drug Aspirin

Drug Disease Rank StatusAspirin Mitochondrial myopathy encephalopathy lactic acidosis and stroke-like 1 ConfirmedAspirin Exostoses with anetodermia and brachydactyly Type E 2Aspirin Atrial fibrillation familial 1 Atfb1 3 ConfirmedAspirin Atrial fibrillation familial 3 Atfb3 4 ConfirmedAspirin Neuropathy hereditary sensory and autonomic Type I with cough and gastroesophageal reflux 5Aspirin Exostoses of heel 6Aspirin Motor neuropathy peripheral with dysautonomia 7Aspirin Migraine familial typical susceptibility to 2 8 ConfirmedAspirin Myasthenia gravisMg 9 ConfirmedAspirin Coronary artery disease autosomal dominant 1 Adcad1 10 Confirmed

Table 4 The newly confirmed drug-disease associations in the top 10 predicted results of the drug Tamoxifen

Drug Disease Rank StatusTamoxifen Mismatch repair cancer syndrome 1Tamoxifen Chorioretinal dystrophy spinocerebellar ataxia and hypogonadotropic 2Tamoxifen Prostate cancer 3 ConfirmedTamoxifen Acroosteolysis with osteoporosis and changes in skull and mandible 4Tamoxifen Hypogonadism male 5 ConfirmedTamoxifen Gastric cancer 6 ConfirmedTamoxifen Renal cell carcinoma nonpapillary Rcc 7 ConfirmedTamoxifen Kaposi sarcoma 8 ConfirmedTamoxifen Uterine anomalies 9Tamoxifen Osteoporosis 10 Confirmed

associations The essential difference between them is theresource-allocation strategies they applied Even though bothmethods do not use any prior biological knowledge experi-mental results showed that reliable prediction performancecan be achieved by the two methods Moreover both the twomethods are easy to be implemented and they have a low timecomplexity of (o(1198992119898))

Despite the encouraging results produced by the twomethods some limitations should be noted First as onlynetwork topology is applied both the two methods cannotwork for drugs without any known drug-disease associationsTo predict drug-disease associations for novel drugs reliablesimilarity measure needs to be taken into considerationMeanwhile the predicted results may be biased as our meth-ods depend heavily on existing drug-disease associations forpredictions and our current knowledge about drug-diseaseassociations is far from completeTherefore the performanceof the methods could be improved by integrating moreverified drug-disease associations

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

The authors are grateful to Dr Jianguo Liu at Univer-sity of Shanghai for Science and Technology for his help

This research was supported by the National Natural Sci-ence Foundation of China (Grants 60970095 61365008and 61165007) (httpwwwnsfcgovcn) the Nature ScienceFoundation of Jiangxi Province (no 20132BAB211036) Grant20123BBE50093 from Jiangxi Province Jiangxi Province Sci-ence and Technology Department (no 20132BBF60083) andthe Science and Technology Supported Project of JiangxiProvince Education Department (no GJJ13340)

References

[1] J A DiMasi ldquoNew drug development in the United States from1963 to 1999rdquo Clinical Pharmacology and Therapeutics vol 69no 5 pp 286ndash296 2001

[2] C P Adams and V Van Brantner ldquoEstimating the cost of newdrug development Is it really $802 million dollarsrdquo HealthAffairs vol 25 no 2 pp 420ndash428 2006

[3] B Booth and R Zemmel ldquoProspects for productivityrdquo NatureReviews Drug Discovery vol 3 no 5 pp 451ndash456 2004

[4] L Brothers The Fruits of Genomics Lehman Brothers NewYork NY USA 2001

[5] R Edmunds P C Ma and C P Tanio ldquoSplicing a cost squeezeinto the genomics revolutionrdquo McKinsey Quarterly vol 2 pp15ndash18 2001

[6] I Vogt and J Mestres ldquoDrug-target networksrdquo MolecularInformatics vol 29 no 1-2 pp 10ndash14 2010

[7] T T Ashburn and K B Thor ldquoDrug repositioning identifyingand developing new uses for existing drugsrdquo Nature ReviewsDrug Discovery vol 3 no 8 pp 673ndash683 2004


[8] J N Weinstein T G Myers P M OrsquoConnor et al ldquoAninformation-intensive approach to the molecular pharmacol-ogy of cancerrdquo Science vol 275 no 5298 pp 343ndash349 1997

[9] T Hughes B Andrews and C Boone ldquoOld drugs new tricksusing genetically sensitized yeast to reveal drug targetsrdquo Cellvol 116 no 1 pp 5ndash7 2004

[10] P Y Lum C D Armour S B Stepaniants et al ldquoDiscoveringmodes of action for therapeutic compounds using a genome-wide screen of yeast heterozygotesrdquo Cell vol 116 no 1 pp 121ndash137 2004

[11] K E Kinnamon and W E Rothe ldquoBiological screening inthe US Army antimalarial drug development programrdquo TheAmerican Journal of Tropical Medicine and Hygiene vol 24 no2 pp 174ndash178 1975

[12] Y YamanishiM Araki A GutteridgeWHonda andM Kane-hisa ldquoPrediction of drug-target interaction networks from theintegration of chemical and genomic spacesrdquo Bioinformaticsvol 24 no 13 pp i232ndashi240 2008

[13] Y Yamanishi M Kotera M Kanehisa and S Goto ldquoDrug-target interaction prediction from chemical genomic and phar-macological data in an integrated frameworkrdquo Bioinformaticsvol 26 no 12 pp i246ndashi254 2010

[14] Z Xia L-Y Wu X Zhou and S T C Wong ldquoSemi-superviseddrug-protein interaction prediction from heterogeneous bio-logical spacesrdquo BMC Systems Biology vol 4 supplement 2article S6 2010

[15] M Gonen ldquoPredicting drug-target interactions from chemicaland genomic kernels using Bayesian matrix factorizationrdquoBioinformatics vol 28 no 18 pp 2304ndash2310 2012

[16] X Chen M-X Liu and G-Y Yan ldquoDrug-target interactionprediction by random walk on the heterogeneous networkrdquoMolecular BioSystems vol 8 no 7 pp 1970ndash1978 2012

[17] H Chen and Z Zhang ldquoA semi-supervised method for drug-target interaction prediction with consistency in networksrdquoPLoS ONE vol 8 no 5 Article ID e62975 2013

[18] D Fourches E Muratov and A Tropsha ldquoTrust but verifyon the importance of chemical structure curation in chemin-formatics and QSAR modeling researchrdquo Journal of ChemicalInformation and Modeling vol 50 no 7 pp 1189ndash1204 2010

[19] F Iorio R Bosotti E Scacheri et al ldquoDiscovery of drugmode ofaction and drug repositioning from transcriptional responsesrdquoProceedings of the National Academy of Sciences of the UnitedStates of America vol 107 no 33 pp 14621ndash14626 2010

[20] R L Chang L Xie P E Bourne and B Oslash Palsson ldquoDrug off-target effects predicted using structural analysis in the context ofa metabolic network modelrdquo PLoS Computational Biology vol6 no 9 Article ID e1000938 2010

[21] S Suthram J T Dudley A P Chiang R Chen T J Hastieand A J Butte ldquoNetwork-based elucidation of human diseasesimilarities reveals common functional modules enriched forpluripotent drug targetsrdquo PLoS Computational Biology vol 6no 2 Article ID e1000662 2010

[22] J T Dudley E Schadt M Sirota A J Butte and E AshleyldquoDrug discovery in a multidimensional world systems pat-terns and networksrdquo Journal of Cardiovascular TranslationalResearch vol 3 no 5 pp 438ndash447 2010

[23] J T Dudley T Deshpande and A J Butte ldquoExploiting drug-disease relationships for computational drug repositioningrdquoBriefings in Bioinformatics vol 12 no 4 Article ID bbr013 pp303ndash311 2011

[24] ZWu YWang and L Chen ldquoNetwork-based drug reposition-ingrdquoMolecular BioSystems vol 9 no 6 pp 1268ndash1281 2013

[25] A P Davis C G Murphy R Johnson et al ldquoThe comparativetoxicogenomics database update 2013rdquo Nucleic Acids Researchvol 41 no 1 pp D1104ndashD1114 2013

[26] A P Chiang and A J Butte ldquoSystematic evaluation of drug-disease relationships to identify leads for novel drug usesrdquoClinical Pharmacology andTherapeutics vol 86 no 5 pp 507ndash510 2009

[27] A Gottlieb G Y Stein E Ruppin and R Sharan ldquoPREDICT amethod for inferring novel drug indications with application topersonalized medicinerdquoMolecular Systems Biology vol 7 no 1article 496 2011

[28] S Zhao and S Li ldquoA co-module approach for elucidatingdrug-disease associations and revealing their molecular basisrdquoBioinformatics vol 28 no 7 pp 955ndash961 2012

[29] Y-F Huang H-Y Yeh and V-W Soo ldquoInferring drug-diseaseassociations from integration of chemical genomic and pheno-type data using network propagationrdquo BMCMedical Genomicsvol 6 supplement 3 article S4 2013

[30] S Daminelli V J HauptM Reimann andM Schroeder ldquoDrugrepositioning through incomplete bi-cliques in an integrateddrug-target-disease networkrdquo Integrative Biology vol 4 no 7pp 778ndash788 2012

[31] H Ye L L Yang ZW Cao K L Tang and Y X Li ldquoA pathwayprofile-based method for drug repositioningrdquo Chinese ScienceBulletin vol 57 no 17 pp 2106ndash2112 2012

[32] D Emig A Ivliev O Pustovalova et al ldquoDrug target pre-diction and repositioning using an integrated network-basedapproachrdquo PLoS ONE vol 8 no 4 Article ID e60618 2013

[33] T Zhou J Ren M Medo and Y-C Zhang ldquoBipartite networkprojection and personal recommendationrdquo Physical Review Evol 76 no 4 Article ID 046115 2007

[34] D S Wishart C Knox A C Guo et al ldquoDrugBank a knowl-edgebase for drugs drug actions and drug targetsrdquoNucleic AcidsResearch vol 36 supplement 1 pp D901ndashD906 2008

[35] A Hamosh A F Scott J Amberger C Bocchini D Valleand V A McKusick ldquoOnlined Mendelian Inheritance in Man(OMIM) a knowledgebase of human genes and genetic disor-dersrdquo Nucleic Acids Research vol 30 no 1 pp 52ndash55 2002

[36] T Zhoua Z Kuscsik J-G Liu M Medo J R Wakeling andY-C Zhang ldquoSolving the apparent diversity-accuracy dilemmaof recommender systemsrdquo Proceedings of the National Academyof Sciences of the United States of America vol 107 no 10 pp4511ndash4515 2010

Submit your manuscripts athttpwwwhindawicom

Stem CellsInternational

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014


MEDIATORSINFLAMMATION

of


Behavioural Neurology

EndocrinologyInternational Journal of



Disease Markers


BioMed Research International

OncologyJournal of



Oxidative Medicine and Cellular Longevity


PPAR Research

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Immunology ResearchHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

ObesityJournal of



Computational and Mathematical Methods in Medicine

OphthalmologyJournal of


Diabetes ResearchJournal of



Research and TreatmentAIDS


Gastroenterology Research and Practice


Parkinsonrsquos Disease

Evidence-Based Complementary and Alternative Medicine

Volume 2014Hindawi Publishing Corporationhttpwwwhindawicom


More recently several computational strategies have beenproposed to predict direct drug-disease associations for drugrepositioning The Comparative Toxicogenomics Database(CTD httpctdbaseorg) [25] is a comprehensive andpublicly available database which inferred chemical-diseaseassociations by integrating chemical-gene interactions andgene-disease relationships received manually from publishedliterature Chiang and Butte [26] implemented a network-based guilt-by-association method (GBA) for drug-diseaseassociation prediction and novel drug use suggestions weremade based on shared treatment profile from disease pairsThis repositioning strategy is limited by the complex relation-ships between drugs and diseases asmany drugs are indicatedas palliative treatments for diseases like cancers [23] Basedon the observation that similar drugs are indicated forsimilar diseases Gottlieb et al [27] developed a novel algo-rithm PREDICT to infer potential drug-disease associationsand predict new drug indications Multiple drug-drug anddisease-disease similaritymeasures were integrated into theirmethod and excellent prediction accuracy can be obtainedon cross validation However negative samples were neededin their model to implement the prediction procedureExperimentally verified negative drug-disease associationsare not available due to lack of research value A comodule(important gene modules shared by both drugs and diseases)method was proposed by Zhao and Li [28] to understandhow drugs and diseases are associated in the molecularlevel Applying the method to simulations and real datademonstrated that it was able to identify new drug-diseaseassociations and highlight their molecular basis Huang et al[29] designed a network propagation model and exploitedexisting chemical genomic and disease phenotype data toinfer drug-proteingene-disease phenotype relationships inwhich genes with similar functional modules were related tonot only drugs but also the disease phenotypeDaminelli et al[30] integrated structural and chemical data to build a drug-target-disease network and mined the network for networkmotifs of bi-cliques where every drug is linked to every targetand disease Links from drugs to diseases were predicted bycompleting the incomplete bicliques Ye et al [31] collectedknown drug target information and disease pathway profileto develop a method for evaluating the relationships betweendrugs and a specific disease Emig et al [32] integrated diseasegene expression signatures drug targets disease informationand molecular interaction network to prioritize drug targetsfor drug repositioning

Computational inference methods are important waysto choose the most promising drug-disease associations forfurther drug repositioning Current computational methodsneed biologically relevant information including negativeassociation results similarity measures or gene expressionprofiles to assist drug-disease association prediction Suchadditional information may not be easily available or maycontain errors [23] Different from previous research inthis paper two inference methods ProbS and HeatS wereintroduced to predict direct drug-disease associations basedonly on the basic network topology measure We formulatedthe problem as recommending preferable diseases for drugsNetwork topology was used to prioritize the potentially

targeted diseases for drugs We tested the two methods onan experimentally verified dataset with leave-one-out crossvalidation for performance evaluation Experimental resultsshowed that both methods can receive reliable predictionperformance and achieve AUC values of 09192 and 09079respectively The method ProbS with a better performancewas selected for potential drug-disease association predic-tion Case studies demonstrated that some of the stronglypredicted associations are confirmed by the publicly availabledatabaseCTD [25] which indicated the practical applicationsof the method ProbS in a real environment Finally plenty ofpredicted drug-disease associationswere publicly released forfuture drug repositioning It is expected that these methodswill provide help to facilitate further research on drugrepositioning

2 Materials and Methods

21 Dataset Experimentally confirmed drug-disease associ-ations were downloaded from the supplementary materialof [27] At the time of the paper [27] was written Got-tlieb et al collected 1933 associations between 593 drugstaken from DrugBank [34] and 313 diseases listed in theOnline Mendelian Inheritance in Man (OMIM) database[35] This set of known drug-disease associations is regardedas the ldquogold standardrdquo data and is used for evaluating theperformance of our introduced methods in the followingcross validation experiments as well as training data in thecomprehensive association prediction

We denote the drug set as 119863 = 1198891 119889

2 119889

119898 and the

disease set as 119875 = 1199011 119901

2 119901

119899 The drug-disease associ-

ations can be described as a bipartite DP graph 119866(119863 119875 119864)where 119864 = 119890

119894119895 119889

119894isin 119863 119901

119895isin 119875 An edge is drawn between

the drug 119889119894and the disease 119901

119895if there exists an association

between them The DP bipartite network can be presentedby an 119899 times 119898 adjacent matrix 119886

119894119895 where 119886

119894119895= 1 if 119901

119894and

119889

119895is linked while all other unknown drug-disease pairs are

labeled as 0 to indicate that their associations need to bepredicted

22 Method Description The methods ProbS and HeatS weapplied in this paper are based on the recommendationtechniques developed by Zhou et al [33 36] We formulatedour problem as recommending diseases for a given drug bymining data on the drug-disease bipartite network properties

Given the drug-disease bipartite network defined aboveProbSworks by assigning diseases an initial resource denotedby the vector 119891 and 119891(119901

119894) is the initial resource allocated to

the 119894th disease in the disease set 119875 The initial resource of anassociated disease is set to be 1 in our study otherwise it is setto be 0 In the first step all the resource in the disease set 119875flows to the drug set119863 according to

119891 (119889

119897) =

119899

sum

119894=1

119886

119894119897119891 (119901

119894)

119896 (119901

119894)

(1)


1 13 3136

518

16

2536

56

56

0

0

1

(a)

12 1318

23

12

5623

1

1

0

0

1

(b)



Number ofdrugs

Number ofdiseases




593 313 1933 326 618 00104





119891

1015840(119901

119894) =

119898

sum

119897=1

119886

119894119897119891 (119889

119897)

119896 (119889

119897)

=

119898

sum

119897=1

119886

119894119897

119896 (119889

119897)

119899

sum

119895=1

119886

119895119897119891 (119901

119895)

119896 (119901

119895)

(2)




119891

1015840(119901

119894) =

119898

sum

119897=1

119886

119894119897

119896 (119901

119894)

119899

sum

119895=1

119886

119895119897119891 (119901

119895)

119896 (119889

119897)

(3)


3 Results












0

20

40

60

80coun

t 100

120

140

160

180

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22

Drugs

degree

(a)

0

20

40

60

80

100

120

1 5 9 131721252933374145495357616569737781

Diseases

degree

coun

t(b)







4 Discussion












Acknowledgments



References











































of






Disease Markers



OncologyJournal of





PPAR Research



Journal of

ObesityJournal of

















1 13 3136

518

16

2536

56

56

0

0

1

(a)

12 1318

23

12

5623

1

1

0

0

1

(b)



Number ofdrugs

Number ofdiseases




593 313 1933 326 618 00104





119891

1015840(119901

119894) =

119898

sum

119897=1

119886

119894119897119891 (119889

119897)

119896 (119889

119897)

=

119898

sum

119897=1

119886

119894119897

119896 (119889

119897)

119899

sum

119895=1

119886

119895119897119891 (119901

119895)

119896 (119901

119895)

(2)




119891

1015840(119901

119894) =

119898

sum

119897=1

119886

119894119897

119896 (119901

119894)

119899

sum

119895=1

119886

119895119897119891 (119901

119895)

119896 (119889

119897)

(3)


3 Results












0

20

40

60

80coun

t 100

120

140

160

180

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22

Drugs

degree

(a)

0

20

40

60

80

100

120

1 5 9 131721252933374145495357616569737781

Diseases

degree

coun

t(b)







4 Discussion












Acknowledgments



References











































of






Disease Markers



OncologyJournal of





PPAR Research



Journal of

ObesityJournal of






















0

20

40

60

80coun

t 100

120

140

160

180

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22

Drugs

degree

(a)

0

20

40

60

80

100

120

1 5 9 131721252933374145495357616569737781

Diseases

degree

coun

t(b)







4 Discussion












Acknowledgments



References











































of






Disease Markers



OncologyJournal of





PPAR Research



Journal of

ObesityJournal of

















0

20

40

60

80coun

t 100

120

140

160

180

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22

Drugs

degree

(a)

0

20

40

60

80

100

120

1 5 9 131721252933374145495357616569737781

Diseases

degree

coun

t(b)







4 Discussion












Acknowledgments



References











































of






Disease Markers



OncologyJournal of





PPAR Research



Journal of

ObesityJournal of

























Acknowledgments



References











































of






Disease Markers



OncologyJournal of





PPAR Research



Journal of

ObesityJournal of



















































of






Disease Markers



OncologyJournal of





PPAR Research



Journal of

ObesityJournal of





















of






Disease Markers



OncologyJournal of





PPAR Research



Journal of

ObesityJournal of
















Documents

Research Article Network-Based Inference Methods for Drug ...downloads.hindawi.com/journals/cmmm/2015/130620.pdf · future drug repositioning. It is expected that these methods will