A white paper on Collaborative Drug Discovery: The Rising Importance of Rare And Neglected Disease’s

Sean Ekins, November 2010

1

Collaborative Drug Discovery : The Rising Importance of Rare And Neglected

Diseases

Sean Ekins, Ph.D., D.Sc.

Image generated with www.Tagxedo.com


2

Transformation of the Pharmaceutical Industry

We have seen in a decade a big transformation in the pharmaceutical industry brought about by massive patent expirations of blockbuster drugs, increased generic penetration and concerns about rapidly escalating healthcare costs. The industry has had to adapt by acquiring or partnering to bring in innovation or products, diversify into generics and consumer products, move into emerging markets and target complimentary businesses such as animal health [1].

Overall big pharma has looked at the key chronic diseases in the western hemisphere. Yet if we think about healthcare from a global perspective there are still diseases (neglected) common in the developing world that can in most cases be readily treated with available drugs, while resistance is occurring and there is a need for new drugs to be developed.

Neglected infectious diseases such as tuberculosis (TB) and malaria kill over two million people annually [2] while estimates suggest that over 2 billion individuals are infected with Mycobacterium tuberculosis (Mtb) alone [3]. These statistics represent both enormous economic and healthcare challenges for the countries and governments affected. Also there are thousands of diseases that occur in small patient populations and are not addressed by any treatments [4], these are classed as rare or orphan diseases. Neglected and rare diseases traditionally have not been the focus of big pharma, while biotech and academia have been primarily involved in their drug discovery.

This situation is changing primarily because pharma’s see these rare or neglected diseases as a way to bring in more revenue as well as improve public relations. Developing treatments for rare or orphan diseases brings additional benefits for an industry struggling to bring new treatments to market for more common diseases compared with the $100’s millions licensing drugs for other diseases [5]. Within a very short time we have seen GSK make some relatively small investments in rare diseases [6], as well as Pfizer [7] and several big pharmas and the WHO working together and investing $150M on neglected diseases [8]. These are likely only the tip of the iceberg and more substantial deals will follow in future to solidify the trend.

We are also witnessing shifts in how pharmaceutical research can be potentially accelerated or made more efficient including by decentralizing research, engaging with the external research communities through crowdsourcing etc. Overall there is a trend towards collaboration [9-13]. In parallel there is a renewed interest in neglected disease research (on malaria, tuberculosis (TB), kinetoplastids etc [14]) due to the significant influence of the US National Institutes of Health (NIH), foundations such as the Bill and Melinda Gates Foundation, The European Commission and increasing investment from pharmaceutical companies and others [14,15].

The dividing line between diseases that are rare or neglected may be very fuzzy. Traditionally rare diseases have small patient populations though there is no global agreement on what this size is, although in the US it is a disease that affects less than 200,000 people. Clearly such a ‘small’ market size would make these diseases less marketable compared with cancers, cardiovascular disease, diabetes etc. which number in the millions treated annually. For example treating addictions for cocaine, methamphetamine and cannabis are major public health issues. Statistics suggest there are over 1 million users of methamphetamine annually in the US. According to Dr. Phil Skolnick, the Director, Division of Pharmacotherapies and Medical Consequencies of Drug Abuse, National


3

Institute on Drug Abuse, the development of treatments has been impacted by big pharma mergers [16]. These resulted in a loss of research programs which along with other companies dropping CNS programs, the reduction in enthusiasm for working in this area and low probability of success for treating CNS disorders, all make the research environment difficult and suggest the need for new research approaches.

In the neglected diseases space we are seeing academics and companies look at repurposing compounds that are already approved for other indications, a strategy being applied elsewhere [17,18]. The benefits of this are working on known druggable targets, availability of materials and hence making it cheaper and faster. Even from the academic side there is transformation occurring in which the NIH is requiring more collaborative research and proposals that reward the complete drug discovery paradigm. Dr. Michael Pollastri at Northeastern University has suggested a distributed model for neglected disease research in which different groups from other institutions contribute their specific expertise [19]. Such research networks may not be unique to neglected diseases and could be applied to more common diseases.

But what will be needed for all of these initiatives will be cost effective secure software for selective sharing of chemical structures and data between collaborators who are likely to be chemists and biologists by training [20].

Go forth and screen

Drug Discovery in the pharmaceutical industry has for over 20 years relied on the “brute force” industrialization of the process rather than the “trial and error” serendipity which produced many drugs in the past. This has reached a pinnacle in the high throughput screening (HTS) methods that are in use across the industry both for finding hits against targets and counter screening. Ricardo Macarrón, PhD, VP of Sample Management Technologies at GlaxoSmithKline has suggested recently that HTS is now producing drugs and healthy return on investment producing from 20-70% of leads for targets at GSK [21]. HTS is now a key component of the drug discovery process at GSK and elsewhere. While there are many drugs recently approved by the FDA (mainly cancer or HIV treatments) that came out of HTS hits in the early 1990’s. This suggests a sobering lesson for those working on neglected and rare diseases, even if hits are found by HTS and its many variants today (or for that matter any technology) a drug may not emerge for over a decade due to the lengthy clinical trials and regulatory approval process. That is unless something dramatic changes to shorten this process. Recently GSK released malaria HTS screening data which is hosted in the CDD database (see later).

Even if a HTS campaign is run for a target or against a disease it is no guarantee of finding a hit that can be optimized [22] and in vitro screens may not be very predictive due to an incomplete understanding of disease biology and if it is a microorganism its replication status may be unknown. Dr. Cifton Barry, Senior Investigator of the Tuberculosis Section of the National Institute of Allergy & Infection Diseases and collaborators has explored the limits of target vulnerability in Mycobacterium Tuberculosis (Mtb) using quantitative HTS (qHTS), in which a compound library is screened under different conditions [23]. These conditions produce pan-active as well as condition selective hits. A pairwise comparison showed that 90% of the hits could be found with glucose, cholesterol or low pH screening conditions. Enabling the sharing of such large Mtb HTS screening data between


4

collaborators has been facilitated by CDD in a grant funded by the Bill and Melinda Gates Foundation.

One common observation looking at hits and approved drugs for neglected diseases is that to the experienced chemist many of the molecules appear ugly. As beauty is in the eye of the beholder it is hard to define ‘ugly’ but the incorporation of rules for chemical reactivity or structural alerts [24-28] can help. These filters in particular pick up a range of undesirable chemical substructures such as thiol traps and redox-active compounds, epoxides, anhydrides, and Michael acceptors. Reactivity can be defined as the ability to covalently modify a cysteine moiety in a surrogate protein [26-28]. Older rules such as the Lipinski rule of 5 [29] have been more widely used. For example if you look at the FDA approved drugs nearly 90% pass this rule (Figure 1). However the more Lipinski violations a compound has also correlates with the increase in the failure using various pharmaceutical filtering methods for reactive groups (Figure 2) [30]. So this suggests some undesirable or ugly molecules may have additional risks such as undesirable promiscuity or toxicity [31].

Dr. Richard Elliott, Senior Program Officer at the Bill & Melinda Gates Foundation thinks that the types of ugly compounds for neglected diseases may be related to having to cross multiple cell walls, and have activatable warheads for activity that can act on multiple targets or via non specific mechanisms [32]. Therefore such compounds may still become effective drugs and will require using a variety of tools to understand the risk that can be assessed with computational, in vitro and in vivo methods. He also thinks we need new chemistry to explore more chemical diversity.

75.2

13.5

5.7 5.50.1

0

20

40

60

80

100

0 1 2 3 4

Number of Lipinski violations

% o

f FD

A d

rugs

Figure 1. Percent of FDA approved drugs (N = 2804) and Lipinski rule of five violations (≥ 2 = failure) [30].


5

0

20

40

60

80

100

0 1 2 3 4

Number of Lipinski violations

% S

MA

RT

s fil

ter

failu

res

in F

DA

app

rove

d dr

ugs

% Abbott Alarm

% Pfizer Blake

% Glaxo filter

% Accelrys

Figure 2. A plot of the percentage of SMARTs filter failures for compounds with different numbers of Lipinski violations [30].

When so many research groups are screening similar or overlapping chemical libraries using HTS methods there has to be a balance between accepting less desirable looking molecules and problematic molecules. Dr. Jonathan Baell from the Walter and Eliza Hall Institute, Melbourne, showed that many classes of compounds can be active against many targets [33]. Such frequent hitters can interfere with assays due to color, being redox-active, chelating and protein reactive [34]. This can be a major problem for many academic screening groups that are not experienced in these frequent hitters and they subsequently may publish hits which are actually frequent hitters. The research community needs ways to alert them to such frequent hitter compounds [34].

Filtering

We have recently seen several large HTS datasets of compounds for TB and malaria become

available publically. For example GSK released >13,500 in vitro screening hits against Malaria using Plasmodium falciparum along with their associated cytotoxicity (in HepG2 cells) data from an initial screen of over 2 million compounds [35]. Three data bases initially all hosted the data (European Bioinformatics Institute-European Molecular Biology Laboratory (EBI-EMBL, ChEMBL http://www.ebi.ac.uk/chembl/), PubChem (http://pubchem.ncbi.nlm.nih.gov/) and CDD [20], while others also followed suit including ChemSpider from the Royal Society of Chemistry (www.chemspider.com).

We have also undertaken an evaluation of this and other datasets using a simple descriptor analysis as well as readily available substructure alerts or “filters” [36-38]. For example (~57-76%,


6

respectively) of the GSK malaria screening hit molecules fail the Pfizer and Abbott filters [26] (Figure 3). We have also recently used the same rules to filter sets of compounds with activity against tuberculosis [39,40], with 81-92% failing the Abbott filters [38] (Figure 4) which may be related to mechanism of action. A detailed analysis of our calculated molecular descriptors for the GSK malaria hits [35] shows that most are normally distributed apart from the skewed Lipinski violations data and the bimodal molecular weight. Interestingly 3,269 (24.3%) of the compounds fail more than one of the Lipinski rules of 5 (MW ≤ 500, logP ≤ 5, HBD ≤ 5, HBA ≤ 10) [29] using the descriptors calculated in the CDD database. The GSK screening hits are generally large and very hydrophobic as is also suggested in their publication [35], and although they suggested this may be important to reach intracellular targets, there is no discussion of the limitations of such compounds. We have also suggested these compounds may not be ‘lead-like’ [41,42] and are closest to ‘natural product lead-like’ [43]. These antimalarial hits as a group are also vastly different to the mean molecular properties of compounds that have shown activity against TB, which are generally of lower molecular weight, less hydrophobic and with lower pKa and fewer RBN [44].

Figure 3. Percent failure of SMARTS filters (http://pasilla.health.unm.edu/tomcat/biocomp/smartsfilter) for different antimalarial datasets.

Figure 4. Percent failure of SMARTS filters (http://pasilla.health.unm.edu/tomcat/biocomp/smartsfilter) for different TB datasets.


7

The GSK antimalarial hits dataset [35] also stood out from the other antimalarial screening datasets in terms of physicochemical properties as the mean molecular weight, logP and number of rotatable bonds were much higher than in the St. Jude [45] and Novartis datasets of antimalarial compounds [46]. The GSK, St Jude and Novartis datasets also have very high failure rates with the Abbott Alerts [26,28] (75- 85%) and Pfizer Lint filters (40-57%) (Figure 3). A set of 14 FDA approved widely used antimalarial drugs has properties much closer to the St Jude and Novartis hits. These compounds had fewer failures with the Abbott filters when compared to the GSK, Novartis and St. Jude antimalarial datasets.

Many companies avoid compounds that have reactive groups prior to screening and the

availability and use of such computational filters is common. This is not however the case in academia. Our analysis suggests that hits from some of these HTS datasets may represent a more difficult starting point for lead optimization.

By creating a collaborative database CDD TB, we have been able to compare on a very large

scale, actives and inactives against Mtb in a dataset containing over 200,000 compounds [44]. The mean molecular weight (357 ± 85), logP (3.6 ±1.4) and rule of 5 alerts (0.2 ± 0.5) were statistically significantly (based on t-test) higher in the most active compounds, while the mean PSA (83.5 ± 34.3) was slightly lower compared to the inactive compounds for the single point screening data [44]. Our most recent analysis for TB used a dataset consisting of another 102,633 molecules screened by the same laboratory against Mtb [38]. We were able to analyze the molecular properties, differentiate the actives from the inactives and show that the actives had statistically significantly (based on t-test) higher values for the mean logP (4.0 ± 1.0) and rule of 5 alerts (0.2 ± 0.4), while also having lower HBD count (1.0 ± 0.8), atom count (41.9 ± 9.4) and lower PSA (70.3 ± 29.5) than the inactives [38]. Overall, comparing these two datasets the mean values are remarkably similar.

Figure 5. Integrating the CDD TB database into various TB screening paradigms [30].


8

A more recent analysis of TB screening data (<300 compounds) from Novartis available in

CDD suggests that we can also differentiate aerobic and anaerobic compounds based on their statistically different mean molecular properties [30]. These analyses may help scientists to focus on compounds with properties that may lead to increased probability of bioactivity against this or other neglected diseases. In addition these large datasets can be used to create computational machine learning models that can identify active molecules against infectious diseases [30,38,44] such as TB and databases like CDD may have a role for both target-based and phenotypic screening (Figure 5) [30].

Engaging Big Pharma and Helping the Community

We have recently asked the question “are there technologies that we could bring together in pharmaceutical research that may seem rather simplistic yet if combined could lead to new insights?” From a cheminformatics perspective we suggested secure sharing of chemical information [47] and collaborations between groups as one such technology for the future. As computational chemistry software companies have generally catered to the computational modeling community and have not done well in translating their tools to bench biologists and chemists it will be important that tools such as CDD can cross scientific boundaries and do not require an expert user. We think the future of drug discovery will be different to what it is now, collaborative networks will be key and software tools for sharing data and analysis that are frequently used should have a low barrier to entry similar to using Google, Facebook and Twitter etc. Mobile computing devices also present a new frontier (and business opportunity) with constraints in how much can be shown on very small screen real estate, which might drive cheminformatics software developers to consider how they expose their tools to new users [48] in the pharmaceutical industry or academia. Uses of such tools may also be driven by the academic scientific community if they are found to be of value.

Biomedical research is moving quickly towards a collaborative network of chemists and biologists but they commonly find themselves overwhelmed by the availability of information (especially if they are in industry). Today we find a major limitation in the availability of biological information related to the understanding of absorption, distribution, metabolism, excretion and toxicity (ADME/Tox) data [49-51] for drugs and molecules evaluated as drug candidates. We would argue that ADME/Tox data is also precompetitive data and should be made freely available on the web as a resource for all scientists. Generating this data is also very costly and in many cases data is reproduced by different groups when comparing their own proprietary compounds with a competitor compound. Why not share this data? It would certainly enable the industry to quickly understand ADME/Tox liabilities with different classes of compounds targeting a specific indication and enable the generation of computer models for these properties. We have proposed that the scientific community should tackle the lack of public databases that contain preclinical ADME/Tox or pharmacokinetic data [52]. This would naturally greatly assist those in the neglected disease space were such data is rarely generated. For example scientists could expose their ADME/Tox data in CDD.

In parallel to this, pharmaceutical companies increasingly evaluate lead compounds for drug-like properties (such as ADME/Tox) very early on in the discovery process using computational prediction methods utilizing experimental data from in vitro or physicochemical property assays [53].


9

Well validated ligand-based in silico approaches are important and exist in the large pharmaceutical companies because these organizations have large diverse proprietary data sets, the financial resources for expensive commercial software and access to in-house computational, medicinal chemistry and high-throughput screening expertise. All these enablers are generally or in part lacking in academia, small biotechnology companies and non-profit neglected disease foundations. In collaboration with Pfizer we have demonstrated how ligand-based computational models could be more readily shared between researchers and organizations if they were generated with open source molecular descriptors (e.g. chemistry development kit, CDK) and modeling algorithms, as this would negate the requirement for proprietary commercial software [54]. We initially evaluated open source descriptors and model building algorithms using a training set of approximately 50,000 molecules and a test set of approximately 25,000 molecules with human liver microsomal metabolic stability data. A C5.0 decision tree model demonstrated that CDK descriptors together with a set of SMARTS keys had good statistics (Kappa = 0.43, sensitivity = 0.57, specificity 0.91, positive predicted value (PPV) = 0.64) equivalent to models built with commercial MOE2D and the same set of SMARTS keys (Kappa = 0.43, sensitivity = 0.58, specificity 0.91, PPV = 0.63). Extending the dataset to ~193,000 molecules and generating a continuous model using Cubist software with a combination of CDK and SMARTS keys or MOE2D and SMARTS keys confirmed this observation. The same combination of descriptor set and modeling method was applied to other ADME datasets with similar model testing statistics.

In summary, open source tools demonstrated comparable predictive results to commercial software with attendant cost savings (Figure 6). The results of this study may provide an important starting point for a validated universal framework for enabling the sharing of ADME/Tox models and facilitating their use for making predictions by third parties, without the requirement of sharing sensitive molecule structure data.

Figure 6. Generating and sharing computational models.


10

The beneficiaries of such open ADME/Tox models would be those in academia, foundations

e.g. in particular those working on rare or neglected diseases. In addition, pharmaceutical companies could avoid duplicative testing and cover more chemical space. This open models approach could certainly result in improved predictions and greater applicability of such models for use by groups with compounds of interest, but with no idea of their ADME properties and ultimately predict likely issues before they become major hurdles to a project. Our work suggests a new approach to sharing ADME/Tox models built using widely available open descriptors and algorithms. CDD will certainly be at the forefront of model sharing in the future in order to benefit all groups doing drug discovery research.

Why collaboration matters

In the long history of human kind (and animal kind, too) those who have learned to collaborate and improvise most effectively have prevailed. - - Charles Darwin

It is also clear that the “new drug discovery” will put a renewed emphasis on collaboration and that research on neglected and rare diseases will require this for success to connect disparate researchers around the globe and create virtual drug discovery teams. Currently available computational database tools for drug discovery, and chemistry in particular are not collaborative and are of limited application for drug development [55]. Therefore at CDD we emphasize collaboration as what differentiates us from other companies and technologies currently available.

We recently asked people through an online forum what collaborations meant to them? We

had responses like “ collaboration, to me, means that folks from disparate disciplines or skills work together towards the same end-goal. … A collaboration means free and open data sharing, transparent goals and intentions, and a relationship that allows open (frank) and constructive discussion” and “the internet is the perfect place to share (certain) data and many of the new technologies and format available at the Web (REST, SOAP etc.) are perfect to use data collaboratively”. In recent months CDD has been putting the finishing touches to “Projects”, soon-to-be released functionality that will enhance the capability to share research data securely using CDD (Figure 7).

This will enable users of the CDD Vault to organize their data within a vault into projects, and invite individual vault members to be able to access specific projects, allowing for more flexible data sharing and management both within a group as well as across groups. Users will be able to share data more selectively, allowing users to view only the data relevant to their projects without compromising the security of data meant to be hidden. This results in no more balancing several systems for managing data between different groups, no more inviting collaborators into private networks and compromising other data.

Imagine a future in which your molecules, data and computational models could all be selectively shared in a single database – this is just a glimpse of some of our long range projects which could be of immense value to the rare and neglected communities, but also may have wider implications for more common diseases research productivity in the pharmaceutical industry.


11

Figure 7. An example of how “projects” could be used for a community project to efficiently manage many projects and recreate a “virtual pharma” environment.

If you would like to hear more about these new features or any of the other exciting projects and collaborative science happening at CDD please contact us (Tel: 215-687-1320; E-mail: [email protected]).

Acknowledgments.

I would like to sincerely thank all the speakers at the 4th Annual CDD Community Meeting, who made their slides available for this analysis (Slides available here - http://collaborativedrug.com/blog/blog/2010/10/26/cdd-hosts-inspiring-4th-annual-ucsf-community-meeting/). I would also like to acknowledge Dr. Antony Williams and Dr Joel Freundlich for their valuable discussions and collaborations. Also for more information, contact: Barry A. Bunin, Ph.D., President, Collaborative Drug Discovery, Inc.

[email protected]

Administrator Can load data for any project and see shared data

User project 1Can read shared and own data, cannot share

User project 2Can share own data but no read access

User project 3Not sharing currently, read access

User project 4Can share and read data


12

References

1. Burrill GS (2010) Looking forward to see ahead. 4th Annual CDD Community Meeting. San Francisco.

2. Fidock DA (2010) Drug discovery: Priming the antimalarial pipeline. Nature 465: 297-298.

3. Balganesh TS, Alzari PM, Cole ST (2008) Rising standards for tuberculosis drug development. Trends Pharmacol Sci 29: 576-581.

4. http://rarediseases.info.nih.gov/Resources/Rare_Diseases_Information.aspx (2010).

5. http://www.crdnetwork.org/blog/big-pharma-moves-from-blockbusters-to-niche-busters/.

6. http://cenblog.org/the-haystack/2010/10/gsk-highlights-rare-diseases-approach/.

7.http://www.xconomy.com/boston/2010/09/01/pfizer-gobbles-foldrx-in-big-pharmas-latest-rare-disease-play-in-boston-area/.

8.http://thebigredbiotechblog.typepad.com/the-big-red-biotech-blog/2010/10/big-pharma-and-governments-put-up-150-m-to-fight-neglected-diseases.html.

9. Bingham A, Ekins S (2009) Competitive Collaboration in the Pharmaceutical and Biotechnology Industry. Drug Disc Today 14: 1079-1081.

10. Hunter AJ (2008) The Innovative Medicines Initiative: a pre-competitive initiative to enhance the biomedical science base of Europe to expedite the development of new medicines for patients. Drug Discov Today 13: 371-373.

11. Barnes MR, Harland L, Foord SM, Hall MD, Dix I, et al. (2009) Lowering industry firewalls: pre-competitive informatics initiatives in drug discovery. Nat Rev Drug Discov 8: 701-708.

12. Bailey DS, Zanders ED (2008) Drug discovery in the era of Facebook--new tools for scientific networking. Drug Discov Today 13: 863-868.

13. Ekins S, Williams AJ (2010) Reaching out to collaborators: crowdsourcing for pharmaceutical research. Pharm Res 27: 393-395.

14. Moran M, Guzman J, Ropars AL, McDonald A, Jameson N, et al. (2009) Neglected disease research and development: how much are we really spending? PLoS Med 6: e30.

15. Morel CM, Acharya T, Broun D, Dangi A, Elias C, et al. (2005) Health innovation networks to help developing countries address neglected diseases. Science 309: 401-404.

16. Skolnick P (2010) Discovering and developing drugs to treat addictions: from models to medicines. In: Meeting tACC, editor. San Francisco.


13

17. O'Connor KA, Roth BL (2005) Finding new tricks for old drugs: an efficient route for public-sector drug discovery. Nat Rev Drug Discov 4: 1005-1014.

18. Chong CR, Sullivan DJ, Jr. (2007) New uses for old drugs. Nature 448: 645-646.

19. Pollastri M (2010) Distributed drug discovery for neglected tropical diseases. In: Meeting tACC, editor. San Francisco.

20. Hohman M, Gregory K, Chibale K, Smith PJ, Ekins S, et al. (2009) Novel web-based tools combining chemistry informatics, biology and social networks for drug discovery. Drug Disc Today 14: 261-270.

21. Macarrón R (2010) Contributions of HTS to drug discovery: a historical perspective. In: Meeting tACC, editor. San Francisco.

22. Payne DA, Gwynn MN, Holmes DJ, Pompliano DL (2007) Drugs for bad bugs: confronting the challenges of antibacterial discovery. Nat Rev Drug Disc 6: 29-40.

23. Barry CE (2010) Probing metabolic variation in Mycobacterium Tuberculosis using qHTS; a systems approach to uncovering novel targets. In: Meeting tACC, editor. San Francisco.

24. Williams AJ, Tkachenko V, Lipinski C, Tropsha A, Ekins S (2009) Free Online Resources Enabling Crowdsourced Drug Discovery. Drug Discovery World 10, Winter: 33-38.

25. Hann M, Hudson B, Lewell X, Lifely R, Miller L, et al. (1999) Strategic pooling of compounds for high-throughput screening. J Chem Inf Comput Sci 39: 897-902.

26. Huth JR, Mendoza R, Olejniczak ET, Johnson RW, Cothron DA, et al. (2005) ALARM NMR: a rapid and robust experimental method to detect reactive false positives in biochemical screens. J Am Chem Soc 127: 217-224.

27. Huth JR, Song D, Mendoza RR, Black-Schaefer CL, Mack JC, et al. (2007) Toxicological evaluation of thiol-reactive compounds identified using a la assay to detect reactive molecules by nuclear magnetic resonance. Chem Res Toxicol 20: 1752-1759.

28. Metz JT, Huth JR, Hajduk PJ (2007) Enhancement of chemical rules for predicting compound reactivity towards protein thiol groups. J Comput Aided Mol Des 21: 139-144.

29. Lipinski CA, Lombardo F, Dominy BW, Feeney PJ (1997) Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Adv Drug Del Rev 23: 3-25.

30. Ekins S, Freundlich JS (2010) Validating new tuberculosis computational models with public whole cell screening aerobic activity datasets Submitted.

31. Ekins S, Williams AJ, Xu JJ (2010) A Predictive Ligand-Based Bayesian Model for Human Drug Induced Liver Injury. Drug Metab Dispos.


14

32. Elliott R (2010) Drugs for 3rd world diseases: The good, the bad, and the ugly. In: Meeting tACC, editor. San Francisco.

33. Baell J (2010) Disengenuous toll compounds: observations on screening-based research and some concerning trends in the literature. In: Meeting tACC, editor. San Francisco.

34. Baell JB, Holloway GA (2010) New Substructure Filters for Removal of Pan Assay Interference Compounds (PAINS) from Screening Libraries and for Their Exclusion in Bioassays. J Med Chem 53: 2719-2740.

35. Gamo F-J, Sanz LM, Vidal J, de Cozar C, Alvarez E, et al. (2010) Thousands of chemical starting points for antimalarial lead identification. Nature 465: 305-310.

36. Ekins S, Williams AJ (2010) Meta-analysis of molecular property patterns and filtering of public datasets of antimalarial “hits” and drugs. MedChemComm In press.

37. Ekins S, Williams AJ (2010) When Pharmaceutical Companies Publish Large Datasets: An Abundance Of Riches Or Fool’s Gold? Drug Disc Today 15: 812-815.

38. Ekins S, Kaneko T, Lipinksi CA, Bradford J, Dole K, et al. (2010) Analysis and hit filtering of a very large library of compounds screened against Mycobacterium tuberculosis Mol BioSyst 6: 2316-2324.

39. Maddry JA, Ananthan S, Goldman RC, Hobrath JV, Kwong CD, et al. (2009) Antituberculosis activity of the molecular libraries screening center network library. Tuberculosis (Edinb) 89: 354-363.

40. Ananthan S, Faaleolea ER, Goldman RC, Hobrath JV, Kwong CD, et al. (2009) High-throughput screening for inhibitors of Mycobacterium tuberculosis H37Rv. Tuberculosis (Edinb) 89: 334-353.

41. Oprea TI (2002) Current trends in lead discovery: are we looking for the appropriate properties? J Comput Aided Mol Des 16: 325-334.

42. Oprea TI, Davis AM, Teague SJ, Leeson PD (2001) Is there a difference between leads and drugs? A historical perspective. J Chem Inf Comput Sci 41: 1308-1315.

43. Rosen J, Gottfries J, Muresan S, Backlund A, Oprea TI (2009) Novel Chemical Space Exploration via Natural Products. J Med Chem 52: 1953-1962.

44. Ekins S, Bradford J, Dole K, Spektor A, Gregory K, et al. (2010) A Collaborative Database And Computational Models For Tuberculosis Drug Discovery. Mol BioSystems 6: 840-851.

45. Guiguemde WA, Shelat AA, Bouck D, Duffy S, Crowther GJ, et al. (2010) Chemical genetics of Plasmodium falciparum. Nature 465: 311-315.

46. Gagaring K, Borboa R, Francek C, Chen Z, Buenviaje J, et al. Novartis-GNF Malaria Box. ChEMBL-NTD (www.ebi.ac.uk/chemblntd)


15

47. Kaiser D, Zdrazil B, Ecker GF (2005) Similarity-based descriptors (SIBAR)--a tool for safe exchange of chemical information? J Comput Aided Mol Des 19: 687-692.

48. Williams AJ (2010) Mobile chemistry - chemistry in your hands and in your face. Chemistry World May

49. Ekins S, Ring BJ, Grace J, McRobie-Belle DJ, Wrighton SA (2000) Present and future in vitro approaches for drug metabolism. J Pharm Tox Methods 44: 313-324.

50. Ekins S, Waller CL, Swaan PW, Cruciani G, Wrighton SA, et al. (2000) Progress in predicting human ADME parameters in silico. J Pharmacol Toxicol Methods 44: 251-272.

51. Ekins S, Swaan PW (2004) Computational models for enzymes, transporters, channels and receptors relevant to ADME/TOX. Rev Comp Chem 20: 333-415.

52. Ekins S, Williams AJ (2010) Precompetitive Preclinical ADME/Tox Data: Set It Free on The Web to Facilitate Computational Model Building to Assist Drug Development. Lab on a Chip 10: 13-22.

53. Ekins S, Ring BJ, Grace J, McRobie-Belle DJ, Wrighton SA (2000) Present and future in vitro approaches for drug metabolism. J Pharmacol Toxicol Methods 44: 313-324.

54. Gupta RR, Gifford EM, Liston T, Waller CL, Bunin B, et al. (2010) Using open source computational tools for predicting human metabolic stability and additional ADME/TOX properties. Drug Metab Dispos 38: 2083-2090.

55. Ekins S, Hohman M, Bunin BA (2010) Pioneering use of the cloud for development of the collaborative drug discovery (cdd) database In: Ekins S, Hupcey MAZ, Williams AJ, editors. Collaborative Computational Technologies for Biomedical Research. Hoboken: Wiley and Sons.

Health & Medicine

A white paper on Collaborative Drug Discovery: The Rising Importance of Rare And Neglected Disease’s