7
Quant. Struct.-Act. Relat. 15. 403409 f 1996) Lipophilicity Calculation Procedures 403 Calculation Procedures for Molecular Lipophilicity: a Comparative Study Raimund Mannhold'* and Karl Dross2 'Heinrich-Heine-Universitat, Institut fur Lasermedizin, Arbeitsgruppe Molekulare Wirkstoff-Forschung, Universitatsstraae 1, D-40225 Diisseldorf, Germany 'Heinrich-Heine-Universitat, C. und 0. Vogt Institut fur Hirnforschung, Universitatsstraae 1, D-40225 Dusseldorf, Germany dedicated to Prof. Dr. Dr. E. Mutschler on the occasion of his 65. birthday Abstract The predictive power of 14 calculation procedures for molecular lipophilicity is checked by comparing with reliable experimental logP values from the literature. The database of 138 test com- pounds comprises 90 simple organic structures and 48 chemically heterogeneous drug molecules (P-blockers, class I antiarrhythmics and neuroleptics). The present investigations lead us to conclude that the predictive power of the calculation procedures is significantly better for sim- ple organic molecules than for chemically heterogeneous drug structures. The calculation procedures should be arranged in three groups with significantly differing predictive power: fragmental > atom-based > conformation-dependent approaches. Key words: lipophilicity; log P calculation procedures; fragmental, atom-based and conformation-dependent methods Abbreviations ASCLOGP CHEMICALC-2 CLOGP HINT KLOGP KOWWIN MOLCAD PROLOGP-atom. PROLOGP-cdr PROLOGP-comb S ANALOGP-ER conformation-dependent log P, based on ap- proximate surface calculation calculated log P values, based on atomic con- tributions calculated log P values, based on fragmental contributions conformation-dependent log P, based on atom- ic contributions and lipophilicity potentials calculated log P values, based on fragmental contributions calculated log P values, based on atodfrag- ment contributions calculated log P values, based on atomic con- tributions calculated log P values, based on atomic con- tributions calculated log P values, based on fragmental contributions calculated log P values, based on atomic and fragmental contributions calculated log P values, based on fragmental contributions * to receive all correspondence 0 VCH Verlagsgesellschaft mbH, D-69469 Weinheim Cf-SYBYL SMILOGP Tsar 2.2 calculated log P values, based on fragmental contributions calculated log P values, based on atomic con- tributions calculated log P values, based on atomic con- tributions 1 Introduction Drug absorption, plasma protein binding, hydrophobic drug-recep- tor interactions and partly the pharmacokinetic behaviour and tox- icological properties of drug molecules, as well as formulation as- pects like solubility, are examples where lipophilicity is considered as a prime physico-chemical descriptor. An emerging new field of application of lipophilicity is in combinatonal chemistry. In the design of compound libraries, experimental or computed lipophi- licity data can be used as estimates for oral drug absorption as an important part of bioavailability. The widespread application of lipophilicity to biophysical processes involving xenobiotics, in particular as a screening tool, easily explains the need for both valid and quick procedures to quantify molecular lipophli- city. Based on large sets of experimental data, various computa- tional approaches have been designed to estimate log P values. For larger data sets calculative approaches are superior to experi- mental procedures, for compound proposals they represent the only possibility. Calculative approaches are either atom-based or use fragments. More recently attention is paid to conformational effects. Routine application of calculative approaches, however, demand for a con- tinuous check of their validity by comparing with expenmental data. We have undertaken a benchmark study for 14 commonly used calculation methods including atom-based, conformation-de- pendent and fragmental approaches. 2 Methods 2.1 Calculation Approaches for log P Calculations used in this study were performed and kindly provided by the corresponding colleagues as detailed in the acknowledge- ments. 093 1-8771/96/05 10-0403 $10.00+.25/0

Calculation Procedures for Molecular Lipophilicity: a Comparative Study

Embed Size (px)

Citation preview

Page 1: Calculation Procedures for Molecular Lipophilicity: a Comparative Study

Quant. Struct.-Act. Relat. 15. 403409 f 1996) Lipophilicity Calculation Procedures 403

Calculation Procedures for Molecular Lipophilicity: a Comparative Study Raimund Mannhold'* and Karl Dross2

'Heinrich-Heine-Universitat, Institut fur Lasermedizin, Arbeitsgruppe Molekulare Wirkstoff-Forschung, Universitatsstraae 1, D-40225 Diisseldorf, Germany

'Heinrich-Heine-Universitat, C. und 0. Vogt Institut fur Hirnforschung, Universitatsstraae 1, D-40225 Dusseldorf, Germany

dedicated to Prof. Dr. Dr. E. Mutschler on the occasion of his 65. birthday

Abstract

The predictive power of 14 calculation procedures for molecular lipophilicity is checked by comparing with reliable experimental logP values from the literature. The database of 138 test com- pounds comprises 90 simple organic structures and 48 chemically heterogeneous drug molecules (P-blockers, class I antiarrhythmics and neuroleptics).

The present investigations lead us to conclude that the predictive power of the calculation procedures is significantly better for sim- ple organic molecules than for chemically heterogeneous drug structures. The calculation procedures should be arranged in three groups with significantly differing predictive power: fragmental > atom-based > conformation-dependent approaches.

Key words: lipophilicity; log P calculation procedures; fragmental, atom-based and conformation-dependent methods

Abbreviations ASCLOGP

CHEMICALC-2

CLOGP

HINT

KLOGP

KOWWIN

MOLCAD

PROLOGP-atom.

PROLOGP-cdr

PROLOGP-comb

S AN ALOGP-ER

conformation-dependent log P, based on ap- proximate surface calculation calculated log P values, based on atomic con- tributions calculated log P values, based on fragmental contributions conformation-dependent log P, based on atom- ic contributions and lipophilicity potentials calculated log P values, based on fragmental contributions calculated log P values, based on atodfrag- ment contributions calculated log P values, based on atomic con- tributions calculated log P values, based on atomic con- tributions calculated log P values, based on fragmental contributions calculated log P values, based on atomic and fragmental contributions calculated log P values, based on fragmental contributions

* to receive all correspondence

0 VCH Verlagsgesellschaft mbH, D-69469 Weinheim

Cf-SYBYL

SMILOGP

Tsar 2.2

calculated log P values, based on fragmental contributions calculated log P values, based on atomic con- tributions calculated log P values, based on atomic con- tributions

1 Introduction

Drug absorption, plasma protein binding, hydrophobic drug-recep- tor interactions and partly the pharmacokinetic behaviour and tox- icological properties of drug molecules, as well as formulation as- pects like solubility, are examples where lipophilicity is considered as a prime physico-chemical descriptor. An emerging new field of application of lipophilicity is in combinatonal chemistry. In the design of compound libraries, experimental or computed lipophi- licity data can be used as estimates for oral drug absorption as an important part of bioavailability. The widespread application of lipophilicity to biophysical processes involving xenobiotics, in particular as a screening tool, easily explains the need for both valid and quick procedures to quantify molecular lipophli- city. Based on large sets of experimental data, various computa- tional approaches have been designed to estimate log P values. For larger data sets calculative approaches are superior to experi- mental procedures, for compound proposals they represent the only possibility.

Calculative approaches are either atom-based or use fragments. More recently attention is paid to conformational effects. Routine application of calculative approaches, however, demand for a con- tinuous check of their validity by comparing with expenmental data. We have undertaken a benchmark study for 14 commonly used calculation methods including atom-based, conformation-de- pendent and fragmental approaches.

2 Methods

2.1 Calculation Approaches for log P

Calculations used in this study were performed and kindly provided by the corresponding colleagues as detailed in the acknowledge- ments.

093 1-8771/96/05 10-0403 $10.00+.25/0

Page 2: Calculation Procedures for Molecular Lipophilicity: a Comparative Study

404 Rairnund Mannhold and Karl Dross Quant. Struct.-Act. Relat. IS, 403409 (1996)

From 14 different calculation approaches (Table 1) 6 are based on fragmental systems. 3 of them refer to the “reductionistic” hydro- phobic fragmental constant approach developed by Rekker [ 1-41. Cf-SYBYL and SANALOGP-ER use fragment values from the revised system [5], while fragment values of the original Rekker system [4] underly the PROLOGP 5. I-cdr approach. For the “con- structionistic” fragmental system from Hansch and Leo [6- 81 the version CLOGP 4.34 was used. As a further example of fragmental procedures KLOGP of Klopman [9] is included, in which the frag- ments are computer-identified by an automatic procedure (CASE). The AFC approach of Meylan and Howard [lo] is based on atom/ fragment contributions. Most remaining approaches are atom-based or conformation-dependent. The atom-based method of Ghose and Crippen [ll-141 represents the basis for MOLCAD [IS], Tsar 2.2 and PROLOGP 5.1-atomics. The program CHEMICALC-2 uses the Suzuki/Kudo procedure [16] and SMILOGP was developed by the group of Dubost [ 171. Conformation-dependent approaches include HINT from Abraham and Kellogg [ 181 and ASCLOGP [ 19, 201. PROLOGP 5.1-comb combines both fragmental [4] and atom- based approaches [14]. This program implies a weighting of the contributions of atom-based and fragmental parameters performed by the user on the basis of regression analysis. Weighting used here refers to:

log Pcomb = 0.73 log Patomics + 0.26 log Pcdr

2.2 Experimental Data

Experimental data for cornparison with calculated values represent octanollwater partition coefficients. Almost exclusively log P*- d u e s from the recent tabulation of Hansch et al. [21] are used. In the case of missing logP*, experimental data are taken from Mannhold et al. [22, 231 as well as Taylor and Cruickshank [24].

Table 1

3 Results

Comparative evaluation of the validity of the calculation proce- dures, included in this study, was performed as follows:

First, the individual differences between experimental (log Pexp) and calculated log P (log Pca,c) were compared according to the following criteria: differences between experiment and calculation (given in % of the entire set) lower.than f 0 . 5 were evaluated as acceptable, differences > f 0.5 and < * 1 .O as disputable and dif- ferences exceeding f 1.0 as unacceptable. Also the percentage of missing calculations is given. In addition, over-(log Pcalc > log Pexp) and underestimations (log Pcale < log P,,,) were counted for the respective procedures. Last, but not least, the mean square devia- tions (m.s.d.) were calculated.

Secondly, regression analysis was applied for correlating log Pexp and logPC,,, and the corresponding statistical criteria (r, s, Fish- er-value) were used for comparing the quality of the different cal- culation procedures. Regression was forced through the origin al- lowing a clearcut evaluation of the deviation of the slope from the theoretically expected value of 1.0.

3.1 The entire Database

In Table 2 expenmental log P values as well as their differences to log Pexp, obtained with the different calculation procedures, are given for the entire database of 138 compounds. Table 3 sum- marizes the analysis of the data obtained with this database.

In general, the analysis indicates a superiority of the fragmental methods over atom-based and conformation-dependent ap- proaches. Acceptable calculations hold for 77% (PRO- LOGP-cdr) up to 91% (KOWWIN) as far as fragmental methods are concerned. In the case of the other approaches significantly lower percentages are found ranging from 50% for SMILOGP to 77% for PROLOGP-atomics. Unacceptable calculations amount to less than 6% for fragmental approaches, while most of the non-fragmental procedures exhibit more than 10% of unacceptable calculations. Regarding over- and underestimations, fragmental and conformation-dependent methods exhibit a more or less equili- brated pattern. Atom-based approaches significantly underesti- mate log P. The mean square deviations (m.s.d.) underline the high- er validity of fragmental methods (m.s.d. values < 0.200) versus non-fragmental methods (m.s.d. values > 0.250).

An important aspect in praxis is whether the programs are capable to deal with any kind of structure. In the present study Cf-SYBYL, KOWWIN, MOLCAD, Tsar 2.2 and CHEMICALC-2 were able to calculate the entire set underlining the fact that fragmental methods sometimes have calculation problems due to missing fragmental values. An enlarged set of 177 compounds (detailed data in [25]) further supports this statement. In this database fragmental methods fail to some extent (1-5%), while KOWWIN and the above mentioned atom-based approaches continue to calculate the entire database.

Regression analysis substantiates the results derived from the indi- vidual comparison of the calculation procedures. Correlation coef-

Page 3: Calculation Procedures for Molecular Lipophilicity: a Comparative Study

Quant. Struct.-Act. Relat. I S , 403409 (1996) Lipophilicity Calculation Procedures 405

Table 2. Experimental log P values (log P*) as well as their differences to log P, obtained with 14 calculation programs, are listed for the entire database of 138

nr 1 2 3 4 5 6 7

9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34

a

compounds. -

Compound

___- Acetic acid Propionic acid Butyric acid Benzoic acid 2-Me-BA 3Me-BA 4-Me-BA

4-OMe-BA 3-F-BA 4-F-BA 3-CI-BA 4-CI-BA 3-Br-BA 4-Br-BA 3-I-BA 4-I-BA 4-Bu-BA 2-OH-BA 4-OH-BA 2.4-Di-OH-BA 1-Naphthalenecarbon acid Phenylacetic acid 3-Me-Phenylacetic acid 3-F-Phenylacebc acid 4-F-Phenylacebc acid 4-CI-Phenylacetic acid 4-Br-Phenylacebc acid SPhenylpropionic acid 4-Phenylbutyric acid Benzophenone Benzamide 2-OH-Benzamide 4-OH-Benzamide

3 - 0 ~ e - e ~

nr 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69

log P SI SANA PROLOGP CLOGP KLOGP KOW PROLOGP MOL- Tsar PROLOGP CHEMI- SMI- HINT ASC LOG I:

Q.17 -0.05 -0.05 -0.08 -0.06 -0.05 0.26 - 0 , l T T 0.00 -0.23 -0.15 0.10 -0.05 -0.10 . - ~ SYEYL ER cdr 4.34 WIN comb CAO 2.2 atomics CALC L O G P

C ~ O U n d

.____- Phenol 4-CCPhenol 4-Br-Phenol 1 -Naphthol 2-Naphthol Benzylalcohol 4-Me-Benzyl Alcohol 4-Cl-Benzyl Alcohol lmidazole 2-Me-lmidazole 2-Phenyl-lmidazole Benzirnidazole 5,6-Di-Me-Benzimidazole Aniline 4-NO2-Aniline 4-CI-Aniline 4-Br-Aniline 2-Naphthylarnine 2-Amino-Biphenyl 2-Amino-Fluorene 2-Amino-7-Br-Fluorene 1-Amino-Anthracene 3-Amino-Fluoranthene 1-Amino-Pyrene Acridine 4-Nitrotoluene 4-Cl-Nitrobenzene 4-Br-Nitrobenzene 1-Nitro-Naphthalene Benzene Penta-Me-Benzene Toluene Biphenyl Bibenzyl Naphthalene

0.33 -0.03 -0.03 0.79 0.03 0.03 1.87 -0.03 -0.04

2.37 -0.01 -0.02 2.27 0.09 0.08 2.02 -0.11 -0.12 1.96 -0.05 -0.06 2.15 -0.07 -0.08 2.07 0.01 0.00 268 -0.11 -0.12 2.65 -0.08 -0.09 2.87 -0.10 -0.11 2.86 -0.m -0.10 3.13 -0.05 -0.05 3.02 0.06 0.06 3.97 -0.06 -0.06 2.26 -0.98 -0.98 1.58 -0.30 -0.30 1.63 -0.91 -0.91 3.10 0.02 0.02 1.41 0.07 0.07 1.95 0.05 0.04 1.65 0.07 0.07 1.55 0.17 0.17 2.12 0.09 0.08 2.31 0.10 0.09 1.84 0.16 0.15 2.42 0.10 0.09 3.18 -0.13 -0.14 0.64 0.13 0 12

2.46 -0.10 -0.11

1.28 -1.07 -1.07

-0.06 0.00 -0.12 -0.19 -0.10 0.00 -0.20 -0.14 -0.19 -0.11 -0.19 -0.16 -0.17 -0.16 -0.12 -0.01 -0.15 -1.04 -0.36 -0.93 -0.08 0.00 -0.03 -0.03 0.07 0.03 0.04 0.08 0.02 0.11 0.09 -1.07

-0.03 -0.15 0.25 0.03 -0.21 0.28 0.02 -0.16 0.00 0.00 -0.37 -0.38 0.01 -0.28 0.05 0.11 -0.18 0.15 0.00 -0.34 -0.06 0.06 -0.28 0.00 -0.02 -0.25 -0.08 0.06 -0.17 0.00 0.02 -0.32 -0.16 0.05 -0.29 -0.13 -0.02 -0.32 -0.11

-0.02 -0.28 -0.09 0.09 -0.17 0.02 0.00 -0.67 -0.08 -0.07 -0.12 -0.02 -0.02 -0.01 -0.19 -0.01 0.37 0.13 6.04 -0.28 -0.05 0.00 0.07 0.02 -0.04 -0.08 0.03 -0.09 0.03 -0.02 0.01 0.13 0.08 0.01 0.02 -0.05

-0.01 -0.31 -0.10

-0.03 0.02 0.01 -0.13 0:05 0.45 -0.18 -0.13 0.36 0.00 -0.33 -0.03 0.02 -0.07 0.10 0.00 -0.03 -0.25

0.00 0.05 -0.30 -0.26 -0.10 -0.06 -0.23 -0.22 -0.16 -0.14 -0.41 -0.40 -0.44 -0.45 -0.45 -0.36 -0.20 -0.48 -0.35 -0.20 -0.31 0.07 0.07 0.08 0.08 0.08 0.05 0.09 0.02 0.33 0.18 -0.25

.~

0.13 0.13 0.02 0.07 0.07 0.07 -0.12 -0.12 -0.37 -0.25 -0.25 -0.28 -0.16 -0.16 -0.10 -0.06 -0.06 -0.07 -0.53 -0.53 -0.24 -0.47 -0.47 -0.25 -0.21 -0.26 -0.15 -0.19 -0.18 -0.15 -0.42 -0.42 -0.49 -0.39 -0.39 -0.49 -0.33 -0.33 -0.54 -0.32 -0.32 -0.56 -0.13 -0.13 -0.57 -0.02 -0.02 -0.49 -0.57 -0.57 -0.22 -0.80 -0.80 -0.27 -0.12 -0.12 -0.34 -0.45 -0.45 0.06 -0.35 -0.35 -0.40 0.27 0.27 0.09 0.19 0.20 0.10 0.17 0.17 0.12 0.27 0.27 0.08 0.08 0.08 0.10 0.16 0.16 0.06 -0.09 0.23 0.10 0.05 0.05 0.02 0.09 0.09 0.40 0.24 0.24 0.20 -0.68 -0.69 0.05

-0.16 0.06 -0.01 -0.08 -0.09 0.06 0.07 -0.32 -0.08 -0.92 -0.29 -0.54 -0.25 -1.09 -0.22 -1.04 -0.16 -1.00 -0.13 -0.52 -0.06 -0.90 -0.03 -0.42 -0.42 -0.94 -0.30 -1.05 -0.36 -0.88 -0.33 -0.98 -0.11 -1.06 -0.43 -0.76 -0.03 -0.98 -0.35 -0.68 -0.25 -1.11 -0.39 -0.93 -0.22 -1.08 -0.36 -0.89 -0.17 -1.03 -0.43 -0.96 -0.16 -1.02 -0.42 -0.94 -0.05 -0.88 -0.43 -0.94 0.06 -0.77 -0.32 -0.83 -0.16 -1.23 -0.11 -1.01 -0.25 -1.70 -1.15 -1.14 -0.60 -1 02 -0.67 -0.80 -0.43 -1.45 -1.12 -1.07 -0.08 -1.03 -0.36 -0.89 0.00 -0.46 -0.08 0.01 -0.12 -0.13 -0.10 0.00 0.01 -0.10 -0.32 -0.15 0.11 0.00 -0.22 -0.05 -0.06 -0.09 -0.22 -0.26 0.01 -0.01 -0.26 -0.28 -0.06 0.03 0.01 0.17 -0.10 -0.10 -0.03 0.09 -0.01 -1.11 1.06 0.29 0.00 -0.33 -0.37 -0.28 -0.42 -1.36 -1.72 0.57

3.33 -0.12 -0.12 -0.12 0.00 0.10 -0.07 0.15 0.27 0.27 0.25 -0.50 -0.41 -1.51 0.39

log P Sf SANA PROLCGP CLOGP KLOGP KOW PROLOGP MOL- Tsar PROLOGP CHEMC SMC HINT ASC SYBYL ER cdr 434 WIN comb CAD 2.2 atorria CALC LOGP LOG I

1.46 0.09 Th- 0.04 0.02 0.08 0.05 006 0.30 0.30 0.07 -0.07 0.02 0.00 0.31 2.39 -0.11 -0.11 -0.15 2.59 -0.11 -0.11 -0.14 2.84 0.00 -0.01 -0.07 2.70 0.14 0.13 0.07 1.10 -0.13 -0.13 -0.23 1.58 -0.09 -0.09 -0.19 1.96 -0.25 -0.26 -0.35 4.08 0.24 0.24 0.03 0.24 0.44 0.44 0.23 1.88 0.20 0.19 0.02 1.32 0.13 0.13 0.15 2.35 0.13 0.13 0.16 0.90 0.10 0.10 0.09 1.39 -0.59 -0.64 -1.15 1.88 -0.15 -0.15 -0.15 2.26 -0.33 -0.33 -0.32 2.28 0.01 0.00 -0.02 2.84 0.08 0.07 0.09 3.14 -0.11 -0.12 -0.05 3.92 0.03 0.03 0.12 3.69 -0.11 -0.12 -0.16 4.20 -0.18 -0.19 -0.07 4.31 0.36 -0.08 0.40 3.40 -0.09 -0.09 0.05 2.37 0.05 0.01 -0.09 2.39 0.24 0.20 0.11 2.55 0.27 0.24 0.16 3.19 0.00 -0.04 -0.15 2.13 -0.02 -0.03 -0.11 4.56 0.14 0.13 0.06 2.73 -0.10 -0.11 -0.19 4.01 0.01 0.01 -0.04 4.79 0.05 0.04 -0.07

0.10 0.05 -0.19 -0.05 0.00 0.02 -0.14 0.05 0.00 0.19 0.25 0.16 0.02 -0.13 0.03 -0.20 -0.19 -0.04 -0.44 -0.30 -0.43 -0.48 -0.59 0.01 0.04 0.24 0.23 -0.10 0.01 -0.12 -0.09 0.02 -0.20

-0.20 -0.23 -0.15 -0.11 -0.11 0.14 -0.21 -0.19 -0.19 -0.04 -0.04 -0.20 -0.19 -0.15 -0.10 -0.08 -0.08 -0.12 -0.05 -0.01 0.04 0.06 0.08 0.02 0.02 -0.02 0.06 0.41 0.41 0.17 -0.07 0.04 0.02 0.40 0.40 0.09 -0.18 -0.24 -0.07 0.07 0.07 0.03 0.28 0.14 0.19 -0.12 -0.17 0.24

0.01 -0.06 -0.06 0.12 0.07 -0.08 -0.01 -0.09 0.24 -0.45 -0.40 0.27 -0.27 -0.02 0.21 -0.55 -0.50 0.22 0.11 0.18 0.14 0.36 0.36 0.16 0.00 0.08 -0.87 -0.17 -0.17 -0.76

0.35 0.37 -0.19 0.44 0.39 -0.34

-0.22 -0.15 -0.11 -0.10 -0.10 -0.10 -0.40 -0.29 -0.33 -0.M -0.20 -0.34 -0.16 -0.03 -0.02 -0.01 -0.01 -0.02 -0.14 0.00 0.14 0.11 0.11 0.16 -0.24 -0.04 0.29 -0.16 -0.16 0.41 -0.17 0.07 0.50 -0.15 -0.15 0.63

-0.43 -0.18 -0.08 -0.61 -0.61 -0.08

-0.13 -0.08 0.08 -0.33 0.14 0.06 -0.31 -0.01 0.05 0.10 0.10 0.10 -0.06 0.07 0.06 0.13 0.13 0.05 -0.03 0.15 0.06 0.24 0.24 0.03 -0.41 -0.20 -0.20 -0.19 -0.19 -0.21 0.29 -0.14 -0.19 -0.08 -0.08 -0.22

-0.46 -0.26 -0.21 -0.42 -0.42 -0.23

-0.54 -0.29 -0.10 -0.72 -0.72 -0.29

-0.22 0.17 -0.31 -0.18 -0.18 -0.45 0.06 -0.19 -0.23 -0.22 -0.24 -0.24 0.10 -0.25 -0.20 -0.28 -0.28 -0.26 0.12 -0.05 -0.15 -0.26 -0.27 -0.18

-0.35 -0.29 -0.22 -0.29 -0.22 -0.27 -0.25 -0 24 -0.22 -0.11 -0.10 -0.08 -0.17 0.02 -0.30 -0.23 -0.05 -0.24 -0.38 -0.22 -0.57 0.w) -059 0.48

0.00 -065 0.29 0.17 -0.37 0.67 -0.02 -0.57 0.84 0.01 0.15 0.00

-0.32 -0.21 0.11

-0.18 -0.11 0.16 0.00 -0.35 0.31 -0.39 -0.63 0.15

-0.30 -0.03 0.82

-0.76 -0.65 -1.32

-0.44 -0.32 -0.12

-0.26 -0.52 0.23 -0.39 -0.40 -0.09 -0.55 -1.02 -0.13 -0.55 -0.52 -0.26 0.14 -0.09 0.04 -0.02 -0.40 -0.79 0.19 -0.21 -0.76 0.29 -0.11 -0.77 -0.07 -0.51 -1.11 0.07 -0.26 0.00 -0.26 -0.62 0.39 -0.11 -0.45 0.06 0.13 -0.71 -0.01 -0.04 -0.58 0.09

-0.2c -0.24 0.05 0.18 0.41 0.43 -0.04 -0.4c

0.34 -0.44

0.10

-0.64 -0.85 -0.35 0.96 -0.42 -0.62

0.08

-0.x

-0.38

0.03 -0.18 -0.26 0.01 0.65 0.53 0.53 -0.58 0.22 -0.05 0.1 1 0.10 0.06

3 30 0.10 0 09 0.00 002 0 23 -0 13 -0.14 -0 25 -0.25 -0.19 0.10 -0.31 -0.01 0 14

Page 4: Calculation Procedures for Molecular Lipophilicity: a Comparative Study

406 Rairnund Mannhold and Karl Dross

Tab.2

70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104

Quant. Struct.-Act. Relat. IS. 403-409 (1996)

cont. Compound

2-Me-Naphthalene 2.6-Di-Me-Naphthalene Anthracene 2-Et-Anthracene SPh-Anthracene Fluoranthene Pyrene 2-Me-Phenanthrene CI-Benzene 1.2-Di-CI-Benzene 1 ,%Di-CI-Benzene 1,4-Di-CI-Benzene 1,2,3-Tri-CI-Bemene 1.2.4-Tri-CI-Benzene 1,3,5-Tri-Cl-Benzene 1,2.3,4-Tetra-Cl-Benzene 1,2,3,5-Tetra-CI-Benzene 1,2,4.5-Tetra-CI-Benzene Penta-CI-Benzene Hexa-Cl-Benzene 1.4-Di-&-Benzene Aprindine Carocainide Disopyramide Ethmozine Flecainide lndecainide Lidocaine Lorcainide Mexiletine Nicainoprol Procainamide Propafenone Quinacainol Quinidine

loa P Sf SANA PROLOGP CLOGP KLOGP KOW PROLOGP MOL- Tsar PROLOGP CHEMI- SMI- HINT ASC

K

105 106 107 108 109 110 11 1 112 113 114 115 116 117 118 119 120 121 I 2 2 123 124 125 126 127 128 129 130 131 132 I33 134 135 136 137 158

__ Acebutolol Alprenolol Atenolol Bunitrolol Bupranolol Metipranolol Metoprolol Oxprenolol Penbutolol Pindolol Propranolol Sotalol Alimemazine Chlorpromazine Fluphenazine Levornepromazine Perarine Perphenazine Promazine Promethazine Sulforidazine Thiethylperazine Thioridazine Trifluoperazine Trifluopromazine Alizapride Alpiropride Amisulpride Bromopride Metoclopramide Sulpiride Sultopride Tiapride Veralipride

Tab.2 cont. I &Pound

LOG F

3.86 0.05 0.05 -0.05 -0.04 0.05 -0.14 -0.14 $34 -0.34 -0.17 -0.04 -0.48 009 0.09 SYBYL ER cE--.-- 4.34 WIN comb CAD- -~~--atomics CALC LOG P _-

4.31 0.12 0.12 4.45 0.23 0.23

6.01 0.59 0.58

4.88 0.90 0.45 5.15 0.05 0.04

3.43 0.13 0.13 3.53 0.03 0.03 3.44 0.12 0.12 4.14 0.15 0.15 4.02 0.27 0.27 4.19 0.10 0.10 4.64 0.38 0.38 4.68 0.36 0.36 4.60 0.42 0.42 5.18 0.57 0.57 5.73 0.75 0.75 3.79 0.18 0.18 4.86 0.87 1.07 1.38 1.49 1.26 2.58 0.01 0.43

3.78 0.27 -0.61

2.26 1.15 1.14 4.85 -0.14 0.72 2.15 1.00 0.99 1.63 0.47 0.46 0.88 0.25 0.23 4.63 -0.47 3.63 0.53 0.74

5.85 -0.13 -0.14

5.16 0.18 -0.05

2.89 -0.05 -0.05

2.98 -1.90

3.11 -0.11 0.10

0.02 0.12 -0.24 0.51 0.01 0.87 0.23 -0.13 0.07 -0.03 0.06 0.10 0.22 0.05 0.34 0.32 0.38 0.54 0.73 0.13 0.79 1.46 0.27

1.20 0.15 0.87 -0.16 0.99 0.27 0.25

0.79

0.00 0.04 -0.33 0.37 -0.20 0.08 -0.16 -0.03 0.02 0.04 0.13 -0.10 0.14 0.09 -0.01 0.09 0.15 0.05 -0.03

-0.39 0.71 -1.25

0.57 -0.65 -0.28 -0.35 0.42 -0.25 0.20

-0.12

0.08

-0.01 0.19 -0.42 0.32 0.01 0.29 -0.13 -0.56 -0.44 -0.54 -0.45 -0.50 -0.38 -0.55 -0.35 -0.37 -0.31 -0.24 -0.13 -0.44 0.56 0.67 1.70

-0.44 1 .oo 0.41 -0.04 -0.02 0.62 0.75

0.52

-0.05 -0.03 -0.10 -0.07 -0.47 -0.37 0.10 0.23 -0.23 -0.23 0.05 0.23 -0.26 -0.13 -0.25 -0.31 -0.15 -0.30 -0.25 -0.25 -0.16 -0.18 -0.21 -0.45 -0.09 -0.14 -0.26 -0.14 -0.07 -0.35 -0.09 -0.20 -0.03 -0.14 0.04 -0.31 0.13 -0.45 -0.02 -0.21 1.04 0.38 0.63 0.26 0.38 0.85 -1.01 0.17 0.03 0.41 0.72 -0.60 0.56

0.46 0.31 0.06 0.09 0.09 0.04

1.19 0.83

-0.17 -0.12

-1.26

-0 33 -0.33 -0.40 -0.40 -0 94 -0.93 -0.27 -0.27 -0.79 -0.79 -0.51 -0.51 -0.63 -0.63 -0.32 -0.32 -0.35 -0.35 -0.45 -0 45 -0.36 -0.36 -0.54 -0.54

-0.59 -0.59 -0.52 -0.52

-0.48 -0.48 -0.54 -0.54 -0.58 -0.57 -0.16 -0.16 0.25 0.25

1.41 0.95 -0.43 -0.91 -0.80 -0.80 0.25 0.25 0.12 0.12 -0.61 -0.61 0.20 0.20 1.17 0.24 0.13 0.13

0.59 0.83

-0.42 -0.42

-0.54 -0.54

0.37 -1.01

-1.23 -1.23

-0.05 -0.14 -0.42 0.14 -0.31 -0.01 -0.26 -0.38 -0.43 -0.33 -0.27 -0.65 -0.27 -0.21 -0.60 -0.39 -0.33 -0.62 -0.89 -0.33 0.23 -0.17 1.06

-0.40 0.92 0.44 -0.11 0.07 0.02 -0.04

0.84

-0.08 0.15 -0.32 0.51 0.18 0.17 -0.14 -0.04 -0.05 -0.03 0.06 -0.44 0.01 -0.05 -0.62 -0.31 -0.04 -0.84 -1.60 0.23 0.24 0.02 -0.04 -2.38 0.51 0.08 -1.35 -1.14 0.22 -1.10 -0.23 -0.77 0.56

-0.49 0.18 0.16 -0.34 0.00 0.10 -0.87 -0.20 -0.48 -0.47 0.31 0.05 -1.17 -0.25 -0.23 -0.28 0.02 0.02 -0.63 -0.04 -0.13 -0.40 -0.05 -0.14 -0.33 0.00 -0.41 -0.43 -0.10 -0.39 -0.34 -0.01 -0.29 -0.42 -0.12 -081 -0.30 0.00 -0.56 -0.47 -0.17 -0.62 -0.30 -0.03 -0.95 -0.32 -0.05 -0.95

-0.22 0.02 -1.07 -0.26 0.01 -0.75

-0.15 0.06 -1 37 -0.15 -0.06 -0.30

1.69 1.97 -3 01

0.88 -0.76 1.27

-0.96 -2.23 1.05 -0.16 -1.45 0.84 -0.13 1.22 -0.36

1.65 1.62 -0.32 0.86 0.19 -1.02 2.28 1.43 -0.73 0.45 2.00

0.97 0.10 0.70 2.88-0.12 0.09 0.27 0.05 0.29 0.41 -0.25 -0.03 -026 -0.43 -0.74 -068 0.41 -028

log P Sf SANA PROLOGP CLOGP KLOGP KOW PROLOGP MOL- Tsar PROLOGP CHEMI- SMC HINT ASC WIN comb CAD 2.2 atomics CALC LOGP LOG F

1.71 0.01 0.22 0.37 -0.08 -0.05 -0.52 0.04 -0.73 -0.73 -0.08 -0.25 -1.08 0.55 2.05 -.- SYBYL ER c b 4.34

3.10 0.23 0.22 0.44 0.16 0.33 0.31 0.55 1.91 0.21 -0.02 0.57 2.80 0.93 0.70 1.32 2.86 0.62 0.60 0.66 1.88 0.31 0.30 0.44 2.10 0.78 0.77 0.99 4.15 0.52 0.72 0.94 1.75 0.43 0.42 0.76 2.98 0.49 0.48 0.64

4.71 0.18 0.11 -0.22

4.36 0.28 0.42 -0.02 4.68 0.28 0.21 -0.12 3.61 0.48 0.85 0.16 4.20 0.14 0.29 -0.27

0.59 -0.26 -0.05 0.04

5.19 -0.09 -0.16 -0.48

4.55 -0.17 -0.25 -0.58 4.81 -0.22 -0.51 -0.26 4.45 -0.70 -0.55 -0.63 5.41 -0.18 0.18 -0.53 5.90 0.02 0.17 -0.13 5.03 0.08 0.45 -0.11 5.19 0.20 0.13 -0.07 1.79 -0.60 0.10 0.05 1.69 -1.84 -1.63 -1.65 1.10 -1.05 -0.84 -0.58 2.83 -0.71 -0.72 -0.68

0.42 0.13 0.33 0.26 1.06 0.10 0.30 0.50

2.62 -0.70 -0.71 -0.88

0.90 -0.15 -0.16 -0.02

-0.45 -0.27 -0.17 0.37 -0.11 -0.68 -0.30 -0.1 1 -0.08 -0.23 -0.36 -0.12 0.02 1.55 -0.03 1.61 1.38 -0.27 -0.1 6 0.32 0.90 0.52 1.46 0.33 0.95 -0.17 0.63 -0.76

0.69 0.87 0.04

-0.75

-0.45 -0.29 -0.18 0.78 -0.19 0.28 -0.15 -0.49 0.15 0.32 0.27 0.70 0.05 0.00 -0.02 0.33 -0.19 0.09 0.14 -0.27 0.26 -0.85 0.05 -0.06 0.15 -0.27 0.25 -0.23 -0.38 0.02 0.55 -0.22 -0.04 0.06 0.27 -0.74

0.35' -0.23 -0.90 0.07 0.38 -0.77 0.57 0.58 -0.66 0.09 -0.38 -1.18 -0.08 0.01 -0.87 -0.44 -0.32 -0.95 -0.14 -0.22 -0.88 0.03 -0.17 -1.64

-0.07 0.01 -0.64

-0.21 0.55 -0.77 0.23 0.08 -0.77 0.35 0.33 -0.21 0.02 0.01 -0.69 -0.64 -1.14 -1.59 0.11 0.01 -0.85 -0.38 -0.89 -0.92 -0.36 -0.93 -0.87 0.51 0.23 0.25 0.62 0.27 0.13 0.85 -0.37 -0.29

-0.26 -0.26 0.40 0.40 0.00 0.00 0.06 0.06 0.04 0.04 -0.09 -0.09 0.18 0.18 -0.79 -0.79 -0.01 -0.18

0.61 0.13 -0.91 -0.91 -1.37 -1.37 -0.82 -0.82 -1.13 -1.13 -0.51 -0.51 -1.02 -1.02 -1.25 -1.25 -1.15 -1.15 -1.21 -1.21

-1.72 -1.71 -1.04 -1.04 -1.01 -1.01 -0.13 -0.76 -1.35 -1.72 -0.80 -0.80 -1.28 -1.28 -1.35 -1.35

-0.18 -0.18

-1.87 -1.87

0.06 -0.31 0.15 0.02 -0.22 -0.22

-0.41 0.19 0.00 0.47 -0.27 -0.04 0.00 -0.42 0.07 -0.20 -0.07 -0.92 -0.70 -1.22 -1.01 -0.95 -1.51 -0.98 -1.20 -0.97 -204 -1.01 -1.01 -0.26 -0.96 -1.57 -0.94 -1.01 -0.94 0.24 -0.01 -0.40

-1.16 -0.90 -0.72 -0.14

-0.14 -0.81 -0.56 -0.91 -0.35 -0.40 -0.30 -0.79 -0.98 -0.69 -0.83 -0.74 -0.83 -0.35 -1.01 0.06 -1.30 -0.01 -1.34 -0.68

-0.51

-1.17 0.22 -0.64 -1.56 -1.54 -0.88 -1.31 0.02 -1.60 -1.79 -0.47 -1.36 -2.59 -1.30 0.14 -1.05 -2.00 -0.94 0.36 -0.69 -1.28 -1.91 -0.86 -1.44 0.11 -1.66 0.08 -1.72 0.47 -0.60 0.47 -0.59 0.64 -1.04

0.25 0.49 -0.64 0.07 0.10 0.14 0.71 1.04 2.12 0.65 0.08 0.63 0.91 0.55 0.71 0.37 0.68 090 0.46 0.16 -1.01 0.12 0.22 0.37 -0.21 0.12 0.59 0.01 0.30 -0.11 1.63 0.76 0.94 -0.03 -0.16 025 0.04 -0.10 -0.02 -0.76 1.00 -0.23 0.22 0.13 0.61 0.57 -0.40 0.36

-0.66 1.60 1.51 1.16 0.07 2.27 0.13 1.80 -0.32 0.07 1.71 0.25 1.06 0.56

-0.06 1.46

1.47 -0.74 -0.54 -0.57 -0.79 -0.34 -090 -0.63 -0.84 -1.21 -0.65 -2.14 -1 63 -0.62 -0.90

Page 5: Calculation Procedures for Molecular Lipophilicity: a Comparative Study

Quant. Struct.-Act. Relat. 15. 403409 (1996) Lipophilicity Calculation Procedures 407

Table 3. In the first part of the table the percentage of acceptable (log P*- Table 4. The percentage of acceptable (log P* -log P,,,, < 10.5). disputable log P,,,, < 3~0.5). disputable (log P*-Iog P,,,, > f 0 . 5 and < f 1.0) and un- (log P*-log PC,,, > f0.5 and < f1.0) and unacceptable calculations acceptable calculations (log P*-log P,,,, > f l .O), the number of over- (logP*-log PC,,, > f1.0), the number of over-(> logP*) and underestima- (> log P*) and underestimations (< log P*) and the mean square deviations tions (< log P*), the mean square deviations (m.s.d.) and the Fisher value (F) (m.s.d.) are given. The second part lists the slope (a), its confidence interval for regressions between experimental and calculated log P, forced through (k c.i.), the standard deviation (s), the correlation coefficient (r) and the Fis- the origin, are given. her value (F) for regressions, forced through the origin, between experimen- tal and calculated log P values.

ficients range from 0.957 to 0.974 for comparing experimental and calculated data from fragmental methods; for atom-based and con- formation-dependent procedures regression analysis yields coeffi- cients from 0.873 to 0.947. Corresponding differences are found for the Fisher-values clearly separating the quality of the fragmental methods (F > 1400) from that of the others (F < 1200).

According to the different interest of users of log P calculation pro- cedures (simple compounds in ecotoxicology, drugs in medicinal chemistry) we have separately analysed a subset of 90 simple or- ganic compounds and a subset of 48 chemically rather heteroge- neous drug molecules.

3.2 Simple Organic Compounds

Table 4 summarizes the analysis of the data obtained with this sub- set substantiating the superiority of the fragmental methods over atom-based and conformation-dependent approaches. However, differences between the various calculation procedures are less pro- nounced for the simple compounds as compared to the entire da- tabase. For the set of simple organic compounds all programs were able to calculate the entire set. Acceptable calculations with fragmental methods hold for more than 90%. For the other approaches significantly lower percentages are found (only PRO- LOGP-atomics and CHEMICALC-2 surmount the 90% level). Over- and underestimations are evenly distributed for the Rekker- and HanscNLeo-based programs, while all other ap- proaches exhibit a significant trend to underestimate log P. Most prominent example is SMJLOGP which underestimates 82 out of 90 test cases. The underestimation of log P by atom-based pro- cedures is significantly more pronounced for log P > 4. The mean

square deviations (m.s.d.) again substantiate the higher validity of the fragmental methods (m.s.d. < 0.1).

Data from regression analysis well correspond to the results derived from individual comparison of the calculation procedures. Fisher- values clearly indicate a higher statistical validity of the fragmental methods with the exception of PROLOGP-cdr.

The homologous nature of several compounds within this subset allows some comparative considerations on the treatment of gen- eral structures and fragments such as benzene or methylene by the different calculation procedures.

Benzene (experimental log P 2.13) is satisfactorily calculated by Cf-SYBYL, SANALOGP-ER, CLOGP and HINT. PRO- LOGP-cdr, KOWWIN, MOLCAD, Tsar 2.2 and CHEMICALC- 2 exhibit moderate differences to log Pexp, while pronounced differ- ences are obtained with KLOGP (+ 0.29!), PROLOGP-comb, PROLOGP-atomics, SMILOGP and ASCLOGP.

Methylene represents one of the fragments most frequently occur- ring in chemical structures and should be calculated with adequate accuracy. The Rekker-based approaches calculate a mean value of 0.52 for methylene irrespective of an aliphatic or aromatic attach- ment. KOWWIN, MOLCAD, Tsar 2.2 and HINT show higher values for aromatically than for aliphatically attached CH,, while the opposite holds for CLOGP, KLOGP, CHEMICALC-2 and SMI- LOGP. PROLOGP-atomics and ASCLOGP exhibit no fixed values for methylene.

All non-fragmental methods as well as KLOGP underestimate ben- zoic acid and its derivatives; this might be due to a miscalculation of the lipophilic contribution of an aromatically attached carboxyl group.

Comparing programs based on identical or at least similar calcula- tion methods show very similar validity: Significant differences between Cf-SYBYL and SANALOGP-ER, both based on the re- vised version of the Rekker approach, are only found for pyrene and aminopyrene indicating a better treatment of polycondensation in the latter approach.

Page 6: Calculation Procedures for Molecular Lipophilicity: a Comparative Study

Quant. Struct.-Act. Relat. IS , 403309 (1996) 408 Raimund Mannhold and Karl Dross

CHEMICALC-2 and PROLOGP-atomics best fit experimental log P among non-fragmental procedures. MOLCAD and Tsar 2.2 use the original Ghose/Crippen approach, while PRO- LOGP-atomics is based on an extension and recalculation of GhoseKrippen parameters. Correspondingly, MOLCAD and Tsar 2.2 show almost identical results. Recalculation of the Ghose/Crip- pen approach (PROLOGP-atomics), results in a slight improve- ment of predictive quality. However, for benzene as well as benzoic acid and some of its analogs MOLCAD and Tsar 2.2 better fit the experimental data.

3.3 Drug Molecules

Comparative analysis of the data of this subset is given in Table 5. Comparing the quality of the calculations of the drug molecules parallels the results obtained with the simple organic structures. However, a general reduction in the predictive power is ob- served. Again, the validity of the fragmental methods is higher as compared to the non-fragmental procedures. Acceptable calcu- lations are obtained with the fragmental methods in at least 50% of the test compounds (PROLOGP-cdr) and approximate 73% in the best case (KOWWIN). For atom-based and conformation-depen- dent procedures acceptable calculations are found in roughly 40%; outliers are here PROLOGP-atomics (52%), SMILOGP and CHEMICALC-2 (27%).

The percentage of unacceptable calculations amounts to 2% for KLOGP and about 10% in case of CLOGP, Zf-SYBYL and KOW- WIN. On the other hand, the non-fragmental procedures unaccep- tably calculate log P of the drug molecules to a significantly larger extent, ranging from 21% for PROLOGP-atomics to 38% for CHE- MICALC-2.

Inspection of under- and overestimations substantiates the trend of fragmental methods to overestimate and the significant preference of atom-based procedures to underestimate log P.

A comparison of the mean square deviations (m.s.d.) is further sup- porting the superiority of the fragmental methods: values between 0.25 (KLOGP) and 0.45 (Cf-SYBYL and CLOGP) were calculated,

Table 5. The percentage of acceptable (log P*-log P,,,, < f0.5), disputable (logP*-log Pcdc > zkO.5 and < fl.O) and unacceptable calculations (log P*-log P,,,, > fl.0). the number of over-(> log P*) and underestima- tions (<log P*), the mean square deviations (m.s.d.) and the Fisher value (F) for regressions between experimental and calculated log P, forced through the origin, are given.

while corresponding values increase to 0.57 (PROLOGP-atomics) up to even 1.07 (ASCLOGP) in the case of non-fragmental ap- proaches.

Inspecting the percentage of drug molecules calculated by the dif- ferent programs shows that Zf-SYBYL, KOWWIN, MOLCAD, Tsar 2.2 and CHEMICALC-2 were capable to calculate the entire test set; percentage of calculated structures is significantly reduced in the case of SMILOGP (80%). Regressions forced through the origin demonstrate the strongly reduced predictive power of all programs to calculate structurally more complex drug molecules as compared to simple organic structures (for details compare Fisher values in Tables 4 and 5).

Comparing programs based on identical or at least similar calcula- tion methods yield the following: revision of the Rekker system (Zf-SYBYL and S ANALOGP-ER) resulted in an improved valid- ity as compared to the original version (PROLOGP-cdr). Ef-SY- BYL and SANALOGP-ER are of almost identical quality; never- theless, in 27 out of 46 test cases the differences between log P* and log P, calculated with the above programs, significantly vary indi- cating some principal differences in the computerization ap- proaches.

Amongst atom-based procedures the GhoseKrippen related pro- grams best fit the experimental data. As already found for the sim- ple organic compounds PROLOGP-atomics exhibits a somewhat higher validity than MOLCAD and Tsar 2.2 as proven by both the m.s.d. value and the Fisher value.

4 Discussion

The central role of lipophilicity for various aspects of research in medicinal chemistry is undisputed [4, 5 , 21, 26, 271. Correspond- ingly, the availability of precise approaches to quantify lipophilicity with experimental and calculative methods attracts profound inter- est. Calculation of lipophilicity was first approached by fragmental methods which splice molecules into adequate fragments and apply correction rules coupled with the molecular connectivity. In con- trast, atom-based procedures avoid correction factors and define huge numbers of atom-types; lipophilicity is quantified by mere summation of atom-type values. Both fragmental and atom-based procedures suffer from the drawback to treat the impact of regioi- somerism and molecular flexibility on lipophilicity in an unsatis- factory manner. Thus, the youngest generation of calculation ap- proaches attempts to reflect conformational aspects and to provide the user with hydrophobic fields for application in 3D QSAR [28- 341.

We have evaluated 14 commercially available fragmental, atom- based or conformation-dependent calculation approaches for a comparative test of their predictive power. There are, of course, several criteria for a comparative evaluation of programs. Prize, platform (PC, Unix) and the support of the program or the prefer- ential interest of the user deserve mention here. For the user in the medicinal chemistry and agrochemical field predictive power of calculations for drug molecules is by far more relevant as com- pared to simple organic compounds which may be of more interest for users working on ecotoxicological studies. Accordingly, we

Page 7: Calculation Procedures for Molecular Lipophilicity: a Comparative Study

Quant. Struct.-Act. Relat. 15, 4 0 3 4 0 9 (1996) Lipophilicity Calculation Procedures 409

have included in our database both simple organic compounds as well as chemically diverse drug molecules.

The entire database comprises 138 test compounds. It is self evident that an enlargement of the database would significantly increase the statistical robustness of the analysis. On the other hand, one should not expect unequivocal clarity even from expanded data sets. Muller [35] compared 1217 compounds with KOWWIN and CLOGP resulting in a superiority of CLOGP; Meylan and Howard (personal communication) find just the opposite with a database of 12 000 compounds, In addition, comprehensive databases (say 1000 molecules or more) would not allow an individual comparison of calculations for the various structures included in the test set, as done in this study.

Due to the rather limited number and some limitation in molecular diversity of our database one should not overestimate this com- parative analysis. Nevertheless, both the entire database and the subsets (simple organic compounds and drug molecules) yield almost identical results indicating the validity of the conclusions drawn here.

Taken together, our analysis demonstrates a significantly higher quality of the fragmental methods as compared to atom-based and conformation-dependent approaches. The predictive power of the calculation procedures is significantly better for simple or- ganic molecules than for chemically heterogeneous drug structures.

Acknowledgements

The authors would like to thank the following colleagues for their participation in the comparison of log P calculation methods:

Z. Bencz and E Csizmadia (PROLOGP), M. Bohl (Cf-SYBYL), C. Cook and G.E. Kellogg (HINT), J.P. Dubost (SMILOGP), M. Kansy (ASCLOGP), G. Klopman (KLOGP), W. Meylan (KOW- WIN), D. Petelin (SANALOGP), E.E. Polymeropoulos (MOL- CAD), A. ter Laak (CHEMICALC-2) and H. van de Waterbeemd (CLOGP and Tsar 2.2).

The intensive discussions on the manuscript with Han van de Waterbeemd are especially acknowledged.

5 References

Nys, G.G. and Rekker, R.F., Chim. Thec 8, 521-535 (1973). Nys. G.G. and Rekker, R.F., Eu,: J. Med. Chem. 9, 361-375 (1974). Rekker, R.F. and de Kort, H.M., Eu,: J. Med. Chem. 14. 479-488 ( 1979). Rekker, R.F., The Hydrophobic Fragmental Constant, Pharmacochem- istry Library, Vol. I , Elsevier Amsterdam, 1977. Rekker, R.F. and Mannhold, R., Calculation of Drug Lipophilicity, VCH, Weinheim, 1992. Leo. A., JOW, P.Y.C., Silipo, C. and Hansch, C.,J. Med. Chem. 18, 865- 868 (1975). Hansch, C., and Leo, A.J., Substiruenr Constanrs for Correlarion Analysis in Chemistry and Biology. John Wiley, New York 1979. Chou, J. and Jurs, P.C., J. Chem. Inform. Comput. Sci. 19, 172-178 (1979).

Klopman, G., Li, J.-Y., Wang. S.. and Dimayuga, M. J . Chem. Inf: Com-

Meylan, W. and Howard, P., 1. Pharm. Sci. 84. 83-92 (1995). Ghose, A.K. and Crippen. G.M., J. Comp. Chem. 7, 565-577 (1986). Chose, A.K. and Crippen. G.M., J. Chem. Inf: Comput. Sci. 27, 21-35 (1987). Ghose, A.K., Pritchett, A. and Crippen, G.M., J. Comp. Chem. 9, 80- 90 (1988). Viswanadhan, V.N., Chose, A.K., Revankar, G.R. and Robins, R.K., J. Chem. Inf Comput. Sci. 29, 163-172 (1989). Brickmann, J. and Waldherr-Teschner, M., Informationstechnik 33,

Suzuki, T. and Kudo, Y, J. Comp.-Aid. Mol. Design 4, 155-198 (1990). Convard, T., Dubost, 1.-P., Le Solleu, H., and Kummer, E., Quanr. Struct.-Act. Relat. 13, 34-37 (1994). Kellogg, G.E. and Abraham, D., J. Comp.-Aid. Mol. Design 5,545-552 ( I99 I). Ulmschneider, M., Analytical Model for the Calculation of van der Waals and Solvent Accessibfe Surface Areas. Contribution to the Cal- culation of Free Enthalpies of Hydration and OctanoliWater Partition CoefJicients. Ph.D. Thesis, University of Haute-Alsace, Mulhouse, France 1993. Van de Waterbeemd, H., Karajiannis, H., Kansy. M., Obrecht, D., Miil- ler, K., Lehrnann, Chr., Conformation-lipophiliciry relationships of peptides and peptide mimetics. In: Trends in QSAR and Molecular Modeling 94. Sanz, F. (Ed.). Prous: Barcelona: in press (1995). Hansch, C., Leo, A. and Hoekman, D., Exploring QSAR. Hydrophobic. Electronic. and Steric Constants. American Chemical Society, Washington, DC, 1995. Mannhold, R., Dross, K. and Rekker, R.F., Quant. Struct.-Act. Relat. 9,

Mannhold, R., Rekker, R.F., Sonntag C.. ter Laak, A.M., Dross, K. and Polymeropoulos, E.E.. J. Phnrm. Sci. 84, 1410-1419 (1995). Taylor, P.J. and Cruickshank, J.M., J. Pharm. Pharmacol. 37, 143-144 ( 1985). Sonntag. Ch., Experimentelk und kalkulative Verfahren zur Bestim- mung der Lipophilie und weiterer physikalisch-chemischer Ei- genschaften von Anneistoffen und einfachen organischen Molekiilen. Ein Beitrag zu LSER- und QSAR-Studien. Ph. D. Thesis, Heinrich- Heine-Universitat Diisseldorf, 1995. Lipophilicity in Drug Action and Toxicology (eds. V. Pliska, B. Testa and H. van de Waterbeemd), in: Methods and Principles in Medicinal Chemistry (eds. R. Mannhold, H. Kubinyi and H. Timmerman), VCH Publishers, Weinheim, 1996. Leo, A,, Chem. Reviews 93, 1281-1306 (1993). Pixner, P., Heiden, W., Merx, H., Moeckel, G., Moller, A. and Brick- mann, J., J. Chem. ln t Comput. Sci. 34, 1309-1319 (1994). Heiden, W., Moeckel, G. and Brickmann, J., J Comp.-Aid. Mol. Design 7, 503-514 (1993). Gaillard, P., Carmpt, P.-A., Testa, B. and Boudon, A., J. Comp.-Aid. Mol. Design 8, 83-96 (1994). Richards, N.G.J. and Williams, PB.. Chem. Design Automat. News 9,

Kim, K.H., J. Comp.-Aid. Mol. Design 9, 308 - 3 18 (1 995). Kellogg, G.E., Joshi,G.S. and Abraham, D.J., Med. Chem. Res. I , 444- 453 (1 992). Abraham, D. and Kellogg, G.E., J. Comp.-Aid. Mol. Design 8, 41-49 (1994). Miiller, M., Comparative evaluation of commercial log Po, estimation software. QSAR 96-Past, present and future. VIIth Int. Workshop on QSARs in Environmental Sciences. Helsingor, Denmark, 1996.

put. S C ~ . 34, 752-781 (1994).

83-90 (1991).

21-28 (1990).

1-24 (1994).

Received on April 12th. 1996; accepted in July 24th. 1996.