9
QSAR Modeling of Peripheral Versus Central Benzodiazepine Receptor Binding Affinity of 2-Phenylimidazo[1,2-a]pyridineacetamides using Optimal Descriptors Calculated with SMILES Kunal Roy a *, Andrey Toropov b and Ivan Raska, Jr. c a Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, Kolkata 700032, India, E-mail: [email protected] b Uzbek Academy of Sciences, Institute of Geology and Geophysics, Khodzhibaev Street 49, Tashkent 700041, Uzbekistan c 3rd Medical Department, 1st Faculty of Medicine, Charles University in Prague, Unemocnice 1, 12808 Prague 2, Czech Republic Keywords: QSAR, SMILES, optimal descriptor, peripheral benzodiazepine receptor Received: June 14, 2006; Accepted: September 07, 2006 DOI: 10.1002/qsar.200630072 Abstract The optimization of correlation weights scheme has been applied to model Peripheral Benzodiazepine Receptor (PBR) binding affinity (ovary and cortex) and PBR binding selectivity (peripheral versus central benzodiazepine receptor) of 2-phenylimidazo[1,2- a]pyridineacetamides. In the present study, optimal descriptors based on Simplified Molecular Input Line Entry System (SMILES) notation have been used for the modeling purpose. The optimized descriptor formulated based on the data of training set generated statistically acceptable relations (for PBR cortex, q 2 ¼ 0.717, r 2 ¼ 0.756; for PBR ovary, q 2 ¼ 0.836, r 2 ¼ 0.852; for PBR cortex selectivity, q 2 ¼ 0.732, r 2 ¼ 0.784; for PBR ovary selectivity, q 2 ¼ 0.828, r 2 ¼ 0.845). When the relations of PBR binding affinity or selectivity with the optimized molecular descriptor formulated based on the data of the training set was used for the calculation of the corresponding response parameters of the test set, r 2 Pred values were found to be satisfactory (for PBR cortex, 0.692; for PBR ovary, 0.682; for PBR ovary selectivity, 0.772) except in the case of PBR cortex selectivity (r 2 Pred being 0.321). The results indicate promising potential of the optimization of correlation weights based on SMILES notation in modeling studies. 1 Introduction The muscle relaxant, anticonvulsant, anxiolytic, and seda- tive actions of Benzodiazepines (BZs) are mediated pri- marily via the Central Benzodiazepine Receptors (CBRs) located in the central nervous system [1]. The CBRs are part of a macromolecular complex that also contains a g- Aminobutyric Acid (GABA) receptor site and a chloride ion channel. BZs also bind to other receptors, located mainly in peripheral tissues and glial cells in the brain, called Peripheral Benzodiazepine Receptors (PBRs) [1]. PBR is a five transmembrane domain mitochondrial protein involved in the regulation of cholesterol transport from the outer to the inner mitochondrial membrane, the rate-determining step in steroid hormone biosynthesis [2 – 6]. PBRs are composed of at least three subunits: the bind- ing site for isoquinolines, with a molecular mass of 18 kDa; the voltage-dependent anion channel, with a molecular mass of 32 kDa, which binds with BZs; and the adenine nucleotide carrier, with a molecular mass of 30 kDa, which also binds BZs. Although isoquinolines can bind to the 18- kDa subunit alone, PBR-specific BZs require the interac- tion of all three subunits for binding [1]. Consistent with its localization in the Mitochondrial Permeability Transi- tion Pore (MPTP), PBR is involved in the regulation of apoptosis, regulation of cell proliferation, stimulation of steroidogenesis, immunomodulation, porphyrin transport, hemebiosynthesis, anion transport, and regulation of mito- chondrial functions [4]. Numerous results suggest that the use of specific PBR ligands to modulate PBR activity may have potential therapeutic applications and might be of significant clinical benefit in the management of a large spectrum of different indications including cancer, autoim- mune, infectious, and neurodegenerative diseases [1, 4]. Because PBRs appear to be involved in a large variety of physical diseases, mental disorders, and responses to stress, clinical benefit may be attainable by the increasing pharmacological knowledge surrounding these receptors 460 # 2007 WILEY-VCH Verlag GmbH &Co. KGaA, Weinheim QSAR Comb. Sci. 26, 2007, No. 4, 460 – 468 Full Papers

QSAR Modeling of Peripheral Versus Central Benzodiazepine Receptor Binding Affinity of 2-Phenylimidazo[1,2-a]pyridineacetamides using Optimal Descriptors Calculated with SMILES

Embed Size (px)

Citation preview

QSAR Modeling of Peripheral Versus Central BenzodiazepineReceptor Binding Affinity of2-Phenylimidazo[1,2-a]pyridineacetamides using OptimalDescriptors Calculated with SMILES

Kunal Roya*, Andrey Toropovb and Ivan Raska, Jr.c

a Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, Kolkata700032, India, E-mail: [email protected]

b Uzbek Academy of Sciences, Institute of Geology and Geophysics, Khodzhibaev Street 49, Tashkent 700041, Uzbekistanc 3rd Medical Department, 1st Faculty of Medicine, Charles University in Prague, Unemocnice 1, 12808 Prague 2, Czech Republic

Keywords: QSAR, SMILES, optimal descriptor, peripheral benzodiazepine receptor

Received: June 14, 2006; Accepted: September 07, 2006

DOI: 10.1002/qsar.200630072

AbstractThe optimization of correlation weights scheme has been applied to model PeripheralBenzodiazepine Receptor (PBR) binding affinity (ovary and cortex) and PBR bindingselectivity (peripheral versus central benzodiazepine receptor) of 2-phenylimidazo[1,2-a]pyridineacetamides. In the present study, optimal descriptors based on SimplifiedMolecular Input Line Entry System (SMILES) notation have been used for the modelingpurpose. The optimized descriptor formulated based on the data of training set generatedstatistically acceptable relations (for PBR cortex, q2¼0.717, r2¼0.756; for PBR ovary,q2¼0.836, r2¼0.852; for PBR cortex selectivity, q2¼0.732, r2¼0.784; for PBR ovaryselectivity, q2¼0.828, r2¼0.845). When the relations of PBR binding affinity or selectivitywith the optimized molecular descriptor formulated based on the data of the training setwas used for the calculation of the corresponding response parameters of the test set, r2Predvalues were found to be satisfactory (for PBR cortex, 0.692; for PBR ovary, 0.682; forPBR ovary selectivity, 0.772) except in the case of PBR cortex selectivity (r2Pred being0.321). The results indicate promising potential of the optimization of correlation weightsbased on SMILES notation in modeling studies.

1 Introduction

The muscle relaxant, anticonvulsant, anxiolytic, and seda-tive actions of Benzodiazepines (BZs) are mediated pri-marily via the Central Benzodiazepine Receptors (CBRs)located in the central nervous system [1]. The CBRs arepart of a macromolecular complex that also contains a g-Aminobutyric Acid (GABA) receptor site and a chlorideion channel. BZs also bind to other receptors, locatedmainly in peripheral tissues and glial cells in the brain,called Peripheral Benzodiazepine Receptors (PBRs) [1].PBR is a five transmembrane domain mitochondrial

protein involved in the regulation of cholesterol transportfrom the outer to the inner mitochondrial membrane, therate-determining step in steroid hormone biosynthesis [2 –6]. PBRs are composed of at least three subunits: the bind-ing site for isoquinolines, with a molecular mass of 18 kDa;the voltage-dependent anion channel, with a molecularmass of 32 kDa, which binds with BZs; and the adenine

nucleotide carrier, with a molecular mass of 30 kDa, whichalso binds BZs. Although isoquinolines can bind to the 18-kDa subunit alone, PBR-specific BZs require the interac-tion of all three subunits for binding [1]. Consistent withits localization in the Mitochondrial Permeability Transi-tion Pore (MPTP), PBR is involved in the regulation ofapoptosis, regulation of cell proliferation, stimulation ofsteroidogenesis, immunomodulation, porphyrin transport,hemebiosynthesis, anion transport, and regulation of mito-chondrial functions [4]. Numerous results suggest that theuse of specific PBR ligands to modulate PBR activity mayhave potential therapeutic applications and might be ofsignificant clinical benefit in the management of a largespectrum of different indications including cancer, autoim-mune, infectious, and neurodegenerative diseases [1, 4].Because PBRs appear to be involved in a large variety

of physical diseases, mental disorders, and responses tostress, clinical benefit may be attainable by the increasingpharmacological knowledge surrounding these receptors

460 G 2007 WILEY-VCH Verlag GmbH&Co. KGaA, Weinheim QSAR Comb. Sci. 26, 2007, No. 4, 460 – 468

Full Papers

and structure – activity relationships of PBR ligands.QSAR study based on molecular orbital derived parame-ters of 1-(2-chlorophenyl)-N-methyl-N-(1-methylpropyl)-3-isoquinolinecarboxamide derivatives was reported byCappelli et al. [7]. 3-D-QSAR studies were reported forPBR ligand pyrrolo[2,1-d][1,5]benzothiazepine derivatives[8], pyrrolo[2,1-d][1,5]benzoxazepine derivatives [8], qui-nolines and isoquinolines derivatives [8], isoquinolinecar-boxamides [9], and 2-phenylpyrazolo[1,5-a]pyrimidin-3-ylacetamides [10]. Recently, QSAR studies have been re-ported based on physicochemical and topological descrip-tors for 2-phenylimidazo-[1,2-a]pyridine derivatives [11 –13] and 2-phenylpyrazolo[1,5-a]pyrimidin-3-yl-acetamides[14]. In this background the present paper attempts tomodel PBR versus CBR binding affinity of 2-phenylimida-zo[1,2-a]pyridineacetamides using optimal descriptors cal-culated with Simplified Molecular Input Line Entry Sys-tem (SMILES).

2 Materials and Methods

2.1 SMILES

SMILES is a line notation (a typographical method usingprintable characters) for entering and representing mole-cules [15, 16]. SMILES contains the same information asmight be found in an extended connection table. The pri-mary reason why SMILES is more useful than a connec-tion table is that it is a linguistic construct, rather than acomputer data structure. The properties of SMILES nota-tion like uniqueness, compactness, human understandabili-ty, machine readability, and universal nature open manydoors to the chemical information programmer. Examplesof uses for SMILES include keys for database access, amechanism for researchers to exchange chemical informa-tion, an entry system for chemical data, and part of lan-guages for Artificial Intelligence or Expert Systems inchemistry.

2.2 Optimization of Correlation Weights

The use of a large number of descriptors available formodeling studies provides a practical problem of selectingappropriate descriptors. Though many of such descriptorsare highly intercorrelated, a large amount of chemical in-formation can be decoded on use of appropriate combina-tion of useful descriptors. One has to take care that suchdescriptors are chosen which extract maximum amount ofchemical information and, at the same time, the descrip-tors used in a multiple regression equation are not inter-correlated among themselves. The concept of flexibletopological descriptors, originally introduced by Randic[17 – 19], is a major breakthrough in this regard as the diffi-culties of multiple regression are not present in such an ap-proach. Flexible descriptors do not have a definite prede-

termined formalism, which can be applied to any sets ofcompounds for the modeling of biological activity or phys-icochemical property. The formalism of such descriptors isdefined, based on an optimization procedure to obtain thebest relation for a particular dataset. Several descriptorshave been proposed in this line and their utilities have alsobeen explored [20 – 29]. Among these descriptors, an inter-esting sort of flexible descriptor is based on the optimizedcorrelation weights of the local graph invariants [27 – 29].This scheme has been successfully attempted to model dif-ferent sets of biological activity and physicochemical prop-erty data [30 – 44].

2.3 Dataset

In the present QSAR study, PBR and CBR binding affini-ty data (pKi) reported by Trapani et al. [12] were used asthe model dataset. The affinity data (pKi) of 2-phenylimi-dazo[1,2-a]pyridineacetamide series for PBR or PBR/CBR selectivity data (DpKi) were used for QSAR analysesas the response variable. There are six regions of structuralvariations in the compounds: R1 and R2 positions (showingdiverse substitution patterns), R3 position (showing limitedstructural variations), and X, Y, and Z positions (showingthe presence or absence of chloro substituent) (Table 1).The observed PBR and CBR binding affinity values aregiven in Table 2. The dataset has been divided into a train-ing set and a test set considering that all SMILES frag-ments occur in the training set and for all test set com-pounds the binding affinity data are available for all end-points. The size of the test set has been kept in the orderof 15% of the total number of data points.

2.4 Optimal Descriptors Based on SMILES

SMILES for the compounds were obtained from the sitehttp://cactus.nci.nih.gov/services/translate/. Optimal de-scriptors based on SMILES nomenclature can be used formodeling purposes [42]. In the present paper, the optimaldescriptor was calculated as

DCWðSMILESÞ ¼X

CWðSFkÞ ð1Þ

where SFk is a SMILES fragment, CW(SFk) is the correla-tion weight of SFk.SFk entries are obtained from the SMILES according to

hierarchy: (i) fragments of four characters (if any, for in-stance [O�], [Nþ], [Sb], etc.); (ii) fragments of three char-acters (C¼C, C#C, C#N, etc.); (iii) fragments of two charac-ters (Cl,¼O, Br, etc.); and (iv) all others.It is to be noted that characters “(”and“)” are indicators

of the same phenomena (branching of atom skeleton). Un-der such circumstances, it would be logical, in the schemeof SFk building, to replace the “)”by“(”.

QSAR Comb. Sci. 26, 2007, No. 4, 460 – 468 www.qcs.wiley-vch.de G 2007 WILEY-VCH Verlag GmbH&Co. KGaA, Weinheim 461

QSAR Modeling of Peripheral Versus Central Benzodiazepine Receptor Binding Affinity

The descriptor (DCW) as defined in Eq. (1) is obtainedfrom optimal correlation weights of different SMILESfragments, which are obtained by Monte-Carlo optimiza-tion procedure. The aim of this optimization procedure isto obtain the correlation coefficient between the property/activity of the training set under consideration and the de-scriptor (DCW), as large as possible. The predictability ofthe model should be validated using a test set. The starting

value of each correlation weight was 1 and using Monte-Carlo iterative optimization procedure [25 – 27], the bestvalues of correlation weights [CW(SFk)] [which give thelargest possible correlation coefficient between the re-sponse values of the training set and the molecular de-scriptor (DCW)] were determined. Based on the opti-mized correlation weights, the molecular descriptor was fi-nally defined and this was then used to derive the relations

462 G 2007 WILEY-VCH Verlag GmbH&Co. KGaA, Weinheim www.qcs.wiley-vch.de QSAR Comb. Sci. 26, 2007, No. 4, 460 – 468

Table 1. Structural features and SMILES of 2-phenylimidazo[1,2-a]pyridineacetamides.

Cd no. X Y Z R1 R2 R3 SMILES

1 H H Cl n-C4H9 n-C4H9 H CCCCN(CCCC)C(¼O)Cc2c(c1ccc(Cl)cc1)nc3ccccn232 H Cl Cl n-C4H9 n-C4H9 H CCCCN(CCCC)C(¼O)Cc2c(c1ccc(Cl)cc1)nc3c(Cl)cccn233 Cl Cl Cl n-C4H9 n-C4H9 H CCCCN(CCCC)C(¼O)Cc2c(c1ccc(Cl)cc1)nc3c(Cl)cc(Cl)cn234 Cl Cl Cl n-C6H13 n-C6H13 H CCCCCCN(CCCCCC)C(¼O)Cc2c(c1ccc(Cl)cc1)nc3c(Cl)cc(Cl)cn235 Cl H Cl n-C4H9 n-C4H9 H CCCCN(CCCC)C(¼O)Cc2c(c1ccc(Cl)cc1)nc3ccc(Cl)cn236 Cl H Cl n-C6H13 n-C6H13 H CCCCCCN(CCCCCC)C(¼O)Cc2c(c1ccc(Cl)cc1)nc3ccc(Cl)cn237 Cl Cl H n-C4H9 C6H5 H CCCCN(C(¼O)Cc2c(c1ccccc1)nc3c(Cl)cc(Cl)cn23)c4ccccc48 Cl Cl Cl n-C4H9 C6H5 H CCCCN(C(¼O)Cc2c(c1ccc(Cl)cc1)nc3c(Cl)cc(Cl)cn23)c4ccccc49 Cl H Cl n-C4H9 C6H5 H CCCCN(C(¼O)Cc2c(c1ccc(Cl)cc1)nc3ccc(Cl)cn23)c4ccccc4

10 Cl Cl H n-C4H9 CH2C6H5 H CCCCN(Cc1ccccc1)C(¼O)Cc3c(c2ccccc2)nc4c(Cl)cc(Cl)cn3411 Cl Cl Cl tert-C4H9 CH2C6H5 H CC(C)(C)N(Cc1ccccc1)C(¼O)Cc3c(c2ccc(Cl)cc2)nc4c(Cl)cc(Cl)cn3412 Cl Cl Cl n-C3H7 4-NO2-CH2C6H5 H CCCN(Cc1ccc(N(¼O)¼O)cc1)C(¼O)Cc3c(c2ccc(Cl)cc2)nc4c(Cl)cc(Cl)cn3413 Cl Cl Cl C6H5 H H O¼C(Cc2c(c1ccc(Cl)cc1)nc3c(Cl)cc(Cl)cn23)Nc4ccccc414 Cl Cl Cl CH2CH¼CH2 CH2CH¼CH2 H C¼CCN(CC¼C)C(¼O)Cc2c(c1ccc(Cl)cc1)nc3c(Cl)cc(Cl)cn2315 Cl Cl Cl �(CH2)4� H O¼C(Cc2c(c1ccc(Cl)cc1)nc3c(Cl)cc(Cl)cn23)N4CCCC416 Cl Cl H �(CH2)4� H O¼C(Cc2c(c1ccccc1)nc3c(Cl)cc(Cl)cn23)N4CCCC417 Cl Cl H �(CH2)5� H O¼C(Cc2c(c1ccccc1)nc3c(Cl)cc(Cl)cn23)N4CCCCC418 Cl Cl Cl �(CH2)5� H O¼C(Cc2c(c1ccc(Cl)cc1)nc3c(Cl)cc(Cl)cn23)N4CCCCC419 Cl H Cl �CH2CH (COOC2H5)(CH2)3� H CCOC(¼O)C4CCCN(C(¼O)Cc2c(c1ccc(Cl)cc1)nc3ccc(Cl)cn23)C420 Cl Cl Cl �CH2CH (COOC2H5)(CH2)3� H CCOC(¼O)C4CCCN(C(¼O)Cc2c(c1ccc(Cl)cc1)nc3c(Cl)cc(Cl)cn23)C421 Cl Cl Cl �(CH2)2N (CH2C6H5)(CH2)2� H O¼C(Cc2c(c1ccc(Cl)cc1)nc3c(Cl)cc(Cl)cn23)N5CCN(Cc4ccccc4)CC522 Cl Cl H – – – O¼C(Cc2c(c1ccccc1)nc3c(Cl)cc(Cl)cn23)N5CCc4ccccc4C523 Cl Cl Cl – – – O¼C(Cc2c(c1ccc(Cl)cc1)nc3c(Cl)cc(Cl)cn23)N5CCc4ccccc4C524 Cl Cl H 2-Pyridylethyl CH3 H CC(c1ccccn1)N(C)C(¼O)Cc3c(c2ccccc2)nc4c(Cl)cc(Cl)cn3425 Cl Cl Cl 2-Pyridylethyl CH3 H CC(c1ccccn1)N(C)C(¼O)Cc3c(c2ccc(Cl)cc2)nc4c(Cl)cc(Cl)cn3426 Cl Cl H 4-Pyridyl H H O¼C(Cc2c(c1ccccc1)nc3c(Cl)cc(Cl)cn23)Nc4ccncc427 Cl Cl Cl n-C4H9 H H CCCCNC(¼O)Cc2c(c1ccc(Cl)cc1)nc3c(Cl)cc(Cl)cn2328 Cl Cl Cl C6H11 H H O¼C(Cc2c(c1ccc(Cl)cc1)nc3c(Cl)cc(Cl)cn23)NC4CCCCC429 Cl Cl H C6H11 H H O¼C(Cc2c(c1ccccc1)nc3c(Cl)cc(Cl)cn23)NC4CCCCC430 Cl Cl Cl CH2C6H5 H H O¼C(Cc2c(c1ccc(Cl)cc1)nc3c(Cl)cc(Cl)cn23)NCc4ccccc431 Cl Cl Cl n-C3H7 n-C3H7 CH3 CCCN(CCC)C(¼O)C(C)c2c(c1ccc(Cl)cc1)nc3c(Cl)cc(Cl)cn2332 Cl Cl Cl C6H11 CH3 CH3 CC(C(¼O)N(C)C1CCCCC1)c3c(c2ccc(Cl)cc2)nc4c(Cl)cc(Cl)cn3433 Cl Cl Cl CH2C6H5 CH3 CH3 CC(C(¼O)N(C)Cc1ccccc1)c3c(c2ccc(Cl)cc2)nc4c(Cl)cc(Cl)cn3434 Cl Cl Cl n-C4H9 CH3 H CCCCN(C)C(¼O)Cc2c(c1ccc(Cl)cc1)nc3c(Cl)cc(Cl)cn2335 Cl Cl H n-C4H9 CH3 H CCCCN(C)C(¼O)Cc2c(c1ccccc1)nc3c(Cl)cc(Cl)cn2336 Cl Cl Cl C6H5 CH3 H CN(C(¼O)Cc2c(c1ccc(Cl)cc1)nc3c(Cl)cc(Cl)cn23)c4ccccc437 Cl Cl Cl CH2C6H5 CH3 H CN(Cc1ccccc1)C(¼O)Cc3c(c2ccc(Cl)cc2)nc4c(Cl)cc(Cl)cn34

Full Papers K. Roy et al.

for the PBR binding affinity or selectivity values of boththe training and test sets using the least squares method ofregression

pKi ¼ aþ b �DCWðSMILESÞ ð2Þ

The optimization of correlation weights was done using aPASCAL program developed by one of the authors(AAT) [45]. Least squares linear regression analyses weredone using the MINITAB software [46]. Statistical qualityof the equations [47] was judged by examining the parame-ters like r2a (adjusted r2, i.e., explained variance), r (correla-tion coefficient), F (variance ratio) with df (degree of free-dom), and s (standard error of estimate). Significance ofthe regression coefficients was judged by the correspond-ing standard errors and t-test. Predicted Residual Sum ofSquares (PRESS) statistics were calculated for the trainingset by “Leave-One-Out” (LOO) technique [48] and q2

(crossvalidation r2 or predicted variance) along with valueswere reported. The predictive capacity of the model wasfound out by its application on the test set and the value ofr2Pred was reported.Definitions of some of the statistical terms are given be-

low.Coefficient of determination r2: This is the most com-

monly used term to describe the goodness of fit of data fora regression model. This statistic is defined by the follow-ing equation:

r2 ¼ 1�P

ðYCalc � YÞ2PðY � YÞ2

ð3Þ

In Eq. (3), YCalc and Y indicate calculated and observed ac-tivity values, respectively, and Y indicates the mean activi-ty value.

Explained variance r2a: Explained variance of the train-ing set without validation may be defined as follows

r2a ¼ðn� 1Þr2 � pn� p� 1

ð4Þ

In Eq. (4), r2 is the correlation coefficient, p is the numberof predictor variables, and n is the number of compounds.Variance ratio (F): It gives an indication about the stabil-

ity of the regression coefficients

F ¼

PðYCalc�YÞ2

pPðYCalc�YÞ2

n�p�1

; df ¼ p; n� p� 1 ð5Þ

In Eq. (5), df is the degree of freedom.Standard error of estimate (s): This is defined as

s ¼

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiPðYCalc � YÞ2

n� p� 1

sð6Þ

Crossvalidation r2 (q2): It measures predictive r2 (LOO)and part of the variance explained in the validation data

q2 ¼ 1�P

ðYPred � YÞ2PðY � YÞ2

ð7Þ

In Eq. (7), YPred and Y indicate predicted and observed ac-tivity values, respectively, and Y indicate mean activityvalue.PRESS: It is the predicted residual sum of squares sum

of squared differences between predicted and observedvalues.

QSAR Comb. Sci. 26, 2007, No. 4, 460 – 468 www.qcs.wiley-vch.de G 2007 WILEY-VCH Verlag GmbH&Co. KGaA, Weinheim 463

Table 2. Correlation weights for different SMILES fragments obtained from Monte-Carlo optimization aimed to maximize R (Y,DCW) (R is correlation coefficient, Y is the response variable, and DCW is the optimal descriptor).

SMILES fragment Correlation weights for different response variables (Y)

pKi(PBR_CORTEX) pKi(PBR_OVARY) DpKi(PBR_CORTEX�CBR) DpKi(PBR_OVARY�CBR)

n 0.9950365 0.9968634 0.9970717 0.9960953c 0.9998772 1.0001041 0.9999016 0.9999643O 0.9940314 0.9953737 0.9964309 0.9942827N 0.9939637 0.9955660 0.9960640 0.9940735Cl 1.0103491 1.0049125 1.0068791 1.0054656C 0.9991819 0.9995661 0.9994635 0.9992867C¼O 0.9937590 0.9975620 0.9966945 0.9976437C¼C 0.9971809 0.9993360 0.9981368 0.9986585¼O 1.0105853 1.0055849 1.0067035 1.00689415 1.0026058 1.0004610 1.0016301 1.00091984 0.9988051 0.9992692 0.9991288 0.99893323 1.0054255 1.0055195 1.0022071 1.01282762 1.0142803 1.0071304 1.0123293 1.01083481 1.0014338 1.0005562 1.0083617 1.0057760( 0.9955460 0.9978616 0.9972535 0.9977324

QSAR Modeling of Peripheral Versus Central Benzodiazepine Receptor Binding Affinity

PRESS ¼X

ðYPred � YÞ2 ð8Þ

r2Pred: The predictive R2 was based only on molecules pres-ent in the test set and is defined as

r2Pred ¼ 1�P

ðYPredðTestÞ � YðTestÞÞ2PðYðTestÞ � Y trainingÞ2

ð9Þ

In Eq. (9), YPred(Test) and Y(Test) indicate predicted andobserved activity values, respectively of the test set com-pounds and Y training indicate mean activity value of thetraining set. r2Test is the squared correlation coefficient (r2)between the observed and predicted data of the test set.

3 Results and Discussion

Among the response parameters (Table 1), the CBR bind-ing affinity was not modeled because of its limited var-iance. The optimization of correlation weight was repeatedusing different probes and in all cases compound 27 ap-peared as an outlier (the residual value is more than twicethe standard error of estimate of the corresponding equa-tion). Thus, compound 27 was not considered for finalmodeling because of its outlier behavior. Though variousprobes were tried, only the best ones are reported here.Numerical values of the correlation weights of SMILESfragments for four response parameters are presented inTable 2. An example of calculation of the DCW(SMILES) for modeling pKi(PBR CORTEX) is shown in Ta-ble 3. The values of optimal descriptor (DCW) for the fourresponse parameters are shown in Table 4.In the case of the PBR binding affinity (cortex), the fol-

lowing relation was obtained

pKiðPBR CORTEXÞ ¼ �204:2ð�22:282Þ þ 211:584�ð�22:300ÞDCW

ntraining ¼ 31; q2 ¼ 0:717; PRESS ¼ 14:7;r2 ¼ 0:756; r2a ¼ 0:748; F ¼ 90:0ðdf 1; 29Þ;s ¼ 0:661; nTest ¼ 5; r2Pred ¼ 0:692;r2Test ¼ 0:856

ð10Þ

Equation (10) could explain 74.8% of the variance of thePBR (cortex) binding affinity and could predict 71.7% ofthe variance of the training set. The number of predictorvariables used here is only one, though it has been derivedin a flexible manner to get an optimal value giving thehighest correlation coefficient with the response variable.In the case of the test set, the predictive r2 value is foundto be 69.2%, while the simple r2 between the observed andpredicted values is found to be 85.6%. The calculatedPBR (cortex) binding affinity values according to Eq. (10)are given in Table 5

pKiðPBR OVARYÞ ¼ �432:414ð�35:254Þþ 436:945ð�35:084ÞDCW

ntraining ¼ 29; q2 ¼ 0:836; PRESS ¼ 9:5;r2 ¼ 0:852; r2a ¼ 0:846;F ¼ 155:1ðdf 1; 27Þ;s ¼ 0:563; nTest ¼ 5; r2Pred ¼ 0:682;r2Test ¼ 0:800

ð11Þ

Eq. (11) could explain 84.6% of the variance of the PBR(ovary) binding affinity and could predict 83.6% of thevariance of the training set. The number of predictor varia-bles used here is only one though it has been derived in a

464 G 2007 WILEY-VCH Verlag GmbH&Co. KGaA, Weinheim www.qcs.wiley-vch.de QSAR Comb. Sci. 26, 2007, No. 4, 460 – 468

Table 3. Example of calculation of the DCW (SMILES) formodeling pKi (PBRCORTEX) for SMILES “CCCCN(CCCC)-C(¼O)Cc2c(c1ccc(Cl)cc1)nc3ccccn23.”

No. SFk CW(SFk)

1 C 0.999182 C 0.999183 C 0.999184 C 0.999185 N 0.993966 ( 0.995557 C 0.999188 C 0.999189 C 0.9991810 C 0.9991811 ( 0.9955512 C 0.9991813 ( 0.9955514 ¼O 1.0105915 ( 0.9955516 C 0.9991817 c 0.9998818 2 1.0142819 c 0.9998820 ( 0.9955521 c 0.9998822 1 1.0014323 c 0.9998824 c 0.9998825 c 0.9998826 ( 0.9955527 Cl 1.0103528 ( 0.9955529 c 0.9998830 c 0.9998831 1 1.0014332 ( 0.9955533 n 0.9950434 c 0.9998835 3 1.0054336 c 0.9998837 c 0.9998838 c 0.9998839 c 0.9998840 n 0.9950441 2 1.0142842 3 1.00543

DCW(CCCCN(CCCC)C(¼O)Cc2c(c1ccc(Cl)cc1)nc3ccccn23)¼1.00137.

Full Papers K. Roy et al.

flexible manner to get an optimal value giving the highestcorrelation coefficient with the response variable. In thecase of the test set, the predictive r2 value is found to be68.2% while the simple r2 between the observed and pre-dicted values is found to be 80.0%. The calculated PBR(ovary) binding affinity values according to Eq. (11) aregiven in Table 5

DpKiðPBR CORTEX � CBRÞ¼ �335:731ð�33:887Þ þ 331:158ð�33:211ÞDCWntraining ¼ 31; q2 ¼ 0:732; PRESS ¼ 13:1;r2 ¼ 0:774; r2a ¼ 0:766;F ¼ 99:4ðdf 1; 29Þ;s ¼ 0:616; nTest ¼ 5; r2Pred ¼ 0:321;r2Test ¼ 0:610

ð12Þ

Eq. (12) could explain and predict 83.9 and 82.8%, respec-tively, of the variance of the selectivity (PBR cortex overCBR) for the training set. In case of the test set, the pre-

dictive r2 value is poor but r2Test value is acceptable. The cal-culated selectivity for PBR (cortex) over CBR binding af-finity values according to Eq. (12) are given in Table 5

DpKiðPBR OVARY� CBRÞ¼ �364:126ð�30:176Þ þ 354:797ð�29:274ÞDCWntraining ¼ 29; q2 ¼ 0:828; PRESS ¼ 9:5;r2 ¼ 0:845; r2a ¼ 0:839;F ¼ 146:9ðdf1; 29Þ;s ¼ 0:565; nTest ¼ 5; r2Pred ¼ 0:772;r2Test ¼ 0:700

ð13Þ

Eq. (13) could explain and predict 83.9 and 82.8%, respec-tively, of the variance of the PBR (ovary) selectivity. Incase of the test set, both predictive r2 and r2Test values areencouraging. The calculated selectivity for PBR (ovary)over CBR binding affinity values according to Eq. (13) aregiven in Table 5.

QSAR Comb. Sci. 26, 2007, No. 4, 460 – 468 www.qcs.wiley-vch.de G 2007 WILEY-VCH Verlag GmbH&Co. KGaA, Weinheim 465

Table 4. Optimal descriptors [DCW (SMILES)] for different response variables.

Sl. no. pKi(PBR_CORTEX) PKi(PBR_OVARY) DpKi(PBR_CORTEX�CBR) DpKi(PBR_OVARY�CBR)

Training set2 1.00274 1.00660 1.02225 1.032793 1.00412 1.00723 1.02364 1.033734 1.00084 1.00548 1.02144 1.030796 0.99947 1.00486 1.02006 1.029857 1.00289 1.00751 1.02206 1.033318 1.00426 1.00813 1.02344 1.034259 1.00289 1.00751 1.02206 1.03331

10 1.00207 1.00707 1.02151 1.0325811 0.98568 0.99910 1.01170 1.0241712 1.00140 1.00625 1.02183 1.0329313 1.00048 1.00656 1.02160 1.0331014 1.00338 1.00851 1.02311 1.0353816 0.99659 1.00357 1.01863 1.0294317 0.99577 1.00313 1.01808 1.0287018 0.99713 1.00375 1.01946 1.0296420 0.99734 1.00239 1.01941 1.0280221 0.98670 0.99657 1.01257 1.0205522 1.00187 1.00556 1.02189 1.0318423 1.00324 1.00618 1.02328 1.0327825 0.99133 1.00101 1.01549 1.0263226 0.99428 1.00268 1.01732 1.0281628 0.99632 1.00332 1.01891 1.0289029 0.99496 1.00270 1.01753 1.0279730 0.99967 – 1.02105 –31 0.99601 1.00336 1.01857 1.0297832 0.99282 1.00146 1.01625 1.0268633 0.99615 1.00426 1.01838 1.0303034 1.00659 1.00854 1.02529 1.0359535 1.00521 1.00791 1.02390 1.0350036 1.00673 1.00944 1.02509 1.0364737 1.00591 – 1.02454 –Test set1 1.00137 1.00598 1.02087 1.031855 1.00274 1.00660 1.02225 1.03279

15 0.99795 1.00419 1.02001 1.0303719 0.99597 1.00177 1.01803 1.0270924 0.98997 1.00039 1.01412 1.02538

QSAR Modeling of Peripheral Versus Central Benzodiazepine Receptor Binding Affinity

The equations reported above can be used for predic-tion of PBR/CBR binding affinity/selectivity of 2-phenyli-midazo[1,2-a]pyridineacetamides having SMILES frag-ments appearing in Table 2. The same dataset was mod-eled previously [13] using topological and physicochemicalparameters. The statistical quality of the QSAR relations(as evidenced from predicted variance q2 and explainedvariance r2a values) obtained in the present paper is betterthan the relations obtained previously [13]. Table 6 showsa comparison of the equations obtained with the flexibledescriptor with those using topological and physicochemi-

cal parameters. Moreover, all equations reported in thepresent paper are based on a single (flexible) descriptor,while the previously reported equations [13] involve multi-ple descriptors in an equation. In spite of this, equationsreported presently have higher prediction statistics thanthose of the earlier reported equations.The scheme of calculation of flexible descriptors based

on correlation weights does not require complex calcula-tion of diverse descriptors and multivariate statistical anal-ysis for proper selection of descriptors. As the equationsinvolve only one flexible descriptor in each case, there is

466 G 2007 WILEY-VCH Verlag GmbH&Co. KGaA, Weinheim www.qcs.wiley-vch.de QSAR Comb. Sci. 26, 2007, No. 4, 460 – 468

Table 5. Observed and calculated PBR cortex and PBR ovary and selectivity data of 2-phenylimidazo[1,2-a]pyridineacetamides.

Compound pKi(PBR_CORTEX) pKi(PBR_OVARY) pKI (CBR) DpKi(PBR_CORTEX�CBR) DpKi(PBR_OVARY�CBR)

Obs.a Cal.b Obs.a Cal.c Obs.a Obs.a Cal.d Obs.a Cal.e

Training set2 8.104 7.96486 7.646 7.41573 5.000 3.104 2.79495 2.646 2.304233 8.284 8.25534 7.731 7.68810 5.000 3.284 3.25411 2.731 2.637944 6.424 7.56086 5.584 6.92479 5.000 1.424 2.52699 0.584 1.592696 8.292 7.27132 7.312 6.65290 5.481 2.811 2.06881 1.837 1.259937 7.939 7.99577 8.337 7.81104 5.000 2.939 2.73162 3.337 2.489418 7.876 8.28629 8.319 8.08365 5.000 2.876 3.19070 3.319 2.823299 8.824 7.99577 7.936 7.81104 6.133 2.691 2.73162 1.803 2.4894110 7.616 7.82215 7.047 7.62003 5.000 2.616 2.54998 2.047 2.2279211 5.464 4.35455 4.740 4.13856 5.000 0.464 �0.69852 �0.260 �0.7528912 7.566 7.68126 7.157 7.26165 5.000 2.566 2.65456 2.157 2.3524913 7.701 7.48659 7.599 7.39748 5.000 2.701 2.57817 2.599 2.4120614 8.092 8.09910 8.252 8.24979 5.000 3.092 3.08121 3.252 3.2239116 5.907 6.66146 5.029 6.08902 5.000 0.907 1.59441 0.029 1.1131117 6.804 6.48893 5.837 5.89876 5.000 1.804 1.41337 0.837 0.8526118 8.301 6.77740 7.036 6.17019 5.000 3.301 1.87066 2.036 1.1850020 6.845 6.82042 5.597 5.57587 5.000 1.845 1.85380 0.597 0.6124421 4.682 4.57039 3.139 3.03307 5.000 �0.318 �0.41036 �1.861 �2.0370922 7.412 7.77927 6.636 6.95823 5.000 2.412 2.67721 1.636 1.9680123 8.313 8.06950 7.449 7.23031 5.000 3.313 3.13622 2.449 2.3014225 6.046 5.54933 5.294 4.97040 5.000 1.046 0.55715 0.294 0.0076026 5.677 6.17357 5.387 5.70110 5.000 0.677 1.16342 0.387 0.6617627 6.409 – 6.538 – 5.000 1.409 – 1.538 –28 6.640 6.60477 5.977 5.97990 5.000 1.640 1.68948 0.977 0.9244429 5.878 6.31655 5.884 5.70859 5.000 0.878 1.23243 0.884 0.5922930 6.772 7.31338 – – 5.000 1.772 2.39660 – –31 5.920 6.53917 5.572 5.99819 5.000 0.920 1.57503 0.572 1.2372032 5.288 5.86376 5.513 5.16773 5.000 0.288 0.80665 0.513 0.1981033 5.005 6.56987 4.937 6.39222 5.000 0.005 1.51193 �0.063 1.4218434 9.347 8.77770 8.481 8.26145 5.000 4.347 3.80048 3.481 3.4238435 8.456 8.48650 8.638 7.98873 5.000 3.456 3.34057 3.638 3.0894136 9.481 8.80873 8.770 8.65751 5.000 4.481 3.73696 3.770 3.6095937 8.623 8.63444 – – 5.000 3.623 3.55478 – –Test set1 8.230 7.67477 7.980 7.14353 7.040 1.190 2.33641 0.940 1.970825 8.485 7.96486 8.172 7.41573 7.051 1.434 2.79495 1.121 2.3042315 6.668 6.95016 6.074 6.36056 5.000 1.668 2.05194 1.025 1.4457419 7.454 6.53190 6.520 5.30481 6.725 0.729 1.39653 �0.205 0.2805724 5.663 5.26255 5.296 4.69971 5.000 0.663 0.10164 0.296 �0.32372

aTaken from Ref. [12].bFrom Eq. (10).cFrom Eq. (11).dFrom Eq. (12).eFrom Eq. (13).

Full Papers K. Roy et al.

no collinearity problem among predictor variables. More-over, considering uniqueness, human understandability,machine readability, and universal nature of SMILES no-tation, SMILES-based optimization of correlation weightscheme offers an attractive method for chemometric mod-eling of biological activity or physicochemical propertydata.

4 Conclusions

The present analysis shows that the optimization of corre-lation weights scheme can generate statistically acceptablemodels for PBR binding affinity and selectivity. Moreover,the scheme does not require complex calculation of di-verse descriptors and statistical analysis for proper selec-tion of descriptors and finding intercorrelation amongthem. Furthermore, as each “elementary” molecular frag-ment has been provided with a “personal” numerical localdescriptor, one can identify vertices, which are increasing/decreasing the property under analysis. Thus, the schememerits further assessment on exploring QSPR/QSAR ofdifferent physicochemical properties/biological activitydata using optimization of correlation weights to justify itssuitability in modeling studies. Furthermore, the presentstudy shows successful use of SMILES-based descriptorsin the optimization of correlation weights scheme, whichwarrants extensive evaluation.

References

[1] M. Gavish, I. Bachman, R. Shoukrun, Y. Katz, L. Veen-man, G. Weisinger, A. Weizman, Pharmacol. Rev. 1999, 51,629 – 650.

[2] J. J. Lacapere, V. Papadopoulos, Steroids 2003, 68, 569 – 85.[3] V. Papadopoulas, Endocr. Res. 2004, 30, 677 – 684.[4] S. Galiegue, N. Tinel, P. Casellas, Curr. Med. Chem. 2003,

10, 1563 – 1572.[5] V. Papadopoulas, Ann. Pharm. Fr. 2003, 61, 30 – 50.[6] V. Papadopoulos, H. Amri, H. Li, Z. Yao, R. C. Brown, B.

Vidic, M. Culty, Therapie 2001, 56, 549 – 556.[7] A. Cappelli, M. Anzini, S. Vomero, P. G. De Benedetti,

M. C. Menziani, G. Giorgi, C. Manzoni, J. Med. Chem.1997, 40, 2910 – 2921.

[8] N. Cinone, H.-D. Hçltje, A. Carotti, J. Comput. Aided Mol.Des. 2000, 14, 753 – 768.

[9] M. Anzini, A. Cappelli, S. Vomero, M. Seeber, M. C. Men-ziani, T. Langer, B. Hagen, C. Manzoni, J. J. Bourguignon, J.Med. Chem. 2001, 44, 1134 – 1150.

[10] S. Selleri, P. Gratteri, C. Costagli, C. Bonaccini, A. Costan-zo, F. Melani, G. Guerrini, G. Ciciani, B. Costa, F. Spinetti,C. Martini, F. Bruni, Bioorg. Med. Chem. 2005, 13, 4821 –4834.

[11] K. Roy, A. U. De, C. Sengupta, Indian J. Biochem. Biophys.2003, 40, 203 – 208.

[12] G. Trapani, V. Laquintana, N. Denora, A. Trapani, A. Lope-dota, A. Latrofa, M. Franco, M. Serra, M. Giuseppina Pisu,I. Floris, E. Sanna, G. Biggio, G. Liso, J. Med. Chem. 2005,48, 292 – 305.

[13] M. K. Dalai, J. T. Leonard, K. Roy, Indian J. Biochem. Bio-phys. 2006, 43, 105 – 118.

[14] M. K. Dalai, J. T. Leonard, K. Roy, Indian J. Chem. 2006,45B, in press.

[15] D. Weininger, J. Chem. Inf. Comput. Sci. 1988, 28, 31 – 36.[16] D. Weininger, A. Weininger, J. L. Weininger, J. Chem. Inf.

Comput. Sci. 1989, 29, 97 – 101.[17] M. Randic, J. Comput. Chem. 1991, 12, 970 – 980.[18] M. Randic, Chemom. Intell. Lab. Syst. 1991, 10, 213 – 227.[19] M. Randic, J. Chem. Inf. Comput. Sci. 1991, 31, 311 – 320.[20] M. Randic, J. Chem. Inf. Comput. Sci. 1992, 32, 686 – 692.[21] E. Estrada, J. Chem. Inf. Comput. Sci. 1995, 35, 1022 – 1025.[22] D. Amic, D. Beslo, D. Lucic, S. Nikolic, N. Trinajstic, J.

Chem. Inf. Comput. Sci. 1998, 38, 819 – 822.[23] M. Randic, S. C. Basak, J. Chem. Inf. Comput. Sci. 1999, 39,

261 – 266.[24] D. K. Sinha, S. C. Basak, R. K. Mohanty, I. N. Basumallick,

Some Aspects in Mathematical Chemistry, Visva-BharatiUniversity Press, Santiniketan 1999.

[25] A. A. Toropov, A. P. Toropova, Russ. J. Coord. Chem. 1998,24, 81 – 85.

[26] A. A. Toropov, A. P. Toropova, N. L. Voropaeva, I. N. Ru-ban, S. Sh. Rashidova, J. Coord. Chem. 1998, 24, 525 – 529.

[27] A. A. Toropov, N. L. Voropaeva, I. N. Ruban, S. Sh. Rashi-dova, Polym. Sci. Ser. A 1999, 41, 975 – 985.

[28] G. Krenkel, E. A. Castro, A. A. Toropov, J. Mol. Struct.(Theochem) 2001, 542, 107 – 113.

[29] A. Mercader, E. A. Castro, A. A. Toropov, J. Mol. Model.2001, 7, 1 – 5.

[30] A. Mercader, E. A. Castro, A. A. Toropov, Chem. Phys.Lett. 2000, 330, 612 – 623.

[31] G. Krenkel, E. A. Castro, A. A. Toropov, J. Mol. Sci. 2001,2, 57 – 65, http://www.mdpi.org/ijms.

[32] D. J. G. Marino, P. J. Perruzo, E. A. Castro, A. A. Toropov,Internet Electron. J. Mol. Des. 2002, 1, 115 – 133, http://www.biochempress.com.

QSAR Comb. Sci. 26, 2007, No. 4, 460 – 468 www.qcs.wiley-vch.de G 2007 WILEY-VCH Verlag GmbH&Co. KGaA, Weinheim 467

Table 6. Comparison of statistical quality of equations reported in this paper with those of Ref. [13].

Response parameter Statistical quality of equations reportedin this manuscript

Statistical quality of equationsreported in Ref. [13]a

ntraining q2 PRESS r2a r2 nTest r2Pred r2Test n Q2 PRESS R2a R2

pKi (PBR_CORTEX) 31 0.717 14.7 0.748 0.756 5 0.692 0.856 36 0.663 19.2 0.720 0.776pKi (PBR_OVARY) 29 0.836 9.5 0.846 0.852 5 0.682 0.800 34 0.698 18.8 0.775 0.823DpKi (PBRCORTEX�CBR) 31 0.732 13.1 0.766 0.774 5 0.321 0.670 36 0.463 28.8 0.587 0.670DpKi (PBROVARY�CBR) 29 0.828 9.5 0.839 0.845 5 0.772 0.700 34 0.513 29.5 0.599 0.684

aExternal validation statistics not reported.

QSAR Modeling of Peripheral Versus Central Benzodiazepine Receptor Binding Affinity

[33] P. Duchowicz, E. A. Castro, A. A. Toropov, Comput. Chem.2002, 26, 327 – 332.

[34] A. A. Toropov, P. Duchowicz, E. A. Castro, Int. J. Mol. Sci.2003, 4, 272 – 283, http://www.mdpi.org/ijms.

[35] P. J. Perruzo, D. J. G. Marino, E. A. Castro, A. A. Toropov,Internet Electron. J. Mol. Des. 2003, 2, 334 – 347, http://www.biochempress.com.

[36] A. A. Toropov, T. W. Schultz, J. Chem. Inf. Comput. Sci.2003, 43, 560 – 567.

[37] A. A. Toropov, K. Roy, J. Chem. Inf. Comput. Sci. 2004, 44,179 – 186.

[38] A. A. Toropov, E. Benfenati, J. Mol. Struct. (Theochem)2004, 676, 165 – 169.

[39] A. A. Toropov, E. Benfenati, J. Mol. Struct. (Theochem)2004, 679, 225 – 228.

[40] K. Roy, A. A. Toropov, J. Mol. Model. 2005, 11, 89 – 96.

[41] I. Raska, Jr., A. Toropov, Bioorg. Med. Chem. 2005, 13,6830 – 6835.

[42] A. A. Toropov, A. P. Toropova, D. V. Mukhamedzhanova, I.Gutman, Indian J. Chem. 2005, 44A, 1545 – 1552.

[43] A. A. Toropov, E. Benfenati, Bioorg Med. Chem. 2006, 14,3923 – 3928.

[44] A. A. Toropov, E. Benfenati, Bioorg Med. Chem. 2006, 14,2779 – 2788.

[45] The program for optimization of correlation weights was de-veloped in PASCAL by A. A. Toropov.

[46] MINITAB is a statistical software of Minitab Inc., USA.[47] G. W. Snedecor, W. G. Cochran, Statistical Methods, Oxford

& IBH Publishing Co. Pvt. Ltd., New Delhi 1967, pp. 381 –418.

[48] S. Wold, L. Eriksson, Validation Tools, in: H. van de Water-beemd (Ed.), Chemometric Methods in Molecular Design,VCH, Weinheim, 1995, pp. 309 – 318.

468 G 2007 WILEY-VCH Verlag GmbH&Co. KGaA, Weinheim www.qcs.wiley-vch.de QSAR Comb. Sci. 26, 2007, No. 4, 460 – 468

Full Papers K. Roy et al.