10
QSTR with Extended Topochemical Atom Indices. 3. Toxicity of Nitrobenzenes to Tetrahymena pyriformis * Kunal Roy** and Gopinath Ghosh Drug Theoretics and Cheminformatics Lab, Division of Medicinal and Pharmaceutical Chemistry, Department of Pharmaceutical Technology, Jadavpur University, Kolkata 700 032 (India); Email: [email protected], Tel: 91-33-2414 6676, Fax: 91-33-2414 6677 Full Paper The experimental determination of toxicological proper- ties of commercial chemicals being costly and time consuming process, there is a need to develop mathemat- ical predictive tool to theoretically quantify such proper- ties. In this background, we have modeled toxicity of nitrobenzene derivatives to Tetrahymena pyriformis using extended topochemical atom (ETA) indices recently introduced by us (Roy and Ghosh, 2003). We have also modeled the toxicity data using other topological descrip- tors (Balaban J , kappa shape indices, connectivity indices, Wiener index) and two physicochemical variables (AlogP98, MolRef) and compared the ETA models with non-ETA ones. Principal component factor analysis was used as the data-preprocessing step to reduce the dimen- sionality of the data matrix and identify the important variables that are devoid of collinearities. Multiple linear regression analyses show that the best non-ETA model involves 2 k and AlogP98 as predictor variables and the quality of the relation is as follows: n 42, Q 2 0.70, R 2 a 0.75, R 2 0.76, R 0.87, F 61.0 (df 2, 39), s 0.36. On the other hand, the best ETA model has the following quality: n 42, Q 2 0.88, R 2 a 0.91, R 2 0.92, R 0.96, F 101.4 (df 4, 37), s 0.22. The ETA relations showed positive contributions of molecular bulk (size), halogen and additional nitro substitutions in the nitrobenzene ring and negative contributions of the substituents like methyl and hydroxymethyl groups to the toxicity. An attempt to use non-ETA descriptors along with the ETA ones slightly improves the quality in comparison to the best ETA model. Interestingly, the ETA model developed by us for the nitrobenzene toxicity is comparable to the previously reported models on the same data set (Estrada et al., 2001; Cronin et al., 1998). Thus, it appears that the ETA descriptors have significant potential in QSAR/QSPR/ QSTR studies, which warrants extensive evaluation. 1 Introduction Quantitative Structure-Toxicity Relationship (QSTR) stud- ies aim at developing statistically acceptable quantitative relations between structure or molecular properties of organic compounds encompassing drugs, insecticides, in- dustrial chemicals and pollutants with their toxicity data with an objective of their predictive risk assessment and environmental hazard control [1]. With the ever-increasing production of new chemicals and considering the require- ments to optimize resources for the risk assessment of existing chemicals in use, the regulatory agencies are recommending Quantitative Structure-Activity Relation- ships (QSARs) as essential tools for ecotoxicological risk prediction [2]. As the demand of different types of regulatory testing has increased and the cost of experimen- tal testing has risen, QSAR studies have emerged as the important tool to predict physical or chemical properties, environmental fate, ecological and health effects of organic chemicals [3]. The use of quantitative (and qualitative) structure-activity relationships (QSARs and SARs) by regulatory agencies and authorities to predict acute toxicity, mutagenicity, carcinogenicity and other health effects has been recently reviewed by Cronin et al. [4] and Comber et al. [5] Different QSAR models, descriptors and statistical methods have been applied to model toxicity data of diverse chemicals by different group of workers. Recently Rose and Hall [6] have published a paper on E-state modeling of fish QSAR Comb. Sci. 2004, 23 DOI: 10.1002/qsar.200330864 ¹ 2004 WILEY-VCH Verlag GmbH &Co. KGaA, Weinheim 99 * A preliminary form of this paper was presented at 40th Annual Convention of Chemists (Indian Chemical Society), Bundelk- hand University, Jhansi, India (23 ± 27 December 2003). ** Author for correspondence Key words: QSAR, QSTR, Extended topochemical atom index, ETA, TAU, VEM QSTR with Extended Topochemical Atom Indices. 3. Toxicity of Nitrobenzenes to Tetrahymena pyriformis & Combinatorial Science

QSTR with Extended Topochemical Atom Indices. 3. Toxicity of Nitrobenzenes to Tetrahymena pyriformis

Embed Size (px)

Citation preview

Page 1: QSTR with Extended Topochemical Atom Indices. 3. Toxicity of Nitrobenzenes to Tetrahymena pyriformis

QSTR with Extended Topochemical Atom Indices.3. Toxicity of Nitrobenzenes to Tetrahymena pyriformis*Kunal Roy** and Gopinath Ghosh

Drug Theoretics and Cheminformatics Lab, Division of Medicinal and Pharmaceutical Chemistry, Department of PharmaceuticalTechnology, Jadavpur University, Kolkata 700 032 (India);Email: [email protected], Tel: �91-33-2414 6676, Fax: �91-33-2414 6677

Full Paper

The experimental determination of toxicological proper-ties of commercial chemicals being costly and timeconsuming process, there is a need to develop mathemat-ical predictive tool to theoretically quantify such proper-ties. In this background, we have modeled toxicity ofnitrobenzene derivatives to Tetrahymena pyriformis usingextended topochemical atom (ETA) indices recentlyintroduced by us (Roy and Ghosh, 2003). We have alsomodeled the toxicity data using other topological descrip-tors (Balaban J, kappa shape indices, connectivity indices,Wiener index) and two physicochemical variables(AlogP98, MolRef) and compared the ETA models withnon-ETA ones. Principal component factor analysis wasused as the data-preprocessing step to reduce the dimen-sionality of the data matrix and identify the importantvariables that are devoid of collinearities. Multiple linearregression analyses show that the best non-ETA modelinvolves 2� and AlogP98 as predictor variables and the

quality of the relation is as follows: n� 42, Q2� 0.70, R2a �

0.75, R2� 0.76, R� 0.87, F� 61.0 (df 2, 39), s� 0.36. Onthe other hand, the best ETA model has the followingquality: n� 42, Q2� 0.88, R2

a � 0.91, R2� 0.92, R� 0.96,F� 101.4 (df 4, 37), s� 0.22. The ETA relations showedpositive contributions of molecular bulk (size), halogenand additional nitro substitutions in the nitrobenzene ringand negative contributions of the substituents like methyland hydroxymethyl groups to the toxicity. An attempt touse non-ETA descriptors along with the ETA ones slightlyimproves the quality in comparison to the best ETAmodel. Interestingly, the ETA model developed by us forthe nitrobenzene toxicity is comparable to the previouslyreported models on the same data set (Estrada et al., 2001;Cronin et al., 1998). Thus, it appears that the ETAdescriptors have significant potential in QSAR/QSPR/QSTR studies, which warrants extensive evaluation.

1 Introduction

Quantitative Structure-Toxicity Relationship (QSTR) stud-ies aim at developing statistically acceptable quantitativerelations between structure or molecular properties oforganic compounds encompassing drugs, insecticides, in-dustrial chemicals and pollutants with their toxicity datawith an objective of their predictive risk assessment andenvironmental hazard control [1]. With the ever-increasingproduction of new chemicals and considering the require-

ments to optimize resources for the risk assessment ofexisting chemicals in use, the regulatory agencies arerecommending Quantitative Structure-Activity Relation-ships (QSARs) as essential tools for ecotoxicological riskprediction [2]. As the demand of different types ofregulatory testing has increased and the cost of experimen-tal testing has risen, QSAR studies have emerged as theimportant tool to predict physical or chemical properties,environmental fate, ecological and health effects of organicchemicals [3]. The use of quantitative (and qualitative)structure-activity relationships (QSARs and SARs) byregulatory agencies and authorities to predict acute toxicity,mutagenicity, carcinogenicity and other health effects hasbeen recently reviewed by Cronin et al. [4] and Comberet al. [5]Different QSAR models, descriptors and statistical

methods have been applied tomodel toxicity data of diversechemicals by different group of workers. Recently Rose andHall [6] have published a paper on E-state modeling of fish

QSAR Comb. Sci. 2004, 23 DOI: 10.1002/qsar.200330864 ¹ 2004 WILEY-VCH Verlag GmbH&Co. KGaA, Weinheim 99

* A preliminary form of this paper was presented at 40th AnnualConvention of Chemists (Indian Chemical Society), Bundelk-hand University, Jhansi, India (23 ± 27 December 2003).

** Author for correspondence

Key words: QSAR, QSTR, Extended topochemical atom index,ETA, TAU, VEM

QSTR with Extended Topochemical Atom Indices. 3. Toxicity of Nitrobenzenes to Tetrahymena pyriformis

� ����������� � ��� �

Page 2: QSTR with Extended Topochemical Atom Indices. 3. Toxicity of Nitrobenzenes to Tetrahymena pyriformis

toxicity independent of 3D structure information. In thispaper they have shown the utility of E-state index in toxicitymodeling with direct physicochemical significance. Mazza-torta et al. have modeled toxicity of 562 organic chemicalsusing neural and fuzzy-neural networks [7]. Recently,Huuskonen hasmodeled toxicity of organic chemicals usingE-state index [8]. Physicochemical descriptors have beenused by Kulkarni et al. [9] to design eco-friendly moleculeswith lower toxicity. QSARs were developed for the pre-diction of aqueous toxicities for Poecilia reticulata usingCODESSA treatment by Katritzky et al. [10]. QSAR wasbuilt with weighted holistic invariant molecular (WHIM)indices as well as physicochemical parameters by DiMarzioet al. [11]. Principal component analysis was used as aclassification tool for the toxicity data of dangerouschemicals by Vighi et al. [12]. Basak et al. [13] have usedH-QSAR for predicting toxicity of chemicals. Devillers [14]has derived a general model for predicting acute toxicity ofpesticides. K-nearest neighbourhood and distance-basedoptimality were used by Schultz et al. [15] for the selectionof data sets for QSTR development. Optimization ofcorrelation weights of local graph invariants was used byToropov et al. [16] for the prediction of aquatic toxicity.Klopman et al. [17] have reportedmodeling of acute toxicitydata against Vibrio fischeri using Multiple Computer-Assisted Structure Evaluation (M-CASE) program.QSARs were developed for 43 aromatic compounds toPhotobacterium phosphoreum and Daphnia magna usingpartition coefficient, linear solvation energy relationship,molecular connectivity index and group contributions [18].Seward et al. [19] have reported toxicity modeling ofaliphatic carboxylic acids and salts to Tetrahymena pyrifor-mis using physicochemical and quantummechanical param-eters. Argese et al. [20] have reportedQSARs for toxicity ofchlorophenols to mammalian submitochondrial particlesusing physicochemical and structural parameters. Croninet al. [21] performed QSAR studies of comparative toxicityin aquatic organisms. Comparative molecular field analysis(CoMFA) was used by Liu et al. to model toxicity data of 56phenylsulfonyl carboxylates on Vibrio fischeri [22] andPhotobacterium phosphoreum [23]. Gao et al. [24] haveused artificial neural network (ANN) for the prediction ofbiotoxicity of substituted benzenes. Cui et al. have usedHolographicQSAR(HQSAR) for the prediction of toxicityof benzene derivatives [25]. Recently, mutagenicity data ofvarious aromatic and heteroaromatic amines have beenmodeled by Gramatica et al. [26] using linear multivariateregression and genetic algorithm variable-set-selection.Recently, we havemodeled toxicity of substituted phenols

against Tetrahymena pyriformis [27] and fish toxicity data of92 diverse aromatic compounds against Poecilia reticulata[28] to explore the suitability of the newly developedextended topochemical atom (ETA) indices in modelingstudies. In both cases encouraging results were obtained. Inour present work, we have been modeled toxicity of nitro-aromatic compounds against Tetrahymena pyriformis using

ETA parameters by multiple regression technique and thebest relation obtained has been compared to that with someselected topological and physicochemical descriptors andalso with models reported previously [29, 30]

2 Materials and Methods

Recently, extended topochemical atom (ETA) indices weredeveloped from our laboratory [27, 28] in the valenceelectron mobile (VEM) environment as an extension of theTAUconcept [31 ± 40]whichwas originally developed in thelate eighties. Definitions of some basic parameters used inthe ETA scheme are given below.The core count of a non-hydrogen vertex [�] is defined as

[27]:

� � Z � Zv

Zv� 1PN � 1

�1�

In Eq. 1, PN stands for period number. Hydrogen atombeing considered as the reference,� for hydrogen is taken tobe zero. Again, another term �, as a measure of electro-negativity, has been defined [27] in the following manner:

����� 0.3 ZV (2)

It is interesting to note that � values of different atoms(which are commonly found in organic compounds) havehigh correlation (r� 0.946) [27] with (uncorrected) van derWaals volume [41] while � has good correlation (r� 0.937)with Pauling×s electronegativity scale [27].The VEM count � of ETA scheme is defined as:

���x���y�� � (3)

In the above equation, � is a correction factor of value 0.5per atom with loan pair of electrons capable of resonancewith an aromatic ring (e.g., nitrogen of aniline, oxygen ofphenol, etc.).For calculation of theVEMcount, contributionof a sigma bond (x) between two atoms of similar electro-negativity (��� 0.3) is considered to be 0.5, and for a sigmabond between two atoms of different electronegativity(��� 0.3) it is considered to be 0.75. Again, in case of pibonds, contributions (y) are considered depending on thetype of pi bond: (i) for pi bond between two atoms of similarelectronegativity (��� 0.3), y is taken to be 1; (ii) for pibond between two atoms of different electronegativity(��� 0.3) or for conjugated (non-aromatic) pi system, y isconsidered to be 1.5; (iii) for aromatic pi system, y is taken as2.The VEM vertex count �i of the ith vertex in a molecular

graph is defined as:

�i ��i

�i

�4�

100 ¹ 2004 WILEY-VCH Verlag GmbH&Co. KGaA, Weinheim QSAR Comb. Sci. 2004, 23

Kunal Roy and Gopinath Ghosh

� ����������� � ��� �

Page 3: QSTR with Extended Topochemical Atom Indices. 3. Toxicity of Nitrobenzenes to Tetrahymena pyriformis

In the above equation, �i stands for � value for the ith vertexand �i stands for VEM count considering all bondsconnected to the atom and lone pair of electrons (if any).Finally, the composite index � is defined in the following

manner:

� ��i�j

�i�j

r2ij

� �0�5

�5�

In Eq. 5, both bonded and non-bonded interactions havebeen considered. rij stands for the topological distancebetween ith atom and jth atom. Again, when all heteroatomsand multiple bonds in the molecular graph are replaced bycarbon and single bond respectively, corresponding mo-lecular graphmay be considered as the reference alkane andthe corresponding composite index value is designated as�R. Considering functionality as the presence of heteroa-toms (atoms other than carbon or hydrogen) and multiplebonds, functionality index �Fmay be calculated as �R��. Toavoid dependence of functionality on vertex count or bulk,we have defined [27] another term � �

F as �F/NV. Again, onecan determine contribution of a particular position, vertexor substructure to functionality in the following manner:

�� i��j� i

�i�j

r2ij

� �0�5

�6�

In Eq. 6, [�]i stands for contribution of the ith vertex to �.Similarly, contribution of the ith vertex [�R]i to �R can becomputed.Contributionof the ith vertex [�F]i to functionalitymay be defined as [�R]i� [�]i. To avoid dependence of thisvalue onNV, a related term [� �

F]i was defined [27] as [�F]i/NV.Again, considering only bonded interactions (rij� 1), the

corresponding composite index is written as �local.

�local ��

i� j�rij�1

�i�j

� �0�5�7�

In the similar way, �localR for the corresponding reference

alkane may also be calculated. Local functionality contri-bution (without considering global topology), �local

F , may becalculated as �local

R ��local.Branching index �B can be calculated as �local

N � �localR �

0.086NR, where NR stands for the number of rings in themolecular graph of the reference alkane. TheNR term in thebranching index expression represents a correction factorfor cyclicity. �local

N indicates � value of the corresponding

QSAR Comb. Sci. 2004, 23 ¹ 2004 WILEY-VCH Verlag GmbH&Co. KGaA, Weinheim 101

Table 1. Calculations of ETA parameters: example of 4-chloronitrobenzene

4-Chloronitrobenzene Reference alkane

Vertex no. 1 2 3 4 7 8 10 1 2 3 4 7 8 10�i 0.5 0.5 0.5 0.5 0.4 0.33 0.715 0.5 0.5 0.5 0.5 0.5 0.5 0.5�i 3.75 3 3 3.75 5.25 2.25 1.25 1.5 1 1 1.5 1.5 0.5 0.5i 0.133 0.167 0.167 0.133 0.076 0.147 0.572 0.33 0.5 0.5 0.33 0.33 1 1[�]i 0.802 0.794 0.800 0.849 0.567 0.508 0.999 ± ± ± ± ± ± ±[�R]i ± ± ± ± ± ± ± 2.389 2.440 2.372 2.226 2.366 2.473 2.349[� �

F]i 0.159 0.165 0.157 0.138 0.180 0.196 0.135� 3.711�R ± 11.949� �F 0.824 ±

�local 1.519 ±�localR ± 4.696

�localF

l 3.177 ±�localN ± 4.914

� �B* 0.022 ±

�� 4.775 5.000

Atoms 5, 6 and 9 are equivalent to atoms 3, 2 and 8 respectively.* Without ring correction

QSTR with Extended Topochemical Atom Indices. 3. Toxicity of Nitrobenzenes to Tetrahymena pyriformis

� ����������� � ��� �

Page 4: QSTR with Extended Topochemical Atom Indices. 3. Toxicity of Nitrobenzenes to Tetrahymena pyriformis

normal alkane (straight chain compound of same vertexcount obtained from the reference alkane), which may beconveniently calculated as (when NV 3):

�localN � 1.414� (NV� 3)0.5 (8)

To calculate branching contribution relative to the mo-lecular size, another term � �

B is defined as �B/NV.Calculation of different indices is illustrated taking

example of 4-chloronitrobenzene in Table 1.In the present communication, utility of ETA parameters

has been demonstrated through a QSTR study taking

toxicity (pC) of a set of 42 nitroaromatic compounds (takenfrom Ref. [29]) as the model data set (Table 2). DifferentETAdescriptors calculated for the nitrobenzene derivativesare defined in Table 3. Factor analysis has been performedas the data preprocessing step for identificationof importantdescriptors for the subsequent multiple regression analysis[42, 43]. For this purpose, the data matrix consisting of thedescriptors has been subjected to principal componentfactor analysis using STATISTICA software [44]. Theprincipal objectives of factor analysis are to display multi-dimensional data in a space of lower dimensionality withminimal loss of information and to extract basic features

102 ¹ 2004 WILEY-VCH Verlag GmbH&Co. KGaA, Weinheim QSAR Comb. Sci. 2004, 23

Table 2. Experimental, calculated and predicted values of toxicity (pC) of nitrobenzenes to Tetrahymena pyriformis

Sl. No. Compounds Obs.1 Calc.2 Pred.2 Calc.3 Pred.3

1 2,6-Dimethylnitrobenzene 0.30 0.18 0.17 0.32 0.332 2,3-Dimethylnitrobenzene 0.56 0.18 0.14 0.34 0.273 2-Methyl-3-chloronitrobenzene 0.68 0.60 0.60 0.64 0.634 2-Methylnitrobenzene 0.05 0.18 0.19 0.20 0.225 2-Chloronitrobenzene 0.68 0.67 0.66 0.56 0.556 2-Methyl-5-chloronitrobenzene 0.82 0.60 0.59 0.51 0.497 2,4,5-Trichloronitrobenzene 1.53 1.39 1.38 1.41 1.408 2,5-Dichloronitrobenzene 1.13 1.08 1.07 0.99 0.989 6-Chloro-1,3-dinitrobenzene 1.98 1.68 1.66 1.65 1.6210 Nitrobenzene 0.14 0.18 0.18 0.14 0.1411 3-Methylnitrobenzene 0.05 0.18 0.19 0.22 0.2412 1,3-Dinitrobenzene 0.89 1.28 1.34 1.24 1.3013 3,4-Dichloronitrobenzene 1.16 1.05 1.05 0.99 0.9814 4-Methylnitrobenzene 0.17 0.18 0.18 0.23 0.2315 1,4-Dinitrobenzene 1.30 1.26 1.26 1.23 1.2216 4-Chloronitrobenzene 0.43 0.62 0.63 0.56 0.5817 2,3,5,6-Tetrachloronitrobenzene 1.82 1.87 1.88 1.83 1.8318 6-Methyl-1,3-dinitrobenzene 0.87 1.23 1.28 1.30 1.3819 3-Chloronitrobenzene 0.73 0.64 0.63 0.56 0.5520 1,2-Dinitrobenzene 1.25 1.36 1.38 1.28 1.2821 2-Bromonitrobenzene 0.75 1.00 1.04 0.93 0.9422 6-Bromo-1,3-nitrobenzene 2.31 2.00 1.93 2.01 1.9823 3-Bromonitrobenzene 1.03 0.94 0.92 0.93 0.9224 4-Bromonitrobenzene 0.38 0.90 0.97 0.93 0.9625 2,4,6-Trimethylnitrobenzene 0.86 1.02 1.03 1.03 1.0426 5-Methyl-1,2-dinitrobenzene 1.52 1.63 1.64 1.55 1.5527 2,4-Dichloronitrobenzene 0.99 1.07 1.07 0.99 0.9928 3,5-Dichloronitrobenzene 1.13 1.06 1.06 0.99 0.9829 6-Iodo-1,3-dinitrobenzene 2.12 1.86 1.82 2.19 2.1930 2,3,4,5-tetrachloronitrobenzene 1.78 1.85 1.86 1.83 1.8431 2,3-Dichloronitrobenzene 1.07 1.10 1.10 0.99 0.9832 2,5-Dibromonitrobenzene 1.37 1.46 1.52 1.72 1.7633 1,2-Dichloro-4,5-dinitrobenzene 2.21 2.12 2.11 2.07 2.0634 3-Methyl-4-bromonitrobenzene 1.16 0.76 0.72 1.04 1.0335 2,3,4-Trichloronitrobenzene 1.51 1.48 1.47 1.41 1.4036 2,4,6-Trichloronitrobenzene 1.43 1.48 1.48 1.41 1.4137 4,6-Dichloro-1,2-dinitrobenzene 2.42 2.14 2.11 2.09 2.0638 3,5-Dinitrobenzyl alcohol 0.53 0.76 1.00 0.78 1.0439 3,4-Dinitrobenzyl alcohol 1.09 0.86 0.63 0.84 0.5940 2,4,6-Trichloro-1,3-dinitrobenzene 2.19 2.49 2.53 2.48 2.5141 2,3,5,6-Tetrachloro-1,4-dinitrobenzene 2.74 2.84 2.87 2.88 2.9242 2,4,6-Trichloro-1,3-dinitrobenzene 2.59 2.49 2.48 2.48 2.46

1 Observed values are taken from Ref. [29]. Obs.�Observed, Calc.�Calculated, Pred.�Predicted2 From Eq. 9. 3 From Eq. 10

Kunal Roy and Gopinath Ghosh

� ����������� � ��� �

Page 5: QSTR with Extended Topochemical Atom Indices. 3. Toxicity of Nitrobenzenes to Tetrahymena pyriformis

behind the data with ultimate goal of interpretation and/orprediction. The factors were extracted by the principalcomponent method and then rotated by VARIMAXrotation to obtain Thurston×s simple structure. Only factorsdescribing 5% of the total variance were considered. Theanalyses were carried out based on the following postulates:(a) only variables with non-zero loadings in such factorswhere biological activity also has non-zero loading areimportant in explaining variance of the activity; (b) onlyvariables with non-zero loadings in different factors may becombined in regression equations; (c) the factor patternindicates whether in the parameter space the biologicalactivity can be explained in a satisfactory manner; if not, adifferent set of variables are to be chosen.The calculations of �, �R, �F, �B and contributions of

different vertices to �F were done, using distancematrix andVEM vertex counts as inputs, by the GW-BASIC programsKRETA1 and KRETA2 developed by one of the authors[45]. We have also modeled the toxicity data using otherselected topological and physicochemical variables andcompared the ETAmodels with non-ETA ones. The valuesfor the topological descriptors and physicochemical varia-bles for the compounds havebeen generated byQSAR� andDescriptor� modules of the Cerius 2 version 4.6 software

[46]. The various topological indices calculated are BalabanJ, connectivity indices (0, 1, 2, 3p, 3c, 0v, 1v, 2v, 3v

p, 3vc),

kappa shape indices (1�, 2�, 3�, 1��, 2��, 3��) and Wienerindex (W). Among the physicochemical variables, molarrefractivity (MolRef) and hydrophobicity (AlogP98) wereconsidered.The regression analyses were carried out using a program

RRR98 [45]. The statistical quality of the equations [47] wasjudged by the parameters like explained variance (R2

a, i.e.,adjusted R2), correlation coefficient (r or R), standard errorof estimate (s) and variance ratio (F) at specified degrees offreedom (df). PRESS (leave-one-out) statistics [48, 49] werecalculated using the programs KRPRES1 and KRPRES2[45], and leave-one-out cross-validation R2 (Q2), predictedresidual sum of squares (PRESS), standard deviation basedonPRESS (SPRESS), standard deviation of error of prediction(SDEP) and average absolute predicted residual (Presav)were reported. Finally, ™leave-many-out∫ cross-validationwas applied on the final equations. All the acceptedequations have regression constants and F ratios significantat 95% and 99% levels respectively, if not stated otherwise.A compound was considered as an outlier if the residual ismore than twice the standard error of estimate for aparticular equation.

3 Results and Discussion

Table 4 shows the results of factor analysis of the datamatrixcomposed ofETAdescriptors. It is observed that five factorscould explain 98.1% of the variance of the data matrix.Factor loading pattern afterVARIMAXrotation shows thatthe nitrobenzene toxicity is highly loaded with factor 1which is in turn highly loaded in ��, �B, � �

B, �R and [� �F]Cl.

Again, the toxicity is moderately loaded with factor 2(highly loaded in �local

F and [� �F]NO2) and factor 4 (highly

loaded in [� �F]CH3). Further, the toxicity shows low loading in

factor 3 (highly loaded in [� �F]Br/I) and factor 5 (highly loaded

in [� �F]CH2OH). Based on the results of factor analyses, a

number of equations were generated taking the descriptors

QSAR Comb. Sci. 2004, 23 ¹ 2004 WILEY-VCH Verlag GmbH&Co. KGaA, Weinheim 103

Table 3. Definitions of different ETA parameters used in explor-ing QSAR of nitrobenzene toxicity

Variables Definition

�� Sum of � values of all non-hydrogen verticesof a molecule

� the composite ETA index�R the composite index for the reference alkane[��F]CH3 functionality for the methyl group[��F]CH2OH functionality for the hydroxymethyl group[��F]Cl functionality for the chlorine atom[��F]NO2 functionality for the nitro group[��F]Br/I functionality for the bromine or iodine atom�localF local functionality

��B � �B/NV

Table 4. Factor loadings of the variables (ETA) after VARIMAX rotation

Variables Factor 1 Factor 2 Factor 3 Factor 4 Factor 5 Communality

pC 0.774* 0.444 �0.111 0.319 0.199 0.949�� 0.919* 0.224 �0.214 0.192 �0.059 0.981�localF 0.491 0.806* 0.177 0.162 �0.207 0.992

�R 0.785* 0.563 0.121 0.066 �0.179 0.983[��F]Cl 0.758* �0.249 0.441 0.359 0.163 0.987[��F]NO2 0.189 0.969* �0.007 0.081 �0.124 0.996[��F]CH3 �0.165 �0.153 0.078 �0.964* 0.055 0.988[��F]Br/I �0.032 �0.103 �0.988* 0.089 0.048 0.998[��F]CH2OH �0.037 0.182 0.033 0.041 �0.979* 0.996�B 0.918* 0.354 0.116 0.040 �0.035 0.985��B 0.950* 0.151 0.102 �0.002 0.056 0.938% variance 0.426 0.221 0.119 0.113 0.103 0.981

QSTR with Extended Topochemical Atom Indices. 3. Toxicity of Nitrobenzenes to Tetrahymena pyriformis

� ����������� � ��� �

Page 6: QSTR with Extended Topochemical Atom Indices. 3. Toxicity of Nitrobenzenes to Tetrahymena pyriformis

showing high loading in different factors and the best twoequations are noted below:

pC � 3�260 �0�448� � ��F

� �Cl�8�927 �2�181� � �

�F

� �Br�I

� 2�066 �0�289� � ��F

� �NO2

�1�767 �1�450� � ��F

� �CH2OH

� 0�180 �0�115� �n� 42, Q2� 0.880, R2

a � 0.907, R2� 0.916, R� 0.957,F� 101.1(df 4,37)

s� 0.218, AVRES� 0.162, SDEP� 0.245, SPRESS� 0.261,PRESS� 2.514, Presav� 0.191 (9)

pC � 0�591 �0�093� ��

�� 0�891 �0�319� �

��F

� �NO2

�2�063 �1�379� � ��F

� �CH3

� 3�800 �1�416� � ��F

� �CH2OH

�2�256 �0�540� � (10)

n� 42, Q2� 0.879, R2a � 0.906, R2� 0.916, R� 0.957,

F� 100.3(df 4,37)

s� 0.218, AVRES� 0.162, SDEP� 0.246, SPRESS� 0.262,PRESS� 2.532, Presav� 0.191

The 95% confidence intervals of the regression coefficientsare shown within parentheses. Eqs. 9 and 10 show that usingfour predictor variables, 88% predicted variance and 91%explained variance canbe achieved. The standard errors andpredicted standard errors of these two equations are 0.22and 0.25 respectively. As a statistical check, it may be notedthat Q2 values of Eqs. 9 and 10 are more than therecommended value of 0.5 and the difference between Q2

and R2 values in each case of Eqs. 9 and 10 was less than thehighest allowed value of 0.3 [3]. Intercorrelation (r) amongthe descriptors used in Eqs. 9 and 10 has been given inTable 5. 4-Bromonitrobenzene (24) acts as an outlier (butnot deleted) for both equations. ThoughEq. 9 shows slightlyhigher Q2 and R2

a values than those of Eq. 10, the latter isconsidered as the best equation considering that the variable[� �

F]Cl shows considerable loading (Table 4) in factor 3 inwhich the variable [� �

F]Br/I is highly loaded and both of thevariables have been used in Eq. 9. To further checkpredictive capacity of Eqs. 9 and 10, leave-10%-out and

104 ¹ 2004 WILEY-VCH Verlag GmbH&Co. KGaA, Weinheim QSAR Comb. Sci. 2004, 23

Table 5. Intercorrelation (r) among important predictor variables

[��F]Cl [��F]NO2 [��F]CH3 [��F]BR/I [��F]CH2OH �� AlogP98 2� 3vC

[��F]Cl 1 0.096 0.379 0.392 0.204 0.618 0.850 0.195 0.696[��F]NO2 1 0.271 0.097 0.292 0.411 0.352 0.895 0.010[��F]CH3 1 0.139 0.110 0.382 0.096 0.295 0.244[��F]BR/I 1 0.096 0.174 0.035 0.169 0.249[��F]CH2OH 1 0.062 0.476 0.479 0.157�� 1 0.600 0.667 0.886AlogP98 1 0.085 0.8332� 1 0.2903v

C 1

Table 6. Results of leave-many-out cross-validation applied on Eqs. (9) and (10)Model equation, pC���ixi��

Key Eq. no. Type ofcross-validation

Numberof cycles

Average regression coefficients (standard deviations) StatisticsQ2(Average Pres)

(9) Leave-10%-out 10a 3.260 (0.045) [��F]Cl� 8.908 (0.239) [��F]Br/I� 2.065 (0.048) [� �

F]NO2� 1.764 (0.368) [� �F]CH2OH

� 0.180 (0.008)

0.889(0.184)

(9) Leave-25%-out 4b 3.257 (0.075) [��F]Cl� 9.025 (0.723) [��F]Br/I� 2.053 (0.134) [��F]NO2� 1.732 (0.866) [� �

F]CH2OH

� 0.180 (0.036)

0.864(0.200)

(10) Leave-10%-out 10a 0.591 (0.012) ��� 0.892 (0.050) [��F]NO2

� 2.052 (0.299) [��F]CH3� 3.798 (0.439) [��F]CH2OH

� 2.259 (0.076)

0.880(0.192)

(10) Leave-25%-out 4b 0.587 (0.015) ��� 0.889 (0.105) [��F]NO2

� 2.130 (0.385) [��F]CH3� 3.783 (0.868) [� �F]CH2OH

� 2.234 (0.100)

0.874(0.195)

Q2 denotes cross-validated R2. Average Pres means average of absolute values of predicted residuals.a Compounds were deleted in 10 cycles in the following manner: (1, 11, 21, 31, 41), (2, 12, 22, 32, 42),. . . . . . . . . . . . . . , (10, 20, 30, 40)b Compounds were deleted in 4 cycles in the following manner: (1, 5, 9,. . . . ,41), (2, 6, 10,. . . . ,42),. . . . . . . . . . . . . . , (4, 8, 12,. . . . ,40)

Kunal Roy and Gopinath Ghosh

� ����������� � ��� �

Page 7: QSTR with Extended Topochemical Atom Indices. 3. Toxicity of Nitrobenzenes to Tetrahymena pyriformis

leave-25%-out was applied and the results presented inTable 6 suggest that the equations are of acceptable quality.The positive coefficient of �� in Eq. 10 indicates that the

nitrobenzene toxicity increases with rise in molecular bulk.Again, the positive coefficients of [� �

F]Cl, [� �F]Br/I and [� �

F]NO2

in Eq. 9 indicates that the toxicity increases with thepresence of �Cl, �Br, �I and (additional) �NO2 groups.Further, negative coefficients of [� �

F]CH3 and [� �F]CH2OH in

Eq. 10 implies the decrease of the toxicity in presence of�CH3 and�CH2OH groups.The calculated and predicted (leave-one-out) toxicity

values according to Eqs. 9 and 10 are given in Table 2.

Next, attempt was made to compare the ETA relationswith non-ETAones.Table 7 shows factor analysis of thedatamatrix composed of the toxicity values and selectedtopological descriptors. Three factors could explain 96.5%of the variance of the datamatrix. The nitrobenzene toxicitywas found to be highly loaded with factor 1 (which is in turnhighly loaded in Balaban J, 1�, 2�, 1��, 0, 1, 2, 3p, 3c,Wiener) and factor 2 (which is in turn highly loaded in 0v,1v, 2v, 3v

p, 3vc). The best formulated relation from these

descriptors was the following:

pC� 0.639(�0.211)2�� 2.568(�0.776)3�vc � 2.482(�0.814)

n� 42, Q2� 0.686, R2a � 0.736. R2� 0.749, R� 0.865,

F� 58.1 (df 2,39)

s� 0.367, AVRES� 0.273, SDEP� 0.395,SPRESS� 0.410, PRESS� 6.555, Presav� 0.298 (11)

Eq. 11 is a two-variable relation predicting 68.6% andexplaining 73.6% of the variance of the toxicity. Intercorre-lation (r) among the two descriptors used in Eq. 11 has beengiven in Table 5. In order to explore the possibility ofimproving quality of Eq. 11 using physicochemical param-eters (AlogP98 andMolRef), factor analysis of data matrixcomposed of important topological descriptors along withAlogP98, MolRef and the toxicity values were performedand the factor loadings are shown in Table 8. Only twofactors could explain 95.2% of the variance of the datamatrix. The best formulated equation using these descrip-tors is as follows:

pC� 0.899(�0.199)2�� 0.483(�0.141)A log P98� 3.696(�0.912)

n� 42, Q2� 0.702, R2a � 0.745, R2� 0.758, R� 0.871,

F� 61.0 (df 2,39)

s� 0.360, AVRES� 0.277, SDEP� 0.385,SPRESS� 0.399, PRESS� 6.223, Presav� 0.303 (12)

Eq. 12 involves AlogP98 and 2� as predictor variables andpredicts 70.2% of the variance of the toxicity (74.5%explained variance). Clearly, Eq. 10 formulated based onETA descriptors is better in statistical quality than Eqs. 11and 12 involving other topological and/or physicochemicalparameters. Intercorrelation (r) among the two descriptorsused in Eq. 12 has been given in Table 5.Next, attempt was made to use non-ETA descriptors

along with ETA ones to further improve the ETA models.Table 9 shows factor analysis results of the data matrixcomposed of ETA and non-ETA descriptors. Five factorscould explain 98.3% of the variance. The best equationconsidering both ETA and non-ETA descriptors was thefollowing:

QSAR Comb. Sci. 2004, 23 ¹ 2004 WILEY-VCH Verlag GmbH&Co. KGaA, Weinheim 105

Table 7. Factor loadings of the variables (topological) afterVARIMAX rotation

Variables Factor 1 Factor 2 Factor 3 Communality

pC 0.567 0.647 0.233 0.795Balaban J 0.758* 0.538 �0.268 0.9361� 0.891* 0.398 0.216 0.9992� 0.918* 0.127 0.347 0.9803� 0.634 �0.212 0.718* 0.9631�� 0.742* 0.640 0.171 0.9902�� 0.685 0.653 0.267 0.9673�� 0.430 0.584 0.677 0.9850 0.879* 0.433 0.196 0.9991 0.904* 0.360 0.229 0.9992 0.857* 0.450 0.233 0.9923P 0.819* 0.528 0.072 0.9553C 0.757* 0.576 0.216 0.9520v 0.439 0.891* 0.083 0.9941v 0.413 0.900* 0.090 0.9892v 0.272 0.953* 0.080 0.9893v

P 0.331 0.878* �0.088 0.8893v

C 0.163 0.973* 0.069 0.979W 0.897* 0.348 0.250 0.988% variance 0.478 0.399 0.087 0.965

Table 8. Factor loadings of the variables (topological and phys-icochemical) after VARIMAX rotation

Variables Factor 1 Factor 2 Communality

pC 0.688 0.562 0.7891� 0.951* 0.305 0.9972� 0.994* 0.006 0.9890 0.936* 0344 0.9941 0.964* 0.262 0.9972 0.927* 0.361 0.9893P 0.855* 0.456 0.9383C 0.834* 0.505 0.9510v 0.539 0.835* 0.9891v 0.519 0.842* 0.9792v 0.385 0.910* 0.9773v

P 0.389 0.848* 0.8703v

C 0.278 0.944* 0.969MolRef 0.700* 0.699 0.979AlogP98 �0.061 0.934* 0.877% variance 0.526 0.427 0.952

QSTR with Extended Topochemical Atom Indices. 3. Toxicity of Nitrobenzenes to Tetrahymena pyriformis

� ����������� � ��� �

Page 8: QSTR with Extended Topochemical Atom Indices. 3. Toxicity of Nitrobenzenes to Tetrahymena pyriformis

pC � 0�668 �0�092� �AlogP98� 2�290 �0�309� ��

�F

� �NO2

�2�261 �1�332� � ��F

� �CH3

�3�054 �1�967� �

��F

� �Br�I

�1�036 �0�268� �

n� 42, Q2� 0.892, R2a � 0.911, R2� 0.920, R� 0.959,

F� 106.0 (df 4,37)

s� 0.213, AVRES� 0.164, SDEP� 0.231,SPRESS� 0.247, PRESS� 2.249, Presav� 0.189 (13)

Eq. 13 involving both ETA and non-ETA descriptors isslightly superior in quality to Eq. 10 involving only ETAdescriptors. Intercorrelation (r) among the descriptors usedin Eq. 13 has been given in Table 5.The present data set was previously modeled by Etsrada

et al. [29] using fragmental contributions adopting TOPS-MODE approach, and R2 statistic for the best developedmodel was 0.910 (Q2� 0.901, F� 93.9 [df 4,37], s� 0.22).This data set was also modeled by Cronin et al. [30] usingphysicochemical descriptors (1-octanol-water partition co-efficient and molecular orbital parameters) and R2 value ofthe best model was 0.881 (Q2� 0.866, F� 154 [df 2,39], s�0.246). Eq. 10 developed in the present study is statisticallycomparable to the relations reported previously [29, 30].Thus, it appears that ETA descriptors have significantpotential in QSAR/QSPR/QSTR studies, which warrantsextensive evaluation.

Acknowledgement

A financial grant from J.U. Research Fund is thankfullyacknowledged.

References

[1] R. Perkins, H. Fang, W. Tong, W. J. Welsh, QuantitativeStructure-Activity Relationship Methods: Perspectives OnDrug Discovery and Toxicology, Environ. Toxicol. Chem.2003, 22, 1666 ± 1679.

[2] S. P. Bradbury, C. L. Russom, G. T. Ankley, T. W. Schultz,J. D. Walker, Overview of Data and conceptual Approachesfor Derivation of Quantitative Structure-Activity Relation-ships for Ecotoxicological Effects of Organic Chemicals,Environ. Toxicol Chem. 2003, 22, 1789 ± 1798.

[3] J. D. Walker, J. Jaworska, M. H. Comber, T. W. Schultz, J. C.Dearden, Guidelines for Developing and Using QuantitativeStructure-Activity Relationships, Environ. Toxicol. Chem.2003, 22, 1653 ± 1665.

[4] M. T. D. Cronin, J. S. Jaworska, J. D. Walker, M. H. Comber,C. D. Watts, A. P. Worth, Use of QSARs in InternationalDecision-Making Frameworks to Predict Health Effects ofChemical Substances, Environ. Health Perspect. 2003, 111,1391 ± 1401.

[5] M. H. Comber, J. D. Walker, C. Watts, J. Hermens, Quanti-tative Structure-Activity Relationships for Predicting Poten-tial Ecological Hazard of Organic Chemicals for Use inRegulatory Risk Assessment, Environ. Toxicol. Chem. 2003,22, 1822 ± 1828.

[6] K. Rose, L. H. Hall, E-State Modeling of Fish ToxicityIndependent of 3-D Structure Information, SAR QSAREnviron. Res. 2003, 14, 113 ± 129.

[7] P. Mazzatorta, E. Benfenati, C. D. Neagu, G. Gini, TuningNeural and Fuzzy-Neural Newworks for Toxicity Modeling, J.Chem. Inf. Comput. Sci. 2003, 43, 513 ± 518.

[8] J. Huuskonen, QSAR Modeling with ElectrotopologicalState: Predicting the Toxicity of Organic Chemicals, Chemo-sphere 2003, 50, 949 ± 953.

[9] S. A. Kulkarni, D. V. Raje, T. Chakraborti, QuantitativeStructure-Activity Relationships Based on Functional andStructural Characteristics of Organic Chemicals, SAR QSAREnviron Res. 2001, 12, 565 ± 591.

[10] A. R. Katritzky, D. B. Tatham, U. Maran, Theoretical De-scriptors for the Correlation of Aquatic Toxicity of Environ-mental Pollutants by Quantitative Structure-Toxicity Rela-tionships, J. Chem. Inf. Comput. Sci. 2001, 41, 1162 ± 1176.

[11] W. Di Marzio, S. Galassi, R. Todeschini, F. Consolaro,Traditional versus WHIM Molecular Descriptors in QSAR

106 ¹ 2004 WILEY-VCH Verlag GmbH&Co. KGaA, Weinheim QSAR Comb. Sci. 2004, 23

Table 9. Factor loadings of the variables (ETA and non-ETA) after VARIMAX rotation

Variables Factor 1 Factor 2 Factor 3 Factor 4 Factor 5 Communality

pC 0.687 0.580 �0.161 0.283 �0.196 0.952�� 0.562 0.751* �0.283 0.140 0.093 0.989�localF 0.945* 0.160 0.164 0.157 0.148 0.991

�R 0.832* 0.510 0.075 0.035 0.166 0.987[� �

F]Cl 0.044 0.875* 0.364 0.298 �0.075 0.995[� �

F]NO2 0.973* �0.197 0.013 0.105 0.024 0.997[� �

F]CH3 �0.201 �0.143 0.089 �0.960* �0.054 0.993[� �

F]Br/I �0.110 �0.069 �0.985* 0.089 �0.047 0.998[� �

F]CH2OH 0.233 �0.198 0.043 0.049 0.947* 0.996� �B 0.501 0.810* 0.029 �0.047 �0.012 0.910

2� 0.934* 0.154 0.050 0.076 0.304 0.998AlogP98 �0.164 0.946* 0.003 0.012 �0.264 0.992% variance 0.382 0.302 0.105 0.097 0.097 0.983

Kunal Roy and Gopinath Ghosh

� ����������� � ��� �

Page 9: QSTR with Extended Topochemical Atom Indices. 3. Toxicity of Nitrobenzenes to Tetrahymena pyriformis

Approaches Applied to Fish Toxicity Studies, Chemosphere2001, 44, 401 ± 406.

[12] M. Vighi, P. Gramatica, F. Consolaro, R. Todeschini, QSARand Chemometric Approaches for Setting Water QualityObjectives for Dangerous Chemicals, Ecotoxicol. Environ.Saf. 2001, 49, 206 ± 220.

[13] S. C. Basak, G. D. Grunwald, B. D. Gute, K. Balasubrama-nian, D. Opitz, Use of Statistical and Neural Net Approachesin Predicting Toxicity of Chemicals. J. Chem. Inf. Comput. Sci.2000, 40, 885 ± 890.

[14] J. Devillers, A General QSAR Model for Predicting theAcute Toxicity of Pesticides to Lepomis macrochirus, SARQSAR Environ. Res. 2001, 11, 397 ± 417.

[15] T. W. Schultz, T. I. Netzeva, M. T. D. Cronin, Selection ofData Sets for QSARs: Analysis of Tetrahymena Toxicity fromAromatic Compounds, SAR QSAR Environ. Res. 2003, 14,59 ± 81.

[16] A. A. Toropov, T. W. Schultz, Prediction of Aquatic Toxicity:Use of Optimization of Correlation Weights of Local GraphInvariants, J. Chem. Inf. Comput. Sci. 2003, 43, 560 ± 567.

[17] G. Klopman, S. E. Stuart, Multiple Computer-AutomatedStructure Evaluation Study of Aquatic Toxicity. III. Vibrofischeri, Environ. Toxicol. Chem. 2003, 22, 466 ± 472.

[18] R. L. Yu, G. R. Hu, Y. H. Zhao, Comparative Study of FourQSAR Models of Aromatic Compounds to Aquatic Organ-isms, J. Environ. Sci. (China) 2002, 14, 552 ± 557.

[19] J. R. Seward, T. W. Schultz, QSARAnalyses of the Toxicity ofAliphatic Carboxylic Acids and Salts to Tetrahymena pyr-iformis, SAR QSAR Environ Res. 1999, 10, 557 ± 567.

[20] E. Arghese, C. Bettiol, G. Giurin, P. Miana, QuantitativeStructure-Activity Relationships for the Toxicity of Chloro-phenols to Mammalian Submitochondrail Particles, Chemo-sphere 1999, 38, 2281 ± 2292.

[21] M. T. D. Cronin, J. C. Dearden, A. J. Dobbs, QSAR Studies ofComparative Toxicity in Aquatic Organisms, Sci. TotalEnviron. 1991, 109 ± 110, 431 ± 439.

[22] X. Liu, Z. Yang, L. Wang, CoMFA of Acute Toxicity ofPhenylsulfonyl Carboxylates to Vibrio fischeri, SAR QSAREnviron. Res. 2003, 14, 183 ± 190.

[23] X. Liu, Z. Yang, L. Wang, Three-Dimensional QuantitativeStructure-Activity Relationship Study for PhenylsulfonylCarboxylates Using CoMFA and COMSIA, Chemosphere2003, 53, 945 ± 952.

[24] D. W. Gao, P. Wang, H. Liang, Y. Z. Peng, A Study onPrediction of the Bio-toxicity of Substituted Benzene Basedon Artificial Neural Network, J. Environ. Sci. Health B 2003,38, 571 ± 579.

[25] S. Cui, X. Wang, S. Liu, L. Wang, Predicting Toxicity ofBenzene Derivatives by Molecular Hologram Derived Quan-titative Structure-Activity Relationships (QSARs), SARQSAR Environ. Res. 2003, 14, 223 ± 231.

[26] P. Gramatica, V. Consonni, M. Pavan, Prediction of AromaticAmines Mutagenicity from Theoretical Molecular Descrip-tors, SAR QSAR Environ. Res. 2003, 14, 237 ± 250.

[27] K. Roy, G. Ghosh, Introduction of Extended TopochemicalAtoms (ETA) Indices in the Valence Electron Mobile (VEM)Environment as Tool for QSAR/QSPR Studies. InternetElectron. J. Mol. Des. 2003, 2, 599 ± 620, http://www.biochem-press.com.

[28] K. Roy, G. Ghosh, QSTR with Extended Topochemical AtomIndices. 2. Fish Toxicity of Substituted Benzenes, J. Chem. Inf.Comput. Sci. 2004, 44, in press.

[29] E. Estrada, E. Uriate, Quantitative Structure Activity Rela-tionships Using TOPS-MODE. 1. Nitrobenzene Toxicity to

Tetrahymena pyriformis. SAR QSAR Environ. Res. 2002, 12,309 ± 324.

[30] M. T. D. Cronin, B. W. Gregory, T. W. Schultz, QuantitativeStructure-Activity Analyses of Nitrobenzene Toxicity toTetrahymena pyriformis. Chem. Res. Toxicol. 1998, 11, 902 ±908.

[31] D. K. Pal, C. Sengupta, A. U. De, A New TopochemicalDescriptor (TAU) in Molecular Connectivity Concept: Part I± Aliphatic Compounds, Indian J. Chem. 1988, 27B, 734 ± 739.

[32] D. K. Pal, C. Sengupta, A. U. De, Introduction of A NovelTopochemical Index and Exploitation of Group ConnectivityConcept to Achieve Predictability in QSAR and RDD,Indian J. Chem. 1989, 28B, 261 ± 267.

[33] D. K. Pal, M. Sengupta, C. Sengupta, A. U. De, QSAR withTAU (�) indices: Part I ± Polymethylene Primary Diamines asAmebicidal Agents, Indian J. Chem., 1990, 29B, 451 ± 454.

[34] D. K. Pal, S. K. Purkayastha, C. Sengupta, A. U. De, Quanti-tative Structure-Property Relationships with TAU indices:Part I ± Research Octane Numbers of Alkane Fuel Mole-cules, Indian J. Chem., 1992, 31B, 109 ± 114.

[35] K. Roy, D. K. Pal, A. U. De, C. Sengupta, ComparativeQSAR with Molecular Negentropy, Molecular Connectivity,STIMS and TAU Indices: Part I. Tadpole Narcosis of DiverseFunctional Acyclic Compounds, Indian J. Chem. 1999, 38B,664 ± 671.

[36] K. Roy, D. K. Pal, A. U. De, C. Sengupta, ComparativeQSAR Studies with Molecular Negentropy, Molecular Con-nectivity, STIMS and TAU Indices. Part II : GeneralAnaesthetic Activity of Aliphatic Hydrocarbons, Halocar-bons and Ethers, Indian J. Chem. 2001, 40B, 129 ± 135.

[37] K. Roy, A. Saha, Comparative QSPR Studies with MolecularConnectivity, Molecular Negentropy and TAU Indices.Part I: Molecular Thermochemical Properties of DiverseFunctional Acyclic Compounds, J. Mol. Model. 2003, 9, 259 ±270.

[38] K. Roy, A. Saha, Comparative QSPR Studies with MolecularConnectivity, Molecular Negentropy and TAU Indices. Part 2:Lipid-Water Partition Coefficient of Diverse FunctionalAcyclic Compounds, Internet Electron. J. Mol. Des. 2003, 2,288 ± 305, http://www.biochempress.com.

[39] K. Roy, A. Saha, QSPR with TAU Indices: Water Solubilityof Diverse Functional Acyclic Compounds, Internet Electron.J. Mol. Des. 2003, 2, 475 ± 491, http://www.biochempress.com.

[40] K. Roy, S. Chakroborty, C. C. Ghosh, A. Saha, QSPR withTAU Indices: Molar Thermochemical Properties of DiverseFunctional Acyclic Compounds, J. Indian Chem. Soc. 2004,81, in press.

[41] I. Moriguchi, Y. Canada, K. Komatsu, van der Waals Volumeand the Related Parameters for Hydrophobicity in Structure-Activity Studies, Chem. Pharm. Bull. 1976, 24, 1799 ± 1806.

[42] P. J. Lewi, Multivariate data analysis in structure-activityrelationships; in: E. J. Ariens, (Ed.), Drug Design, vol. 10,Academic Press, New York 1980, pp. 307 ± 342.

[43] R. Franke, A. Gruska, Principal component and factoranalysis; in: H. van de Waterbeemd, (Ed.), ChemometricMethods in Molecular Design, vol. 2, VCH, Weinheim 1995,pp. 113 ± 163.

[44] STATISTICA is a statistical software of Statsoft Inc., USA.[45] The GW-BASIC programs RRR98, KRETA1, KRETA2,

KRPRES1 and KRPRES2 were developed by Kunal Royand standardized using known data sets.

[46] Cerius 2 version 4.6 is a product of Accelrys Inc., San Diego,CA.

QSAR Comb. Sci. 2004, 23 ¹ 2004 WILEY-VCH Verlag GmbH&Co. KGaA, Weinheim 107

QSTR with Extended Topochemical Atom Indices. 3. Toxicity of Nitrobenzenes to Tetrahymena pyriformis

� ����������� � ��� �

Page 10: QSTR with Extended Topochemical Atom Indices. 3. Toxicity of Nitrobenzenes to Tetrahymena pyriformis

[47] G. W. Snedecor, W. G. Cochran, Statistical Methods, Oxfordand IBH Publishing Co. Pvt. Ltd., New Delhi 1967, pp. 381 ±418.

[48] S. Wold, L. Eriksson, Statistical Validation of QSAR Results,in: H. van de Waterbeemd, (Ed.), Chemometric Methods inMolecular Design, VCH, Weinheim 1995, pp. 312 ± 317.

[49] A. K. Debnath, Quantitative Structure-Activity Relationship(QSAR): A Versatile Tool in Drug Design, in: A. K. Ghose,V. N. Viswanadhan, (Eds.), Combinatorial Library Designand Evaluation, Marcel Dekker, Inc., New York 2001,pp. 73 ± 129.

Received on January 16, 2004; Accepted on February 11, 2004

108 ¹ 2004 WILEY-VCH Verlag GmbH&Co. KGaA, Weinheim QSAR Comb. Sci. 2004, 23

Kunal Roy and Gopinath Ghosh

� ����������� � ��� �