Transcript
Page 1: Introduction Overall Methodology Key Points · Roy K, Kar S, Das RN, Understanding the Basics of QSAR for Applications in Pharmaceutical Sciences and Risk Assessment, Academic Press,

“INTELLIGENT” CONSENSUS PREDICTIONS FOR DAPHNIA TOXICITY OF AGROCHEMICALS Pathan Mohsin Khan 1 , Kunal Roy 2,3 , Emilio Benfenati 3

1Department of Pharmacoinformatics, National Institute of Pharmaceutical Educational and Research (NIPER), Chunilal Bhawan, 168, Manikata Main Road, 700054 Kolkata, India 2 Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, 188 Raja S C Mullick Road, 700032, Kolkata, India

3 Laboratory of Environmental Chemistry and Toxicology, Istituto Di Ricerche FarmacologicheMario Negri IRCCS, Via La Masa, 19, 20156, Milano, Italy

Introduction

Overall Methodology

Key Points

Why Consensus?

Error estimation and predictivity comparison

Molecular descriptors

(Dragon+ PaDEL)

Validation Parameters

[MAE, Tropsha,

rm2]

External

(Q2F1, Q2F2,

rm2Test,

MAE95%)

Internal

(R2, Q2,rm2

LOO, MAE95%)

References

Agrochemicals : A broad class of chemical products widely used in the

agriculture to prevent, destroy, or control the harmful organisms (insects,

fungi, microbes and weeds) or diseases, or to protect the crops before

and after harvesting to minimize the loss or to enhance the yield in

production.

Over the last few years, the ecotoxicological hazard potential of

agrochemicals has received much attention in the industries and

regulatory agencies.

There are only limited experimental ecotoxicological data available for

such compounds.

Quantitative structure-toxicity relationship (QSTR) modeling is a ligand

based statistical approach proved to be useful in data gap filling.

In the present work, we have generated QSTR models for daphnia toxicities of different classes of

agrochemicals (fungicides, herbicides, insecticides and microbiocides) employing only simple and

interpretable two-dimensional descriptors, and subsequently strictly validated using test set compounds.

The validated individual as well as global models were subjected for the “intelligent” consensus model

generation using the ICP tool (http://dtclab.webs.com/software-tools) with an objective to improve the

prediction quality and reduced prediction errors .

The individual as well as consensus models were used to predict the toxicity of an external dataset of

biocides to determine the predictive ability of models.

As per the developed models, generally, lipophilicity, number of X (halogen) on an aromatic ring,

number of substituted benzene C(sp2), number of chlorine atoms, frequency of C - Cl at topological

distance 5, number of multiple bonds, number of heavy atoms, number of rotatable bonds, and an

increase in carbon chain length increase the toxicity while polarity, presence of ether moiety in aliphatic

chain, presence of two oxygen atoms at a topological distance 8, branching in molecules, count of

hydrogen bond acceptor atoms and/or polar surface area decrease the toxicity.

Ring

descriptors

E-state

descriptors

Molecular

properties

Connectivity

indices

ETA

descriptors

Functional group

count

2D atom

pair

Validation

External set prediction

(Biocides dataset)

ECOSAR

Comparison

Summary of feature responsible for toxicity of agrochemicals

Comparison between our models and ECOSAR prediction

A single model can’t guarantee the best quality predictions for all compds

Entire chemical space is not covered in a single model while consensus

combines multiple features of different models covering wider range

Helps to reduce error of predictions

Four types of consensus proposed:

I. CM0:- Simple average of predictions

II. CM1:- Average of predictions from the 'qualified' individual models

III. CM2:- Weighted average predictions from 'qualified' Individual models

IV. CM3:- Best selection of predictions (compound-wise) from 'qualified'

Individual models.

Prediction of a models are not reliable unless compared with standards and used

external dataset compounds.

we have employed an external dataset of 67 biocides, The quality of predictions

(R2pred) for three individual models were 0.47, 0.50 and 0.47 with mean

absolute error of 1.407, 1.395, and 1.422 respectively, while the prediction

quality for the consensus model-3 is 0.49 but the mean absolute error reduced to

1.37.

Comparison of error (RMSEp) was made with ECOSAR

ECOSAR is preferred widely for ecotoxicological prediction of organic

chemicals

Comparison was made only with test set of the models.

Our models offered better predictive efficiency and larger chemical domain.

Consensus models offered better predictivity when compared with simple

QSTR models.

Global models

Ind

ivid

ual

Mod

els

Fu

ngic

ides

model

s

mic

robio

cid

es m

odel

s

Her

bic

ides

model

s

Inse

ctic

ides

model

s

Ntrain = 81 and Ntest = 26 Ntrain = 36 and Ntest = 12

Ntrain = 112 and Ntest = 35 Ntrain = 111 and Ntest = 36

Global models

Waxman MF, The agrochemical and pesticides safety handbook. CRC Press, 1998.

Roy K, Kar S, Das RN, Understanding the Basics of QSAR for Applications in Pharmaceutical Sciences and Risk

Assessment, Academic Press, NY, 2015.

Roy K, Ambure P, Kar S, Ojha PK, Is it possible to improve the quality of predictions from an“intelligent” use of multiple

QSAR/QSPR/QSTR models? J Chemom 32, 2018, e2992.

US EPA, The ECOSAR The ECOSAR (ECOlogical Structure Activity Relationship) Class Program, 2012.

Acknowledgement PMK thanks the Department of Pharmaceuticals, Ministry of Chemicals and Fertilizers, Govt. of India for a fellowship.

KR thanks the European Commission for financial assistance under the project VERMEER [LIFE16 ENV/IT/000167].

GADCV

PLS

Ntr

ain

= 3

13

Nte

st =

105

Daphnia toxicity data

(pEC50 values)

Flutianil

pEC50 = 6.98

Propiconazole

pEC50 = 4.02

Pentachlorophenol

pEC50 = 5.95

Acetic Acid

pEC50 = 2.08

Cyfluthrin

pEC50 = 9.23

Propylene glycol

pEC50 = 1.83

Tributyltin methacrylate

pEC50 = 2.95

Tributyltin oxide

pEC50 = 7.90