This work is licensed under a Creative Commons Attribution 4.0
International License.
O R I G I N A L S C I E N T I F I C P A P E R
Croat. Chem. Acta 2020, 93(4), 311–319 Published online: September
2, 2021 DOI: 10.5562/cca3776
QSPR Models for Prediction of Aqueous Solubility: Exploring the
Potency of Randi-type Indices
Janja Sluga,1,2 Katja Venko,1 Viktor Drgan,1 Marjana Novi1,*
1 National Institute of Chemistry, Theory Department, Laboratory
for Cheminformatics, Hajdrihova ulica 19, Ljubljana, Slovenia 2
University of Ljubljana, Faculty of Pharmacy, Chair of
Pharmaceutical Chemistry, Aškereva cesta 7, Ljubljana, Slovenia *
Corresponding author’s e-mail address:
[email protected]
RECEIVED: April 26, 2021 REVISED: June 28, 2021 ACCEPTED: June 29,
2021
THIS PAPER IS DEDICATED TO PROF. MILAN RANDI ON THE OCCASION OF HIS
90TH BIRTHDAY, AND TO THE MEMORY OF PROF. MIRCEA DIUDEA
Abstract: The development of QSPR models to predict aqueous
solubility (logS) is presented. A structurally diverse set of over
1600 compounds with experimentally determined solubility values
(AqSolDB database) is used for building the data-driven models
based on multiple linear regression (MLR) and artificial neural
network (ANN) methods to predict aqueous solubility. Molecular
structures are encoded by numerous structural descriptors,
including the connectivity index developed by Randi in 1975, and
many later derived variations. To evaluate the potency of
Randi-like descriptors in the structure-property relationship, we
developed models based on two sets of descriptors, first using only
Randi- like descriptors calculated with Dragon, and second using 17
commonly applied descriptors available in the AqSolDB database. All
models were validated with external prediction sets, with the RMSE
ranging from 0.8 to 1.1. Interestingly, the RMSE of predicted LogS
values of models based only on the Randi-like descriptors were in
average just 0.1 larger than the models with 17 descriptors
preselected as suitable for modelling logS. Keywords: aqueous
solubility, QSPR model, MLR, ANN, connectivity index, Randi-like
indices.
INTRODUCTION QUEOUS solubility is an important physical property
describing a complex process of the interaction of a
solute with water. It is of a special importance in the
pharmaceutical industry, as the drug discovery relays upon
solubility data to help improve drug delivery systems. In our
previous work we found a potential autolysin E (AtlE) inhibitor,[1]
which has to be subjected to a structure optimiz- ation procedure
due to its low solubility. There are several models available to
predict aqueous solubility (logS) based generally on two main
approaches, either calculation of solution free energy using
physics-based theory alone, or using machine learning/quantitative
structure−property relationship (QSPR) models. Theoretical
calculations by using molecular simulation of aqueous solubility
are extremely demanding. Physics- based solubility computations
equilibrate the free energy of a molecule in the crystal lattice to
the solvation energy of a molecule in saturated aqueous solution.
Crystal lattice
free energy is experimentally observed as free energy of
sublimation, while free energy for transfer of the molecule from
the gas phase to the saturated solution is solvation free energy.
Both values can be calculated by molecular dynamics simulation in
conjunction with one of the methods for free energy calculation,
such as thermodynamic integration, thermodynamic perturbation, or
metadynamics. Solvation free energy calculation[2,3] is specially
demanding since in the case of solubility simulation it depends on
the solute concentration. Therefore, solvation free energy value
should be determined for several values of solute concen- tration.
Obviously, a huge computational effort for one species and first
principle approach is definitively not practical for purposes like
drug design where hundreds of drug candidates should be screened.
For much simpler tasks like octanol/water partition calculations
where solvent reaction field methods should work, only few
solvation models reproduce experimental values e.g. SMD (solvation
model density) of Truhlar and Cramer works, while PCM - polarizable
continuum model of Tomasi does not work.[2]
Croat. Chem. Acta 2020, 93(4), 311–319 DOI: 10.5562/cca3776
Often the pragmatic approach of the machine or deep learning
outperforms the theoretical calculations, particularly after having
more and more data available. It has been reported by J. L.
McDonagh et al.[3] that direct theoretical calculation did not give
accurate results, while machine learning was able to predict the
logS with a root mean squared error (RMSE) of ∼1.1 log units in a
10-fold cross-validation for 100 drug-like molecules. The group of
Schneider performed a review of the drug discovery with explainable
artificial intelligence.[4] They concluded that deep learning bears
promise for drug discovery, but there is a demand for ‘explainable’
deep learning methods to address the need for a new narrative of
the machine language of the molecular sciences. QSPRs are widely
used in silico methods to predict chemical properties for untested
chemicals, since they present time and cost- effective approach,
which is in most cases sufficient alternative method to
experimental testing. For building and validation of models, the
appropriate statistical algorithms and the data matrix that
includes numerical values of chemical structures and empirical
values of property are needed. In the literature and on web
platforms, QSPR models for aqueous solubility based on various sets
of molecules are published.[3,5–16] Models are developed using
various number of compounds and linear or non-linear methodologies
(e.g.: multiple linear regression, partial least squares, ordinary
least squares, multivariate adaptive regression splines,
hierarchical clustering, group contribution, nearest neighbour,
support vectors, random forest, or artificial neural networks). By
utilizing infor- mation derived only from SMILES strings, the
available models make predictions of aqueous solubilities (logS) in
simple and straightforward procedures, which do not require
molecular geometry optimizations. Models generate predictions
including various molecular properties from molecular descriptors
(Chemistry Development Kit (CDK), PaDEL, RDKit, Dragon, alvaDesc,
SASA) to molecular structures signatures (distance graph based
signatures (GBS), MACCS keys, etc.). Most of the models include
logP parameter for prediction[3,5,7,9,12,17–19], which is also a
core variable in General Solubility Equation (GSE)[20] While GSE is
method of choice only when melting point is estimated, the other
models, in which variables are calculated from molecular structure,
can be used without limitations. On SwissADME platform,[13] three
aqueous solubility models are available: ESOL,[5] Ali[12] and
SILICOS-IT. ESOL model by Delaney[5] was developed from a set of
2874 compounds using multiple linear regression and nine molecular
properties like logP, molecular weight, prop- ortion of heavy atoms
in aromatic systems, and number of rotatable bonds. The model has
good performance
= =2 2 TR V( 0.69, 0.85)R R and is competitive with General
Solubility Equation for medicinal/agrochemical molecules.
Model Ali is based on set of 1256 compounds by using partial lest
squares, MACCS keys, TPSA and logP and having performance = =2
2
TR V0.81, 0.83.R R [12] While model SILICOS-IT[21] is based on
fragmental method corrected by molecular weight and having =2
TR 0.75.R Models of McElroy and Jurs[14] were generated with 399
heteroatom- containing organic compounds by using multiple linear
regression (MLR) and computational neural network. The best results
were obtained with non-linear CNN models
=TR(RMSE 0.6, =VRMSE 1.5; subscripts TR and V refer- ring to
training and validation sets, respectively). The models available
on VEGA and EPA platforms are based on 5020 compounds from EPI
Suite database. VEGA water solubility model v.1.0.0[7] is based on
artificial neural network algorithm and 15 DRAGON descriptors with
per- formance =TRRMSE 0.84 and =VRMSE 0.93. EPA model is available
in Toxicity Estimation Software Tool (T.E.S.T. v5.1) and makes
consensus prediction from various modeling algorithms (hierarchical
clustering, group contribution, nearest neighbour) with estimated
=VRMSE 0.87. [6] Another available prediction model is on pkCSM
platform[11] generated with 1708 compounds and graph based sig-
natures = =2 2
TR V( 0.82, 0.73).R R The model from admetSAR 2.0 is also based on
the same set of compounds
=2( 0.81).R [9,10] On Alvascience platform model =2 TR(
0.76,R
=2 V 0.76)R based on ordinary least squares, 8825 comp-
ounds and five alvaDesc descriptors is presented and its
predictions are in high correlation (> 0.9) with ESOL model. The
molecular connectivity index developed by Milan Randi[22] was shown
to correlate almost perfectly with the boiling points of alkane
isomers having two to seven carbon atoms. Hall et al.[23]
demonstrated its relation- ship to water solubility and boiling
point. After 25 years, in 2001, Randi published a comprehensive
review of the developments of the connectivity indices as molecular
descriptors in multiple linear regression analysis structure–
property–activity studies.[24] The review is focused on the
elaboration of higher order connectivity indices and the valence
connectivity indices. The discussion has shed light on further
development in chemical graph theory, novel directions in
mathematical characterization of chemical, biochemical, and
biological systems, all stimulated by the connectivity index.
Connectivity-based molecular descrip- tors were later applied in
many QSPR models, including those for prediction of aqueous
solubility[25–29] A few years ago, Gutman et al.[30] depicted an
interesting connection between the degree-based information content
of a (molecular) graph and Randi index. However, detailed
inspection of the correlation studies revealed that I(G)
(degree-based information content), converse to R(G) (Randi index),
carries information only on degree distribution in graphs and not
on their mutual relationship, which results in the insensitivity of
vertex-degree-based
J. SLUGA et al.: QSPR Models for Prediction of Aqueous Solubility …
313
DOI: 10.5562/cca3776 Croat. Chem. Acta 2020, 93(4), 311–319
information of a graph on subtle structural differences among
graphs. This illustrates additionally the advantage of the Randi
index and its application potential in chemistry. An interesting
use of connectivity indices is presented for the estimation of
stability constants of metal-complexes.[31,32] The aim of this
study was to evaluate the potency of Randi-like descriptors in the
structure-property relation- ship regarding aqueous solubility. In
particular, our goal is to develop the QSPR (Quantitative
Structure-Property Relationship) models to predict aqueous
solubility (logS) of the potential (AtlE) inhibitors in order to
optimize the initial chemical structure of poorly soluble hit
compounds that were obtained in previous work.[1] Therefore, we
develop- ed and compared models based on two sets of descriptors,
first using only Randi-like descriptors calculated with Dragon, and
second using 17 commonly applied descrip- tors, as described in the
literature, and available in the AqSolDB database.[33]
MATERIALS AND METHODS
Dataset The dataset for modeling included 1674 compounds. The
solubility data were obtained from AqSolDB database.[33] In this
database logS values were collected from different
sources.[5,20,34–41] AqSolDB consists of over 9900 unique
compounds, which are coded with SMILES strings. For our modelling,
we have chosen only compounds from the most reliable groups G3 and
G5, i.e. groups composed of compounds with logS values found more
than once in merged dataset and having reliability label of
standard deviation < 0.5. In this way, 1818 compounds were
obtained that were further reduced because of limitations of
calculation of molecular descriptors (MD) in Dragon software.[42]
This led to 1674 compounds in our dataset. Compounds were
classified in four classes according to the thresholds reported in
AqSolDB: insoluble comp- ounds (logS > −4), moderately soluble
compounds (logS −4 to −2), soluble compounds (logS −2 to 0) and
highly soluble compounds (logS > 0)[33] The classes are labelled
as I, L, S and H (see Supplementary file Figure S1 for distribution
of the classes on a Kohonen top-map). Among the 1674 compounds
included in this study, 458 compounds are insoluble (I), 563
compounds have moderate low solubility (L), while 519 compounds are
soluble (S) and 134 compounds are highly soluble (H). In general,
chemical structures of all compounds were numerically coded with
molecular descriptors (MDs). SMILES representations of structures
and experimental values of all compounds are listed in Table S1
(Supplementary Material). The experimental values available as
standardized logS units in
AqSolDB[33] were obtained from aqueous solubility assays that
followed the OECD guidelines for testing of chemicals. Further on,
1674 compounds were divided into three datasets: 1004 compounds in
the training set TR (60 %), 335 in the test set TE (20 %) and 335
in the validation set V (20 %). The split of compounds into
datasets was based on mapping and visualization of compounds on
top-map with CPANNatNIC software[43]. Several Kohonen maps of
differ- ent network sizes (19 × 19, 20 × 20, 21 × 21) were tested
for mapping the compounds according to their structural similarity
and logS values. After the statistical analysis and inspection of
occupied/empty neurons, the optimal neural network for the
splitting purpose was of size 20 × 20 (Table S2). The most similar
compounds that were located on the same neuron were then split in
training, test and validation datasets. See the methodology of
splitting of data using Kohonen ANN in the paper of Minovski et
al.[46] and Refs. [36,37] ibid. The Kohonen map of the optimal
network is visualized in Figure S1.
Molecular Descriptors Two set of molecular descriptors were used in
develop- ment of our QSPR predictive models. The first set, so
called AqSol set, was composed of 17 MDs topological and
physico-chemical 2D descriptors that are published in AqSolDB.[33]
Originally, they were calculated with RDKit software[44] and are
listed in Table S8 (Supplementary Material). The second set, so
called Dragon set, was generated by Dragon 7.0 software.[42]
Molecular structures in the format of SMILE strings were an input
and the program calculated over 3000 molecular descriptors for each
molecule. Therefore, the reduction of the number of initially
calculated descriptors was crucial for QSPR model- ling. We focused
on Randi connectivity index and other Randi-like descriptors. In
this way we ended up with 94 Randi descriptors (Table S3). Prior to
be used in building of QSPR models, both set of descriptors were
normalised to zero mean and unit standard deviation for each
descriptor.
Generation of Models The inputs for model building were various
m-dimensional vectors representing the chemical structure; m being
the number of selected MDs (independent variables), and the target
(property, i.e. dependent variable) corresponding to logS of
compounds. For model fitting, the linear and non- linear regression
approaches were used. Firstly, we used the supervised learning
algorithm of counter-propagation artificial neural network (CP-ANN)
and in-house software (CP-ANNatNIC).[43] During the model
optimisation, the size of CP-ANN (number of neurons), number of
epochs, minimal and maximal coefficients used for correction of
CPANN weights, and selection of descriptors were
314 J. SLUGA et al.: QSPR Models for Prediction of Aqueous
Solubility …
Croat. Chem. Acta 2020, 93(4), 311–319 DOI: 10.5562/cca3776
simultaneously optimised. Secondly, the multiple linear regression
models were developed with Qsarins software (DiSTA, Varese, Italy,
www.qsar.it)[45] In optimization the genetic algorithm (GA) was
used for selection of the influential descriptors and improvement
of predictive ability and robustness of the models, which is
available in Qsarins and combined with the CPANN in-house program.
All models were externally validated and had defined applicability
domain (AD). The AD was applied to evaluate the reliability of
model predictions within the established chemical space limits. Two
approaches were used for AD evaluation, the cumulative
distributions of Euclidean distances to central neurons (MEDS) for
CP-ANN models[46] and the leverage values for MLR models (hat
values).[45] At the end, the models with the best performance
parameters were selected and are presented in this paper as models
NN-AqSol (NN-A), Q-AqSol (Q-A), NN-Dragon (NN-D) and Qsarins-Dragon
(Q-D). These optimized regression models can be further used for
prediction of logS value of any chemical of interest, considering
the limitations specified with the models, such as
electro-neutrality or structural applicability domain.
=
=
= ∑ ∑
h y (1)
RESULTS AND DISCUSSION
Splitting of the Data Initially, we followed recommended
methodology for building QSAR models[51] and performed precise
splitting of the initial data to avoid inconsistent results. The
rate of the division of compounds into the training set (TR), test
set (TS), and validation set (V) sets was done according to the
optimal distribution of 1674 compounds described with 17 AqSol
molecular descriptors on the Kohonen top map. In Kohonen neural
network, molecular descriptors were mapped according to similarity
and consequently few of them have place on the same neurons. A
network with optimal distribution of compounds had the following
parameters: 20 × 20 neuron grid, 100 learning epochs, RMSE = 0.397
and R = 0.918, 0.5 maximal learning rate, 0.01 minimal learning
rate, non-toroidal NN boundary conditions, and triangular
correction function of the neighborhood (Table S2). Splitting of
1674 compounds was performed following the optimal rate (60 %
training set / 20 % test set/ 20 % validation set) to cover as much
infor- mation as possible. In models using Dragon descriptors, the
number of compounds was reduce to 1665, since for nine compounds
some descriptors were not calculated. The validation set (n = 335,
334 for Dragon models) was the same for all developed models. The
training set for MLR models was composed of 1339 compounds (1331
for Dragon models), merged TR and TS sets, while in CP-ANN models
the training and test sets were composed of 1004 (996 for Dragon
models) and 335 compounds, respectively (Table 1).
Distribution of logS Analysis of the logS distribution reveals that
the compounds have solubility values in the range between −12.1 and
1.5. Figure 1 shows the distribution of solubility values in
solubility classes according to AqSolDB classification.[33] Our
rates of compounds distribution according to aqueous solubility
classes are comparable with rates of compounds as available in
AqSolDB.
Development of Predictive Models The logS values of 1674 compounds
selected from AqSolDB and algorithms of multiple linear regression
(MLR) or counter-propagation artificial neural networks (CP-ANN)
with genetic algorithm (GA) were applied for development of
regression models for predicting solubility in water. During
optimization process hundreds of models were generated, but only
the best four models were selected and represented in this study
(models NN-A, Q-A, NN-D, Q-D) for two sets of descriptors (A: AqSol
17 descriptors and D:
J. SLUGA et al.: QSPR Models for Prediction of Aqueous Solubility …
315
DOI: 10.5562/cca3776 Croat. Chem. Acta 2020, 93(4), 311–319
Dragon Randi-like 94 descriptors) and two modelling methods (NN for
neural networks and Q for MLR). The QSARINS and CP-ANNatNIC
software are well known and frequently used tools for linear and
non-linear QSAR models.[43,45] The best models were chosen by using
Root Mean Square Error for training set (RMSETR) and validation set
(RMSEV) as the optimization value criteria. In building and
optimization process of CP-ANN models numerous GA runs by changing
different parameters like number of neurons and learning epochs,
learning rate, minimal and maximal coefficients used for correction
of CP-ANN weights and number of descriptors were performed. The
final selection of the best models was based on the optimal values
of performance indexes (Table 1, Table S13). The best four
regression models are presented in Table 1. The performance
statistics for eight logS predictive QSPR models obtained by
Qsarins or CP-ANN tools are given in Table 2. The best results were
obtained with NN-A model (RMSETR = 0.69, RMSEV = 0.76), which has a
good ability to predict aqueous solubility. Next reliable model
NN-D has also good performance (RMSETR = 0.81, RMSEV = 0.96). The
linear Q-A model has values for RMSETR = 1.38, RMSEV = 1.12, while
another non-linear prediction model Q-D have
parameters RMSETR = 1.22 and RMSEV = 1.24. If several models are
developed the consensus approach could be applied according to the
OECD guidance. Consensus models cons4 and aver4 are including
predictions of four chosen models, but do not show better results
than single models NN-A and NN-D. On the other hand, the consensus
models consNN and averNN, which includes predictions from both
CP-ANN models (NN-A and NN-D), are the best according to validation
parameters. The consensus approach in consNN has decreased the
RMSETR to 0.59, if compared with single models NN-A (RMSETR = 0.69)
and NN-D (RMSETR = 0.81). The newly developed models have broad
applicability domain and cover wide chemical space. Almost all
compounds are in AD of our models. Therefore, models are robust and
predictions are reliable, based on compounds from similar
structural domain. The best CP-ANNs models, NN-A (7 AqSol MDs)
has
=2 all 0.90R and NN-D (22 Dragon MDs) has =2
all 0.84,R while models from literature has lowest R2.[6,7,9–11,13]
We were able to compare our results with models available on VEGA =
=2 2
TR V( 0.86, 0.83)R R and EPA =2 V( 0.84)R plat-
forms, Ali et al. study = =2 2 TR V( 0.81, 0.83),R R [12]
pkCSM
model = =2 2 TR V( 0.82, 0.73)R R or admetSAR model
=2( 0.81).R Anyhow, analysis of the 2R parameters shows that also
our models Q-A (9 AqSol MDs) with
=2 all 0.65R and Q-D (12 Dragon MDs) with =2
all 0.72R are comparable with ESOL model[5] = =2 2
TR V( 0.69, 0.85),R R SILICOS-IT[21] =2
TR( 0.75)R and AlvaScience model[17] = =2 2
TR V( 0.76, 0.76).R R In general, our CP-ANN models have better
validation parameters in comparison with other publically available
models. Furthermore, X1 (connectivity index of order 1 - Randi
connectivity index) with other connectivity indices demonstrates
very good choice as molecular descriptors for predicting solubility
in water. Results RMSETR = 0.81 for NN-D model with Randi- like
descriptors were just 0.1 log units larger than the RMSETR = 0.69
of NN-A model with selected 7 out of 17 AqSol descriptors
preselected as suitable for predicting solubility in water.
Table 1. Summary of results for regression models.
Model ID No. of
NN-A 1674
(1004/335/335) 7 AqSol CP-ANN/RTR+RTS 20 × 20/328 0.69 0.72
0.76
Q-A 1674
NN-D 1665
Q-D 1665
* rmse for cross-validation of training set
Figure 1. Distribution of compounds according to aqueous solubility
ranges (logS).
316 J. SLUGA et al.: QSPR Models for Prediction of Aqueous
Solubility …
Croat. Chem. Acta 2020, 93(4), 311–319 DOI: 10.5562/cca3776
Graphs on Figure 2 show the correlation among predicted and
experimental logS values of six models. The best correlation is
observed in consensus models like consNN =2( 0.922)R and averNN =2(
0.910).R Among single models the non-linear CP-ANN models
(NN-D,
=2 0.815,R and NN-A, =2 0.901)R have better perfor- mance than
linear MLR models (Q-D, =2 0.705R and Q-A,
=2 0.654)R .
Selection of Influential Descriptors The descriptors selected in
the NN-A and Q-A models are shown in Table S10 (Supplementary
Material). Model NN-A is developed on base of 7 AqSol MDs: MolLogP,
MolMR, HeavyAtomCount, NumHeteroatoms, NumAliphaticRings,
RingCount, and BertzCT. In model Q-A 9 AqSol MDs were selected:
MolLogP, HeavyAtomCount, NumHeteroatoms,
NumRotatableBonds, NumValenceElectrons, NumAromatic- Rings,
RingCount, LabuteASA, and BertzCT. The MDs selected in the NN-D and
Q-D models are listed in Table S5 (Supplementary Material). For
model NN-D, 22 Dragon MDs we selected: PW2, X1v, X4v, X0Av, X1Av,
X0sol, X3sol, X5sol, RDCHI, X1Kup, X1Per, X1MulPer, Chi_H2, Chi_Dt,
ChiA_Dt, Chi_Dz(Z), Chi_Dz(p), Chi_Dz(i), Chi_B(p), ChiA_B(p),
VR3_B(p), and VR2_B(i). In model Q-D, 12 Dragon MDs were selected:
CID, X0Av, X1sol, Chi_D, ChiA_X, ChiA_Dt, Chi_B(m), Chi_B(e),
Chi_B(p), ChiA_B(p), VR2_B(i), and ChiA_B(s). Several descriptors
presented in Table S5 are correlated with original Randi
connectivity index (X1) and some other connectivity indices X1v,
X0sol, X3sol, X5sol, RDCHI, X1Kup, X1Per, X1MulPer, and 2D
matrix-based descriptor Chi_H2 in model NN-D. We also observed high
correlation with walk and path counts descriptor CID,
Figure 2. Correlation between experimental and predicted logS
values of all compounds in the joint dataset.
Table 2. Statistical parameters of the best four single models and
consensus models.
Model ID RMSETR R2TR QF3TR CCCTR RMSEV R2V QF3V CCCV RMSEall R2all
QF3all
NN-A 0.69 0.90 0.90 0.95 0.76 0.88 0.88 0.94 0.71 0.90 0.90
Q-A 1.38 0.62 0.62 0.77 1.12 0.53 0.75 0.84 1.33 0.65 0.65
NN-D 0.81 0.87 0.87 0.93 0.96 0.80 0.81 0.91 0.90 0.84 0.83
Q-D 1.22 0.58 0.70 0.83 1.08 0.69 0.77 0.87 1.19 0.72 0.72
cons4 1.10 0.82 0.76 0.85 1.04 0.63 0.78 0.87 1.09 0.77 0.76
aver4 0.83 0.86 0.86 0.92 0.78 0.83 0.88 0.93 0.82 0.87 0.86
consNN 0.59 0.93 0.93 0.96 0.70 0.90 0.90 0.95 0.63 0.92 0.92
averNN 0.63 0.92 0.92 0.96 0.73 0.88 0.90 0.95 0.68 0.91 0.91
J. SLUGA et al.: QSPR Models for Prediction of Aqueous Solubility …
317
DOI: 10.5562/cca3776 Croat. Chem. Acta 2020, 93(4), 311–319
connectivity index X1sol, and 2D matrix-based descriptors Chi_B(e),
Chi_B(p) in model Q-D. In Table S5, we can also see, that five
Dragon descriptors (X0Av, ChiA_Dt, Chi_B(p), ChiA_B(p), VR2_B(i))
are represented in both models, NN-D and Q-D. The top ten most
frequently selected Dragon MDs in model optimization are summarized
in Table S4, which are X0Av, ChiA_B(p), X1Per, X1Av, X1Kup,
X1MulPer, X3sol, RDCHI, X1v, and Chi_H2 for non-linear approach.
X0Av, CID, ChiA_B(p), Chi_D, ChiA_X, VR2_B(s), Chi_B(m), Chi_B(s),
Chi_B(p), and Chi_B(e) were most frequently selected when using
linear methodology. The top ten most frequently selected AqSol MDs
are listed in Table S9. MolLogP, MolMR, BertzCT, NumHAcceptors,
NumHeteroatoms, MolWt, TPSA, NumHDonors, RingCount, and
NumAromaticRings are represented in CP-ANN models, while
NumAromaticRings, MolLogP, NumRotatableBonds, NumDonors, NumHetero-
atoms, NumAliphaticRings, MolWt, BalabanJ, NumH- Acceptors, and
RingCount are in Qsarins models. Correl- ation analysis for Dragon
descriptors (correlation coefficient (CC) > 0.8) are represented
in Table S6, where we can see that descriptors X1, CID, X1A, PW4,
X4v, X0Av, Chi_D, ChiA_Dt, ChiA_B(p), and VR3_B(p) have many
correlated molecular decsriptors. MolMR, BertzCT,
NumAliphaticRings, TPSA, and NumHAcceptors from AqSol descriptor
dataset have also high correlation with few descriptors (Table
S11). Correlation matrix for all 94 Dragon and 17 AqSol MDs are
represented in Table S7 and S12, respectively.
The correlation coefficients (CC) of aqueous solubility predictions
among our prediction models, exper- imental logS and predictions
generated with six publically available models are represented in
correlation matrix (Table S14). We can observe high correlation (CC
> 0.9) for models NN-A, NN-D, aver4, consNN, averNN and VEGA
with experimental logS values. High correlation (CC > 0.8) was
also observed among predictions from NN-A, NN-D, cons4, aver4,
consNN and averNN with predictions in VEGA and pkCSM models, while
predictions of other three public models (ESOL, Ali and SILICOS-IT)
are less correlated with our models and experimental logS, but
still in reasonable high rate (CC > 0.7). To summarize, our
trial was to select a really large set of reliable solubility data
of structurally diverse compounds, from different data sets, all
collected in AqSol database. So compiled data (chemical diversity)
combined with CP-ANN method, which is able to automatically
organize smaller clusters (subsets) of compounds from which the
prediction are performed, give better results than any of available
models compared in the discussion, with large applicability domain.
Software together with the model of Aqueous solubility, is
available from the authors upon request.
Software written in Java is also freely available, the user can
download it from: https://www.ki.si/en/departments/d01-
theory-department/laboratory-for-cheminformatics/software/ (SOM
tool developed within LIFE+ project LIFE PROSIL).
CONCLUSIONS In this work, linear and non-linear QSPR models were
constructed to predict solubility in water. A dataset of 1674
chemicals (splitting in 60 % training set, 20 % test set and 20 %
validation set) and their experimentally measured aqueous
solubility values (logS) was used for model development. Dragon and
AqSol molecular descriptors, and MLR and CP-ANN algorithms were
implemented in modelling process. Multiple calculations were
performed in order to obtain the optimal model (cons NN with =2
0.92,R RMSETR = 0.59). Non-linear models were shown to give better
results and predict the water solubility of chemicals more
accurately than the linear ones. An interesting conclusion can be
drawn from the comparison of models based on descriptors derived
from the connectivity index (Randi-like indices) on one side, and
on AqSol descriptors (recognized as most suitable descriptors for
logS modelling) on the other side. The RMSE of the models based on
the Randi-like descriptors only, were in average just 0.1 log units
larger than the models with AqSol descriptors, which demonstrates a
huge potential of connectivity index in capturing molecular
structural properties that may correlate with physico-chemical as
well as biochemical properties of compounds. In drug design, the
solubility is one of the key properties that have to be considered.
For drug design purposes it would be worth mentioning protonation
states that depend on solution pH and solute pKa values. The latter
is also concentration dependent. Besides, protonation states may be
different in crystal than in the solution. The methodology applied
in this work, however, cannot explicitly consider such effects.
Neverthe- less, some underlying information about the effects
mentioned above is present in the solubility data used for training
and developing data-driven models. Once the data- base of tested
chemicals is large enough and covers an extensive area of chemical
structure space, the models would become more reliable, also
regarding the protonation state, but the interpretation of it
definitively remains as a challenge for future. Based on the
results obtained in this work we are confident that these newly
developed models could be a valuable guidance for design and
optimization of more soluble compounds in the future. Since models
are robust and reliable, we hope they will be very useful in our
further drug development of autolysin E inhibitors.
Croat. Chem. Acta 2020, 93(4), 311–319 DOI: 10.5562/cca3776
Data Availability: The data used to support the findings of this
study are included within the article and Support Information.
Conflicts of Interest: The authors declare that there are no
conflicts of interest regarding the publication of this paper.
Funding: This work was financially supported by the grants from the
Slovenian Research Agency (national program P1- 0017 and MR 39008).
Acknowledgments: We thank prof. Paola Gramatica for availability of
QSARINS license. Supplementary Information. Supporting information
to the paper is attached to the electronic version of the article
at: https://doi.org/10.5562/cca3776.
REFERENCES [1] J. Borišek, S. Pintar, M. Ogrizek, S. G. Grdadolnik,
V.
Hodnik, D. Turk, A. Perdih, M. Novi, J. Enzyme. Inhib. Med. Chem.
2018, 33(1):1239–1247.
https://doi.org/10.1080/14756366.2018.1493474
[2] A.V. Marenich, C.J. Cramer, D.G. Truhlar, J. Phys. Chem. B,
2009, 113(18), 6378–6396. https://doi.org/10.1021/jp810292n
[3] J. L. McDonagh, N. Nath, L. D. Ferrari, T. V. Mourik, J. B. O.
Mitchell, J. Chem. Inf. Model. 2014, 54, 844–856.
https://doi.org/10.1021/ci4005805
[4] J. J. Luna, F. Grisoni, G. Schneider, Nat. Mach. Intell. 2020,
2, 573–584. https://doi.org/10.1038/s42256-020-00236-4
[5] J. S. Delaney, J. Chem. Inf. Comput. Sci. 2004, 44, 1000–1005.
https://doi.org/10.1021/ci034243x
[6] T. M Martin; Toxicity Estimation Software Tool (TEST); U.S.
Environmental Protection Agency, 2020, Washington, DC,
https://www.epa.gov/chemical-
research/toxicity-estimation-software-tool-test
[7] E. Benfenati, A. Roncaglioni, A. Lombardo, A. Manganaro,
Integrating QSAR, Read-Across, and Screening Tools: The VEGAHUB
Platform as an Example in: (ed.: H. Hong) Advances in Computational
Toxicology. Challenges and Advances in Computational Chemistry and
Physics, vol 30., 2019, Springer, Cham. https://www.vegahub.eu/
https://doi.org/10.1007/978-3-030-16443-0_18
[8] P. J. Nathan, J. Clarke, J. Lloyd, C. W. Hutchison, L. Downey,
C. Stough, Hum. Psychopharmacol. 2001 16(4), 345–351.
https://doi.org/10.1002/hup.306
[9] J. Wang, G. Krudy, T. Hou, W. Zhang, G. Holland, X. Xu, J.
Chem. Inf. Model. 2007, 47, 1395–1404.
https://doi.org/10.1021/ci700096r
[10] H. Yang, C. Lou, L. Sun, J. Li, Y. Cai, Z. Wang, W. Li, G.
Liu, Y. Tang, Bioinform. 2019, 35(6), 1067–1069.
https://doi.org/10.1093/bioinformatics/bty707
[11] D. E. V. Pires, T. L. Blundell, D. B. Ascher, J. Med. Chem.
2015, 58(9), 4066. http://biosig.unimelb.edu.au/pkcsm/prediction
https://doi.org/10.1021/acs.jmedchem.5b00104
[12] J. Ali, P. Camilleri, M. B. Brown, A. J. Hutt, S. B. Kirton,
J. Chem. Inf. Model. 2012, 52, 420–428.
https://doi.org/10.1021/ci200387c
[13] A. Daina, O. Michielin, V. Zoete, Sci. Rep. 2017, 7., 42717.
https://doi.org/10.1038/srep42717
[14] N. R. McElroy, P. C. Jurs, J. Chem. Inf. Comput. Sci. 2001,
41, 1237–1247. https://doi.org/10.1021/ci010035y
[15] M. Przybyek, T. Jeliski, P. Cysewski, J. Chem. 2019, 9858371.
https://doi.org/10.1155/2019/9858371
[16] A. L. Perryman, D. Inoyama, J. S. Patel, S. Ekins, J. S.
Freundlich, ACS Omega. 2020, 5(27), 16562–16567.
https://doi.org/10.1021/acsomega.0c01251
[17] Alvascience models for aqueous solubility (LogS).
https://www.alvascience.com/tutorial-build-
models-for-aqueous-solubility-logs/
[18] D. Butina, J. M. R Gola, J. Chem. Inf. Comput. Sci. 2003, 43,
837–841. https://doi.org/10.1021/ci020279y
[19] A. Cheng, K. M. Merz, J. Med. Chem. 2003, 46, 3572– 3580.
https://doi.org/10.1021/jm020266b
[20] N. Jain, S. H. Yalkowsky, J. Pharm. Sci. 2001, 90(2), 234–252.
https://doi.org/10.1002/1520-6017(200102)90:
2%3C234::AID-JPS14%3E3.0.CO;2-V
[21] SILICOS-IT – aqueous solubility predictor
http://silicos-it.be.s3-website-eu-west-
1.amazonaws.com/software/filter-it/1.0.2/filter-it.html
[22] M. Randi, J. Am. Chem. Soc. 1975, 97(23), 6609– 6615.
https://doi.org/10.1021/ja00856a001
[23] L. H. Hall, L. B. Kier, W. J. Murray, J. Pharm. Sci. 1975,
64(12), 1974–1977. https://doi.org/10.1002/jps.2600641215
[24] M. Randi, J. Mol. Graph. Model. 2001, 20, 19–35.
https://doi.org/10.1016/S1093-3263(01)00098-5
[25] M. Hewitt, M. T. Cronin, S. J. Enoch, J. C. Madden, D. W.
Roberts, J. C. Dearden, J. Chem. Inf. Model. 2009, 49(11):
2572–2587. https://doi.org/10.1021/ci900286s
[26] S. Nikoli, N. Trinajsti, D. Amic, D. Beslo, S. Basak in
QSAR/QSPR Studies by Molecular Descriptors, Vol. 1 (Ed.: M. V.
Diudea), Nova Science Publishers, New York, 2001, pp. 63–81.
[27] Y. D. Hu, Y. L. Wang, Asian J. Chem. 2007, 19, 407– 416.
[28] C. Zhong, Q. Hu, J. Pharm. Sci. 2003, 92, 2284–2294.
https://doi.org/10.1002/jps.10499
DOI: 10.5562/cca3776 Croat. Chem. Acta 2020, 93(4), 311–319
[29] E. Estrada, E. J. Delgado, J. B. Alderete, G. A. Jaña, J.
Comput. Chem. 2004, 25(14), 1787–1796.
https://doi.org/10.1002/jcc.20099
[30] I. Gutman, B. Furtula, V. Katani, AKCE Int. J. of Graphs Comb.
2018, 15(3), 307–312.
https://doi.org/10.1016/j.akcej.2017.09.006
[31] N. Raos, G. Branica, A. Milievi, Croatica Chemica Acta, 2008,
81(3), 511–517.
[32] A. Milievi, N. Raos, J. Phys. Chem. A, 2008, 112(33),
7745–7749. https://doi.org/10.1021/jp802018m
[33] M. C. Sorkun, A. Khetan, S. Er, Sci Data. 2019, 6, 143.
https://doi.org/10.1038/s41597-019-0151-1
[34] OECD. eChemPortal - The Global Portal to Information on
Chemical Substances, 2019.
https://www.echemportal.org/echemportal/proper
tysearch/addblock_input.action
[35] US EPA. EPI Suite Data. WATERNT (Water Solubility Fragment)
Program Methodology & Validation Documents, 1995.
http://esc.syrres.com/interkow/Download/WaterFr
agmentDataFiles.zip
[36] US EPA. EPI Suite Data. WSKOWWIN Program Methodology and
Validation Documents, 1994.
http://esc.syrres.com/interkow/Download/WSKO
WWIN_Datasets.zip
[37] O. A. Raevsky, V. Y. Grigor’ev, D. E. Polianczyk, O. E.
Raevskaja, J. C. Dearden, J. Chem. Inf. Comput. Sci. 2014, 54,
683–691. https://doi.org/10.1021/ci400692n
[38] J. Huuskonen, J. Chem. Inf. Comput. Sci. 2000, 40, 773–777.
https://doi.org/10.1021/ci9901338
[39] J. Wang, T. Hou, X. Xu, J. Chem. Inf. Model. 2009, 49,
571–581. https://doi.org/10.1021/ci800406y
[40] Goodman Group website, http://www-
jmg.ch.cam.ac.uk/data/solubility/
[41] A. Llinas, R. C. Glen, J. M. Goodman, J. Chem. Inf. Model.
2008, 48, 1289–1303. https://doi.org/10.1021/ci800058v
[42] Dragon 7.0 – Software for molecular descriptors calculation,
2016, Kode srl., Pisa, Italy, https://chm.kode-
solutions.net/products_dragon.php
[43] V. Drgan, Š. uperl, M. Vrako, C. I. Cappelli, M. Novi, J.
Cheminform. 2017, 9, 30.
https://doi.org/10.1186/s13321-017-0218-y
[44] RDKit software, 2019, http://wwww.rdkit.org [45] P. Gramatica,
N. Chirico, E. Papa, S. Kovarich, S.
Cassani, J. Comput. Chem. 2013, 34, 2121–2132.
https://doi.org/10.1002/jcc.23361
[46] N. Minovski, Š. uperl, V. Drgan, M. Novi, Anal. Chim. Acta.
2013, 759, 28–42. https://doi.org/10.1016/j.aca.2012.11.002
[47] V. Consonni, D. Ballabio, R. Todeschini, J. Chem. Inf. Model.
2009, 49(7), 1669–1678. https://doi.org/10.1021/ci900115y
[48] N. Chirico, P. Gramatica, J. Chem. Inf. Model. 2012, 52,
2044–2058. https://doi.org/10.1021/ci300084j
[49] X. Wang, Q. Wang, M. E.Morris, APPS J. 2008, 10, 47–55.
https://doi.org/10.1208/s12248-007-9001-8
[50] R. Todeschini, V. Consonni, P. Gramatica in Comprehensive
Chemometrics, Vol. 4 (Eds.: S.D. Brown, R. Tauler, B. Walczak),
Elsevier, Oxford, 2009, pp. 129–172.
https://doi.org/10.1016/B978-044452701-1.00007-7
[51] P. Gramatica, QSAR Comb. Sci. 2007, 26, 694–701.
https://doi.org/10.1002/qsar.200610151
id SMILES split CP-ANN split Qsarins Solubility class Experimental
logS Q-D HAT i/i (h*=0,0293) NN-D dist (EDcrit=4,195) Q-A HAT i/i
(h*=0,0224) NN-A dist (EDcrit=6,783) consensus 4 average 4 SD (<
0.5) consensus NN average NN SD (<0.5) VEGA pkCSM Silicos-IT Ali
ESOL average-6
1 FC1CCC(CC1)C(=O)C2CCC(F)CC2 TR TR insoluble -4,397 -4,626 0,006
-2,932 0,327 -3,929 0,005 -4,121 0,097 -4,228 -3,902 0,710 -3,848
-3,526 0,841 -3,720 -2,944 -2,81 -3,14 -3,1 -3,26
2 O=C(OCCCOCCCOC(=O)C1CCCCC1)C2CCCCC2 TR TR insoluble -4,596 -4,494
0,004 -4,501 0,367 -4,366 0,010 -4,849 0,327 -4,458 -4,552 0,207
-4,685 -4,675 0,246 -4,590 -3,639 -4,26 -5,77 -4,23 -4,53
3
CC1CCC(O)C(C)C1,CC2CCC(C)C(O)C2,CC3CC(C)CC(O)C3,CC4CCC(O)CC4C,CC5CCCC(O)C5C,CC6CCCC(C)C6OTR
TR soluble -1,980 -2,959 0,127 -3,455 3,548 -5,754 0,086 -1,980
0,000 -2,031 -3,537 1,600 -1,981 -2,718 1,043 / -5,380 -1,01 -15,58
-12,79 -7,35
4 CC(C)=CCC\C(C)=C/CO TR TR moderately soluble -2,321 -2,354 0,003
-2,760 0,404 -1,943 0,002 -2,676 0,087 -2,126 -2,433 0,370 -2,691
-2,718 0,059 -2,300 -2,817 -1,84 -3,67 -2,78 -2,68
5 CC(OC1CCC(Cl)CC1C)C(O)=O TR TR moderately soluble -2,466 -3,150
0,001 -2,953 0,228 -3,006 0,001 -2,608 0,104 -3,065 -2,929 0,230
-2,716 -2,780 0,244 -2,290 -1,585 -1,42 -3,06 -2,55 -2,27
6 CC(C)=C1CCC(=CC1)C TR TR insoluble -4,287 -2,752 0,004 -4,123
0,315 -2,695 0,004 -4,926 0,151 -2,762 -3,624 1,091 -4,666 -4,525
0,568 -4,420 -3,625 -2,46 -4,19 -3,5 -3,81
7 CCCCCCCC\C=C/CCCCCCCC(O)=O,NCCNCCO TR TR insoluble -4,809 -3,918
0,012 -3,771 0,651 -3,147 0,013 -4,851 0,115 -3,620 -3,922 0,704
-4,689 -4,311 0,764 / -2,891 -5,39 -5,22 -3,21 -4,28
8 CCCCCCCCCCCCOC(=O)C(C)O TR TR insoluble -4,634 -3,614 0,005
-4,116 0,226 -2,806 0,005 -3,883 0,303 -3,230 -3,605 0,571 -4,017
-3,999 0,165 -4,010 -4,576 -4,27 -6,18 -4,02 -4,51
9 C=CC(=O)OCCOC1CCCCC1 TR TR moderately soluble -2,564 -2,617 0,002
-3,289 0,421 -2,812 0,003 -2,306 0,080 -2,693 -2,756 0,412 -2,463
-2,797 0,695 -2,410 -1,768 -2,01 -2,76 -2,17 -2,26
10 OC(=O)CCCCCCCCC=C TR TR moderately soluble -3,553 -3,063 0,005
-3,671 0,326 -3,078 0,005 -3,686 0,078 -3,093 -3,374 0,351 -3,683
-3,678 0,010 -3,190 -3,318 -2,94 -4,34 -2,82 -3,38
11 CC1CCC(CC1)C(C)(C)C TR TR insoluble -4,472 -3,304 0,004 -4,225
0,196 -2,284 0,004 -4,739 0,087 -2,827 -3,638 1,080 -4,581 -4,482
0,364 -5,040 -5,636 -2,67 -4,68 -3,84 -4,41
12 OC(=O)C1CCCCC1 TR TR soluble -1,445 -1,784 0,003 -1,102 0,182
-2,137 0,002 -0,708 0,166 -1,979 -1,433 0,647 -0,896 -0,905 0,279
-1,360 -1,481 -0,65 -2,37 -1,8 -1,43
13 CCCN(CCC)C(=O)SCC TS TR moderately soluble -2,703 -3,223 0,004
-2,233 0,700 -2,612 0,002 -2,454 0,236 -2,816 -2,631 0,424 -2,398
-2,344 0,156 -2,590 -2,340 -2,46 -3,84 -2,57 -2,70
14 CC(CC1=CC(=CC(=O)N1O)C)CC(C)(C)C,NCCO TR TR moderately soluble
-3,906 -2,543 0,013 -3,904 0,002 -0,785 0,011 -2,775 0,220 -3,250
-2,502 1,290 -3,892 -3,339 0,798 / -2,186 -3,38 -3,23 -2,67
-3,07
15 SCCC(=O)OCC(COC(=O)CCS)(COC(=O)CCS)COC(=O)CCS TR TR insoluble
-5,123 -6,032 0,014 -5,125 0,003 -3,607 0,029 -5,123 0,000 -5,131
-4,971 1,006 -5,124 -5,124 0,001 -2,410 -3,083 -3,82 -5,84 -2,05
-3,72
16 CCOC(=O)CC(C)CC(C)(C)C TR TR moderately soluble -3,823 -2,722
0,004 -3,598 0,612 -1,825 0,003 -3,686 0,099 -2,248 -2,958 0,872
-3,674 -3,642 0,062 -3,480 -3,023 -2,82 -3,71 -2,79 -3,25
17
C[C@]12C[C@H](O)[C@H]3[C@@H](CCC4=CC(=O)C=C[C@]34C)[C@@H]1CC[C@]2(O)C(=O)CO
TR TR moderately soluble -3,178 -4,540 0,006 -3,316 0,197 -2,546
0,013 -3,591 0,247 -3,913 -3,498 0,823 -3,438 -3,454 0,195 -3,390
-3,416 -2,17 -3,22 -2,96 -3,10
18 CCCCOCCOCCOCC1CC2OCOC2CC1CCC TS TR insoluble -4,151 -3,753 0,006
-4,501 0,518 -3,922 0,008 -5,029 0,643 -3,841 -4,301 0,581 -4,737
-4,765 0,373 -2,870 -3,457 -4,39 -4,01 -3,23 -3,78
19 CC1=CC[C@H]2C[C@@H]1C2(C)C TR TR insoluble -4,773 -2,677 0,007
-2,952 0,424 -3,417 0,010 -2,982 0,621 -2,987 -3,007 0,306 -2,964
-2,967 0,021 -3,860 -3,733 -2,23 -4,2 -3,51 -3,42
20 C[Si](C)(C)O[Si](C)(C)C TR TR insoluble -4,910 -5,422 0,022
-5,174 3,117 -2,258 0,004 -2,818 0,326 -2,707 -3,918 1,613 -3,041
-3,996 1,667 -4,180 -1,963 -2,14 -2,78 -2,55 -2,78
21 CCCCCCCC\C=C/CCCCCCCC(O)=O,NCCNCCN TR TR insoluble -4,410 -4,130
0,012 -3,771 0,631 -3,134 0,013 -4,851 0,132 -3,713 -3,971 0,716
-4,663 -4,311 0,764 / -2,891 -5,39 -5,08 -3,04 -4,21
22 CCCCOC(=O)C1CCCCC1C(=O)OCCCC TR TR insoluble -4,388 -3,580 0,003
-4,763 0,585 -3,567 0,003 -4,070 0,128 -3,583 -3,995 0,563 -4,194
-4,416 0,490 -4,540 -3,520 -3,66 -4,83 -3,48 -4,04
23 CC(C)CC(C)CC(C)\C=C(/C)C1CC(=O)OC1=O TR TR insoluble -4,426
-3,872 0,004 -4,425 0,001 -2,997 0,003 -4,907 0,266 -4,107 -4,050
0,820 -4,426 -4,666 0,341 -4,200 -4,466 -3,26 -5,31 -4,04
-4,28
24 CCCCCCCCCCCCCCCC(=O)NCC(O)=O TR TR insoluble -5,593 -4,434 0,007
-5,189 0,450 -3,705 0,010 -5,251 0,397 -4,144 -4,645 0,728 -5,222
-5,220 0,044 -3,930 -4,277 -5,7 -7,65 -4,73 -5,25
25 C\1C\C=C/CC\C=C/CC/C=C1 TR TR insoluble -5,365 -3,605 0,005
-3,706 0,640 -3,285 0,008 -4,926 0,243 -3,499 -3,881 0,720 -4,590
-4,316 0,863 -5,420 -3,789 -1,81 -5,26 -4,31 -4,20
26 C1CCC(CC1)C2CCC(CC2)C3CCCCC3 TR TR insoluble -6,183 -5,634 0,006
-6,200 0,485 -4,990 0,010 -5,474 0,242 -5,396 -5,575 0,499 -5,716
-5,837 0,513 -8,440 -6,661 -3,96 -8,02 -6,39 -6,53
27 C1CC2CCC3CCCC4CCC(C1)C2C34 TS TR insoluble -6,179 -5,078 0,008
-7,345 0,741 -4,586 0,012 -6,669 0,599 -4,908 -5,920 1,301 -6,971
-7,007 0,478 -7,240 -5,829 -2,94 -6,07 -5,15 -5,70
28 CC(C)CCCCCCCOC(=O)CCCCCCCCC(=O)OCCCCCCCC(C)C TR TR insoluble
-7,252 -5,868 0,012 -5,770 0,459 -5,789 0,025 -6,486 0,388 -5,854
-5,978 0,341 -6,158 -6,128 0,507 -7,000 -5,331 -9,65 -12,26 -8,09
-8,08
29 CCCCN(CCCC)C1CCC2C(OC3CC(C)C(NC4CCCCC4)CC3C25OC(=O)C6CCCCC56)C1
TR TR insoluble -7,404 -7,395 0,014 -7,796 1,461 -7,773 0,020
-7,404 0,000 -7,421 -7,592 0,222 -7,404 -7,600 0,277 -5,920 -4,369
-6,88 -8,94 -7,75 -6,88
30
CCCCC(CC)COC(=O)C1CCC(NC2nC(NC3CCC(CC3)C(=O)NC(C)(C)C)nC(NC4CCC(CC4)C(=O)OCC(CC)CCCC)n2)CC1TR
TR insoluble -8,185 -6,432 0,022 -8,200 0,708 -8,554 0,025 -6,950
0,925 -7,433 -7,534 1,007 -7,658 -7,575 0,884 -2,620 -2,962 -9,65
-10,84 -8,27 -7,00
31 CC1CCCC(C)C1OCC(O)=O TR TR moderately soluble -2,257 -2,430
0,001 -2,371 0,181 -2,336 0,001 -2,385 0,083 -2,381 -2,380 0,039
-2,380 -2,378 0,009 -1,920 -0,975 -1,15 -2,98 -2,28 -1,95
32 NC1CC(CCC1O)[N+]([O-])=O V V moderately soluble -2,206 -1,524
0,002 -2,595 0,473 -2,432 0,003 -1,136 0,241 -1,946 -1,922 0,705
-1,628 -1,866 1,032 -1,520 -0,433 0,71 -0,87 -0,4 -0,69
33 CCCCCCCl TR TR moderately soluble -3,122 -1,965 0,006 -2,874
0,364 -2,411 0,003 -2,837 0,112 -2,266 -2,522 0,426 -2,846 -2,855
0,026 -3,170 -3,454 -2,86 -3,32 -2,61 -3,04
34 CCCCCCC(C)=O TR TR moderately soluble -2,154 -2,124 0,004 -1,821
0,281 -2,128 0,003 -2,192 0,113 -2,126 -2,066 0,167 -2,085 -2,006
0,262 -1,600 -2,778 -2,59 -2,37 -1,8 -2,20
35 CCS TR TR soluble -0,846 -1,198 0,026 -0,741 0,960 -1,284 0,004
-0,749 0,112 -1,254 -0,993 0,288 -0,749 -0,745 0,006 -0,080 -0,500
-0,61 -1,28 -0,78 -0,67
36 CCCCBr TR TR moderately soluble -2,198 -1,387 0,011 -1,763 0,891
-2,143 0,004 -2,052 0,091 -1,945 -1,836 0,341 -2,025 -1,907 0,204
-1,940 -2,296 -2,3 -2,4 -2,29 -2,21
37 [O-][N+](=O)C1CC(Cl)CCC1Cl TR TR moderately soluble -3,364
-2,949 0,002 -2,776 0,279 -4,110 0,003 -2,284 0,132 -3,427 -3,030
0,773 -2,443 -2,530 0,348 -2,520 -3,027 -1,47 -2,82 -2,4
-2,45
38 CC(C)(C1CCC(O)CC1)C2CCC(O)CC2,OC(=O)C=C,ClCC3CO3 TR TR
moderately soluble -3,680 -3,130 0,040 -3,679 0,002 -4,289 0,007
-3,681 0,000 -3,720 -3,694 0,473 -3,680 -3,680 0,001 / -3,650 -1,97
-5,59 -4,6 -3,90
39 CCCCCOC(=O)CC TR TR moderately soluble -2,251 -1,607 0,003
-1,896 0,199 -1,956 0,002 -2,818 0,084 -1,832 -2,069 0,522 -2,543
-2,357 0,651 -2,070 -1,705 -2,34 -2,98 -2,08 -2,29
40 FC(F)=C(F)Cl TR TR moderately soluble -2,161 -0,886 0,022 -2,159
0,002 -2,876 0,006 -1,938 0,283 -2,224 -1,965 0,823 -2,158 -2,049
0,156 -1,200 -1,639 -1,37 -1,47 -1,73 -1,59
41 NCCCCCCCCCCC(O)=O TR TR moderately soluble -2,401 -3,067 0,005
-3,671 0,453 -2,410 0,005 -2,454 0,124 -2,715 -2,900 0,595 -2,715
-3,063 0,861 -2,310 -2,842 -2,95 0,19 0,22 -1,73
42 C\C=C1/CC2CC1C=C2 TR TR moderately soluble -3,177 -2,338 0,007
-3,177 0,001 -3,082 0,006 -2,982 0,304 -3,047 -2,895 0,379 -3,176
-3,079 0,138 -3,410 -3,629 -1,26 -3,52 -2,99 -3,00
43 CCO TR TR highly soluble 1,234 1,124 0,028 1,232 0,002 -0,471
0,004 0,971 0,139 0,680 0,714 0,797 1,228 1,101 0,185 0,430 0,782
0,13 0,12 -0,07 0,44
44 COC1CCC(OC)CC1 TR TR moderately soluble -2,246 -1,259 0,004
-1,330 0,521 -2,320 0,002 -1,429 0,052 -1,974 -1,585 0,495 -1,420
-1,380 0,070 -0,850 -1,093 -1,32 -1,26 -1,4 -1,22
45 CN=C=S TR TR soluble -0,983 -0,292 0,011 -1,478 1,006 -1,904
0,003 -0,042 0,148 -1,547 -0,929 0,903 -0,225 -0,760 1,016 -0,300
-0,526 -0,33 -1,46 -0,89 -0,62
46 CC(C)CCCCCCCO TR TR moderately soluble -3,324 -2,648 0,005
-3,324 0,001 -2,020 0,004 -2,676 0,153 -3,132 -2,667 0,532 -3,322
-3,000 0,458 -2,790 -3,820 -2,95 -3,91 -2,75 -3,26
47 CCC(=O)OCC1CCCCC1 TR TR moderately soluble -2,345 -2,655 0,002
-2,080 0,339 -2,597 0,002 -2,445 0,115 -2,621 -2,444 0,259 -2,353
-2,263 0,259 -2,940 -2,741 -2,2 -3,18 -2,5 -2,65
48 CCN(CC)CC TR TR soluble -0,138 -0,565 0,016 -0,637 0,526 -1,132
0,003 -0,618 0,096 -1,030 -0,738 0,264 -0,621 -0,628 0,013 -0,770
-0,714 -1,51 -1,12 -1,18 -0,99
49 CSC TS TR soluble -0,931 -1,596 0,033 -0,741 3,047 -1,294 0,004
-0,749 0,107 -1,308 -1,095 0,422 -0,749 -0,745 0,006 -1,120 -0,697
-0,6 -1 -0,78 -0,82
50 ClC1CCCC(n1)C(Cl)(Cl)Cl TR TR moderately soluble -3,506 -4,268
0,005 -3,989 0,493 -4,455 0,005 -3,969 0,141 -4,354 -4,170 0,234
-3,973 -3,979 0,014 -3,330 -2,719 -3,38 -3,24 -3,33 -3,33
51 C\C=C/C=C,C/C=C/C=C,C1CC=CC1 TR TR moderately soluble -2,454
-1,852 0,050 -2,876 2,890 -3,782 0,008 -5,005 0,348 -3,545 -3,379
1,341 -4,776 -3,940 1,506 / -5,931 -0,65 -5,86 -4,81 -4,41
52 CCCCC TR TR moderately soluble -3,007 -0,626 0,013 -3,491 0,563
-1,660 0,004 -2,648 0,080 -1,472 -2,106 1,238 -2,752 -3,070 0,596
-2,890 -2,288 -1,73 -3,07 -2,29 -2,50
53 CCCCOC(C)=O TR TR soluble -1,220 -0,911 0,003 -1,296 0,177
-1,523 0,002 -0,826 0,108 -1,306 -1,139 0,327 -1,004 -1,061 0,332
-0,870 -0,651 -1,5 -1,95 -1,42 -1,23
54 NC1CCC(C2CCC(N)CC2[S](O)(=O)=O)C(C1)[S](O)(=O)=O TR TR
moderately soluble -2,692 -3,993 0,008 -2,695 0,001 -2,419 0,010
-3,344 0,342 -2,826 -3,113 0,703 -2,698 -3,020 0,459 -1,290 -2,286
0,23 2,92 1,91 -0,20
55 CC(C)=CCC\C(C)=C\CO TS TR moderately soluble -2,352 -2,354 0,003
-2,760 0,404 -1,943 0,002 -2,676 0,087 -2,126 -2,433 0,370 -2,691
-2,718 0,059 -2,300 -2,817 -1,84 -3,67 -2,78 -2,68
56 CCCCOCCO[P](=O)(OCCOCCCC)OCCOCCCC TR TR moderately soluble
-2,779 -3,628 0,010 -4,108 0,665 -4,223 0,020 -3,310 0,282 -3,812
-3,817 0,425 -3,548 -3,709 0,564 -3,240 -3,491 -5,5 -5,17 -3,29
-4,04
57 OC1CCCC2CCCnC12 TR TR moderately soluble -2,350 -2,465 0,004
-2,310 0,497 -2,771 0,003 -2,127 0,108 -2,643 -2,418 0,273 -2,160
-2,218 0,129 -1,330 -0,552 -1,15 -0,85 -1,33 -1,23
58 C\C=C\C TR TR soluble -1,940 -0,572 0,013 -1,478 0,939 -1,471
0,003 -1,909 0,071 -1,323 -1,357 0,562 -1,879 -1,693 0,305 -2,190
-1,460 -0,56 -1,97 -1,66 -1,62
59 CC(=O)C1CCCCC1 TR TR soluble -1,280 -2,232 0,003 -1,413 0,277
-2,436 0,002 -1,255 0,073 -2,339 -1,834 0,587 -1,288 -1,334 0,112
-1,640 -2,318 -1,61 -2,02 -1,84 -1,79
60 CCCCOC(=O)CCC TR TR moderately soluble -2,361 -1,607 0,003
-1,896 0,158 -1,956 0,002 -2,818 0,084 -1,832 -2,069 0,522 -2,496
-2,357 0,651 -2,060 -1,573 -2,34 -2,37 -1,71 -2,09
61 OC(=O)C1CCCCC1C(O)=O TR TR soluble -1,425 -1,908 0,002 -2,595
0,170 -2,551 0,003 -2,074 0,112 -2,189 -2,282 0,343 -2,280 -2,335
0,369 -0,770 -1,150 0,17 -2,01 -1,32 -1,23
62 CC1CCCC(C)C1O TR TR soluble -1,292 -1,781 0,004 -1,413 0,443
-2,060 0,002 -1,255 0,051 -1,945 -1,627 0,363 -1,271 -1,334 0,112
-0,940 -1,337 -1,01 -2,16 -1,96 -1,45
63 NC1CCC(Cl)CC1Cl TR TR moderately soluble -2,417 -2,899 0,003
-2,012 0,173 -3,276 0,002 -2,328 0,061 -3,102 -2,629 0,566 -2,245
-2,170 0,223 -1,430 -1,454 -1,72 -1,66 -1,83 -1,72
64 CC/C=C/CC/C=C/C=C V V moderately soluble -3,342 -2,740 0,005
-3,110 1,039 -2,932 0,004 -4,766 0,136 -2,881 -3,387 0,932 -4,574
-3,938 1,171 -4,940 -5,029 -2,08 -3,44 -2,72 -3,80
65 CCCCCCCCCC(O)=O TR TR moderately soluble -3,445 -2,765 0,005
-3,671 0,354 -2,721 0,004 -3,686 0,110 -2,764 -3,211 0,540 -3,682
-3,678 0,010 -2,990 -3,092 -2,87 -4,58 -2,96 -3,36
66 CC(C)C TR TR moderately soluble -2,978 -1,034 0,018 -0,781 0,834
-1,199 0,004 -1,909 0,104 -1,190 -1,231 0,484 -1,784 -1,345 0,798
-2,530 -1,871 -0,91 -1,72 -1,52 -1,72
67 CCCC TS TR moderately soluble -2,978 -0,391 0,016 -1,763 2,139
-1,431 0,004 -1,909 0,152 -1,233 -1,373 0,685 -1,899 -1,836 0,103
-2,500 -1,651 -1,28 -2,55 -1,96 -1,97
68 OC(=O)CC1CCC(O)CC1 TR TR soluble -0,399 -2,128 0,002 -1,085
0,202 -2,109 0,002 -1,161 0,057 -2,099 -1,621 0,576 -1,144 -1,123
0,053 -0,960 -0,408 -0,28 -1,35 -1,05 -0,87
69 COC1CCC(COC(C)=O)CC1 TR TR moderately soluble -2,474 -2,297
0,002 -2,080 0,441 -2,391 0,002 -2,385 0,047 -2,355 -2,288 0,145
-2,355 -2,232 0,216 -1,430 -1,321 -1,7 -1,84 -1,67 -1,72
70 NC1nC(N)nC(n1)C2CCCCC2 TR TR moderately soluble -2,494 -2,931
0,003 -2,449 0,409 -2,775 0,004 -2,228 0,275 -2,858 -2,595 0,317
-2,316 -2,338 0,157 -2,150 -2,951 -0,9 -0,77 -0,92 -1,67
71 CC(=C)C(=O)OC1CC2CCC1(C)C2(C)C TR TR insoluble -4,544 -3,630
0,004 -4,011 0,398 -2,982 0,004 -2,920 0,302 -3,315 -3,386 0,526
-3,391 -3,466 0,772 -3,690 -3,653 -3,03 -5,56 -4,33 -3,94
72 OC1CCCCC1C(=O)OC2CCCCC2 TR TR insoluble -4,554 -3,550 0,002
-3,538 0,292 -3,258 0,002 -2,521 0,304 -3,398 -3,217 0,483 -3,040
-3,030 0,719 -2,830 -2,227 -1,64 -3,86 -3,07 -2,78
73 CC(=C)[C@@H]1CCC(=CC1)C TS TR insoluble -4,347 -2,825 0,003
-4,123 0,350 -2,749 0,004 -4,926 0,151 -2,819 -3,656 1,056 -4,684
-4,525 0,568 -4,250 -3,568 -2,26 -4,29 -3,5 -3,76
74 CC\C=C/CCOC(=O)C\C=C/CC TR TR insoluble -4,510 -2,863 0,004
-3,671 0,557 -2,810 0,003 -4,057 0,067 -2,866 -3,350 0,614 -4,015
-3,864 0,273 -3,380 -3,259 -2,55 -3,53 -2,61 -3,22
75 CCCCCCCCCCCC(=O)OCC(O)CO TR TR insoluble -4,660 -3,332 0,005
-4,116 0,297 -2,293 0,006 -3,883 0,178 -2,877 -3,406 0,812 -3,970
-3,999 0,165 -3,150 -3,540 -3,71 -5,27 -3,24 -3,81
76 COC1CCC(C(O)C1)C(=O)C2CCCCC2 TR TR insoluble -4,580 -3,580 0,002
-3,532 0,149 -3,302 0,003 -3,892 0,043 -3,471 -3,577 0,243 -3,811
-3,712 0,254 -1,770 -2,525 -1,8 -3,32 -2,83 -2,68
77 CCOC(=O)C1CCCCC1 TR TR moderately soluble -2,400 -2,209 0,002
-2,080 0,419 -2,470 0,002 -2,118 0,047 -2,339 -2,219 0,176 -2,114
-2,099 0,027 -2,610 -2,164 -1,78 -2,76 -2,22 -2,27
78 OC(=O)CCC1CCCCC1 TR TR soluble -1,406 -2,683 0,003 -2,080 0,352
-2,466 0,002 -2,118 0,048 -2,538 -2,337 0,289 -2,113 -2,099 0,027
-2,330 -2,367 -1,49 -3,26 -2,39 -2,33
79 CC(C)=CCC\C(C)=C\C=O TR TR moderately soluble -2,412 -2,577
0,004 -2,760 0,258 -2,397 0,003 -2,676 0,124 -2,474 -2,602 0,156
-2,703 -2,718 0,059 -3,190 -3,326 -1,96 -3,05 -2,43 -2,78
80 CNCCCN1C2CCCCC2CCC3CCCCC13 TR TR moderately soluble -3,658
-5,223 0,005 -3,612 0,257 -3,848 0,004 -3,802 0,128 -4,493 -4,121
0,741 -3,739 -3,707 0,134 -3,950 -3,168 -3,83 -4,13 -3,89
-3,78
81 CC1CCCCC1 TR TR moderately soluble -2,206 -1,753 0,007 -1,599
0,458 -2,224 0,002 -2,731 0,228 -2,116 -2,077 0,511 -2,355 -2,165
0,800 -3,670 -3,478 -1,62 -3,3 -2,72 -2,86
82 OC(=O)CCCCC(O)=O,C1CNCCN1 TR TR soluble -0,623 -1,490 0,014
-0,624 0,001 -1,333 0,004 -0,623 0,000 -0,734 -1,018 0,459 -0,623
-0,624 0,000 / -0,815 -0,23 3,92 2,41 0,93
83 NC(N)=O TR TR highly soluble 0,958 0,299 0,021 0,957 0,002
-0,549 0,003 -0,618 0,050 0,283 0,022 0,750 0,896 0,169 1,113 0,300
1,140 1,07 0,48 0,69 0,76
84 NC1CCCC(Cl)C1 TR TR soluble -1,373 -2,182 0,004 -0,886 0,233
-2,670 0,002 -1,515 0,088 -2,504 -1,813 0,779 -1,343 -1,200 0,444
-1,070 -0,933 -1,31 -1,35 -1,43 -1,24
85 CCCCCCCCCCCCCCC(C)C1CC(C)CC(C)C1O TR TR insoluble -7,336 -5,972
0,009 -7,804 0,524 -4,983 0,010 -5,654 0,428 -5,523 -6,103 1,207
-6,621 -6,729 1,520 -5,630 -8,064 -6,87 -10,67 -7,6 -7,58
86 C1CC=CC1 TR TR moderately soluble -2,105 -0,510 0,015 -1,777
1,045 -2,247 0,004 -2,304 0,147 -1,909 -1,710 0,834 -2,240 -2,041
0,373 -2,140 -1,474 -0,65 -1,54 -1,47 -1,59
87 CC(C)C1CCC(C)CC1O TR TR moderately soluble -2,488 -2,660 0,002
-3,080 0,138 -1,672 0,004 -2,199 0,079 -2,301 -2,402 0,606 -2,520
-2,639 0,623 -2,460 -2,411 -1,48 -3,5 -2,88 -2,54
88 CNC1CCC(O)CC1,CNC2CCC(O)CC2,O[S](O)(=O)=O TR TR soluble -0,838
-1,455 0,045 -0,839 0,004 -2,661 0,015 -0,838 0,000 -0,943 -1,448
0,859 -0,838 -0,839 0,001 / -1,533 -0,91 0,22 -0,18 -0,65
89 CC1CC(C)CC(O)C1 TR TR soluble -1,388 -1,950 0,003 -1,413 0,318
-2,139 0,002 -1,255 0,037 -2,023 -1,689 0,422 -1,271 -1,334 0,112
-1,320 -1,353 -1,01 -2,29 -2,04 -1,55
90 CC(C)CC(=O)CC(C)C TS TR moderately soluble -2,454 -2,466 0,004
-2,002 0,719 -1,884 0,002 -2,192 0,163 -2,088 -2,136 0,254 -2,157
-2,097 0,134 -2,180 -2,946 -2,26 -2,48 -2,02 -2,34
91 NC1CCC2C(O)CC(CC2C1)[S](O)(=O)=O TR TR moderately soluble -2,680
-3,341 0,003 -4,011 0,695 -2,251 0,006 -3,133 0,246 -3,030 -3,184
0,726 -3,363 -3,572 0,621 -2,120 -0,949 0,07 1,01 0,41 -0,82
92
COC1CCC(CC1O)[C@@H]2CC(=O)C3C(O)CC(O[C@@H]4O[C@H](CO[C@@H]5O[C@@H](C)[C@H](O)[C@@H](O)[C@H]5O)[C@@H](O)[C@H](O)[C@H]4O)CC3O2TR
TR insoluble -5,011 -3,210 0,023 -3,611 0,609 -1,326 0,018 -3,976
0,616 -2,206 -3,031 1,179 -3,793 -3,793 0,258 -0,190 -1,414 2,75
-0,45 -0,79 -0,65
93 CC1(C)[C@H]2CC[C@]1(C)C(C2)OC(=O)C=C TR TR insoluble -4,583
-3,337 0,004 -4,011 0,324 -2,954 0,004 -2,920 0,238 -3,151 -3,306
0,507 -3,383 -3,466 0,772 -3,160 -3,187 -2,64 -5,18 -4,01
-3,59
94 CC(C)[C@@H]1CC[C@@H](C)CC1=O V V moderately soluble -2,492
-2,747 0,003 -3,080 0,265 -2,106 0,003 -2,199 0,048 -2,451 -2,533
0,461 -2,333 -2,639 0,623 -2,370 -2,871 -2,17 -3,07 -2,65
-2,58
95 CC1CC(C)CC(C)C1 TS TR moderately soluble -3,383 -2,436 0,004
-2,012 0,578 -2,272 0,003 -2,731 0,142 -2,340 -2,363 0,301 -2,590
-2,372 0,508 -4,360 -3,972 -1,98 -3,71 -3,15 -3,29
96 CC(=O)OC(C)(C)CCCC(=C)C=C TS TR moderately soluble -3,597 -3,018
0,005 -3,598 0,799 -2,600 0,003 -4,057 0,134 -2,775 -3,318 0,640
-3,991 -3,827 0,324 -3,070 -3,193 -2,89 -3,83 -2,86 -3,31
97 CCCN(CCC)CCC TR TR moderately soluble -2,472 -2,071 0,006 -4,501
0,486 -1,805 0,003 -2,192 0,148 -1,917 -2,642 1,250 -2,730 -3,346
1,633 -1,950 -1,894 -2,78 -2,51 -2,09 -2,33
98 CCCCCC(CO)CCC TS TR moderately soluble -3,518 -2,361 0,005
-3,324 0,421 -2,015 0,004 -2,676 0,148 -2,183 -2,594 0,557 -2,845
-3,000 0,458 -2,940 -2,765 -2,95 -3,87 -2,72 -3,01
99 FC(F)C(F)(F)F TR TR soluble -1,417 -1,096 0,038 -2,539 0,820
-2,232 0,014 -2,778 0,318 -1,954 -2,161 0,745 -2,711 -2,658 0,169
-2,060 -1,458 -1,32 -1,82 -1,9 -1,88
100 CC1CCCC(O)C1C TS TR soluble -1,424 -1,781 0,004 -1,413 0,444
-2,060 0,002 -1,255 0,051 -1,945 -1,627 0,363 -1,271 -1,334 0,112
-1,020 -1,117 -1,01 -2,29 -2,04 -1,46
101 CC(=O)OC1CCC(CC1)[N+]([O-])=O TR TR moderately soluble -2,599
-2,155 0,002 -2,583 0,341 -2,964 0,003 -2,124 0,154 -2,497 -2,457
0,398 -2,267 -2,354 0,324 -2,150 1,765 -0,55 -2,33 -1,57
-1,18
102 CC(C)COC(=O)C1CC(N)C(Cl)C(N)C1 TR TR moderately soluble -2,738
-3,518 0,001 -2,739 0,001 -2,808 0,002 -3,515 0,152 -2,950 -3,145
0,430 -2,742 -3,127 0,549 -1,980 -1,380 -1,45 -2,04 -1,63
-1,87
103 ClC1nC(Cl)nC(Cl)n1 TR TR moderately soluble -2,622 -2,175 0,007
-2,621 0,001 -3,962 0,005 -2,622 0,000 -2,722 -2,845 0,774 -2,622
-2,622 0,001 -1,740 -1,471 -2,02 -1,12 -1,81 -1,80
104 CC12CCC3C(CCC4=CC(=O)C=CC34C)C1CCC2=O TR TR moderately soluble
-2,821 -5,076 0,006 -3,819 0,256 -4,197 0,015 -3,858 0,182 -4,804
-4,237 0,584 -3,842 -3,839 0,028 -4,570 -4,817 -3,71 -3,34 -3,47
-3,96
105 CC[N+](CC)(CC(=O)NC1C(C)CCCC1C)CC2CCCCC2,[O-]C(=O)C3CCCCC3 TR
TR soluble -1,021 -5,196 0,009 -1,026 0,003 -3,610 0,011 -1,021
0,000 -1,496 -2,713 2,056 -1,022 -1,023 0,003 / -3,919 -5,48 -8,56
-6,71 -5,14
106 CCCCCCO TR TR soluble -1,383 -0,897 0,006 -0,759 0,554 -1,391
0,003 -0,618 0,072 -1,205 -0,916 0,336 -0,635 -0,689 0,099 -0,410
-1,286 -1,64 -2,08 -1,49 -1,26
107 CC(C)[C@@H]1CC[C@@H](C)C[C@H]1O TS TR moderately soluble -2,571
-2,660 0,002 -3,080 0,138 -1,672 0,004 -2,199 0,079 -2,301 -2,402
0,606 -2,520 -2,639 0,623 -2,460 -2,411 -1,48 -3,5 -2,88
-2,54
108 COC(C)(C)C TR TR soluble -0,324 -0,424 0,007 -0,091 0,328
-0,960 0,004 -0,618 0,096 -0,746 -0,523 0,363 -0,499 -0,355 0,373
0,090 -1,191 -1,12 -0,72 -0,91 -0,73
109
CCC1CCCCC1,CCC2CC(CC)C(CC)CC2CC,CCC(C)(C3CCCCC3)C4CCCCC4,CC5CCCCC5,CC6CCCCC6C7CCCCC7,CC8CCCC9CCCCC89,C=CC%10CCCCC%10,C%11CCC(CC%11)C%12CCCCC%12,C(CC%13CCCCC%13)C%14CCCCC%14,C%15CCCCC%15,C%16CCC(CC%16)C%17CCCCC%17TR
TR insoluble -4,588 -5,844 0,430 -4,602 0,617 -17,563 0,222 -4,922
5,174 -11,700 -8,233 6,242 -4,636 -4,762 0,226 / -2,892 / / /
-3,76
110 COC(=O)C1CCCCC1C(=O)OC TR TR soluble -1,686 -1,848 0,003 -2,371
0,520 -2,307 0,003 -0,922 0,146 -2,051 -1,862 0,668 -1,240 -1,647
1,025 -2,060 -1,531 -1,24 -2,22 -1,77 -1,68
111 CCOC(=O)CC(=O)C(F)(F)F TS TR soluble -1,663 -2,197 0,011 -0,544
0,704 -1,968 0,006 -1,433 0,431 -2,036 -1,535 0,734 -1,096 -0,988
0,629 -1,290 -0,820 -1,6 -1,88 -1,51 -1,37
112 [O-][N+](=O)OCCOCCO[N+]([O-])=O TR TR soluble -1,691 0,010
0,012 -1,606 0,658 -2,325 0,009 -2,537 0,300 -1,335 -1,614 1,154
-2,246 -2,071 0,658 -1,810 -1,296 0,51 -3,07 -1,15 -1,51
113 CCC(C)(C)C1CCCCC1 V V insoluble -4,501 -3,131 0,004 -3,080
0,484 -2,621 0,002 -4,739 0,086 -2,854 -3,393 0,926 -4,489 -3,909
1,173 -5,480 -5,324 -2,92 -5,34 -4,18 -4,62
114 CC(CCC=O)C1C(=CCCC1(C)C)C TR TR insoluble -4,653 -3,666 0,004
-4,065 0,396 -2,890 0,003 -5,005 0,369 -3,247 -3,907 0,880 -4,552
-4,535 0,665 -4,540 -4,367 -3,42 -3,44 -3,01 -3,89
115 CC1(C)C2CCC(C2)C1=C V V insoluble -4,472 -2,707 0,007 -2,952
0,382 -2,902 0,005 -2,982 0,302 -2,821 -2,886 0,123 -2,969 -2,967
0,021 -3,850 -4,291 -2,48 -3,93 -3,34 -3,48
116 CCCCCCCCCC(=O)OC TR TR insoluble -4,627 -2,669 0,004 -3,671
0,300 -2,603 0,003 -3,686 0,058 -2,668 -3,157 0,602 -3,683 -3,678
0,010 -4,200 -3,941 -3,58 -4,68 -3,18 -3,88
117 CCCC[Sn](=O)CCCC TR TR insoluble -4,794 -4,548 0,010 -4,794
0,004 -2,916 0,003 -2,818 0,236 -3,922 -3,769 1,047 -4,765 -3,806
1,398 -5,840 -2,922 -3,41 -3,16 -2,96 -3,84
118 COC(=O)C1CCC(C)CC1 TR TR moderately soluble -2,577 -2,082 0,002
-1,888 0,199 -2,238 0,002 -2,118 0,028 -2,155 -2,081 0,145 -2,089
-2,003 0,163 -2,690 -2,088 -1,53 -2,65 -2,22 -2,21
119 NC1CCCCC1N TR TR soluble -0,440 -1,509 0,004 -0,492 0,340
-1,803 0,002 -0,490 0,102 -1,689 -1,074 0,683 -0,491 -0,491 0,001
-0,080 0,445 -0,31 -0,31 -0,35 -0,18
120 O=C1N[S](=O)(=O)C2CCCCC12 TR TR soluble -1,672 -2,465 0,005
-3,253 0,711 -2,126 0,004 -1,258 0,146 -2,270 -2,275 0,827 -1,596
-2,255 1,411 -1,230 -1,375 -1,23 -1,38 -1,21 -1,34
121 CC(C)=CCC\C(C)=C\CCC(C)=O TR TR moderately soluble -3,697
-3,524 0,006 -3,598 0,477 -2,856 0,003 -3,697 0,000 -3,508 -3,419
0,382 -3,697 -3,648 0,070 -3,460 -4,917 -3,18 -3,75 -2,98
-3,66
122 CCCCCC(=O)OCC=C V V moderately soluble -2,611 -2,054 0,003
-1,896 0,631 -2,317 0,002 -2,818 0,142 -2,208 -2,271 0,403 -2,648
-2,357 0,651 -2,380 -2,156 -2,41 -2,87 -2,03 -2,42
123 CCCCCCCCCCO V V moderately soluble -3,631 -2,647 0,006 -3,324
0,740 -2,285 0,005 -2,676 0,209 -2,444 -2,733 0,432 -2,819 -3,000
0,458 -2,840 -3,829 -3,32 -4,72 -3,17 -3,45
124 O=C(OCCOCCOC(=O)C1CCCCC1)C2CCCCC2 TR TR moderately soluble
-3,938 -4,124 0,004 -4,501 0,506 -3,951 0,009 -3,717 0,155 -4,069
-4,073 0,330 -3,901 -4,109 0,555 -4,120 -3,445 -3,47 -5,02 -3,74
-3,95
125 ClC1CCC(Cl)C(Cl)C1 TR TR moderately soluble -3,703 -3,499 0,004
-3,863 0,150 -4,165 0,003 -4,310 0,103 -3,887 -3,959 0,359 -4,129
-4,087 0,316 -3,170 -3,888 -2,71 -2,71 -2,92 -3,25
126 COC TS TR highly soluble 0,884 1,915 0,034 1,232 1,147 -0,670
0,004 0,971 0,118 -0,349 0,862 1,096 0,995 1,101 0,185 0,010 0,313
-0,17 0,17 -0,18 0,19
127 C1CCC2CCCCC2C1 TR TR moderately soluble -3,567 -3,041 0,006
-2,285 0,658 -2,866 0,003 -2,808 0,202 -2,917 -2,750 0,325 -2,685
-2,546 0,369 -4,530 -5,073 -2,17 -4,35 -3,61 -3,74
128 C=CC(=O)OCCCCCCOC(=O)C=C TR TR moderately soluble -2,819 -2,445
0,004 -3,637 0,298 -2,758 0,006 -2,508 0,171 -2,574 -2,837 0,550
-2,919 -3,072 0,798 -2,770 -1,897 -2,59 -3,37 -2,17 -2,62
129 NC1CCC(CC1)N=NC2CCCCC2 TR TR moderately soluble -3,760 -3,944
0,003 -3,538 0,177 -4,530 0,004 -4,121 0,127 -4,187 -4,033 0,411
-3,877 -3,829 0,412 -3,510 -2,791 -2,27 -3,1 -2,51 -3,01
130 CCCCC(O)=O TR TR soluble -0,491 -0,811 0,004 -0,744 0,415
-1,637 0,002 -0,826 0,074 -1,327 -1,005 0,423 -0,814 -0,785 0,058
-0,470 -0,610 -0,78 -1,78 -1,15 -0,93
131 C\C=C\OC1CCC(CC1)O\C=C\C TR TR moderately soluble -3,904 -2,814
0,002 -3,289 0,254 -3,172 0,003 -3,469 0,107 -2,976 -3,186 0,276
-3,416 -3,379 0,128 -1,540 -2,911 -1,52 -3,28 -2,82 -2,58
132 CC(C)(C)C1CCC(O)C(C1)C(C)(C)C TR TR moderately soluble -3,796
-3,892 0,006 -4,079 0,091 -2,135 0,007 -4,608 0,062 -3,241 -3,679
1,073 -4,393 -4,344 0,374 -3,580 -3,653 -2,71 -4,56 -3,81
-3,78
133 CC(CC1CCC(CC1)C(C)(C)C)C=O TR TR moderately soluble -3,792
-3,976 0,003 -4,065 0,330 -2,486 0,003 -3,792 0,000 -3,548 -3,580
0,738 -3,793 -3,928 0,193 -4,800 -5,641 -3,09 -4,86 -3,89
-4,35
134 N#CCN(CCN(CC#N)CC#N)CC#N V V moderately soluble -2,820 -2,528
0,006 -3,637 0,997 -2,837 0,025 -2,300 0,400 -2,585 -2,825 0,584
-2,683 -2,968 0,945 -1,240 -1,736 -1,33 -1,11 -0,37 -1,41
135 CC(F)F V V soluble -1,315 0,465 0,022 0,957 2,093 -1,514 0,005
-1,165 0,228 -1,154 -0,314 1,209 -0,956 -0,104 1,500 -0,280 -0,785
-0,72 -1,06 -1,16 -0,83
136 CCC(C)(O)CCCC(C)C TR TR moderately soluble -2,694 -2,727 0,004
-2,002 0,669 -1,713 0,004 -2,676 0,092 -2,216 -2,279 0,502 -2,595
-2,339 0,477 -2,460 -2,511 -2,56 -3,37 -2,55 -2,67
137 CC(C)(N=NC(C)(C)C#N)C#N TR TR moderately soluble -2,713 -2,295
0,007 -2,713 0,002 -2,966 0,007 -2,700 0,005 -2,692 -2,668 0,278
-2,710 -2,707 0,009 -1,730 -2,211 -1,79 -2,21 -1,42 -2,01
138 COC(=O)C1CCC(CC1)C(=O)OC TS TR moderately soluble -3,797 -1,952
0,002 -2,371 0,478 -2,352 0,003 -0,922 0,175 -2,109 -1,899 0,679
-1,311 -1,647 1,025 -1,960 -1,492 -1,24 -1,85 -1,54 -1,57
139 C[C@@H](CCO)CCC=C(C)C TR TR moderately soluble -2,707 -2,395
0,003 -3,186 0,382 -1,864 0,003 -2,676 0,044 -2,130 -2,530 0,552
-2,728 -2,931 0,360 -2,430 -2,532 -2,21 -4,03 -2,94 -2,81
140 s1CCCC1 TR TR soluble -1,445 -1,379 0,016 -1,777 1,060 -2,607
0,003 -0,329 0,150 -2,399 -1,523 0,946 -0,509 -1,053 1,024 -1,230
-2,178 -1 -1,49 -1,24 -1,27
141 CC(C)(C1CCC(O)CC1)C2CCC(O)CC2 TR TR moderately soluble -2,880
-4,245 0,002 -2,960 0,254 -3,297 0,004 -2,832 0,218 -3,892 -3,334
0,639 -2,891 -2,896 0,091 -3,260 -2,691 -1,97 -3,62 -3,15
-2,93
142 CC(C)CCO TR TR soluble -0,468 -0,416 0,005 -0,637 0,590 -0,918
0,003 -0,141 0,137 -0,704 -0,528 0,330 -0,234 -0,389 0,351 0,130
-0,670 -0,84 -1,18 -0,99 -0,63
143 s1CnC2CCCCC12 TS TR soluble -1,654 -2,984 0,008 -2,286 0,761
-3,344 0,004 -2,127 0,428 -3,218 -2,685 0,576 -2,184 -2,206 0,112
-2,140 -1,381 -1,5 -1,79 -1,76 -1,79
144 OC(=O)C1CCCC(C1)[N+]([O-])=O V V soluble -1,746 -1,911 0,002
-2,595 0,280 -3,020 0,004 -2,124 0,160 -2,347 -2,413 0,495 -2,296
-2,360 0,333 -1,800 -1,302 0,16 -2,13 -1,29 -1,44
145 OC(C1CCCCC1)C(=O)C2CCCCC2 TR TR moderately soluble -2,850
-4,036 0,002 -3,532 0,328 -3,299 0,004 -2,836 0,118 -3,747 -3,426
0,499 -3,020 -3,184 0,492 -2,740 -3,631 -2,18 -4,26 -3,41
-3,21
146 CCCCCCCCO TR TR moderately soluble -2,638 -1,811 0,006 -2,356
0,195 -1,840 0,004 -2,192 0,058 -1,848 -2,049 0,268 -2,229 -2,274
0,116 -1,950 -2,556 -2,49 -3,09 -2,14 -2,41
147 COC(=O)COC1nC(F)C(Cl)C(N)C1Cl TR TR moderately soluble -3,951
-3,111 0,005 -3,950 0,001 -3,275 0,003 -2,661 0,246 -3,723 -3,249
0,535 -3,946 -3,305 0,912 -2,570 -1,469 -1,54 -1,32 -1,57
-2,07
148 CC(C)(C)C1CCCC(C1O)C(C)(C)C TR TR insoluble -4,714 -3,770 0,006
-4,079 0,124 -2,220 0,008 -4,608 0,086 -3,203 -3,669 1,027 -4,391
-4,344 0,374 -3,290 -4,044 -2,71 -4,75 -3,92 -3,85
149 CC(C1CCCCC1)C2CCCCC2 TR TR insoluble -4,693 -4,391 0,004 -5,248
0,313 -3,736 0,005 -5,253 0,154 -4,145 -4,657 0,736 -5,251 -5,251
0,003 -7,240 -6,536 -3,19 -6,3 -5,01 -5,59
150 CC1CC(C)C(C(C)C1)C(=O)[P](=O)(C2CCCCC2)C3CCCCC3 TR TR insoluble
-5,011 -6,529 0,007 -4,990 1,232 -4,371 0,005 -5,011 0,000 -5,093
-5,225 0,918 -5,011 -5,000 0,015 -5,380 -4,201 -4,58 -6,69 -5,62
-5,25
151 CCCCC(CC)COC(=O)C(C)=C TR TR insoluble -4,806 -2,764 0,003
-3,598 0,453 -2,531 0,002 -4,057 0,034 -2,687 -3,237 0,713 -4,024
-3,827 0,324 -4,300 -3,385 -3,25 -4,81 -3,4 -3,86
152 CCCCC(CC)COC(=O)CS[Sn](C)(SCC(=O)OCC(CC)CCCC)SCC(=O)OCC(CC)CCCC
TR TR insoluble -5,394 -9,178 0,058 -5,394 0,008 -6,920 0,029
-5,395 0,000 -5,428 -6,722 1,789 -5,394 -5,394 0,000 -7,210 -4,339
-9,33 -15,46 -10,16 -8,65
153 OCC1CCCC(OC2CCCCC2)C1 TR TR moderately soluble -2,826 -3,643
0,003 -3,538 0,336 -3,477 0,003 -2,836 0,082 -3,547 -3,373 0,365
-2,973 -3,187 0,497 -2,660 -2,516 -2,1 -3 -2,68 -2,65
154 O=C1CCCCCCCCCCCN1 V V moderately soluble -2,820 -3,623 0,004
-3,120 0,350 -2,131 0,009 -3,469 0,150 -3,178 -3,086 0,670 -3,364
-3,294 0,247 -2,780 -2,600 -3,17 -3,19 -2,9 -3,00
155 COC(=O)\C=C\C1CCCCC1 TR TR moderately soluble -2,734 -2,567
0,002 -2,080 0,205 -2,552 0,002 -2,118 0,159 -2,556 -2,329 0,267
-2,101 -2,099 0,027 -3,010 -2,672 -1,48 -3,33 -2,64 -2,54
156 CSC1CC(SC)C(N)C(C)C1N TR TR moderately soluble -2,857 -4,548
0,009 -1,975 0,730 -3,025 0,002 -3,181 0,098 -3,305 -3,182 1,056
-3,038 -2,578 0,853 -2,140 -1,223 -1,07 -2,57 -1,6 -1,94
157 C=CC1CCCCC1 TR TR moderately soluble -2,555 -2,137 0,006 -1,599
0,468 -2,698 0,002 -2,731 0,022 -2,543 -2,291 0,536 -2,680 -2,165
0,800 -3,730 -3,963 -1,7 -3,3 -2,73 -3,02
158 ClC1CC(Cl)C(Cl)nC1Cl TR TR moderately soluble -3,861 -3,703
0,005 -3,989 0,381 -4,830 0,005 -3,969 0,225 -4,261 -4,123 0,490
-3,976 -3,979 0,014 -2,340 -2,345 -2,87 -2,4 -2,95 -2,81
159 CC(=O)OC\C=C\C1CCCCC1 TS TR moderately soluble -2,793 -3,046
0,003 -3,289 0,412 -2,761 0,002 -2,445 0,123 -2,895 -2,885 0,364
-2,639 -2,867 0,596 -3,050 -3,185 -1,89 -3,38 -2,7 -2,81
160
C[C@]12CCC(=O)C=C1CC[C@H]3[C@@H]4CC[C@](O)(C(=O)CO)[C@@]4(C)C[C@H](O)[C@H]23
TR TR moderately soluble -3,112 -4,654 0,006 -3,316 0,147 -2,457
0,010 -3,591 0,274 -3,859 -3,505 0,906 -3,412 -3,454 0,195 -3,670
-3,511 -2,63 -3,21 -2,97 -3,23
161 CC(C)(C)C1CCC(O)CC1 TS TR moderately soluble -3,103 -2,864
0,003 -1,945 0,341 -1,669 0,006 -2,199 0,091 -2,463 -2,169 0,511
-2,145 -2,072 0,179 -2,290 -2,538 -1,71 -3,07 -2,62 -2,40
162 [O-][N+](=O)C1CCCC(Cl)C1 TR TR moderately soluble -2,761 -2,334
0,002 -2,313 0,250 -3,465 0,002 -2,868 0,027 -2,882 -2,745 0,544
-2,814 -2,590 0,393 -2,330 -2,469 -1,07 -2,51 -2 -2,20
163 CC(Cl)CCl TR TR soluble -1,622 -1,341 0,006 -1,798 0,243 -2,122
0,004 -2,337 0,152 -1,844 -1,899 0,433 -2,130 -2,067 0,382 -1,740
-1,780 -1,89 -1,61 -1,72 -1,81
164 CCNC1CCCCC1 TR TR soluble -1,652 -2,164 0,004 -1,330 0,396
-2,494 0,002 -1,255 0,099 -2,374 -1,810 0,614 -1,270 -1,292 0,053
-1,210 -1,257 -2,12 -1,85 -1,74 -1,57
165 CC(C)CCOCCC(C)C TR TR moderately soluble -3,760 -2,503 0,005
-3,186 0,537 -2,001 0,003 -2,676 0,175 -2,206 -2,591 0,489 -2,801
-2,931 0,360 -3,190 -3,739 -2,87 -4,16 -3,1 -3,31
166 CC1CCC(Cl)C(Cl)C1 TS TR moderately soluble -3,776 -3,141 0,003
-3,863 0,291 -3,486 0,002 -4,133 0,253 -3,354 -3,656 0,434 -4,008
-3,998 0,191 -3,500 -3,879 -2,47 -2,85 -2,88 -3,26
167 NC1CCCC(O)C1 TS TR soluble -0,623 -1,368 0,004 -0,492 0,173
-1,867 0,002 -0,490 0,053 -1,674 -1,054 0,681 -0,491 -0,491 0,001
0,620 0,456 -0,1 -0,46 -0,52 -0,08
168 O=C1CCCCCCCCCCC1 V V moderately soluble -3,846 -3,695 0,004
-3,120 0,360 -2,609 0,008 -3,469 0,273 -3,330 -3,223 0,473 -3,318
-3,294 0,247 -3,800 -3,213 -3,11 -4,26 -3,61 -3,55
169 CC1CCC(N)CC1Cl TR TR soluble -1,753 -2,549 0,002 -2,012 0,130
-2,599 0,002 -1,851 0,027 -2,552 -2,253 0,377 -1,878 -1,932 0,114
-1,550 -1,164 -1,48 -1,81 -1,79 -1,61
170 CC1CCC(Cl)CC1Cl V V moderately soluble -3,809 -3,141 0,003
-3,863 0,323 -3,486 0,002 -4,133 0,253 -3,354 -3,656 0,434 -4,015
-3,998 0,191 -3,280 -3,941 -2,47 -2,85 -2,88 -3,24
171 OC(=O)CC1CCC(Cl)CC1Cl TS TR moderately soluble -2,919 -3,554
0,002 -2,993 0,538 -3,504 0,001 -3,492 0,077 -3,526 -3,386 0,263
-3,430 -3,243 0,353 -2,540 -2,445 -1,87 -2,56 -2,36 -2,53
172 CC(C)(C1CCC(O)CC1)C2CCC(O)CC2,ClCC3CO3 TR TR insoluble -5,029
-4,110 0,013 -5,027 0,002 -4,128 0,005 -5,029 0,000 -4,885 -4,573
0,525 -5,028 -5,028 0,001 / -3,974 -1,97 -4,41 -3,98 -3,87
173 [O-][N+](=O)C1CCC(Cl)CC1 TR TR moderately soluble -2,812 -2,358
0,002 -2,313 0,227 -3,510 0,002 -2,868 0,040 -2,901 -2,762 0,559
-2,786 -2,590 0,393 -2,420 -2,474 -1,07 -2,51 -2 -2,21
174 CC(C)C1CCC(C=O)CC1 TS TR moderately soluble -2,785 -2,761 0,002
-1,888 0,472 -2,685 0,002 -2,813 0,219 -2,716 -2,537 0,436 -2,520
-2,350 0,654 -3,140 -3,482 -1,84 -2,87 -2,46 -2,72
175 OC(=O)CCCCCCCC(O)=O TR TR soluble -1,895 -2,334 0,005 -2,105
0,197 -2,494 0,003 -1,702 0,127 -2,412 -2,159 0,344 -1,859 -1,903
0,285 -1,340 -1,687 -1,46 -2,75 -1,47 -1,76
176 S=C(NC1CCCCC1)NC2CCCCC2 TS TR moderately soluble -3,972 -4,751
0,004 -3,538 0,650 -4,442 0,006 -4,121 0,348 -4,629 -4,213 0,518
-3,918 -3,829 0,412 -3,810 -3,760 -3,02 -4,56 -3,39 -3,74
177 C[C@]12CC[C@H]3[C@@H](CCC4=C3CCC(=O)C4)[C@@H]1CCC2=O V V
insoluble -4,055 -5,010 0,005 -3,819 0,418 -4,164 0,012 -3,858
0,155 -4,731 -4,213 0,553 -3,847 -3,839 0,028 -4,040 -4,520 -4,03
-2,23 -2,72 -3,56
178 CCCBr TR TR soluble -1,711 -1,063 0,014 -1,763 1,228 -1,915
0,004 -2,052 0,111 -1,727 -1,698 0,439 -2,028 -1,907 0,204 -1,220
-1,670 -1,88 -1,73 -1,86 -1,73
179
C[C@@H]1C[C@H]2[C@@H]3C[C@H](F)C4=CC(=O)C=C[C@]4(C)[C@H]3[C@@H](O)C[C@]2(C)[C@H]1C(=O)COTS
TR moderately soluble -3,204 -5,158 0,006 -3,316 0,294 -3,110 0,012
-3,591 0,285 -4,501 -3,794 0,930 -3,456 -3,454 0,195 -3,620 -4,183
-2,72 -3,59 -3,54 -3,52
180 O=C(C1CCCCC1)C2CCCCC2 TR TR moderately soluble -3,882 -4,059
0,003 -2,936 0,212 -3,637 0,006 -3,181 0,082 -3,883 -3,453 0,498
-3,112 -3,058 0,174 -3,960 -5,190 -2,71 -4,17 -3,5 -3,77
181 CCC(C)C1CCCCC1 TS TR moderately soluble -3,759 -2,797 0,004
-1,330 0,889 -2,738 0,002 -4,739 0,259 -2,767 -2,901 1,400 -3,970
-3,034 2,411 -5,020 -4,961 -2,52 -4,57 -3,63 -4,11
182 CC(C)=CCC\C(C)=C\CCC(C)(O)C=C TR TR moderately soluble -3,985
-3,849 0,006 -5,268 0,692 -2,815 0,004 -4,508 0,107 -3,253 -4,110
1,040 -4,609 -4,888 0,538 -3,730 -4,838 -3,15 -4,99 -3,8
-4,19
183 CCCCOCCOCCOC(=O)C(C)=C TR TR moderately soluble -2,016 -2,158
0,006 -2,597 0,267 -2,168 0,004 -2,508 0,178 -2,172 -2,358 0,228
-2,544 -2,553 0,063 -2,210 -1,728 -3,11 -2,86 -1,98 -2,41
184 COC1CCC(N)CC1 TS TR soluble -0,748 -1,436 0,003 -0,917 0,181
-2,022 0,002 -0,843 0,102 -1,794 -1,304 0,546 -0,870 -0,880 0,052
-0,380 -0,153 -0,82 -0,78 -0,87 -0,65
185 CCCCCO TR TR soluble -0,603 -0,314 0,007 0,111 0,578 -1,167
0,003 -0,618 0,139 -0,902 -0,497 0,538 -0,477 -0,254 0,516 -0,020
-0,689 -1,21 -1,54 -1,14 -0,85
186 CC(=C)C(=O)OCC=C TR TR soluble -1,759 -1,351 0,004 -1,184 0,484
-1,860 0,003 -1,666 0,192 -1,655 -1,515 0,304 -1,529 -1,425 0,340
-1,390 -0,716 -1,21 -1,89 -1,44 -1,36
187 ClC\C=C/Cl TS TR soluble -1,707 -1,124 0,006 -2,543 0,987
-2,536 0,003 -2,337 0,147 -2,070 -2,135 0,681 -2,364 -2,440 0,146
-1,760 -2,026 -1,54 -1,67 -1,75 -1,85
188 CCO[Si](CC(C)C)(OCC)OCC TR TR moderately soluble -3,017 -3,395
0,006 -2,233 0,999 -2,348 0,003 -3,197 0,127 -2,754 -2,793 0,588
-3,088 -2,715 0,682 -2,190 -2,278 -2,82 -3,34 -2,63 -2,72
189 S=C=S TS TR soluble -1,559 -1,928 0,047 -1,934 6,448 -2,447
0,005 -1,165 0,224 -2,379 -1,868 0,528 -1,190 -1,549 0,544 -1,750
-0,845 -0,01 -3,06 -1,62 -1,41
190 COC(=O)C1CCCCC1 TS TR soluble -1,812 -1,723 0,002 -1,739 0,354
-2,259 0,002 -1,694 0,072 -2,020 -1,854 0,271 -1,702 -1,717 0,032
-2,330 -1,879 -1,37 -2,39 -1,98 -1,94
191
COC(=O)CCCCCCCCC(=O)OC1CC(C)(C)N(C)C(C)(C)C1,CN2C(C)(C)CC(CC2(C)C)OC(=O)CCCCCCCCC(=O)OC3CC(C)(C)N(C)C(C)(C)C3TR
TR insoluble -4,611 -5,743 0,024 -4,613 0,001 -5,187 0,045 -6,950
1,487 -4,691 -5,623 0,998 -4,615 -5,782 1,653 / -3,967 -7,12 -13,37
-10,56 -7,93
192 CC(C)CC1CCC(CC1)C(C)C(O)=O TS TR moderately soluble -3,992
-3,638 0,002 -4,065 0,453 -2,691 0,002 -3,792 0,306 -3,090 -3,546
0,597 -3,902 -3,928 0,193 -3,650 -3,822 -2,13 -5,13 -3,8
-3,74
193 CCNC1nC(NC(C)C)nC(SC)n1 TR TR moderately soluble -3,047 -3,447
0,003 -3,693 0,336 -3,044 0,002 -2,969 0,085 -3,213 -3,288 0,342
-3,114 -3,331 0,512 -2,190 -2,619 -2,4 -1,31 -1,23 -2,14
194 COC1CCCC(C1)C(C)=O TR TR soluble -1,869 -2,025 0,002 -1,888
0,240 -2,397 0,001 -2,118 0,077 -2,234 -2,107 0,215 -2,062 -2,003
0,163 -0,830 -1,360 -1,53 -1,18 -1,33 -1,38
Table S1: Smlies strings of compounds and predictions of logS
values with QSPR models
195 O=C1NC(=O)NC(=O)N1 TR TR soluble -1,808 -0,303 0,006 -1,808
0,001 -0,348 0,004 -1,974 0,389 -1,413 -1,108 0,907 -1,808 -1,891
0,118 -1,870 -2,399 -0,96 -0,41 -0,4 -1,31
196 CCC(COC(=O)C(C)=C)(COC(=O)C(C)=C)COC(=O)C(C)=C V V insoluble
-4,226 -2,803 0,008 -4,348 0,985 -2,837 0,011 -2,775 0,752 -2,824
-3,191 0,772 -3,456 -3,562 1,112 -4,050 -2,630 -3,47 -5,76 -3,85
-3,87
197 CC1=CCC2C(C1)C2(C)C TS TR insoluble -4,670 -2,631 0,008 -2,952
0,619 -2,877 0,005 -2,982 0,299 -2,778 -2,860 0,159 -2,972 -2,967
0,021 -3,190 -3,557 -2,23 -4,1 -3,44 -3,25
198 CCC(CC1CCC(C)CC1)(N(C)C)C(=O)C2CCC(CC2)N3CCOCC3 TS TR insoluble
-5,133 -5,573 0,006 -6,430 1,275 -3,601 0,005 -3,681 0,673 -4,530
-4,821 1,407 -4,630 -5,055 1,944 -4,310 -3,103 -4,1 -5,45 -4,97
-4,43
199 NC1nC(N)nC(N)n1,O=C2NC(=O)NC(=O)N2 TS TR insoluble -4,976
-0,498 0,016 -0,624 2,224 -0,366 0,016 -4,225 0,716 -0,473 -1,428
1,868 -3,348 -2,425 2,547 / -2,877 0,94 1,16 1,14 -0,60
200 CC(C)CCCC(C)CCCC(C)(O)C#C TR TR insoluble -4,953 -4,004 0,005
-5,268 0,436 -2,443 0,004 -4,508 0,091 -3,146 -4,056 1,194 -4,639
-4,888 0,538 -3,910 -5,018 -3,52 -5,05 -3,78 -4,32
201 CC(=C)CC(C)(C)C,N(C1CCCCC1)C2CCCCC2 V V insoluble -5,088 -4,828
0,010 -3,904 1,807 -4,576 0,009 -5,315 0,606 -4,692 -4,656 0,588
-4,961 -4,609 0,998 / -4,397 -2,8 -7,17 -5,87 -5,04
202
CC1=CCC2CC1C2(C)C,CC3(C)OC4(C)CCC3CC4,CC5(C)C6CCC(C6)C5=C,CC7(C)C8CCC7(C)C(=O)C8
TR TR insoluble -5,438 -5,047 0,068 -3,455 4,195 -6,976 0,053
-5,438 0,000 -5,460 -5,229 1,446 -5,438 -4,447 1,402 / -6,317 -2,45
-11,04 -10,07 -7,06
203 CC(C)CCCCCCCCCCO TR TR insoluble -5,001 -3,567 0,006 -5,565
0,360 -2,672 0,006 -4,372 0,184 -3,151 -4,044 1,229 -4,775 -4,969
0,844 -3,950 -5,547 -4,18 -5,56 -3,81 -4,64
204 CCCCCCC(O)CCCCCCCCCCCO TS TR insoluble -5,156 -4,599 0,008
-5,189 0,480 -3,336 0,012 -5,693 0,391 -4,119 -4,704 1,016 -5,467
-5,441 0,356 -3,550 -4,911 -5,64 -7,68 -4,98 -5,37
205 FC(F)=C(F)C(F)(F)F TR TR moderately soluble -2,889 -2,185 0,040
-2,539 0,750 -2,972 0,021 -2,778 0,182 -2,708 -2,618 0,339 -2,731
-2,658 0,169 -2,310 -1,798 -1,68 -1,53 -1,91 -1,99
206 CC(F)(F)Cl TR TR soluble -1,723 -0,719 0,018 -1,722 0,002
-2,160 0,005 -1,938 0,199 -1,771 -1,635 0,636 -1,724 -1,830 0,153
-1,310 -1,299 -1,41 -1,57 -1,69 -1,50
207 FCF TR TR soluble -1,437 1,912 0,104 -0,204 2,309 -1,407 0,005
-0,183 0,191 -1,219 0,030 1,379 -0,185 -0,193 0,014 -0,380 -0,441
-0,64 -0,6 -0,8 -0,51
208 ClC(=O)C1CCC(CC1)C(Cl)=O TR TR insoluble -4,029 -3,533 0,002
-3,055 0,409 -3,803 0,003 -3,492 0,038 -3,630 -3,471 0,310 -3,455
-3,274 0,309 -4,830 -2,364 -2,22 -2,82 -2,55 -3,04
209 C[C@]12CCC(=O)C=C1CC[C@@H]3[C@@H]2CC[C@@]4(C)[C@H]3C(O)CC4=O TR
TR moderately soluble -2,213 -4,895 0,005 -4,078 0,294 -3,430 0,011
-4,006 0,235 -4,408 -4,102 0,603 -4,038 -4,042 0,052 -3,320 -4,011
-3,37 -2,53 -2,83 -3,35
210 ClCCCBr TR TR soluble -1,940 -1,625 0,011 -1,763 0,863 -2,441
0,005 -2,337 0,071 -2,197 -2,041 0,408 -2,293 -2,050 0,406 -1,620
-2,175 -2,55 -1,81 -2,06 -2,08
211 OC(O)C(Cl)(Cl)Cl TR TR highly soluble 1,037 -1,465 0,011 -0,692
0,433 -1,788 0,005 0,555 0,138 -1,620 -0,847 1,042 0,254 -0,068
0,882 0,290 0,056 -0,67 -1,43 -1,42 -0,49
212 OC(=O)C1CC(O)CC(C1)C(O)=O TS TR moderately soluble -2,005
-1,890 0,002 -2,595 0,511 -2,391 0,003 -1,136 0,167 -2,109 -2,003
0,649 -1,495 -1,866 1,032 -0,870 -0,551 0,97 -1,04 -0,57
-0,59
213 OC1CCCCCCCCCCC1 TR TR moderately soluble -3,989 -3,597 0,004
-3,120 0,374 -2,168 0,010 -3,469 0,257 -3,154 -3,088 0,646 -3,327
-3,294 0,247 -3,310 -2,809 -2,42 -3,87 -3,35 -3,18
214 ClCC(CCl)O[P](=O)(OC(CCl)CCl)OC(CCl)CCl TR TR insoluble -4,377
-6,530 0,027 -4,963 1,202 -5,884 0,018 -4,501 0,213 -6,055 -5,469
0,911 -4,571 -4,732 0,326 -4,780 -4,196 -5,28 -4,48 -4,02
-4,55
215 ClC1CCCC(Cl)C1Cl TR TR insoluble -4,004 -3,311 0,004 -3,863
0,131 -4,218 0,003 -4,310 0,107 -3,858 -3,926 0,452 -4,110 -4,087
0,316 -3,450 -3,935 -2,71 -3,16 -3,19 -3,43
216 CC(C)C1(O)CCC(=CC1)C TR TR soluble -1,941 -2,697 0,003 -1,945
0,247 -1,946 0,003 -2,199 0,063 -2,356 -2,197 0,354 -2,147 -2,072
0,179 -1,760 -2,039 -1,91 -3,36 -2,78 -2,33
217 COC1CCC(OC)C(C1)[N+]([O-])=O V V moderately soluble -3,017
-1,322 0,003 -2,371 0,581 -2,816 0,002 -2,124 0,154 -2,195 -2,158
0,627 -2,176 -2,248 0,175 -1,630 -1,380 -0,49 -1,67 -1,28
-1,44
218 CCCCNC(=O)OCC#CI TR TR moderately soluble -3,203 -2,972 0,006
-3,186 0,724 -2,780 0,006 -1,702 0,168 -2,863 -2,660 0,660 -1,981
-2,444 1,049 -2,680 -1,690 -2,66 -2,55 -2,52 -2,35
219 CC1CC2NC(=O)NC2CC1N V V soluble -1,977 -2,332 0,003 -2,002
0,674 -1,873 0,004 -1,641 0,315 -2,136 -1,962 0,288 -1,756 -1,821
0,255 -1,470 -1,273 -0,91 -0,63 -0,69 -1,12
220 CC(C)(C)OOC(C)(C)C TR TR moderately soluble -2,932 -1,673 0,011
-2,337 1,031 -1,440 0,006 -2,818 0,066 -1,599 -2,067 0,628 -2,788
-2,577 0,340 -3,160 -2,576 -1,76 -2,11 -1,87 -2,38
221 CCOC(=O)\C=C\C1CCCCC1 V V moderately soluble -3,013 -2,907
0,002 -3,289 0,391 -2,761 0,002 -2,445 0,123 -2,844 -2,851 0,350
-2,647 -2,867 0,596 -3,320 -2,970 -1,89 -3,71 -2,9 -2,91
222 O=C1NSC2CCCCC12 TR TR soluble -1,947 -3,045 0,005 -2,285 0,564
-2,669 0,004 -2,049 0,104 -2,812 -2,512 0,438 -2,086 -2,167 0,167
-1,430 -2,292 -1,4 -1,92 -1,56 -1,78
223 CC1(C)C2CCC3(C2)C1C(=O)CCC3(C)C TR TR insoluble -4,098 -4,162
0,007 -4,817 0,717 -3,289 0,008 -4,511 0,112 -3,772 -4,195 0,661
-4,553 -4,664 0,216 -4,010 -4,707 -3,88 -4,06 -3,73 -4,16
224 CCCCCCCC(C)=O TS TR moderately soluble -2,930 -2,534 0,004
-3,110 0,575 -2,345 0,003 -2,676 0,178 -2,424 -2,666 0,325 -2,779
-2,893 0,307 -2,370 -3,428 -3,01 -3,17 -2,3 -2,84
225 CC1CCC(CC1[N+]([O-])=O)[N+]([O-])=O TR TR moderately soluble
-3,040 -1,926 0,003 -2,201 0,360 -3,390 0,005 -1,755 0,052 -2,398
-2,318 0,738 -1,811 -1,978 0,316 -2,810 -2,856 -0,01 -3,06 -1,84
-2,06
226 OCCCNC1CCC(O)CC1[N+]([O-])=O TS TR moderately soluble -2,111
-2,121 0,002 -1,635 0,768 -2,600 0,003 -2,687 0,217 -2,351 -2,261
0,486 -2,455 -2,161 0,744 -0,950 -0,897 -0,36 -1,77 -0,96
-1,23
227 CC1CCCCC1C TS TR moderately soluble -2,819 -2,049 0,005 -2,827
0,277 -2,220 0,002 -2,731 0,039 -2,189 -2,457 0,381 -2,743 -2,779
0,068 -4,110 -3,760 -1,8 -3,44 -2,9 -3,13
228 CC1CCC(C)CC1 TR TR moderately soluble -2,819 -2,132 0,005
-2,827 0,277 -2,265 0,003 -2,731 0,071 -2,232 -2,489 0,342 -2,751
-2,779 0,068 -3,950 -3,727 -1,8 -4,5 -3,54 -3,38
229 CC(C)C(=O)OCCC1CCCCC1 TR TR moderately soluble -3,080 -3,291
0,002 -2,694 0,158 -2,512 0,002 -2,383 0,058 -2,882 -2,720 0,402
-2,466 -2,538 0,220 -3,960 -3,820 -2,64 -4,51 -3,42 -3,47
230 CC(=O)/C=C/C1=C(C)CCCC1(C)C TS TR moderately soluble -3,080
-3,506 0,004 -3,055 0,639 -2,821 0,004 -3,469 0,444 -3,172 -3,213
0,332 -3,300 -3,262 0,293 -3,070 -3,552 -3,1 -2,93 -2,73
-3,11
231 CCCCCCCCCCN(C)C TS TR insoluble -4,045 -3,413 0,006 -4,149
0,527 -2,427 0,005 -4,372 0,054 -2,950 -3,590 0,877 -4,351 -4,261
0,158 -4,440 -4,426 -4,02 -5,11 -3,73 -4,35
232 CCCCCCC(C)(C)S TS TR moderately soluble -3,990 -3,479 0,005
-2,002 0,987 -2,569 0,004 -5,309 0,181 -2,963 -3,340 1,447 -4,797
-3,655 2,338 -3,400 -4,299 -3,21 -4,53 -3,03 -3,88
233 CCCS[P](=O)(OCC)OC1CCC(Br)CC1Cl TR TR insoluble -4,366 -6,420
0,010 -4,963 0,404 -5,907 0,007 -5,227 0,115 -6,082 -5,629 0,660
-5,169 -5,095 0,187 -3,920 -4,748 -3,86 -4,74 -3,98 -4,40
234 CCCCCCCCC(C)C=O TR TR insoluble -4,027 -3,070 0,004 -4,645
0,377 -2,695 0,004 -4,372 0,183 -2,894 -3,695 0,958 -4,461 -4,508
0,193 -4,330 -4,596 -3,48 -4,44 -3,12 -4,07
235 CC1CCC(C)C(O)C1C V V soluble -1,936 -2,140 0,003 -1,945 0,504
-2,050 0,002 -1,644 0,176 -2,084 -1,945 0,216 -1,722 -1,794 0,213
-1,840 -1,521 -1,19 -2,74 -2,4 -1,90
236 CC1CCCC(C)C1 V V moderately soluble -2,833 -2,112 0,005 -2,827
0,334 -2,210 0,002 -2,731 0,036 -2,202 -2,470 0,361 -2,740 -2,779
0,068 -4,010 -3,712 -1,8 -3,44 -2,9 -3,10
237 O=[Si]=O / TR moderately soluble -3,603 / / / / -1,399 0,003
-0,618 0,182 -1,387 -1,009 0,664 -0,618 -0,618 0,437 -1,180 1,207
-1,8 -3,44 -2,9 -1,46
238 CC1CCCC(C1C)[N+]([O-])=O TR TR moderately soluble -3,003 -2,102
0,003 -2,313 0,446 -2,762 0,002 -2,645 0,135 -2,496 -2,455 0,303
-2,568 -2,479 0,235 -2,900 -2,811 -1 -3,29 -2,44 -2,50
239 O=C(CC1CCCCC1)OCCC2CCCCC2 TR TR insoluble -4,189 -4,639 0,004
-3,661 0,418 -3,487 0,005 -2,832 0,111 -4,112 -3,654 0,747 -3,005
-3,246 0,586 -5,390 -5,612 -3,66 -6,22 -4,72 -4,77
240 C=CCOC(=O)C1CCCCC1C(=O)OCC=C TR TR moderately soluble -3,221
-2,993 0,002 -3,103 0,340 -3,414 0,007 -2,216 0,175 -3,097 -2,932
0,509 -2,517 -2,660 0,628 -3,050 -2,562 -2,17 -3,56 -2,64
-2,75
241 CC1CC(O)CC(C)(C)C1 TR TR soluble -1,992 -2,341 0,003 -1,945
0,442 -1,570 0,004 -2,199 0,191 -2,029 -2,014 0,338 -2,122 -2,072
0,179 -2,060 -1,482 -1,64 -2,69 -2,37 -2,06
242 CN(C)C1CCCCC1 V V soluble -1,922 -2,108 0,003 -1,413 0,529
-2,087 0,002 -1,255 0,099 -2,083 -1,716 0,446 -1,280 -1,334 0,112
-1,210 -1,265 -1,39 -1,59 -1,76 -1,42
243 O=C/C=C/C1CCCCC1 TR TR soluble -1,969 -2,546 0,003 -1,739 0,455
-2,701 0,003 -1,644 0,081 -2,600 -2,157 0,543 -1,658 -1,691 0,067
-2,680 -3,297 -1,33 -2,73 -2,28 -2,33
244 CC(=O)NC1CCC(O)CC1 TS TR soluble -1,033 -2,186 0,002 -1,085
0,291 -2,239 0,002 -1,161 0,182 -2,205 -1,668 0,630 -1,132 -1,123
0,053 -0,290 -0,598 -0,89 -0,78 -0,8 -0,75
245 CCCCCCCCCO TR TR moderately soluble -3,013 -2,208 0,006 -2,356
0,410 -2,066 0,004 -2,192 0,157 -2,126 -2,205 0,119 -2,237 -2,274
0,116 -2,420 -3,198 -2,91 -3,89 -2,65 -2,88
246 O=C1CCCCCCCCCCCC(=O)OCCO1 TR TR insoluble -4,262 -3,670 0,004
-4,763 0,589 -2,333 0,019 -4,907 0,289 -3,449 -3,918 1,193 -4,860
-4,835 0,102 -3,550 -2,871 -3,32 -4,96 -4,13 -3,95
247 OC1CC(Cl)CCC1OC2CCC(Cl)CC2 TR TR insoluble -4,239 -4,757 0,003
-2,960 0,494 -4,980 0,002 -4,309 0,234 -4,872 -4,252 0,905 -3,876
-3,635 0,954 -2,510 -3,236 -2,46 -3,23 -3,22 -3,09
248 NC1nC(N)nC(NC2nC(N)nC(N)n2)n1 TR TR insoluble -4,225 -2,013
0,007 -3,422 0,570 -2,425 0,013 -4,225 0,000 -3,841 -3,021 0,997
-4,224 -3,824 0,568 -2,560 -2,892 0,91 3,14 2,2 -0,57
249 CC1CCCCC1 TR TR moderately soluble -3,846 -1,753 0,007 -1,599
0,458 -2,185 0,004 -4,044 0,077 -2,096 -2,395 1,127 -3,691 -2,821
1,729 -3,670 -3,478 -1,62 -3,3 -2,72 -3,08
250 COC(=O)OC TR TR highly soluble 0,185 1,532 0,015 0,0830, 0,561
-1,182 0,003 0,197 0,070 -0,764 0,158 1,109 0,184 0,197 0,139
-0,020 0,545 0,01 -0,84 -0,59 -0,12
251 CC(=C)C(=O)OC1CCCCC1 TR TR moderately soluble -3,089 -2,679
0,002 -2,371 0,413 -2,518 0,002 -2,057 0,076 -2,606 -2,406 0,264
-2,106 -2,214 0,222 -2,970 -1,978 -1,83 -3,1 -2,51 -2,42
252 CCCC(=O)OCC1CCCCC1 TR TR moderately soluble -3,117 -2,985 0,002
-3,289 0,277 -2,807 0,002 -2,445 0,098 -2,884 -2,882 0,352 -2,665
-2,867 0,596 -3,500 -3,238 -2,61 -3,54 -2,74 -3,05
253 OC(=O)C1CC(CC(C1)[N+]([O-])=O)[N+]([O-])=O TR TR moderately
soluble -2,208 -1,749 0,005 -1,340 0,220 -3,737 0,010 -2,661 0,168
-2,353 -2,372 1,065 -2,089 -2,000 0,934 -2,400 -2,136 1 -2,54 -1,16
-1,55
254 NNC(N)=S TR TR soluble -0,846 0,362 0,008 0,064 0,880 -0,574
0,004 -0,495 0,183 -0,272 -0,161 0,449 -0,399 -0,215 0,395 -0,840
0,474 0,73 -0,35 0,4 0,00
255 CCC(=O)/C=C/C1=C(C)CCCC1(C)C V V insoluble -4,188 -3,728 0,004
-3,600 0,548 -3,030 0,004 -5,005 0,432 -3,404 -3,841 0,833 -4,386
-4,302 0,994 -3,790 -4,040 -3,5 -3,42 -3,05 -3,70
256 ClC1CCCC(Cl)C1 V V moderately soluble -3,072 -2,801 0,004
-2,827 0,357 -3,557 0,002 -4,133 0,235 -3,316 -3,330 0,640 -3,615
-3,480 0,924 -2,950 -3,637 -2,3 -2,39 -2,52 -2,90
257 CCC(=O)C1CCCCC1 TR TR moderately soluble -2,049 -2,458 0,003
-1,739 0,418 -2,648 0,002 -1,644 0,155 -2,569 -2,122 0,505 -1,669
-1,691 0,067 -2,260 -2,803 -2,03 -2,5 -2,15 -2,24
258 CC(C)(C)C1CCC(C=O)CC1 TR TR moderately soluble -3,131 -3,151
0,002 -3,080 0,424 -2,452 0,002 -2,813 0,058 -2,803 -2,874 0,317
-2,845 -2,946 0,188 -3,660 -4,066 -2,24 -3,28 -2,8 -3,15
259 S=C1NC2CCCCC2S1 V V moderately soluble -3,152 -3,720 0,006
-5,184 1,129 -3,890 0,004 -2,049 0,380 -3,815 -3,711 1,286 -2,839
-3,616 2,217 -2,500 -2,446 -1,61 -3,28 -2,29 -2,49
260 CC(=C)C1CCCCC1 TR TR moderately soluble -3,008 -2,578 0,004
-1,413 0,299 -2,734 0,002 -2,731 0,192 -2,671 -2,364 0,638 -2,216
-2,072 0,932 -4,270 -4,303 -2,11 -3,94 -3,21 -3,34
261