11
A combined pharmacophore modeling, 3D-QSAR and molecular docking study of substituted bicyclo-[3.3.0]oct-2-enes as liver receptor homolog-1 (LRH-1) agonists Manisha Lalit 1 , Rahul P. Gangwal 1 , Gaurao V. Dhoke, Mangesh V. Damre, Kanchan Khandelwal, Abhay T. Sangamwar Department of Pharmacoinformatics, National Institute of Pharmaceutical Education and Research (NIPER), Sect-67, S.A.S. Nagar, Punjab 160 062, India highlights Ligand based pharmacophore model was developed and validated for LRH-1 agonists. Three different databases were screened by means of validated pharmacophore model. CoMFA and CoMSIA models were developed and validated. Docking analysis shows that His390 and Arg393 plays important role in binding of agonists. article info Article history: Received 25 March 2013 Received in revised form 13 June 2013 Accepted 13 June 2013 Available online 22 June 2013 Keywords: CoMFA CoMSIA Liver receptor homolog-1 Molecular docking Pharmacophore Virtual screening abstract A combined pharmacophore modelling, 3D-QSAR and molecular docking approach was employed to reveal structural and chemical features essential for the development of small molecules as LRH-1 ago- nists. The best HypoGen pharmacophore hypothesis (Hypo1) consists of one hydrogen-bond donor (HBD), two general hydrophobic (H), one hydrophobic aromatic (HYAr) and one hydrophobic aliphatic (HYA) feature. It has exhibited high correlation coefficient of 0.927, cost difference of 85.178 bit and low RMS value of 1.411. This pharmacophore hypothesis was cross-validated using test set, decoy set and Cat-Scramble methodology. Subsequently, validated pharmacophore hypothesis was used in the screening of small chemical databases. Further, 3D-QSAR models were developed based on the alignment obtained using substructure alignment. The best CoMFA and CoMSIA model has exhibited excellent r 2 ncv values of 0.991 and 0.987, and r 2 cv values of 0.767 and 0.703, respectively. CoMFA predicted r 2 pred of 0.87 and CoMSIA predicted r 2 pred of 0.78 showed that the predicted values were in good agreement with the experimental values. Molecular docking analysis reveals that pp interaction with His390 and hydrogen bond interaction with His390/Arg393 is essential for LRH-1 agonistic activity. The results from pharma- cophore modelling, 3D-QSAR and molecular docking are complementary to each other and could serve as a powerful tool for the discovery of potent small molecules as LRH-1 agonists. Ó 2013 Elsevier B.V. All rights reserved. 1. Introduction Nuclear receptors (NRs) are a class of proteins found inside the cells, which act as transcriptional factors. They regulate expression of many genes and control the development, homeostasis, and metabolism of the organism. Their activity is regulated by binding of small lipophilic compounds including hormones, metabolites and few synthetic ligands. NRs get activated upon ligand binding and/or phosphorylation causing a conformational change, which results in dissociation of co-repressor complexes and recruitment of co-activator complexes. According to sequence homology, NR superfamily is classified into seven subfamilies (NR0-NR6) [1]. NRs for which no ligand has been identified are categorized as or- phan NRs. Liver receptor homolog-1 (LRH-1, NR5A2) is one such orphan NR, which belongs to NR5A subfamily. In adult mammals, it is mainly confined to liver, pancreas and intestine. It is also ex- pressed in ovary, pre-adipocyte and at lower levels in placenta. It plays an important role in early development and also in regula- tion of bile acid synthesis, cholesterol metabolism and steroido- genesis in the adult. It also regulates the expression of aromatase in the breast and ovaries, which also exhibit its utility in cancer therapy [2–7]. LRH-1 was found to be constitutive active when expressed in a variety of cell types [8]. However, phospholipids have been found 0022-2860/$ - see front matter Ó 2013 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.molstruc.2013.06.035 Corresponding author. Tel.: +91 0172 2214682. E-mail address: [email protected] (A.T. Sangamwar). 1 These authors contributed equally to this work. Journal of Molecular Structure 1049 (2013) 315–325 Contents lists available at SciVerse ScienceDirect Journal of Molecular Structure journal homepage: www.elsevier.com/locate/molstruc

A combined pharmacophore modeling, 3D-QSAR and molecular docking study of substituted bicyclo-[3.3.0]oct-2-enes as liver receptor homolog-1 (LRH-1) agonists

  • Upload
    abhay-t

  • View
    212

  • Download
    0

Embed Size (px)

Citation preview

Journal of Molecular Structure 1049 (2013) 315–325

Contents lists available at SciVerse ScienceDirect

Journal of Molecular Structure

journal homepage: www.elsevier .com/ locate /molst ruc

A combined pharmacophore modeling, 3D-QSAR and molecular dockingstudy of substituted bicyclo-[3.3.0]oct-2-enes as liver receptorhomolog-1 (LRH-1) agonists

0022-2860/$ - see front matter � 2013 Elsevier B.V. All rights reserved.http://dx.doi.org/10.1016/j.molstruc.2013.06.035

⇑ Corresponding author. Tel.: +91 0172 2214682.E-mail address: [email protected] (A.T. Sangamwar).

1 These authors contributed equally to this work.

Manisha Lalit 1, Rahul P. Gangwal 1, Gaurao V. Dhoke, Mangesh V. Damre, Kanchan Khandelwal,Abhay T. Sangamwar ⇑Department of Pharmacoinformatics, National Institute of Pharmaceutical Education and Research (NIPER), Sect-67, S.A.S. Nagar, Punjab 160 062, India

h i g h l i g h t s

� Ligand based pharmacophore model was developed and validated for LRH-1 agonists.� Three different databases were screened by means of validated pharmacophore model.� CoMFA and CoMSIA models were developed and validated.� Docking analysis shows that His390 and Arg393 plays important role in binding of agonists.

a r t i c l e i n f o

Article history:Received 25 March 2013Received in revised form 13 June 2013Accepted 13 June 2013Available online 22 June 2013

Keywords:CoMFACoMSIALiver receptor homolog-1Molecular dockingPharmacophoreVirtual screening

a b s t r a c t

A combined pharmacophore modelling, 3D-QSAR and molecular docking approach was employed toreveal structural and chemical features essential for the development of small molecules as LRH-1 ago-nists. The best HypoGen pharmacophore hypothesis (Hypo1) consists of one hydrogen-bond donor(HBD), two general hydrophobic (H), one hydrophobic aromatic (HYAr) and one hydrophobic aliphatic(HYA) feature. It has exhibited high correlation coefficient of 0.927, cost difference of 85.178 bit andlow RMS value of 1.411. This pharmacophore hypothesis was cross-validated using test set, decoy setand Cat-Scramble methodology. Subsequently, validated pharmacophore hypothesis was used in thescreening of small chemical databases. Further, 3D-QSAR models were developed based on the alignmentobtained using substructure alignment. The best CoMFA and CoMSIA model has exhibited excellent r2

ncvvalues of 0.991 and 0.987, and r2

cv values of 0.767 and 0.703, respectively. CoMFA predicted r2pred of 0.87

and CoMSIA predicted r2pred of 0.78 showed that the predicted values were in good agreement with the

experimental values. Molecular docking analysis reveals that p–p interaction with His390 and hydrogenbond interaction with His390/Arg393 is essential for LRH-1 agonistic activity. The results from pharma-cophore modelling, 3D-QSAR and molecular docking are complementary to each other and could serve asa powerful tool for the discovery of potent small molecules as LRH-1 agonists.

� 2013 Elsevier B.V. All rights reserved.

1. Introduction

Nuclear receptors (NRs) are a class of proteins found inside thecells, which act as transcriptional factors. They regulate expressionof many genes and control the development, homeostasis, andmetabolism of the organism. Their activity is regulated by bindingof small lipophilic compounds including hormones, metabolitesand few synthetic ligands. NRs get activated upon ligand bindingand/or phosphorylation causing a conformational change, whichresults in dissociation of co-repressor complexes and recruitment

of co-activator complexes. According to sequence homology, NRsuperfamily is classified into seven subfamilies (NR0-NR6) [1].NRs for which no ligand has been identified are categorized as or-phan NRs. Liver receptor homolog-1 (LRH-1, NR5A2) is one suchorphan NR, which belongs to NR5A subfamily. In adult mammals,it is mainly confined to liver, pancreas and intestine. It is also ex-pressed in ovary, pre-adipocyte and at lower levels in placenta. Itplays an important role in early development and also in regula-tion of bile acid synthesis, cholesterol metabolism and steroido-genesis in the adult. It also regulates the expression of aromatasein the breast and ovaries, which also exhibit its utility in cancertherapy [2–7].

LRH-1 was found to be constitutive active when expressed in avariety of cell types [8]. However, phospholipids have been found

316 M. Lalit et al. / Journal of Molecular Structure 1049 (2013) 315–325

in the ligand binding pockets of LRH-1 [9–11], but their role in reg-ulation of LRH-1 activity still remains to be established. Two phos-phatidylcholines, dilauryl and diundecadeoyl have been found asLRH-1 agonists and depicted LRH-1 as a target for type-2 diabetes[12]. Medicinal chemistry approach for the development of LRH-1agonists has been utilized by Whitby et al. [13,14]. They have usedcis-bicyclo-[3.3.0]oct-2-ene skeleton and designed various com-pounds of series 1-anilino, 1-alkoxy- and 1-alken-2-yl-substitutedbicyclo-[3.3.0]oct-2-enes. Among the 1-anilino series, compoundGSK8470 showed good pEC50, but its primary limitation is acidinstability. The compound RJW100 was found to have good activityand stability. In the cell-based studies, it is found to be an activeagonist of LRH-1 and now, it is under preclinical evaluation for tox-icity study. According to our knowledge, till now no compound hasreached the clinical trials, warranting the novelty of LRH-1agonists.

The aim of this study is to identify the basic structural require-ments for LRH-1 agonistic activity and thereby designing novel andpotent agonists. The combination of pharmacophore modelling,3D-QSAR and molecular docking approach has been employed toachieve this goal. The HypoGen algorithm based pharmacophorehypothesis was generated. The validated pharmacophore hypothe-sis was subsequently used in virtual screening process to identifynovel and potent LRH-1 agonists. CoMFA and CoMSIA model weredeveloped and validated based on the substructure alignment. Fur-ther, molecular docking analysis has rendered the complimentaryinformation to pharmacophore and 3D-QSAR studies.

2. Materials and methods

2.1. Pharmacophore modelling

2.1.1. Data setsSince the last half decade, a number of small molecules were

reported as LRH-1 agonists. Out of these, 47 agonists assayed bysame biological methods were selected and used further for phar-macophore modelling study [13,14]. The agonistic activity of thesecompounds was expressed in terms of EC50 (i.e., concentration of acompound where 50% of its maximal effect is observed) and pEC50

(pEC50 = �log EC50) values. The most important step in the phar-macophore modelling is the selection of a suitable training set withwide activity range of at least four orders of magnitude, responsi-ble for determining the quality of the generated pharmacophore.The reported pEC50 values of dataset spanned across a small rangefrom 5.32 to 7.92. These activity values were rescaled to the rangeof four log units to develop statistically reliable pharmacophore.For HypoGen pharmacophore model generation, a training set of16 compounds (Fig. 1) was selected based on the principles ofstructural diversity and activity range (As per manual of DiscoveryStudio 2.5 (DS2.5), minimum 16 compounds are required in thetraining set for the development of HypoGen based Pharmaco-phore hypothesis). The rest of the 31 compounds (Fig. S1) fromthe dataset were used as a test set for pharmacophore modelvalidation.

2.1.2. Pharmacophore model generationThe 3D QSAR Pharmacophore Generation and Ligand Pharmaco-

phore Mapping module within DS2.5 software package [15] wereused to carry out pharmacophore modelling studies. Theconformations for all training set compounds were generated byCat-Conf program within DS2.5 software package. The BEST meth-od was employed during generation of multiple acceptable confor-mations. The BEST method provides complete and improved

coverage of conformational space by performing a rigorous energyminimization and optimizing the conformations in both torsionaland cartesian space using the poling algorithm [16]. The featuressuch as hydrogen bond acceptor (HBD), general hydrophobic fea-ture (H), hydrophobic aliphatic (HYA), hydrophobic aromatic(HYAr) and ring aromatic (RA) were included for the pharmaco-phore generation assuming common features present in the stud-ied compounds. The uncertainty value during pharmacophoregeneration was set to two, as the activity range in the trainingset compounds barely spans the minimum requirement of four or-ders of magnitude as well as to correlate the training set com-pounds with their activity values accurately [17]. The uncertaintyvalue of two means the biological activity of a particular agonistis assumed to be located somewhere in the range two times higherto two times lower of the true value of that agonist. Top ten phar-macophore hypotheses were generated using training set of LRH-1agonists. The best pharmacophore hypothesis was selected basedon significant statistical parameters (high correlation coefficient(r2), lowest total cost, highest cost difference and low RMS values).

2.1.3. Validation of pharmacophore modelThe validation of developed pharmacophore model was done to

determine whether it is capable of differentiating between active,least active and inactive compounds [18]. To validate the bestpharmacophore hypothesis three different methods were em-ployed (test set prediction, decoy test and Fischer randomizationtest). A test set of 31 diverse LRH-1 agonists were used to validatethe best pharmacophore hypothesis. The cost functions such asweight cost, configuration cost and error cost, calculated duringthe pharmacophore generation process were initially used to vali-date the best pharmacophore hypothesis. In test set validationmethod, Ligand Pharmacophore Mapping protocol with the BESTflexible search option was employed to map the test set com-pounds upon the best pharmacophore hypothesis. The pEC50 valuefor each test set compound was also estimated. In decoy set valida-tion method, a small database of decoys was generated usingDecoyFinder1.1. Five active LRH-1 agonists were included in thedecoy database to calculate goodness of hit score (GH) and enrich-ment factor (E value). GH and E value are the two major parame-ters, playing important role in identifying capability of thegenerated pharmacophore hypothesis. Finally, Fischer randomiza-tion methodology was employed as the third validation procedurewith a goal to check whether there is a strong correlation betweenthe chemical structures and the biological activity in the trainingset. In this validation method, 19 random spreadsheets (hypothe-ses) were generated by randomizing the activity data of the train-ing set compounds to achieve 95% confidence level.

2.1.4. Virtual screeningVirtual screening of chemical databases is a fast and accurate

method to find potential leads suitable for further development[19]. In our study, the validated pharmacophore hypothesis(Hypo1) was used as a 3D query in database screening. Three com-mercially available databases (ChemDiv, Specs and NCI) of diversechemical compounds were screened to identify novel LRH-1 ago-nists. From the available Fast/Flexible and Best/Flexible search op-tion, Best/Flexible option was utilized for screening the databases.Maximum Omitted Features option was set to ‘�1’ during screen-ing of the chemical databases. Hit compounds were screened fortheir predicted biological activity values using the Hypo1 pharma-cophore model. The compounds which were showing estimatedpEC50 values greater than 7 were selected and subsequently sub-jected to molecular docking analysis using Glide5.5.

Fig. 1. Chemical structures of training set compounds and their rescaled pEC50 values in parentheses.

Fig. 2. Alignment of training set molecules used for generation of CoMFA andCoMSIA molecular field.

M. Lalit et al. / Journal of Molecular Structure 1049 (2013) 315–325 317

2.2. 3D-QSAR (CoMFA and CoMSIA)

2.2.1. Data setThe collected LRH-1 agonists were divided into a training set of

38 compounds, and a test set of nine compounds. All the molecularmodelling calculations were performed using SYBYL 7.1 packageinstalled on a Silicon Graphics Fuel Work station running IRIX6.5. 3D structures of all agonists were drawn using the Sketch Mol-ecule module. All the compounds were built using the docked con-formation coordinates of the template agonist (compound 1). Themolecules were minimized by applying Tripos molecular mechan-ics force field with conjugate gradient method. The minimizationwas terminated when either the energy gradient convergence cri-terion of 0.05 kcal/mol was reached or when the 10,000 steps min-imization cycle was exceeded. Fig. S2 shows the fragment used asthe common structure to sketch and align all agonists. Gasteiger–Hückel charges were applied to all the compounds of dataset.

2.2.2. Molecular alignment for 3D-QSAR analysisIn molecular alignment, the selection of bioactive conformation

of template molecule and superposition of all molecules on tem-plate compound are the two most critical steps of the 3D-QSARstudy. The docked conformation of the most potent agonist hasbeen used as a template molecule. For aligning all molecules ofdataset with the template molecule, a common sub-structure

based alignment method was used. The aligned molecules of thetraining set are shown in Fig. 2.

2.2.3. Calculation of CoMFA and CoMSIA descriptorsFor deriving CoMFA and CoMSIA descriptor fields, a 3D cubic

lattice with grid spacing of 2 Å was created around aligned mole-cules. The steric (Lennard–Jones potential) and electrostatic (Cou-

318 M. Lalit et al. / Journal of Molecular Structure 1049 (2013) 315–325

lombic potential) CoMFA fields at each lattice intersection werecalculated using the default probe, a sp3 carbon atom with a chargeof +1 and a van der Waals radius of 1.52 Å. The SYBYL default en-ergy cut-off of 30 kcal/mol was used. The signal-to-noise ratio wasimproved by setting the minimum column filtering value to2.0 kcal/mol, which omit those lattice points whose energy varia-tion was below this threshold. For performing regression analysis,a cross-validation method, leave-one-out (LOO) was used. Theoptimum number of components N to be used in the model deriva-tion was chosen from the analysis with the highest cross-validatedcorrelation coefficient r2

cv . The optimal number of components wasemployed to do non-cross validation PLS analysis to get final modelparameters such as corresponding conventional correlation coeffi-cient r2

cv , its standard error of estimate (SEE), and the F value.In CoMSIA study, five physiochemical properties incorporating

steric, electrostatic, hydrophobic, hydrogen bond donor and accep-tor have been evaluated. The study was carried out using a probeatom with radius 1.0 Å and a charge of +1. The attenuation factorwas set to the default value of 0.3. The minimum column filteringwas set to 2.0 kcal/mol. The statistical evaluation for ten differentCoMSIA models was carried out in the similar way as describedin CoMFA.

The agonistic activity (pEC50 values) of test compounds waspredicted by using CoMFA and CoMSIA models. The predictive cor-relation coefficient (r2

pred), based on the molecules of test set, wascalculated according to the equation shown below:

r2pred

SD� PRESSSD

where SD is the sum of squared deviations between the agonisticactivities of the test set and mean activities of the training set mol-ecules and PRESS is the sum of squared deviations between pre-dicted and actual activity values for each molecule in the test set[20,21].

Y randomization test was used for further validation of thedeveloped models, which is widely used approach to establishthe robustness of developed QSAR models. In this test, new QSARmodels were developed by shuffling the dependent variable vectorrandomly and keeping the original independent variable as such.The new QSAR models are expected to have low r2

ncv and r2cv values

for several trials, which confirms that the developed QSAR modelsare robust [22,23].

2.3. Molecular docking studies

To investigate the detailed intermolecular interactions betweensubstituted bicyclo-[3.3.0]oct-2-enes and the LRH-1, moleculardocking program Glide5.5 [24] was used. Also, with the combina-

Table 1Statistical parameters of top 10 pharmacophore hypotheses of LRH-1 agonists generated u

Hypo No. Total cost Cost differencea

Hypo1 75.995 85.178Hypo2 76.308 84.863Hypo3 76.881 84.290Hypo4 77.606 83.566Hypo5 81.231 79.941Hypo6 82.672 78.499Hypo7 84.243 76.928Hypo8 85.693 75.478Hypo9 85.950 75.221Hypo10 86.439 74.732

The null cost, the fixed cost and the configuration cost are 161.172, 59.993 and 12.419a Cost difference between the null and the total cost.b RMS, root mean square deviation.c Abbreviation used for features: HBD, hydrogen bond donor; H, hydrophobic; HYA, h

tion of virtual screening, molecular docking studies have been usedfor identifying active compounds by filtering out those that do notfit into the binding site. Three-dimensional structure informationof the target protein was taken from the protein data bank (PDBID: 3PLZ). The protocol followed for docking studies of knownLRH-1 agonists and virtual hits included processing of the proteinand ligand preparation. During protein preparation, ligand mole-cules were deleted, hydrogen atoms were added, solvent moleculeswere deleted; and bond orders for crystal protein were adjustedand minimized up to 0.30 Å RMSD. An active site of 10 Å was cre-ated around the co-crystallized ligand. Standard precision (SP)mode and other default parameters of Glide software were usedfor the docking studies. For validating the docking protocol, mini-mized conformation of co-crystallized ligand was re-docked intoLRH-1 ligand binding domain. The final hits were selected basedon the binding mode and molecular interactions observed in theactive site.

3. Results and discussion

3.1. Pharmacophore modelling

3.1.1. Pharmacophore model generationIn pharmacophore modelling studies, top ten hypotheses were

generated using diverse training set of LRH-1 agonists. Table 1summarizes the results of pharmacophore hypotheses generation.The best pharmacophore hypothesis Hypo1 was characterized bythe best correlation coefficient of 0.927, the highest cost differenceof 85.178 and lowest RMS values of 1.411. The best pharmaco-phore hypothesis Hypo 1 consists of spatial arrangement of fivefeatures: one hydrogen-bond donor (HBD), two general hydropho-bic (H), one hydrophobic aromatic (HYAr) and one hydrophobic ali-phatic (HYA) feature (Fig. 3). Table 2 lists the Hypo1 estimatedactivity values of all the training set compounds along with theircorresponding error values (i.e., the ratio between calculated andexperimental activity). Among 16 training set compounds; all ac-tive (+++) compounds were predicted accurately, and one least ac-tive (+) compounds was overestimated as moderately active (++),and one moderately active (++) was underestimated as least active(+). Interestingly, highly active compounds were mapped to allpharmacophore features of Hypo1, while in case of moderately ac-tive and least active compounds one or two features were missing.All the compounds in the training set were mapped to HYAr andHYA features, which reveal that these two features are mainlyresponsible for the activity. The most active compound 1 hasshown the fitness score of 9.11 whereas the least active compound16 has shown the fitness score of 6.31 when mapped to Hypo1. Incompound 1, HBD feature corresponds to amide NH, whereas for

sing HypoGen algorithm.

RMSb Correlation Featuresc

1.411 0.927 1HBD,2H,1HYA,1HYAr1.428 0.926 1HBD,3H,1HYAr1.437 0.925 1HBD,1HYA,1HYAr1.465 0.922 1HBD,1H,1HYA,1HYAr1.618 0.904 1HBD,2H,1HYAr1.681 0.895 1HBD,2HYA,1HYAr1.727 0.889 1HBD,1H,1HYA,1HYAr1.754 0.886 1HBD,1H,2HYA,1HYAr1.800 0.879 1HBD,2H,1HYA,1HYAr1.814 0.877 1HBD,1H,2HYA,1HYAr

respectively.

ydrophobic aliphatic; HYAr, hydrophobic aromatic.

Fig. 3. The chemical features of best pharmacophore hypothesis (Hypo1) with their inter-feature distance constraints in angstrom (Å). Hydrogen-bond donor indicated asmagenta vectored spheres and hydrophobic features indicated as cyan spheres.

Table 2Experimental and predicted pEC50 activities of LRH-1 agonists in the training set molecules based on the Hypo1 pharmacophore model.

Compound No. FitValue Exp. pEC50 Rescale Exp. pEC50 Rescale Pred. pEC50 Errora Exp. scaleb Pred. scaleb

1. 9.11 7.92 7.92 7.41 +3.25 +++ +++2. 8.37 7.52 7.31 6.67 +4.41 +++ +++3. 9.10 7.22 6.85 7.39 �3.50 +++ +++4. 8.42 7.00 6.52 6.71 �1.56 +++ +++5. 8.13 6.82 6.25 6.43 �1.50 +++ +++6. 7.32 6.70 6.06 5.62 +2.77 ++ ++7. 7.34 6.60 5.90 5.63 +1.86 ++ ++8. 7.94 6.49 5.74 6.24 �3.15 ++ ++9. 6.12 6.10 5.14 4.42 +5.26 ++ +

10. 6.76 6.00 4.99 5.06 �1.17 + ++11. 6.21 5.90 4.84 4.51 +2.14 + +12. 6.33 5.80 4.68 4.62 +1.14 + +13. 6.29 5.70 4.53 4.59 �1.14 + +14. 6.33 5.60 4.38 4.62 �1.74 + +15. 6.32 5.50 4.23 4.62 �2.44 + +16. 6.31 5.30 3.92 4.61 �4.89 + +

a Difference between the predicted and experimental values; ‘+’ indicates that the predicted pEC50 is higher than the experimental pEC50; ‘�’ indicates that the predictedpEC50 is lower than the experimental pEC50; a value of 1 indicates that the predicted pEC50 is equal to the experimental pEC50.

b Activity scale: pEC50 P 6.00 = +++ (highly active); 6.00 hpEC50i 5.00 = ++ (moderately active); pEC50 6 5.00 = + (least active).

M. Lalit et al. / Journal of Molecular Structure 1049 (2013) 315–325 319

molecules with a lesser activity (10–16) at least one feature ismissing.

3.1.2. Pharmacophore model validationThe derived pharmacophore hypotheses were validated based

on cost analysis, test set prediction, decoy and Fischer randomiza-tion test. The various cost values were calculated during the phar-macophore generation. A difference of 40–60 bits between thetotal cost and the null cost hypotheses confirmed the 75–90%chance of representing a true correlation in the data. The total costand null cost values for the generated pharmacophore hypothesesHypo1 was 75.995 and 161.172, respectively. The cost differencebetween the null cost and total cost values for the Hypo1 was85.178, which represent that Hypo1 can correlate the data by more

than 90%. Hypo1 showed the highest correlation coefficient valueof 0.927 and thereby showing the high predictive ability of Hypo1.In addition, RMS value of less than 1.5 Å (1.411) was to further sup-port the predictive ability of the pharmacophore hypothesis(Hypo1). Among the top ten pharmacophore hypotheses, Hypo1was developed with better statistical values including higher cor-relation, greater cost difference, lower RMS value and low configu-ration cost values. Based on this analysis, Hypo1 was considered asthe best pharmacophore hypothesis for further analysis. The testprediction was employed as the first validation method. Hypo1was used to estimate the activity of test set compounds and mostof the compounds were predicted correctly for their biologicalactivity. A coefficient of determination (r2) of 0.764 shows a goodcorrelation between the actual and estimated activities (Fig. 4).

Fig. 4. Scatter plot of predicted rescaled pEC50 value against experimental rescaledpEC50 value for training (blue triangles) and test set (red diamonds) compounds.(For interpretation of the references to colour in this figure legend, the reader isreferred to the web version of this article.)

Table 3Statistical parameters obtained from decoy test.

Sr. No. Parameter Hypo1

1. Total compounds in database (D) 1852. Total Number of actives in database (A) 53. Total hits (Ht) 74. Active hits (TP) 55. True negative (TN) 1786. Enrichment factor or enhancement (E) 26.427. False negatives (FN = A � TP) 08. False positives (FP = Ht � TP) 29. GH score (goodness of hit list) 0.776

10. Accuracy = (TP + TN)/(TP + TN + FP + FN) 0.98911. Precision = TP/(TP + FP) 0.71412. Sensitivity = TP/(TP + FN) 1.0013. Specificity = TN/(TN + FP) 0.988

320 M. Lalit et al. / Journal of Molecular Structure 1049 (2013) 315–325

In detail, 6 of 7 highly active, 6 of 13 moderately active and 8 of 11least active compounds were predicted correctly. One highly activecompound was underestimated as moderately active. One moder-ately active compound was underestimated as least active andsix moderately active compounds were overestimated as highlyactive. Two least active compounds were overestimated as moder-ately active and one as highly active. The most active compound(17) in the test set mapped over Hypo1 (Fig. 5a) shows that allthe features were being mapped accurately with Hypo1, whereasin least active compound (47) one HBD and HYAr features weremissing (Fig. 5b). In second validation method, decoy test, a smalldatabase (D) containing 185 compounds, which include five activecompounds for LRH-1 agonistic activity was used to validatewhether the best pharmacophore hypothesis (Hypo1) could beable to discriminate the active from inactive compounds. Databasescreening was performed using Hypo1 as a 3D structural query. Forfurther analysis, enrichment factor (E-value) and goodness of hitscore (GH) were calculated using the following formulae:

Fig. 5. (a) Pharmacophore model aligned with the most active test set compound 17compound 47 (rescaled pEC50 = 4.23).

E ¼ TP� DHt� A

ð1Þ

GH ¼ ðTP=4HtAÞð3AþHtÞ � ð1� ððHt� TPÞ=ðD� AÞÞÞ ð2Þ

where Ht, TP, D and A represent the total number of compoundsscreened by a pharmacophore model, total number of active com-pounds screened, total number of compounds of the database andtotal number of actives, respectively. Hypo1 has shown an E valueof 26.42 and GH score of greater than 0.5, which indicates thatquality of the developed pharmacophore hypothesis was significant(Table 3). From the overall validation results, we assure that thehypothesis (Hypo1) was able to discriminate between the activeand decoys. Finally, Fischer randomization test was applied as thirdvalidation process to achieve 95% confidence level. The pharmaco-phore hypotheses generated from the 19 random runs were com-pared with the original pharmacophore hypothesis, Hypo1. Fig. 6shows that none of the randomly generated hypotheses have scoredbetter statistical results than Hypo1. These results have clearlyshown that Hypo1 has not been generated by any chancecorrelation.

3.1.3. Pharmacophore model based virtual screeningFig. 7 shows the sequential virtual screening followed during

identification of novel LRH-1 agonists. The validated pharmaco-phore hypothesis (Hypo1) was used as a query to search the NCI,ChemDiv and Specs databases, which comprised of 87,374,

(rescaled pEC50 = 7.52). (b) Pharmacophore model aligned with the least active

Fig. 6. The difference in the cost value of hypotheses between the initial spread sheet and 19 random spread sheets after Fischer randomization test (CatScramble run).

Fig. 7. Flow chart showing sequential virtual screening techniques followed during identification of the novel LRH-1 agonists.

M. Lalit et al. / Journal of Molecular Structure 1049 (2013) 315–325 321

84,3113 and 27,6807 compounds, respectively. 86,8725 lead-likecompounds were obtained after filtering all the databases by theLipinski’s rule of five. After performing pharmacophore based vir-

tual screening of databases using Hypo1, 86,2929 hits were ob-tained. A set of 352 hit compounds were selected that hadestimated pEC50 values above 7 and confirmed to Lipinski’s rule.

Table 4Summary of results from CoMFA and CoMSIA analysis.

Models r2cv

k r2ncv

l SEEm ONCn Fo r2pred

p Relative% contributions

S E H D A

CoMFA 0.767 0.991 0.098 7 495.33 0.87 70 30 – – –CoMSIAa 0.705 0.97 0.181 6 167.127 0.73 59.1 40.9 – – –CoMSIAb 0.703 0.987 0.123 7 315.452 0.78 31.1 41 27.9 – –CoMSIAc 0.59 0.966 0.194 7 123.058 0.70 45.6 33 – 21.4 –CoMSIAd 0.644 0.977 0.159 7 185.472 0.80 25.8 24.5 38.1 11.7 –CoMSIAe 0.614 0.986 0.13 8 246.836 0.81 27.6 19.1 34.7 – 18.6CoMSIAf 0.665 0.953 0.229 7 87.86 0.76 48.3 25.9 – – 25.8CoMSIAg 0.404 0.853 0.39 4 46.603 0.84 30.1 – 43 26.9 –CoMSIAh 0.6 0.918 0.295 5 71.203 0.81 33.2 – 39.8 – 27CoMSIAi 0.492 0.885 0.354 6 39.621 0.75 41.2 – – 32.6 26.2CoMSIAj 0.593 0.979 0.158 8 165.75 0.80 22.5 16.4 31.5 11.7 17.7

S – Steric; E – Electrostatic; H – Hydrophobic; D – Hydrogen Donor; A – Hydrogen Acceptor.a SE.b SEH.c SED.d SEHD.e SEHA.f SEA.g SHD.h SHA.i SDA.j SEHDA.k Cross validated correlation coefficient.l No validation correlation coefficient.

m Standard Error of Estimate.n Optimal number of components.o F-test value.p Predictive correlation coefficient.

322 M. Lalit et al. / Journal of Molecular Structure 1049 (2013) 315–325

The selected hit compounds were further analyzed by moleculardocking analysis to avoid the false-positive hits from virtualscreening.

3.2. 3D-QSAR (CoMFA and CoMSIA) modelling

3.2.1. CoMFA and CoMSIA model generationThe statistical parameters obtained from the CoMFA and CoM-

SIA analysis are listed in Table 4. The analysis of the resulting3D-QSAR models showed that the best CoMFA model was obtainedwith combined steric and electrostatic fields, that yields across-validated r2

cv of 0.767 with seven as optimum number ofcomponents, a non-cross-validated r2

ncv of 0.991, an estimated high

Fig. 8. Graph of experimental versus predicted rescaled pEC50 value of the training se

F value of 495.33, and low standard error of estimation (SEE) of0.098. In this model, the steric and electrostatic contributions werefound to be 70% and 30%, respectively.

Using various combinations of CoMSIA descriptor fields, ten dif-ferent CoMSIA models were developed. A model consisting of stericand electrostatic CoMSIA fields has exhibited high cross-validatedr2

cv of 0.705 with six as optimum number of components. The bestCoMSIA model was selected based on the r2

ncv , SEE, and F value. Amodel consisting of steric, electrostatic, and hydrophobic fieldswith a cross-validated r2

cv of 0.703 with seven as optimum numberof components and conventional r2

ncv of 0.987 was selected for fur-ther analysis. F value and SEE of this model were 315.452 and0.123, respectively. The relative contributions of steric, electro-

t and the test set compounds using validated: (a) CoMFA and (b) CoMSIA model.

Fig. 10. CoMSIA hydrophobic field distribution around the highly active compound1.

Table 5Ten cross validated r2

cv and non-cross r2ncv validated correlation coefficients from the

activity (Y) randomization test study.

Sr. No. CoMFA CoMSIA

r2ncv r2

cv r2ncv r2

cv

1. 0.214 �0.159 0.171 �0.0682. 0.315 �0.182 0.249 �0.0593. 0.265 �0.113 0.329 �0.1904. 0.312 0.157 0.315 0.1875. 0.347 �0.083 0.301 �0.0706. 0.249 �0.159 0.213 �0.2377. 0.282 �0.088 0.460 �0.0728. 0.195 �0.036 0.232 �0.0859. 0.202 �0.030 0.246 �0.075

10. 0.234 �0.157 0.189 �0.211

M. Lalit et al. / Journal of Molecular Structure 1049 (2013) 315–325 323

static, and hydrophobic were found to be 31.1%, 41% and 27.9%,respectively. The scattered plots of experimental pEC50 againstthe CoMFA and CoMSIA predicted pEC50 of both training set, andtest set are shown in Fig. 8. These results indicate that the modelthus developed was consistent and was supported by bootstrap-ping results for 100 runs, with r2

bs of 1.000 for CoMFA and r2bs of

0.997 for CoMSIA.Further, non-cross validated CoMFA and CoMSIA model were

used to generate the 3D contour maps. The contour maps illustratelattice points and the variation in the molecular field values at lat-tice points. Variation in molecular field is mainly associated withthe variation in the receptor binding affinity. Molecular field anal-ysis helps in finding the favorable or unfavorable interaction ener-gies of aligned molecules with the aid of a probe atom, surroundingthe molecules. These 3D color contour maps provide clues for themodification required to design new molecules with better agonis-tic activity.Fig. 9a and b shows the distribution of steric and elec-trostatic fields generated by using the validated CoMFA model,respectively. The green and yellow contours (80% and 20% contri-butions) indicate the favorable and unfavorable steric interactions,respectively. A green plot was found around the 3rd position of thehexahydropentalene ring favorable for presence of bulky groupsfor the activity. The agonists having less-bulky substitutions suchas propoxy (46), butoxy (16), 2-methyl-butoxy (14), 2-propoxy(15), hex-1-en-2-yl (44), at 3rd position of hexahydropentalenering were weakly active. The agonists having bulky substitutionssuch as phenylamine, styrene, benzoxy, and cyclohexyloxy at 3rdposition of hexahydropentalene ring were moderate to highly ac-tive. Another green contour is sighted around the 5th position ofhexahydropentalene ring. Activity was increased due to introduc-tion of bulk (oct-4-en-4-yl) at the 5th position of hexahydropenta-lene ring (1, 3, 17 and 20), where less bulky substitution such ashex-3-en-3-yl (41) is weakly active. One yellow region is sightedin steric contour plots at 4th position of the phenyl–vinyl/phe-nyl–amine ring suggesting that bulky groups in these regions dis-favor the agonistic activity (10, 13 and 37), whereas compoundwithout substitution has shown the higher activity (7). Two moreyellow regions are sighted around 5th position of hexahydropenta-lene ring suggesting that bulky aliphatic substitutions (decan-2-yl,octyl, hexyl) in these regions disfavor the activity (8, 12, and 13).Small aliphatic chain substitution (butyl) in the same regions re-sulted in an increase in the agonistic activity (6).

The electrostatic effects of the substituents were analyzed bythe presence of blue and red color (80% and 20% contributions)contour maps. The blue and red contours show the favorable andunfavorable positive electrostatic interactions, respectively. Onebig blue contour plot covering the nitrogen of phenyl–aminegroup, indicates that the electropositive potential favors theactivity (18). Presence of less electropositive group (vinyl) at same

Fig. 9. CoMFA contour maps for LRH-1 agonists: (a) the steric field distribution an

position results in moderate activity (7), whereas presence of elec-tronegative group (oxygen) at same position results in decrease inactivity (36). The presence of blue contour in the proximity ofnitrogen of phenyl–amine group corresponds to hydrogen bonddonor feature of best pharmacophore hypothesis and also to thehydrogen bond interaction with His390/Arg393. The small red con-tour is sighted in the proximity of 5th position of hexahydropenta-lene ring suggesting that the electronegative substitutions arefavorable for agonistic activity. The moderate electronegativesubstitution (dimethyl phenyl silane) at this position has shownmoderate activity (22), whereas compound (12) having no electro-negative substitution was less active.

The CoMSIA contour maps consist of steric, electrostatic andhydrophobic fields. Steric and electrostatic contour maps gener-

d (b) the electrostatic field distribution around the highly active compound 1.

Fig. 11. The binding pocket of LRH-1 (PDB ID: 3PLZ) with docked conformation of: (a) highly active compound 1 and (b) Top screened virtual hit (ZINC20355647).

324 M. Lalit et al. / Journal of Molecular Structure 1049 (2013) 315–325

ated by validated CoMSIA and CoMFA model were found to becomparatively similar. Fig. 10 shows contour plots for the CoMSIAhydrophobic fields. The cyan and pink contours (80% and 20%contributions) indicate the favorable and unfavorable hydrophobicinteractions, respectively. A big cyan plot was covering the ben-zene ring of phenylamine and styrene group, indicates that thehydrophobic substitution favors the agonistic activity. The agonistswith no hydrophobic substituents at the same position were foundto be least active (14–16 and 44–47). One big pink contour sightedat 4th position of phenylamine and styrene group, indicates thatthe hydrophobic substitutions are unfavorable for agonistic activ-ity. The compound having hydrophobic substitution such as chlo-rine at this position is less potent (37), whereas the compoundswith no substitution are more potent (18). Three pink contoursare sighted around substitutions at the 5th position of hexahydro-pentalene ring, suggesting that an increase in the hydrophobic sub-stitution at this position results in the decreasing agonistic activity.The compounds having more hydrophobic substitution such as oc-tyl (12) was found to be less potent, whereas the compounds bear-ing less hydrophobic substitutions such hexyl (7) and butyl (6)were found to be moderately active.

The derived CoMFA and CoMSIA models were validated by pre-dicting activity of test set agonists. All the test set agonists wereconstructed, minimized and aligned with template compound 1,in a similar manner to the training set. The predicted activities oftest set compounds are in good agreement with the experimentalactivities within an acceptable error range and verified by theCoMFA r2

pred value of 0.87 and CoMSIA r2pred value of 0.78. CoMSIA

model consisting of steric, hydrophobic and hydrogen donor fieldhas exhibited high r2

pred value of 0.84, whereas the CoMSIA modelconsisting of steric, electrostatic and hydrogen donor field hasshown lowest prediction value (r2

pred value of 0.70) among all thedeveloped CoMSIA models. Thus, the CoMSIA contour map analysissuggests that hydrophobic field plays an important role in LRH-1agonistic activity.

Finally, Y randomization test was performed to assess therobustness of the developed 3D QSAR models. Ten random shufflesof the pEC50 values were performed and models were generated.Table 5 shows the results obtained for Y randomization test. Thelow r2

ncv and r2cv values show that the good results in our original

developed models are not due to a chance correlation. These re-sults show that both the CoMFA and CoMSIA models are reliableand can be useful in designing new potent LRH-1 agonists.

3.3. Molecular docking studies

All the LRH-1 agonists and screened compounds were dockedinto the active site of LRH-1 using Glide5.5. The active site of

LRH-1 mainly consists of hydrophobic residues (Phe342, Met345,Trp382, Ile387, Leu424, Ile403, Met428, Ala513, and Leu517), pos-itively charged residues His390, Arg393 and nucleophilic residueSer383. The docking protocol was validated by re-docking mini-mized conformation of co-crystallized ligand with RMSD of0.994 Å. The amide NH group of most potent agonists (compounds1–5) was showing the hydrogen bond interaction with imidazoleside chain of His390. All the inhibitors were showing the p–pinteraction with His390. All the screened hits were analyzed forbinding orientation and docking score. Finally, nine hits were se-lected (Fig. S3), showing the hydrogen bond interaction withHis390/Arg393 and p–p interaction with His390 (Fig. S4), and alsomapped to all the features of Hypo1 (Fig. S5). The selected hitswere showing high docking score (>�8.654) in comparison to mostpotent agonist (�7.535). The docking conformation of most potentagonist and top virtual hit was shown in Fig. 11. Further, search byPubChem and SciFinder scholar search tools confirmed that thesecompounds were not reported as LRH-1 agonists. Hence, we sug-gest that these nine compounds could be novel scaffolds as LRH-1 agonists.

4. Conclusions

In this study, we have developed a quantitative pharmacophore,CoMFA and CoMSIA models for a different series of LRH-1 agonistsas anti-diabetic agents. The best pharmacophore hypothesis Hypo1was validated by using different methods to evaluate their predic-tion power over the diverse test set compounds. The highly predic-tive hypothesis was further used in virtual searching for theidentification of novel LRH-1 agonists. Three diverse chemical dat-abases were used in virtual searching. The hits from the virtualscreening were filtered based on the estimated activity values,docking score and binding orientation. Combining all these results,nine new compounds were presented as possible lead candidatesto be used as novel and potent LRH-1 agonists. As far as we areaware, LRH-1 agonistic activity of these compounds has not beenreported in the literature, which has been explored by PubChemand SciFinder Scholar search tools.

CoMFA and CoMSIA study has correlated the LRH-1 agonisticactivities of 47 bicyclo-[3.3.0]oct-2-enes derivatives with severalstereo-chemical parameters representing steric, electrostatic,hydrophobic, hydrogen bond donor and acceptor fields. Stableand statistically reliable CoMFA and CoMSIA models were devel-oped, which has suggested that hydrophobic field plays an impor-tant role in LRH-1 agonistic activity. Molecular docking analysisrevealed that the p–p interaction with His390 and hydrogen bondinteraction with His390/Arg393 are essential for LRH-1 agonisticactivity. Further, in vitro testing of obtained virtual hits would be

M. Lalit et al. / Journal of Molecular Structure 1049 (2013) 315–325 325

necessary to confirm the success rate of this work and to optimizethe hits subsequently.

Acknowledgement

The authors acknowledge financial support from Department ofScience and Technology (DST), New Delhi.

Appendix A. Supplementary material

Supplementary data associated with this article can be found, inthe online version, at http://dx.doi.org/10.1016/j.molstruc.2013.06.035.

References

[1] E. Fayard, J. Auwerx, K. Schoonjans, Trends Cell Biol. 14 (2004) 250–260.[2] D.J. Shin, T.F. Osborne, J. Biol. Chem. 283 (2008) 15089–15096.[3] E. Fayard, K. Schoonjans, J.S. Annicotte, J. Auwerx, J. Biol. Chem. 278 (2003)

35725–35731.[4] Y. Miki, C.D. Clyne, T. Suzuki, T. Moriya, R. Shibuya, Y. Nakamura, T. Ishida, N.

Yabuki, K. Kitada, S. Hayashi, Cancer Lett. 244 (2006) 24–33.[5] T. Yazawa, Y. Inaoka, R. Okada, T. Mizutani, Y. Yamazaki, Y. Usami, M.

Kuribayashi, M. Orisaka, A. Umezawa, K. Miyamoto, Mol. Endocrinol. 24 (2010)485–496.

[6] C.D. Clyne, C.J. Speed, J. Zhou, E.R. Simpson, J. Biol. Chem. 277 (2002) 20591–20597.

[7] C.D. Clyne, A. Kovacic, C.J. Speed, J. Zhou, V. Pezzi, E.R. Simpson, Mol. Cell.Endocrinol. 215 (2004) 39–44.

[8] E.P. Sablin, I.N. Krylova, R.J. Fletterick, H.A. Ingraham, Mol. Cell 11 (2003)1575–1585.

[9] I.N. Krylova, E.P. Sablin, J. Moore, R.X. Xu, G.M. Waitt, J.A. MacKay, D.Juzumiene, J.M. Bynum, K. Madauss, V. Montana, Cell 120 (2005) 343–355.

[10] E.A. Ortlund, Y. Lee, I.H. Solomon, J.M. Hager, R. Safi, Y. Choi, Z. Guan, A.Tripathy, C.R.H. Raetz, D.P. McDonnell, Nat. Struct. Mol. Biol. 12 (2005) 357–363.

[11] W. Wang, C. Zhang, A. Marimuthu, H.I. Krupka, M. Tabrizizad, R. Shelloe, U.Mehra, K. Eng, H. Nguyen, C. Settachatgul, Proc. Natl. Acad. Sci. USA 102 (2005)7505–7510.

[12] J.M. Lee, Y.K. Lee, J.L. Mamrosh, S.A. Busby, P.R. Griffin, M.C. Pathak, E.A.Ortlund, D.D. Moore, Nature 474 (2011) 506–510.

[13] R.J. Whitby, J. Stec, R.D. Blind, S. Dixon, L.M. Leesnitzer, L.A. Orband-Miller, S.P.Williams, T.M. Willson, R. Xu, W.J. Zuercher, J. Med. Chem. 54 (2011) 2266–2281.

[14] R.J. Whitby, S. Dixon, P.R. Maloney, P. Delerive, B.J. Goodwin, D.J. Parks, T.M.Willson, J. Med. Chem. 49 (2006) 6652–6655.

[15] Discovery Studio, Version 2.5; Accelrys Inc: San Diego, CA, 2009.[16] A. Smellie, S.L. Teig, P. Towbin, J. Comput. Chem. 16 (1995) 171–187.[17] A. Bhadauriya, G.V. Dhoke, R.P. Gangwal, M.V. Damre, A.T. Sangamwar, Mol.

Divers. 17 (2013) 139–149.[18] G.V. Dhoke, R.P. Gangwal, A.T. Sangamwar, J. Mol. Struct. 1028 (2012) 22–30.[19] U. Singh, R.P. Gangwal, R. Prajapati, G.V. Dhoke, A.T. Sangamwar, Mol. Simul.

39 (2013) 385–396.[20] P. Gupta, P. Garg, N. Roy, Mol. Divers. 15 (2011) 733–750.[21] S. Nandi, M.C. Bagchi, Mol. Divers. 14 (2010) 27–38.[22] P.S. Ambure, R.P. Gangwal, A.T. Sangamwar, Mol. Divers. 16 (2012) 377–388.[23] U. Singh, R.P. Gangwal, G.V. Dhoke, R. Prajapati, M. Damre, A.T. Sangamwar,

AJC (2012).[24] Glide, Version 5.5, Schrödinger, LLC, New York, NY, 2009.