12
CoMSIA Study on Substituted Aryl Alkanoic Acid Analogs as GPR40 Agonists Aaditya Bhatt, Pallav D. Patel, Maulik R. Patel, Satyakam Singh, Cesar A. Lau-Cam and Tanaji T. Talele* Department of Pharmaceutical Sciences, College of Pharmacy and Allied Health Professions, St John's University, 8000 Utopia Parkway, Queens, NY 11439, USA *Corresponding author: Tanaji T. Talele, [email protected] GPR40, a G-protein-coupled receptor has been well established to play a crucial role in regulating blood glucose levels. Hence, GPR40 is a potential target for future antidiabetic agents. The present 3D QSAR study is aimed at delineating structural parameters governing GPR40 agonistic activity. To meet this objective, a comparative molecular similarity indices analysis for 63 different GPR40 agonists was performed using two methods; a ligand-based 3D QSAR model employing the atom fit alignment method and a receptor-based 3D QSAR model that was derived from the predicted binding conformations obtained by docking all the GPR40 agonists at the active site of GPR40. The results of these studies showed the ligand-based model to be superior (r 2 cv value of 0.610) to the receptor-based model (r 2 cv value of 0.519) in terms of statistical data. The predictive ability of these models was evaluated using a test set of 15 com- pounds not included in the preliminary training set of 48 compounds. The predictive r 2 values for the ligand- and the receptor-based models were found to be 0.863 and 0.599, respectively. Further, inter- pretation of the comparative molecular similarity indices analysis contour maps with reference to the active site of GPR40 provided an insight into GPR40-agonist interactions. Key words: aryl alkanoic acid, CoMSIA, GPR40, type 2 diabetes Received 29 November 2010, revised 12 January 2011 and accepted for publication 20 February 2011 Fatty acids can bind to and activate a type of G-protein-coupled receptors (GPCR's) variously designated as GPR40 or free fatty acid receptor 1 (FFAR1) to increase glucose-dependent insulin secretion (1–5), through the inositol triphosphate diacylglycerol (IP 3 DAG) sec- ondary messenger system (6). Recent phase II clinical experience of GPR40 agonist (TAK-875) illustrates the tractable nature of this receptor for the development of antidiabetic drugs (7). Hence, GPR40 is a potential target for the treatment of type 2 diabetes. Aryl alkanoic acid derivatives possessing GPR40 agonistic activity repre- sent a promising group of compounds for the treatment of diabetes deserving computational investigation. A high throughput screening of the GlaxoSmithKline compound library generated a lead compound bearing a phenylpropionic acid pharmacophore with a potency com- parable to that of linoleic acid when determined by the luciferase reporter assay (8,9). Efforts to improve the activity of this lead com- pound by chemical modification were initially focused on the three- carbon long polar acid head group, considered to be essential for optimal GPR40 agonistic activity. However, both homologation and shortening of this head group dropped the agonistic activity seen in the luciferase assay (9). In contrast, the introduction of the optically pure trans-cyclopropyl moiety with –COOH function as the head group led to an increase in activity. A further modification of the lead compound was the replacement of the acid group with a primary or secondary amide in an attempt to prevent the generation of the acyl glucuronide and plasma protein binding of the acid analog. Although the resulting primary and secondary amides retained the GPR40 agonistic activity, they were less potent than the free acid form (8). Taken into account the existing experimental evidence, it is clear that by combining ligand- and receptor-based molecular modeling approaches it would be possible to identify and optimize the design of new aryl alkanoic acid derivatives possessing a high GPR40 ago- nistic activity. For this reason, this study was conceived to apply comparative molecular similarity indices analysis (CoMSIA) (10) 3D QSAR method to the aryl alkanoic acid derivatives possessing GPR40 agonistic activity. The CoMSIA approach calculates similarity indices in the space surrounding each of the aligned molecules in the data set. Comparative molecular similarity indices analysis is believed to be less affected by changes in molecular alignment and provides smooth and interpretable contour maps as a result of employing Gaussian type distance dependence with the molecular similarity indices it uses. Furthermore, in addition to steric and elec- trostatic fields of comparative molecular field analysis (CoMFA) (11), CoMSIA defines explicit hydrophobic and hydrogen bond donor and acceptor fields (10). In addition, the docking simulations were per- formed using the homology model of the complex of GPR40 with phenyl propionic acid derivative to elucidate the probable binding modes of these agonists at active site of the receptor (12). By per- forming the computational analysis on the aryl alkanoic acid scaf- folds and by examining the topographical features of the active site in the context of the best CoMSIA contour maps, we have con- firmed the identity of the ligand-binding amino acid residues of GPR40 previously revealed by the other techniques (12). This type of analysis would be of great help in predicting the binding behav- ior of novel GPR40 agonists to GPR40. 361 Chem Biol Drug Des 2011; 77: 361–372 Research Article ª 2011 John Wiley & Sons A/S doi: 10.1111/j.1747-0285.2011.01112.x

CoMSIA Study on Substituted Aryl Alkanoic Acid Analogs as GPR40 Agonists

Embed Size (px)

Citation preview

Page 1: CoMSIA Study on Substituted Aryl Alkanoic Acid Analogs as GPR40 Agonists

CoMSIA Study on Substituted Aryl Alkanoic AcidAnalogs as GPR40 Agonists

Aaditya Bhatt, Pallav D. Patel, Maulik R.Patel, Satyakam Singh, Cesar A. Lau-Camand Tanaji T. Talele*

Department of Pharmaceutical Sciences, College of Pharmacy andAllied Health Professions, St John's University, 8000 UtopiaParkway, Queens, NY 11439, USA*Corresponding author: Tanaji T. Talele, [email protected]

GPR40, a G-protein-coupled receptor has beenwell established to play a crucial role in regulatingblood glucose levels. Hence, GPR40 is a potentialtarget for future antidiabetic agents. The present3D QSAR study is aimed at delineating structuralparameters governing GPR40 agonistic activity.To meet this objective, a comparative molecularsimilarity indices analysis for 63 different GPR40agonists was performed using two methods; aligand-based 3D QSAR model employing the atomfit alignment method and a receptor-based 3DQSAR model that was derived from the predictedbinding conformations obtained by docking all theGPR40 agonists at the active site of GPR40. Theresults of these studies showed the ligand-basedmodel to be superior (r2

cv value of 0.610) to thereceptor-based model (r2

cv value of 0.519) in termsof statistical data. The predictive ability of thesemodels was evaluated using a test set of 15 com-pounds not included in the preliminary training setof 48 compounds. The predictive r2 values for theligand- and the receptor-based models were foundto be 0.863 and 0.599, respectively. Further, inter-pretation of the comparative molecular similarityindices analysis contour maps with reference tothe active site of GPR40 provided an insight intoGPR40-agonist interactions.

Key words: aryl alkanoic acid, CoMSIA, GPR40, type 2 diabetes

Received 29 November 2010, revised 12 January 2011 and accepted forpublication 20 February 2011

Fatty acids can bind to and activate a type of G-protein-coupledreceptors (GPCR's) variously designated as GPR40 or free fatty acidreceptor 1 (FFAR1) to increase glucose-dependent insulin secretion(1–5), through the inositol triphosphate ⁄ diacylglycerol (IP3 ⁄ DAG) sec-ondary messenger system (6). Recent phase II clinical experience ofGPR40 agonist (TAK-875) illustrates the tractable nature of thisreceptor for the development of antidiabetic drugs (7). Hence, GPR40is a potential target for the treatment of type 2 diabetes. Aryl

alkanoic acid derivatives possessing GPR40 agonistic activity repre-sent a promising group of compounds for the treatment of diabetesdeserving computational investigation. A high throughput screeningof the GlaxoSmithKline compound library generated a lead compoundbearing a phenylpropionic acid pharmacophore with a potency com-parable to that of linoleic acid when determined by the luciferasereporter assay (8,9). Efforts to improve the activity of this lead com-pound by chemical modification were initially focused on the three-carbon long polar acid head group, considered to be essential foroptimal GPR40 agonistic activity. However, both homologation andshortening of this head group dropped the agonistic activity seen inthe luciferase assay (9). In contrast, the introduction of the opticallypure trans-cyclopropyl moiety with –COOH function as the headgroup led to an increase in activity. A further modification of the leadcompound was the replacement of the acid group with a primary orsecondary amide in an attempt to prevent the generation of the acylglucuronide and plasma protein binding of the acid analog. Althoughthe resulting primary and secondary amides retained the GPR40agonistic activity, they were less potent than the free acid form (8).

Taken into account the existing experimental evidence, it is clearthat by combining ligand- and receptor-based molecular modelingapproaches it would be possible to identify and optimize the designof new aryl alkanoic acid derivatives possessing a high GPR40 ago-nistic activity. For this reason, this study was conceived to applycomparative molecular similarity indices analysis (CoMSIA) (10) 3DQSAR method to the aryl alkanoic acid derivatives possessingGPR40 agonistic activity. The CoMSIA approach calculates similarityindices in the space surrounding each of the aligned molecules inthe data set. Comparative molecular similarity indices analysis isbelieved to be less affected by changes in molecular alignment andprovides smooth and interpretable contour maps as a result ofemploying Gaussian type distance dependence with the molecularsimilarity indices it uses. Furthermore, in addition to steric and elec-trostatic fields of comparative molecular field analysis (CoMFA) (11),CoMSIA defines explicit hydrophobic and hydrogen bond donor andacceptor fields (10). In addition, the docking simulations were per-formed using the homology model of the complex of GPR40 withphenyl propionic acid derivative to elucidate the probable bindingmodes of these agonists at active site of the receptor (12). By per-forming the computational analysis on the aryl alkanoic acid scaf-folds and by examining the topographical features of the active sitein the context of the best CoMSIA contour maps, we have con-firmed the identity of the ligand-binding amino acid residues ofGPR40 previously revealed by the other techniques (12). This typeof analysis would be of great help in predicting the binding behav-ior of novel GPR40 agonists to GPR40.

361

Chem Biol Drug Des 2011; 77: 361–372

Research Article

ª 2011 John Wiley & Sons A/S

doi: 10.1111/j.1747-0285.2011.01112.x

Page 2: CoMSIA Study on Substituted Aryl Alkanoic Acid Analogs as GPR40 Agonists

Materials and Methods

Data sets and biological activityThe training set and the test set used in the study consisted of aseries of 1,4-disubstituted aryl alkanoic acid analogs which havebeen found to exhibit GPR40 agonistic activity. The pEC50 ()logEC50)values reported as the average of two to four experiments (8,9) wereused as the dependent variables in the CoMSIA QSAR analysis. Ide-ally the pEC50 values of the training data set should encompassapproximately 3 log units. Accordingly, the pEC50 values of the train-ing set described in this report encompasses 3.81 log units. Thestructures and the range of activities of the compounds included inthe initial test set (15 compounds) and the training set (48 com-pounds) are shown in Table 1. The compounds were distributed in amanner that for each compound in the test set, there was a repre-sentative compound in the training set to achieve excellent predic-tive ability of the training set toward the test set compounds.

Molecular modeling and alignmentThe CoMSIA study was carried out using SYBYL version 7.2a on aDELL Precision 470n workstation with the RHEL 4.0 operating sys-tem. The position of each atom is important for CoMSIA because thedescriptors were calculated based on the 3D space grid. The 3Dstructures of the compounds in the training and the test sets wereconstructed using the Sketch Molecule function in SYBYL. Of the ini-tial set of 70 compounds in the series, characterized by the 1H NMRand LC ⁄ MS spectroscopy (purity >95%) and tested using the reporterassay expressing human GPR40 in a CHO cell line (8), seven com-pounds were dropped from the initial model generation, as theywere reported as racemic mixtures; however, they were laterassigned to an additional test set, in the form of best-aligned enan-tiomer to validate the predictive power, accommodative capacity,and the robustness of the training set model in the CoMSIA analysis(8,9). Compounds 15, 16, 17, 18, and 26–31 bearing the cyclopro-pyl ring on their side chain were reported as the pure enantiomersbased on vibrational circular dichroism analysis (8,9). They weretherefore modeled as S (at the carbon atom of the cyclopropyl ringlinked to the phenyl core, hereafter always regarded as the first chi-ral label), S (at the carbon atom of the cyclopropyl ring linked to thecarboxylic acid group, hereafter always regarded as the second chirallabel) enantiomeric form in case of compounds 15, 16, 26–31, R,Risomer in case of compound 17, and R,S isomer in case of com-pound 18 (Table 1), respectively. As the absolute configurations ofcompounds 34 and 40 was not reported, they were individuallymodeled as R and S enantiomers onto the carbon a- to the carbox-ylic acid group in the polar side chain. Finally, the S enantiomer ofthese two compounds was included in the study, as it was found tofit well in the alignment model selected for the model generation.The energy minimizations were performed with Tripos Force Field(13) and Gasteiger–Marsili charges (14), with steepest descent fol-lowed by the conjugate gradient method with the convergence crite-rion of 0.001 kcal ⁄ mol �. The ligand-based model, for which themost potent molecule (compound 16) was chosen as a template tofit the rest of the training and test set compounds by using SYBYL fitatom function, was used. The reference atoms C1, C2, C3, and C4 incompound 16 used for alignment are shown in Figure 1A. The

resulting alignment model (Figure 1B) generated by atom fit align-ment rule was then subjected to the CoMFA and CoMSIA studies.The CoMFA (steric and electrostatic) and the CoMSIA descriptorfields (steric, electrostatic, hydrophobic, hydrogen bond donor andacceptor) were generated onto the ligands by performing the compu-tational analysis using SYBYL 7.2 onto the training set of 48compounds which led to the generation of cross-validated and non-cross-validated correlation coefficients. This involves the utility ofpartial least square (PLS) analysis using leave-one-out (LOO) methodavailable in SYBYL for cross-validation. The analysis was performedusing the column-filtering criteria of 2 kcal ⁄ mol energy with fiveoptimum components. Moreover, as both steric and electrostaticparameters of the CoMFA were included in the CoMSIA analysis,only the statistically significant CoMSIA study was selected for fur-ther computational analysis and contour map interpretation (statisti-cal data of CoMFA analysis available in Table S1).

The second alignment model was the receptor-based model, inwhich the molecules are aligned according to the bioactive confor-mations obtained from the docking experiments (Figure S1). Themolecules to be docked were previously imported from SYBYL toMAESTRO v8.0 (Schrodinger, LLC., New York, NY, USA) environment.The details of methodology used for protein preparation, grid gener-ation, and glide docking are discussed in our previous report (15).

Based on a rms deviation of 0.18 � obtained by superimposition ofthe reference atoms shown in Figure 1C, the conformation of thebound agonist (compound 1) (12) was found to be similar to theconformation of compound 16 used in the atom fit alignmentapproach. Therefore, to gain a better understanding of the interac-tions of the ligands with the active site amino acids of GPR40, thebound agonist within the active site of the receptor (12) wassuperimposed onto the most potent compound 16, and overlaidalong with individual contour maps (steric, electrostatic, hydropho-bic, hydrogen bond donor and acceptor) using the SYBYL atom fitfunction. The reference atoms used to superimpose the bound ago-nist (compound 1) within the receptor active site (12) onto com-pound 16 were (i) C1 on the thiazole nucleus, (ii) C2 linked to thethiazole nucleus, (iii) C3 and (iv) C4 on the phenyl ring (Figure 1C).The amino acids beyond the spherical periphery of 8 � of com-pound 16 and the bound agonist (compound 1) were removedusing the undisplay atoms function available in SYBYL, as it is evi-dent that amino acids beyond this distance will not play any rolein the ligand–receptor interaction. Various contour maps were indi-vidually displayed onto the different ligands from the ligand-basedmodel at the active site, and were interpreted in terms of the pos-sible amino acid interactions playing an important role in theirbinding to GPR40.

CoMSIA 3D QSAR modelsIn deriving the CoMSIA descriptor fields, a 3D cubic lattice withgrid spacing of 2 � in x, y, and z directions was created to encom-pass the aligned molecules. Comparative molecular similarity indi-ces analysis descriptors were derived according to Klebe et al. (10),CoMSIA similarity indices (AF) for a molecule j with atoms i at agrid point q were calculated using eqn 1:

Bhatt et al.

362 Chem Biol Drug Des 2011; 77: 361–372

Page 3: CoMSIA Study on Substituted Aryl Alkanoic Acid Analogs as GPR40 Agonists

R3

NH R2

HNR1

R2CO

R3

HNR1

CO

OHR1

42–6332–411–31 and 64–70

Compd. R1 R2 R3Observed

pEC50

PredictedpEC50

Residuals

1O

(CH2)2OH 7.19 7.11 0.08

2N

S

H3C

CF3(CH2)2 OH 7.17 7.10 0.07

3O

OH 5.00 5.11 –0.11

4 N

S

H3CCF3

OH 5.00 5.07 –0.07

5O

OH 5.00 5.21 –0.21

6N

S

H3C

CF3 CH2

CH2

OH 5.00 4.85 0.15

7O

OH 5.95 5.41 0.54

8N

S

H3C

CF3 (CH2)3

(CH2)3

OH 5.00 5.52 –0.52

9aO

CH

CH OH 6.84 7.01 –0.17

10a,bN

S

H3C

CF3 CH

CH OH 7.44 7.12 0.32

11 O

OCH2 OH 5.69 6.43 –0.74

12 N

S

H3C

CF3 OCH2 OH 6.54 6.37 0.17

13 O O

H3C CH3

OH 5.00 4.90 0.10

14 O

SCH2 OH 6.49 6.33 0.16

15cO

OH 7.91 7.67 0.24

16cN

S

H3C

CF3

OH 8.31 7.86 0.45

17dO

OH

18eN

S

H3C

CF3

OH

19 O

NH2

20bN

S

H3C

CF3 NH2(CH2)2

(CH2)2

6.25 6.69 –0.44

6.52 7.20 –0.68

7.29 7.61 –0.32

7.89 7.33 0.56

21 O

NHCH3(CH2)2 7.31 7.43 –0.12

Table 1: Structures of the training and the test set compounds along with the residuals of their observed and predicted pEC50 values(8,9)

CoMSIA study on GPR40 Agonists

Chem Biol Drug Des 2011; 77: 361–372 363

Page 4: CoMSIA Study on Substituted Aryl Alkanoic Acid Analogs as GPR40 Agonists

22 O

NHCH(CH3)2

23bN

S

H3C

CF3 NHCH(CH3)2

24 O

N(CH3)2

25 O

(CH2)2

(CH2)2

(CH2)2

(CH2)2

N

26cN

S

H3C

CF3 NHCH(CH3)2

27cN

S

H3C

CF3

NH

28b,cN

S

H3C

CF3

NH CH3

29cN

S

H3C

CF3NH

NH

30cO

NHOH

31b,cN

S

H3C

CF3

NHOH

32 O OH

O

33b OO

OH

34g O OH

CH3

O

35 OOH

O

36b - OH

O

37f - OH

O

38 - OH

O

39b -O

OH

7.40 7.37 0.03

7.00 7.27 –0.27

6.76 6.74 0.02

6.94 6.66 0.28

7.96 7.78 0.18

8.04 8.09 –0.05

7.27 6.83 –0.56

6.69 6.51 0.18

7.83 7.60 0.23

8.16 7.33 0.83

- 6.40 6.67 –0.27

- 5.50 5.06 0.44

- 4.78 4.62 0.16

- 5.40 5.29 0.11

6.60 6.65 –0.05

7.40 7.08 0.32

6.30 6.06 0.24

5.80 4.92 0.88

40g - OH

CH3

O

41b - OH

O

4.50 4.59 –0.09

4.91 4.58 0.33

42bCl

Cl0.256.356.60--

43 –0.246.346.10--

44 Br

0.365.746.10--

45 S

Cl 0.195.515.70--

46 Cl 0.045.966.00--

Table 1: (Continued)

Bhatt et al.

364 Chem Biol Drug Des 2011; 77: 361–372

Page 5: CoMSIA Study on Substituted Aryl Alkanoic Acid Analogs as GPR40 Agonists

47bCF3 –0.476.175.70--

48b

Cl0.155.255.40--

49 CH3 –0.355.755.40--

50b F –0.165.465.30--

51 Cl

CF3 0.364.945.30--

52 5.10-- 5.19 –0.09

53 N

0.104.404.50--

54 N

–0.014.514.50--

55 O

O–0.695.194.50--

56 OH

O

–0.244.744.50--

57 OH –0.054.554.50--

58 CN –0.324.824.50--

59 OCH3 0.405.505.90--

60b NH2

O

0.244.264.50--

61bN

HN–0.294.794.50--

62

HN

SCH3

OO0.414.094.50--

63

64h

65h

OPh

O

N

S

H3C

CF3

-

CH

CH3

-

OH

OH

6.71

7.00

6.45

6.79

6.69

6.45

–0.08

0.01

0.00

66h

67h

68c,h

69d,h

70d,h

O

N

S

H3C

CF3

N

S

H3C

CF3

O

N

S

H3C

CF3

CH3

CH3

OH

OH

NH2

NHCH(CH3)2

N(CH3)2

6.90

7.23

8.35

7.90

7.18

5.44

7.30

6.84

7.03

6.72

1.46

–0.07

1.51

0.87

0.36

Table 1: (Continued)

CoMSIA study on GPR40 Agonists

Chem Biol Drug Des 2011; 77: 361–372 365

Page 6: CoMSIA Study on Substituted Aryl Alkanoic Acid Analogs as GPR40 Agonists

AqF ;k ðjÞ ¼ �

Xxprobe;k xik e�ar

iq2 ð1Þ

The CoMSIA method incorporates five physicochemical parameters,i.e., steric, electrostatic, hydrophobic, hydrogen bond donor and

hydrogen bond acceptor which are denoted as k in eqn 1, evaluatedusing probe atom. A Gaussian type distance dependence was usedbetween the grid point q and each atom i in the molecule. Adefault value of 0.3 was used as the attenuation factor a.

The CoMSIA descriptors were used as the independent variables andthe pEC50 values were used as the dependent variables in the PLS (16)regression analysis in order to derive the 3D QSAR models using thestandard implementation in the SYBYL package. The predictive value ofthe models was first evaluated by LOO cross-validation (17,18). Thecross-validated coefficient, r 2

cv, was calculated using eqn 2

r 2cv ¼

PðYpredicted � YobservedÞ2PðYobserved � YmeanÞ

ð2Þ

where Ypredicted, Yobserved, and Y mean are predicted, observed,and mean values of the target property (pEC50), respectively andP

(Ypredicted ) Yobserved)2 is the predicted residual sum of squares(PRESS). The optimal number of components (ONC) obtained fromthe cross-validated PLS analysis were used to derive the final QSARmodel using the compounds in the training set without cross-valida-tion. This non-cross-validated correlation coefficient (r 2

ncv) served asa measure of the quality of the model.

The boot strapping analysis (19) for 100 runs and the number ofcross-validations (e.g., two and five) were carried out and confirmedby the average value for 50 runs from each cross-validation. To testthe utility of the model as a predictive tool, an external set of com-pounds (test set) with known activities, but not used in model gen-eration, were predicted. The predictive r2 (r 2

pred), calculated by usingeqn 3, was used on models in the test set and was used to evalu-ate the predictive power of the CoMSIA models

Predictive r 2 ¼ 1� ð0press0=SDÞ ð3Þ

where SD is the sum of squared deviations between actual activi-ties of the compounds in the test set and the mean activity of thecompounds in the training set and 'press' is the sum of the squareddeviations between predicted and actual activities for every com-pound in the test set.

Results and Discussion

Despite attempting various 3D QSAR tools available in SYBYL, CoM-FA technique resulted into statistically inferior model (Table S1).Charge calculation and steric-electrostatic cutoffs also failed inbuilding the statistically significant CoMFA model (data not shown).

A

B

C

Figure 1: (A) Compound 16, bearing the highest GPR40 agonis-tic activity was used as a template for the manual atom fit-basedalignment. The atoms indicated in the figure were used for aligningall the molecules (Note: the atom numbering does not follow IUPACrules). Ligand-based alignment of all the compounds is shown inpanel (B). (C) The atoms of the template molecule 16 indicated inthe figure were used to superimpose onto the bound agonist com-pound 1 within the active site of GPR40 (Note: the atom numberingdoes not follow IUPAC rules).

aCompounds having E (geometrical) isomerism at R2 position.bCompounds included in the test set.cCompounds having (S,S) configuration on the cyclopropyl ring.dCompounds having the (R,R) configuration on the cyclopropyl ring.eCompounds having the (R,S) configuration on the cyclopropyl ring.fCompounds that were modeled as the (S,R) isomer at the chiral carbons on the cyclopropyl ring.gCompounds that were individually modeled as the R and S isomers out of which S isomer was selected for the model generation.hAdditional test set modeled as pure enantiomer and included in the original test set of 15 compounds.

Bhatt et al.

366 Chem Biol Drug Des 2011; 77: 361–372

Page 7: CoMSIA Study on Substituted Aryl Alkanoic Acid Analogs as GPR40 Agonists

Finally, assuming CoMSIA to be an extension of the CoMFA methodand in addition CoMSIA includes both (steric and electrostatic)descriptors of the CoMFA technique, we mainly focused on statisti-cally significant CoMSIA model for further computational analysis.

CoMSIA statistical resultsOf the initial set of 70 compounds, the activities of seven compoundswere reported as racemic mixtures. Hence, these seven compoundswere dropped from the initial model generation, but were later mod-eled as best-aligned enantiomers and included in the test set to vali-date the predictive power, accommodative capacity and therobustness of the CoMSIA training model. The CoMSIA analysis wasperformed for both the ligand and the receptor-based alignmentmodels. The ligand-based model for the training set of 48 compoundsinitially yielded r 2

cv = 0.490 and r 2ncv = 0.952, whereas the receptor-

based model gave the r 2cv = 0.442 and r 2

ncv = 0.951, respectively(Tables 2 and S2). The ligand-based model was found to be betterthan the receptor-based model and was considered for further study.In an effort to generate a statistically significant 3D QSAR modelin terms of cross-validated correlation coefficient with the leaststandard deviation, we used variety of 3D QSAR tools available in theSYBYL, namely different charge calculation methods, field fit, andregion focusing. To be able to evaluate the effect of the charge calcu-lation method on the statistical results, computation of variouscharges such as the Gasteiger–Huckel, Huckel, DelRe and Pullmanwas carried out for the molecules in the training set. The results fromthese computations were inferior to those obtained by the Gasteiger–Marsili method except for the Gasteiger–Huckel that yielded thestatistical results of r 2

cv = 0.561, SEP = 0.807, r 2ncv = 0.955, and

SEE = 0.258, which proved to be slightly better than the Gasteiger–Marsili method. To select the best out of the two models generatedby both charge calculation methods, other techniques such as field fitand region focusing were used. Field fit alignment model with theGasteiger–Marsili charges yielded r 2

cv = 0.523 and r 2ncv = 0.953

(Table 2). However, it yielded poorer results when performed ontomodel generated by the Gasteiger–Huckel charge calculationtechnique (r 2

cv = 0.434 and r 2ncv = 0.945). To further optimize the

model, region focusing was carried out to study its impact onto themodels generated by either charge calculation techniques, aided inthe generation of the best model with an excellent predictive ability.Application of region focusing onto the 48 compound having thecharges evaluated by Gasteiger–Marsili method yielded the bestmodel with r 2

cv = 0.610, r 2ncv = 0.935, SEP = 0.761, SEE = 0.311, and

r 2pred = 0.863 (Table 2). The data obtained by performing the region

focusing onto the compounds having the charges computed by Gastei-ger–Huckel method generated the r 2

cv = 0.603, r 2ncv = 0.907,

SEP = 0.768, SEE = 0.372, and r 2pred = 0.802, proved to be inferior as

it exacerbated the statistics of the model. Finally, region focusing,also performed on the receptor-based model to evaluate its effectonto the statistical data, yielded r 2

cv = 0.519, SEP = 0.845,r 2

ncv = 0.933, and SEE = 0.316, along with a poor r 2pred of 0.599

(Table S2). Therefore, the ligand-based model with the Gasteiger–Marsili charges and region focusing proved to be the best model. Thedetailed experimental and predicted pEC50 values based on this CoM-SIA model for the training set are shown in Table 1 and are plotted inFigure 2.

Figure 2: Plot of observed versus predicted pEC50 values fortraining (e), original test ( ), and additional test ( ) set compoundsbased on the comparative molecular similarity indices analysismodel.

Table 2: Summary of CoMSIA statistical results for the GPR40activity

PLS statistics

CoMSIA models

Atom fit alignment Field fit alignment Region focusing

r 2ncv

a 0.952 0.953 0.935SEEb 0.267 0.263 0.311Ftest

c 116.561 172.128 120.883r 2

cvd 0.490 0.523 0.610

SEPe 0.871 0.838 0.761r 2

predf 0.826 0.831 0.863

PLS componentsg 5 5 5Contribution

Steric 0.13 0.10 0.16Electrostatatic 0.23 0.28 0.21Hydrophobic 0.24 0.21 0.26Donor 0.17 0.17 0.15Acceptor 0.23 0.24 0.22

r 2boot

h 0.950 0.965 0.968SEEboot

i 0.267 0.224 0.213r 2

LHOj 0.439 0.403 0.449

SDLHOk 0.875 0.856 0.865

r 25cv

l 0.525 0.475 0.574SD5cv

m 0.835 0.850 0.815

CoMSIA, comparative molecular similarity indices analysis; PLS, partial leastsquare.aCorrelation coefficient.bStandard error of estimate.cRatio of r2 explained to unexplained = r2 ⁄ (1 ) r2).dCross-validated correlation coefficient after leave-one-out procedure.eStandard error of prediction.fPredicted correlation coefficient for test set of compounds. gOptimal numberof principal components.hAverage of correlation coefficient for 100 samplings using bootstrappedprocedure.iAverage standard error of estimate for 100 samplings using bootstrappedprocedure.jAverage cross-validated correlation coefficient for 50 runs using leave-half-out (LHO) group.kStandard deviation of average cross-validated correlation coefficient for 50runs.lAverage cross-validated correlation coefficient for 50 runs using five cross-validation group.mStandard deviation of average cross-validated correlation coefficient for 50runs.

CoMSIA study on GPR40 Agonists

Chem Biol Drug Des 2011; 77: 361–372 367

Page 8: CoMSIA Study on Substituted Aryl Alkanoic Acid Analogs as GPR40 Agonists

To further assess the robustness of the derived model, the boot-strapping analysis for 100 runs was performed. Bootstrappinginvolves the generation of many new data sets from the originaldata set after randomly choosing samples from the original dataset. The bootstrapped r2 (r 2

boot) of 0.968 obtained through CoMSIAanalysis suggests the existence of a good internal consistencywithin the underlying data set (Table 2).

Furthermore, a cross-validation analysis was performed for the com-pounds in the training set to determine the stability of the CoMSIAmodels. The training set model was cross-validated using two(leave-half-out) and five (leave 20% out) cross-validation groups of

50 times each. The two cross-validation groups yielded an averager 2

cv of 0.449 and a standard deviation of 0.865; and the five cross-validation groups yielded an average r 2

cv of 0.574 and a standarddeviation of 0.815 (Table 2).

Predictive power of the CoMSIA modelThe predictive ability of the CoMSIA model was determined usingthe set of 15 test compounds not included in the model generation.The high predictive r2 value of 0.863 for this set indicates good pre-dictive ability of the model. Observed and predicted pEC50 valuesfor the test set of compounds based on the best CoMSIA modelare shown in Table 1 and are plotted in Figure 2. Furthermore, tobe able to validate the predictive power of the training model morerigorously, a set of seven additional test compounds (compounds64–70) were added to the supplemental set, to bring the finalnumber of test compounds to 22 (Table 1). The activity of theseseven compounds was reported for their corresponding racemic mix-tures. It was assumed a priori that only one of the enantiomers ineach racemic mixture was able to align properly onto the templatemolecule and, thus, to a least residual value. In this way, it waspossible to use the CoMSIA technique to distinguish the activeenantiomer (eutomer) from the inactive enantiomer (distomer), a sit-uation that permitted to decrease the ratio of the training com-pounds to test compounds (approximately one test compound fortwo training compounds). Interestingly, although the size of this testset was fairly large, the CoMSIA model was still able to produce asignificantly high predictive r2 (0.675 for 22 test compounds). Thehigh predictive power of the CoMSIA model suggest that this modelpossess a high accommodating capacity and, hence, wide applica-bility in the development of novel GPR40 agonists. Furthermore, tovalidate the training model more rigorously, we have predicted theactivity of structurally diverse test set of three compounds that arereported in the literature (20–22). Though the structures of thesethree compounds were remotely related to the training set, theirpredicted pEC50 values were within a close range of the reportedvalue, thus proving the wide application of the generated model(Table S3).

CoMSIA contour mapIn the CoMSIA steric field, the green (sterically favorable) and yel-low (sterically unfavorable) contours represent 81% and 8.5% con-tributions, respectively. The CoMSIA steric contour map in thepresence of most potent compound 16 (pEC50 = 8.31) is depicted inFigure 3A. A green contour (G) indicates that the steric bulk isfavored at this position. The truncated bioisosteric R1 substitutionsamong the aryl propionic acid derivatives exhibiting poor activity(pEC50 < 6.0), for example, aryl halides (compounds 45-48, 50, and51), 3-pyridyl (compound 53), phenol (compound 57), aryl nitrile(compound 58), benzamide (compound 60), and 2-imidazolyl (com-pound 61) groups prevented the exposure of R1 substituent to G.However, among the phenoxyacetic acid and the arylpropionic acidderivatives, compounds 12 (pEC50 = 6.54) and 63 (pEC50 = 6.71)are found to be considerably potent as the para-trifluoromethylphe-nyl and the para-phenoxy substituents on their R1 positions, respec-tively, are oriented near G. Changing the R1 substituent from the4-methyl-2-(4-trifloromethylphenyl)thiazole, which oriented into G, to

A

B

Figure 3: (A) Comparative molecular similarity indices analysis(CoMSIA) stdev* coeff steric contour map. The green (G) contourindicates the region where bulky groups increase the agonisticactivity, and yellow (Y1 and Y2) contours indicate the regions wherebulky groups decrease the agonistic activity. The most potent com-pound 16 (ball and stick) overlaid onto the steric contour map ismapped within the active site (stick) of GPR40. (B) CoMSIA stdev*coeff electrostatic and hydrophobic contour maps. The most potentcompound 16 (ball and stick) overlaid onto the electrostatic andhydrophobic contour maps is mapped within the active site (stick)of GPR40. Blue (B) contour indicates the region where the electro-positive groups increase the agonistic activity, whereas the red (R)contour indicates the region where the electronegative groupsincrease the agonistic activity. The orange (O) contour indicates theregion where the hydrophobic groups increase the agonistic activity,whereas the white (W) contour indicates the region where hydro-philic groups increase the agonistic activity.

Bhatt et al.

368 Chem Biol Drug Des 2011; 77: 361–372

Page 9: CoMSIA Study on Substituted Aryl Alkanoic Acid Analogs as GPR40 Agonists

the 3-phenoxyphenyl group, which is oriented away from G, pre-served the agonistic activity in several compounds [compound 2

(pEC50 = 7.17) versus 1 (pEC50 = 7.19); 10 (pEC50 = 7.44) versus 9

(pEC50 = 6.84); 16 (pEC50 = 8.31) versus 15 (pEC50 = 7.91); 20

(pEC50 = 7.89) versus 19 (pEC50 = 7.29); 23 (pEC50 = 7.0) versus 22

(pEC50 = 7.40) and 31 (pEC50 = 8.16) versus 30 (pEC50 = 7.83)],probably because of the consistently favorable edge-to-face pi–piinteraction of the 3-phenoxy moiety with His137. The importance ofHis137 residue has already been established by site-directed andmolecular docking studies (12,23,24).

The medium sized yellow contour (Y1) indicates that the steric bulkat this position is detrimental to GPR40 agonistic activity. One ofthe methyl groups of the N,N-dimethylbenzylamine substituent atthe R1 position of the poorly active compound 54 (pEC50 = 4.50)was found to undergo steric hindrance with the side chain amidegroup of Asn244 in the active site. The phenyl ring attached to themethylamine substituent in case of the moderately potent com-pounds 32 (pEC50 = 6.40) and 38 (pEC50 = 6.30) occupies Y1. Thephenyl ring of these compounds undergoes the potential steric hin-drance with the side chain of Asn244 and guanidine group ofArg258. None of the groups of the potent molecules 16

(pEC50 = 8.31), 20 (pEC50 = 7.89), and 31 (pEC50 = 8.16) are foundto occupy Y1. Another large yellow contour (Y2) was found to sur-round the propionic acid head group of the least active compound53 (pEC50 = 4.50). The polar head group of the poorly active com-pounds 6 (pEC50 = 5.0), 7 (pEC50 = 5.95), 8 (pEC50 = 5.0), 33

(pEC50 = 5.50), and 39 (pEC50 = 5.80) completely penetrates intoY2. Structure-based analysis confirms the poor activity of thesecompounds to stem from the steric hindrance of their polar headgroups with the side chains of active site residues Ala146, Ala173,Trp174, Asp175, and Arg183 (12,23,24), all of which are confined toY2. The methyl branching at the a-carbon of the R2 substituent incompound 13 (pEC50 = 5.0) was found to exhibit steric hindrancewith the side chains of Ala146, Trp174, and Arg183 (10) thatreflected in its poor activity. In case of compound 25

(pEC50 = 6.94), the pyrrolidine ring at the R3 position was found topartially occupy Y2, hence responsible for its moderate activity.Structure-based analysis revealed the steric hindrance of the pyrrol-idine ring with the amino acid residues surrounding Y2. The N,N-dimethylamine at the R3 position of moderately active compound24 (pEC50 = 6.76) was found to partially occupy Y2. Here, one ofthe methyl group of the N,N-dimethylamine function was found toundergo steric hindrance with the amino acids circumscribing Y2.None of the groups of the highly active compounds 15

(pEC50 = 7.91), 16 (pEC50 = 8.31), 27 (pEC50 = 8.04), and 31

(pEC50 = 8.16) were found to penetrate into Y2.

In the CoMSIA electrostatic field, the red (electronegative chargefavorable) and blue (electropositive charge favorable) contours rep-resent 30% and 70% contributions, respectively. The CoMSIA elec-trostatic contour map in the presence of most potent compound 16

(pEC50 = 8.31) is depicted in Figure 3B. The small red contour (R)suggests that the electronegative group at this position is favorablefor GPR40 agonistic activity. The potent compounds 1

(pEC50 = 7.19), 9 (pEC50 = 6.84), 15 (pEC50 = 7.91), 19

(pEC50 = 7.29), 22 (pEC50 = 7.40), and 30 (pEC50 = 7.83), bearingthe meta-phenoxy substituted phenyl ring at the R1 position were

found to locate their electronegative ether oxygen atom in closeproximity to R. Compound 1 (pEC50 = 7.19) exhibited higher activitythan compound 63 (pEC50 = 6.71) even though both of them carrythe same R1 substitution. The phenoxy substituent in compound 1

is at the meta position and in compound 63, it is at the para posi-tion. This regioisomeric difference allows the ether oxygen atom incompound 1 to localize nearer to R than in compound 63. More-over, the ether oxygen atom linking the two phenyl rings at the R1

position of the moderately active compounds 14 (pEC50 = 6.49) and24 (pEC50 = 6.76) tends to orient close to R.

The blue contour (B) indicates that the electropositive groups at thisposition are favorable for GPR40 agonistic activity. The potent com-pounds 1 (pEC50 = 7.19), 2 (pEC50 = 7.17), 15 (pEC50 = 7.91), 16

(pEC50 = 8.31), 20 (pEC50 = 7.89), 26 (pEC50 = 7.96), 27

(pEC50 = 8.04), 30 (pEC50 = 7.83), and 31 (pEC50 = 8.16) have theira-carbon atoms (carbon atoms linked to the electronegative car-bonyl group) oriented into B. The carbon atoms of the cyclopropylring in case of compound 16 (pEC50 = 8.31) were found to occupyB. The carbonyl group attached to the cyclopropane ring, beingelectronegative in nature, tends to pull the electrons from the cyclo-propane ring carbon atoms by an inductive effect, thus making thecyclopropyl ring slightly electropositive in nature. The poorly activecompounds (pEC50 < 6) 3 (carbonyl oxygen), 4 (carbonyl oxygen), 13

(ether oxygen), 34 (carbonyl oxygen), 35 (carbonyl oxygen), 40 (car-bonyl oxygen), and 41 (carbonyl oxygen) have the electronegativeoxygen atom occupying B. This feature suggests that the shorteningof the acid or substituted amide side chain from the one present inthe reference molecule is detrimental to GPR40 agonistic activity asit tends to orient the electronegative carbonyl oxygen atom into B.The higher activity of compound 14 (pEC50 = 6.49) compared withcompound 11 (pEC50 = 5.69) may be attributed to a difference inelectronegativity at the heteroatom on their R2 position, which issurrounded by B. The isosteric replacement of the oxygen atom(more electronegative) in compound 11, with the sulfur atom (lesselectronegative) in compound 14 at the R2 position, may be respon-sible for the higher potency associated with compound 14 thanwith compound 11. The a-carbon atom linked to the electronega-tive carbonyl group of the polar head group of the moderatelyactive compounds 25 (pEC50 = 6.94), 42 (pEC50 = 6.60), 44

(pEC50 = 6.10), and 46 (pEC50 = 6.00) was found to penetrate intoB. The cyclopropane ring carbon atoms in case of the moderatelyactive compounds 17 (pEC50 = 6.25) and 18 (pEC50 = 6.52) werealso surrounded by B.

In the CoMSIA hydrophobic field, the orange (hydrophobic favorable)and the white (hydrophobic unfavorable) contours represent 88%and 5% contributions, respectively. For visualization, the mostpotent compound 16 (pEC50 = 8.31) is overlaid in the CoMSIAhydrophobic contour map shown in Figure 3B. The orange contour(O) suggests that the hydrophobic group at this position will favorGPR40 agonistic activity. Structure-based analysis revealed that thehydrophobic environment at the active site contributed significantlyto the binding of the lipophilic tails of the agonists. Compoundsbearing a 4-methyl-2-(4-trifloromethylphenyl)thiazole lipophilic tail atthe R1 position have this substituent occupying the hydrophobicpocket formed by the side chains of Leu90, Tyr91, and Leu190 asdescribed previously (12,23,24). On the other hand, analogs bearing

CoMSIA study on GPR40 Agonists

Chem Biol Drug Des 2011; 77: 361–372 369

Page 10: CoMSIA Study on Substituted Aryl Alkanoic Acid Analogs as GPR40 Agonists

a meta-phenoxyphenyl lipophilic tail have this substituent orientedinto the hydrophobic pocket formed by the side chains of Leu90,Tyr91, His137, and Leu190. Structure-based analysis revealed thatthe imidazole ring of His137 largely contributes to the edge-to-facepi–pi interactions (12,23,24) with the terminal phenyl ring of agon-ists bearing the meta-phenoxyphenyl at their R1 position. In case ofthe moderately active compounds 42 (3,4-dichlorophenyl), 44 (m-bromophenyl), and 46 (p-chlorophenyl) the smaller hydrophobic sub-stitution at the R1 position limits the exposure of these groups toO. However, in case of the poorly active compounds 54 (N,N-dim-ethylamine), 55 (1,3-dioxolane) 56 (carboxylic acid), 57 (hydroxyl),60 (amide), and 62 (N-substituted methanesulfonamide), all exhibit-ing pEC50 of 4.5, their hydrophilic groups at R1 are oriented into O.The polar R1 substituents in these compounds are found orientedinto the hydrophobic pocket formed by the side chains of Phe87and Leu90. The analysis of the active site revealed that the methylof the methoxy group at the R1 position in compound 59 is ori-ented into the hydrophobic pocket of the receptor, making it morepotent than compound 57 which, instead, bears a hydroxyl groupat that position. The para-trifluoromethylphenyl group and the termi-nal phenyl ring on the R1 positions of compounds 12 (pEC50 = 6.54)and 32 (pEC50 = 6.40), respectively, as well as the phenyl ring onthe R3 position of compound 38 (pEC50 = 6.30) are found in closeproximity to O. Upon analysis of the active site amino acids in theimmediate surroundings of the aforesaid substituents, it was deter-mined that the para-trifluoromethylphenyl and the phenyl ring ofphenoxy substituent at R1 of moderately active compounds 12 and32, respectively, were oriented into the hydrophobic pocket formedby the side chains of Leu90, Tyr91, His137, and Leu190, whereasthe R3 position of compound 38 was found to occupy the hydropho-bic pocket formed by the side chains of His137, Val141, andLeu190. Also, the terminal phenyl rings of the R1 substituent of themoderately active compounds 24 (pEC50 = 6.76) and 25

(pEC50 = 6.94) were found to undergo edge-to-face pi–pi interac-tions with the imidazole ring of His137 and phenyl ring of Tyr91(12,23,24) because the substituted phenyl rings of compounds 24

and 25 at R1 are oriented close to O.

The white contour (W) indicates that the hydrophilic bulk is toleratedat this position. The terminal polar carboxylic acid group of com-pound 16 (pEC50 = 8.31) and the N-hydroxyacetamide group of com-pounds 30 pEC50 = 7.83 and 31 pEC50 = 8.16 were oriented into W.The guanidine group of Arg183, mapped close to the polar headgroup of the data set, is known to exist in the positively chargedguanidinium ion state in the receptor environment and, thus, enteringinto ionic interactions with the polar head groups of the agonists(12,23,24). The terminal polar acid moiety of the least potent com-pound 53 was oriented away from W.

The hydrogen bond donor contour maps from CoMSIA analyses aredepicted in Figure 4A in the presence of most potent compound16. The cyan (donor favorable) and the purple (donor unfavorable)contours represent the 80% and 20% contributions, respectively.The cyan contour (C) indicates that the hydrogen bond donatinggroups at this position are favorable. The potent compounds 16

(carboxylic acid), 19 (primary amide), 20 (primary amide), 21 (N-methylpropanamide), and 22 (N-isopropylpropanamide) have at leastone hydrogen bond donor group which is oriented toward C, with a

potential possibility of hydrogen bonding interaction with the back-bone of Ser178. In case of compound 16, the carboxyl moietyforms a hydrogen bond with the backbone of Ser178 (COOH-O=C;2.48 �), whereas in case of compounds 21 and 22, the terminalsecondary amide group was found to be 3.2–3.5 � away from thebackbone of Ser178. The –NH2 group of the terminal amide moietyin compounds 19 and 20 was found to be in close proximity(2.71–3.20 �) to the backbone of Ala173, thus exhibiting potentialfor hydrogen bonding interaction. In the potent compounds 16

(pEC50 = 8.31), 19 (pEC50 = 7.29), 20 (pEC50 = 7.89), 21

(pEC50 = 7.31), and 22 (pEC50 = 7.40), a three-carbon long polarhead group was found optimum for placing the donor group in C. Incontrast, in the poorly active compounds 3 (pEC50 = 5.0), 4

(pEC50 = 5.0), 5 (pEC50 = 5.0), 6 (pEC50 = 5.0), 34 (pEC50 = 4.78),35 (pEC50 = 5.40), 40 (pEC50 = 4.50), and 41 (pEC50 = 4.91), theirshorter polar head group placed the donor groups away from C.The purple contour (P) indicates the presence of unfavorable donorgroups at this position. The carboxyl moiety in the poorly activecompound 13 (pEC50 = 5.0) is oriented toward P, which accounts

A

B

Figure 4: (A) Comparative molecular similarity indices analysis(CoMSIA) stdev* coeff hydrogen bond donor contour maps. Thetemplate molecule 16 (ball and stick) along with the contours ismapped within the active site (stick) of GPR40. Cyan (C) and purple(P) contours indicate the favorable and unfavorable donor groups,respectively, for GPR40 agonistic activity. (B) CoMSIA stdev* coeffhydrogen bond acceptor contour maps. The reference compound 16

(ball and stick) displayed with the acceptor contour maps is mappedwithin the active site (stick) of GPR40. Magenta (M) contour indi-cates a region where the acceptor groups are favored, and the red(R1, R2 and R3) contours indicate regions where the acceptorgroups do not favor GPR40 agonistic activity.

Bhatt et al.

370 Chem Biol Drug Des 2011; 77: 361–372

Page 11: CoMSIA Study on Substituted Aryl Alkanoic Acid Analogs as GPR40 Agonists

for its lower activity. None of the donor groups of the potentcompounds 16 (pEC50 = 8.31), 19 (pEC50 = 7.29), 20 (pEC50 = 7.89),21 (pEC50 = 7.31), and 22 (pEC50 = 7.40) were found to occupy P.

The hydrogen bond acceptor contour map from CoMSIA in the pres-ence of the most potent compound 16 is depicted in Figure 4B.The magenta (acceptor favorable) and the red (acceptor unfavorable)account for 81% and 25% contributions, respectively. A magentacontour (M) points to a hydrogen bond acceptor group favorableregion. The oxygen atoms of the terminal carboxyl group and theoxygen atom and nitrogen atom of the terminal substituted amideof the ligands undergo multiple hydrogen bonding ⁄ ionic interactionswith the guanidine group of Arg183 (C=O-HN; 2.90–3.20 �) asdescribed previously (12,23,24). The carbonyl oxygen atom of thepotent compounds 16 (pEC50 = 8.31), 26 (pEC50 = 7.96), 27

(pEC50 = 8.04), and 28 (pEC50 = 7.27) has a potential to enter intohydrogen bonding with the guanidine group of Arg183 (C=O-HN;2.51–3.03 �), as suggested by the presence of M above thecarbonyl groups of these compounds. Additionally, the second oxy-gen atom of the acid moiety in the potent compound 16 may enterinto ionic interactions with the guanidine group of Arg183. In com-pounds 16, 26, 27, and 28, the S,S configuration at the cyclopro-pyl ring aligned their carbonyl groups toward M. In contrast, thepoorly active compounds 3 (pEC50 = 5.0) and 4 (pEC50 = 5.0) ori-ented the carbonyl group of their acid moiety away from M. Struc-ture-based analysis revealed that the carbonyl moieties of theterminal acid groups of these two compounds were oriented awayfrom the guanidine group of Arg183, thereby decreasing the poten-tial for the formation of the hydrogen bonding and ionic interac-tions. The moderately active compounds 17 (pEC50 = 6.25) and 18

(pEC50 = 6.52), by having the R,R and R,S configurations at the R2

cyclopropyl ring, respectively, oriented the carboxylic carbonylgroups away from M.

The three acceptor-unfavorable red contours (R1, R2 and R3) are dis-played in Figure 4B along with the most potent compound 16. Thecarbonyl oxygen atom of the poorly active compounds 5

(pEC50 = 5.0), 6 (pEC50 = 5.0), 34 (pEC50 = 4.78), 35 (pEC50 = 5.40),40 (pEC50 = 4.50), and 41 (pEC50 = 4.91), bearing the acetic acidhead group, fall into R1. The poor activity of compounds exhibiting abutanoic acid head group, namely 7 (pEC50 = 5.95), 8 (pEC50 = 5.0),33 (pEC50 = 5.50), and 39 (pEC50 = 5.80) may be because of theircarbonyl oxygen atom occupying the R2 contour. The carbonyl oxygenatom of the phenoxyacetic acid derivatives 11 (pEC50 = 5.69) and 12

(pEC50 = 6.54) and phenylthioacetic acid derivative 14 (pEC50 = 6.49)was found to orient toward the R3, a feature that lowers their activ-ity. The higher activity of compound 26 (pEC50 = 7.96) relative tocompound 23 (pEC50 = 7.0) is attributed to the S,S configuration ofits cyclopropyl ring in compound 26, which orients the carbonyl oxy-gen atom toward M. The absence of the cyclopropyl ring in case ofcompound 23, however, oriented the carbonyl oxygen atom awayfrom M, but toward R3.

Conclusions

The CoMSIA 3D QSAR method was used to build a statistically sig-nificant model with excellent correlative and predictive power for

probing the GPR40 agonistic activities of aryl alkanoic acid deriva-tives. The predictive power of the model was also evident by itsability to accurately predict the biological activities of the ligandsbearing the diverse scaffolds as well as its ability to distinguisheutomer from the distomer in the data set. The r 2

cv and r 2ncv values

of the CoMSIA model were good enough to suggest that all thereported agonists bind to GPR40 in an almost a similar fashion.Major contributors to the agonistic activity are lipophilic tail point-ing toward the hydrophobic pocket of GPR40 active site and electro-static and acceptor groups. A comparison of the CoMSIA contourmaps with the previously proposed GPR40 active site structural fea-tures showed good correlation between the two analyses. Theresults derived from structure-based CoMSIA analysis of GPR40 ag-onists provide a better understanding of important ligand–receptorinteractions, can serve as guidelines for ligand design, and can beused as a predictive model for optimizing the design of new GPR40agonistic molecules.

Acknowledgments

Support to TT in the form of start-up funds and resources from theCollege of Pharmacy, St. John's University are gratefully acknowl-edged. We thank Mr. Shridhar Kulkarni and Dr Santosh Khedkar forcritical readings of the manuscript and for helpful discussions.

Conflict of Interest

The authors declare no competing interests.

References

1. Nunez E.A. (1997) Biological complexity is under the 'strangeattraction' of non-esterified fatty acids. Prostaglandins LeukotEssent Fatty Acids;57:107–110.

2. Briscoe C.P., Tadayyon M., Andrews J.L., Benson W.G., Cham-bers J.K., Eilert M.M., Ellis C. et al. (2003) The orphan G-pro-tein-coupled receptor GPR40 is activated by medium and longchain fatty acids. J Biol Chem;278:11303–11311.

3. Itoh Y., Kawamata Y., Harada M., Kobayashi M., Fujii R., Fukus-umi S., Ogi K. et al. (2003) Free fatty acids regulate insulinsecretion from pancreatic b cells through GPR20. Nat-ure;422:173–176.

4. Kotarsky K., Nilsson N.E., Flodgren E., Owman C., Olde B. (2003)A human cell surface receptor activated by free fatty acids andthiazolidinedione drugs. Biochem Biophys Res Commun;301:406–410.

5. Steneberg P., Rubins N., Shifman R.B., Walker M.D., Edlund H.(2005) The FFA receptor GPR40 links hyperinsulinemia, hepaticsteatosis, and impaired glucose homeostasis in mouse. CellMetab;1:245–258.

6. Gromada J. (2006) The free fatty acid receptor GPR40 generatesexcitement in pancreatic b-cells. Endocrinology;147:672–673.

7. Negoro N., Sasaki S., Mikami S., Ito M., Suzuki M., Tsujihata Y.,Iti R. et al. (2010) Discovery of TAK-875: a potent, selective, and

CoMSIA study on GPR40 Agonists

Chem Biol Drug Des 2011; 77: 361–372 371

Page 12: CoMSIA Study on Substituted Aryl Alkanoic Acid Analogs as GPR40 Agonists

orally bioavailable GPR40 agonist. ACS Med Chem Lett;1:290–294.

8. Garrido D.M., Corbett D.F., Dwornik K.A., Goetz A.S., LittletonT.R., McKeown S.C., Mills W.Y., Smalley T.L. Jr, Briscoea C.P.,Peat A.J. (2006) Synthesis and activity of small molecule GPR40agonists. Bioorg Med Chem Lett;16:1840–1845.

9. McKeown S.C., Corbett D.F., Goetz A.S., Littleton T.R., BighamE., Briscoe C.P., Peat A.J., Watson S.P., Hickey D.M. (2007) Solidphase synthesis and SAR of small molecule agonists for theGPR40 receptor. Bioorg Med Chem Lett;17:1584–1589.

10. Klebe G., Abraham U., Mietzner T. (1994) Molecular similarityindices in a comparative analysis (CoMSIA) of drug molecules tocorrelate and predict their biological activity. J Med Chem;37:4130–4146.

11. Cramer R.D. III, Patterson D.E., Bunce J.D. (1988) Comparativemolecular field analysis (CoMFA). 1. Effect of shape on bindingof steroids to carrier proteins. J Am Chem Soc;110:5959–5967.

12. Tikhonova I.G., Sum C.S., Neumann S., Thomas C.J., RaakaB.M., Costanzi S., Gershengorn M.C. (2007) Bidirectional itera-tive approach to the structural delineation of the functional``Chemoprint'' in GPR40 for agonist recognition. J MedChem;50:2981–2989.

13. Clark M., Cramer R.D. III, Opdenbosch N.V. (1989) Validation ofthe general purpose Tripos 5.2 force field. J ComputChem;10:982–1012.

14. Gesteiger J., Marsili M. (1980) Iterative partial equalization oforbital electronegativity: a rapid access to atomic charges. Tetra-hedron;36:3219–3228.

15. Patel P.D., Patel M.R., Basu N.K., Talele T.T. (2008) 3D QSARand molecular docking studies of benzimidazole derivatives ashepatitis C virus NS5B polymerase inhibitors. J Chem InfModel;48:42–55.

16. Staahle L., Wold S. (1987) Partial least squares analysis withcross-validation for the two- class problem: a Monte Carlostudy. J Chemom;1:185–196.

17. Cramer R.D. III, Bunce J.D., Patterson D.E. (1988) Crossvalida-tion, bootstrapping and partial least squares compared withmultiple regression in conventional QSAR studies. Quant StructAct Relat;7:18–25.

18. Wold S. (1978) Cross validatory estimation of the number ofcomponents in factor and principal components models. Techno-metrics;4:397–405.

19. Wong R., Geladi P., Wold S., Esbensen K. (1988) Source contri-butions to ambient aerosol calculated by descriminant partialleast squares regression (PLS). J Chemom;2:281–296.

20. Zhou C., Tang C., Chang E., Ge M., Lin S., Cline E., Tan C. et al.(2010) Discovery of 5-aryloxy-2,4-thiazolidinediones as potentGPR40 agonists. Bioorg Med Chem Lett;20:1298–1301.

21. Christiansen E., Urban C., Merten N., Libescher K., Karlsen K.K.,Hamacher A., Spinrath A., Bond A.D., Drewke C., Ullrich S., Kas-sack M.U., Kostenis E., Ulven T. (2008) Discovery of potent andselective agonists of free fatty acid receptor 1 (FFA1 ⁄ GPR40), apotent target for treatment of type II diabetes. J MedChem;51:7061–7064.

22. Song F., Lu S., Gunnet J., Xu J., Wines P., Proost J., Liang Y.,Baumann C., Lenhard J., Murray W.V., Demarest K.T., Kuo G.H.(2007) Synthesis and biological evaluation of 3-aryl-3-(4-phen-oxy)-propionic acid as a novel series of G protein-coupled recep-tor 40 agonists. J Med Chem;50:2807–2817.

23. Lu S.-Y., Jiang Y.-J., Lv J., Wu T.-X., Yu Q.-S., Zhu W.-L. (2010)Molecular docking and molecular dynamics simulation studies ofGPR40 receptor–agonist interactions. J Mol GraphModel;28:766–774.

24. Sum C.S., Tikhonova I.G., Neumann S., Engel S., Raaka B.M.,Costanzi S., Gershengorn M.C. (2007) Identification of residuesimportant for agonist recognition and activation in GPR40. J BiolChem;282:2948–2955.

Note

a Tripos Inc., 1699 South Hanley Rd., St Louis, MO, USA.

Supporting Information

Additional Supporting Information may be found in the online ver-sion of this article:

Figure S1. Receptor Based Model where the dataset was alignedonto the template molecule 16 using the manual atom fit align-ment.

Table S1. Summary of CoMFA statistical results for the GPR40activity.

Table S2. Comparision of CoMSIA results for the ligand- and thereceptor-based models.

Table S3. Observed and predicted pEC50 values of the diverse setof GPR40 agonists.

Please note: Wiley-Blackwell is not responsible for the content orfunctionality of any supporting materials supplied by the authors.Any queries (other than missing material) should be directed to thecorresponding author for the article.

Bhatt et al.

372 Chem Biol Drug Des 2011; 77: 361–372