Upload
karley-leuty
View
218
Download
1
Tags:
Embed Size (px)
Citation preview
Presentation of a Structurally Diverse and Commercially Available Drug Data Set for Correlation and Benchmarking Studies
Anders KarlénUppsala University
OHOHO
O
NH2
NH2
H2N
NH2OHO
OHO
OH
OO
HOOH
NH2H CH2NH2
H3C
HO
CH3
CH3
O CH3CH3 CH3
CH3
CH3
H2NSH
O
P
P
O
HOHO
HO
HO OH
NH2CH3
O
H3C CH3
O
NO
OHNH
O
O
N
O
O
H
CH3H3C
H3C
CH3S
NH3C
CH3
OO
Aim of study
• Derive a “benchmark data set“– Drug-like– Physicochemically diverse – Commercially available and inexpensive– Amenable to analytical measurements
• Start the generation of benchmark data– Derive good-quality data from the same
lab
Possible use of the data set
• General description of drugs• Developing ADME/TOX filters
(permeability, solubility, plasma protein binding etc.)
• To validate novel experimental techniques
Generation of a ”benchmark” data set based on the list of drugs in Sweden (FASS 2001)
691 cpds
Remove compounds•Molecular weight >900•Polymers, polypeptides•Inorganic and metal containing
799 cpds 370 cpds
Select commercially available< $800/g
332 cpds
•Select only oral, nasal, pulminal, ocular, parenteral and rectal administered drugs
284 cpds
Remove “odd” ATC classese.g. A01(Mouth and teeth),A05(Bile acids)A06 (Laxative)…
Exp.design
24-compound data set
450
Cost and availability of the 691-compound data set
Histogram
Binned Price/gram ($)0.0284 - 24.9 24.9 - 50.2 50.2 - 79.6 79.6 - 100 100 - 995 995 - 3228000
50
100
150
200
450 of the 691 compounds can be boughtPrice range $0.03/gram - $3,228 000/gram (2001)
NN
N
N
Methenamine
HO
CH2
OH
H
H3C
CH3
OH
CH3
Calcitrol
Back0.03 -24.9 24.9 – 50.2 50.2 – 79.6 79.6 – 100 100 – 995 995 – 3,228 000
-10
-8
-6
-4
-2
0
2
4
6
8
-8 -6 -4 -2 0 2 4 6 8 10 12 14 16SIMCA-P 11 - 2006-11-01 16:08:45
Principal component analysis
Lipophilicity
Size
Polarity
• General descriptors
• General hydrogen bonding descriptors
• Hydrogen bond donor descriptors
• Hydrogen bond acceptor descriptors
28 molecular descriptors
Principal component analysis
-10
-8
-6
-4
-2
0
2
4
6
8
-8 -6 -4 -2 0 2 4 6 8 10 12 14 16
t[2]
t[1]
Series (Variable MOL_WEIGHT)
0 - 200200 - 400400 - 600600 - 800800 - 1000
SIMCA-P+ 11 - 2006-11-10 10:27:53
-10
-8
-6
-4
-2
0
2
4
6
8
-8 -6 -4 -2 0 2 4 6 8 10 12 14 16
t[2]
t[1]
Series (Variable MLOGP)
-7 - -4-4 - -1-1 - 22 - 55 - 8
SIMCA-P+ 11 - 2006-11-10 10:32:21
-10
-8
-6
-4
-2
0
2
4
6
8
-8 -6 -4 -2 0 2 4 6 8 10 12 14 16
t[2]
t[1]
Series (Variable PSASAVOL)
0 - 100100 - 200200 - 300300 - 400
SIMCA-P+ 11 - 2006-11-10 10:34:12
-10
-8
-6
-4
-2
0
2
4
6
8
-8 -6 -4 -2 0 2 4 6 8 10 12 14 16SIMCA-P 11 - 2006-11-01 16:08:45
Polarity
SizeLipophilicity
The factorial design“A face-centered central composite
design”
+ + -
+ - -
+ - +
- + -
+ + +- + +
- - +
- - -
20 proteolytes4 nonproteolytes
NSH
COOHO
H2N-SO2
F3C
S
NH
NH
OO
SNH
NH
O
NH
O
N
N
OO
COOHOH
OHNH2
H
COOH
NH2
H
O
I
I
OH
I
I
N
NSH
NH2 F
HOOC
S
ON
NCl
NH2 NH2
NH
NH2
O NH
N
NH2O
Cl
S
N
S
NH
NHH2N-SO2
Cl
OO
NH
OO
Cl
O
OH
O
H H
H
OHO
SNN
NO2
OO
CF3
NOH
SN
Cl
NH2
NH
O
N
O
Cl OO
O
O
O OHOH
O
OHHH
OH
OH
O
NH2
N
N
N N
N
NH2
OH
NH
NH
O H COOH
COOH
ONH
O
ONH2
O
N
NCl
OH
N
OH
OH
O
O
O O
O
OHOH
O
OH
O OH
N
O
Captopril ()
Bendroflumethiazidea ( )
Glipizide ( )
Levodopa ()
Levothyroxine ( )
Thiamazole ( )
Amantadine ( )
Sulindac ( )
Amiloride ()
Carbamazepine ( )
Chlorprothixene ( )
Hydrochlorothiazide ( )
Chlorzoxazone ( )
Prednisone ()
Tinidazole ( )
Flupenthixol ( )
Metoclopramide ()
Fenofibrate ( )
Tetracycline ()
Folic acid ( )
Carisoprodola ()
Meclizinea ( )
Terfenadineb ( )
Erythromycin ( )
24-compound data set
The cost of buying the entire data set (at least 1 gram of each compound) is less than $1,500
Comparison of the data sets with respect
to some common molecular descriptors 691-compound data set 24-compound data set
Min Max Mean Min Max Mean
MW 60 854 347 114 777 349
PSA 0 373 93 8 246 99
logPMor 6.4 7.6 1.9 2.0 5.3 1.9
logDACD_6.5 10.6 12.3 0.74 5.0 4.8 0.94
HBD 0 19 2.4 0 8 2.7
HBA 0 19 4.9 1 14 4.7
OHOHO
O
NH2
NH2
H2N
NH2OHO
OHO
OH
OO
HOOH
NH2H CH2NH2
N
NO
CH3
O ON
NN
NHO CH3O
O
Candesartan cilexetillogPMor= 7.6
NeomycinHBD = 19
Comparison of the data sets with respect to functional groups
0,00%
25,00%
50,00%
75,00%
ALIPHATI
C q-A
MIN
E
ALIPHATI
C t-AM
INE
ALIPHATI
C s-AM
INE
ALIPHATI
C p-A
MIN
E
COOH
BENZENE
ALIPHATI
C OH
AROMATIC
t-AMIN
E
AROMATIC
s-AMIN
E
AROMATIC
p-A
MINE
AROMATIC
OH
ESTER
HETEROCYCLIC
Functional group
Pe
rce
nt
of
co
mp
ou
nd
s c
on
tain
ing
th
e f
un
cti
on
al g
rou
p
24-set
FASS (druglike only)691- set
Number of substances Percent of dataset
ATC Description 24-set 691-set 24-set 691-setA GI 1 69 4,2% 9,99%B Blood 0 21 0,0% 3,04%C Cardio 2 89 8,3% 12,88%D Topical 0 36 0,0% 5,21%G Gen.hormones 1 38 4,2% 5,50%H Hormones 3 14 12,5% 2,03%J Infection 5 89 20,8% 12,88%L Tum.,immuno 1 53 4,2% 7,67%M Muscle,mov. 3 37 12,5% 5,35%N Nervous 6 134 25,0% 19,39%P Antiparasite 0 13 0,0% 1,88%R Respiration 1 52 4,2% 7,53%S Eye,ear 1 24 4,2% 3,47%V Various 0 22 0,0% 3,18%
Distribution in ATC
Comparison of the data sets with respect to ATC classes
The Anatomical Therapeutic Chemical (ATC) classification system is the most commonly used classification system for drug substances
Start the generation of benchmark data.Derive good-quality data from the same
lab
1. Measurment of pKa by pH-metric or pH-UV technique (n=20)
2. Measurment of lipophilicity(a) pH-metric logP (n=18)(b) capacity factors by RP-HPLC (n=21)
3. Measurment of intrinsic and kinetic solubility pH-metric solubility (CheqSol technique) or shake-plate solubility (n=17)
4. Measurment of permeability across Caco-2 Cells. A to B direction (n=22)
2. LipophilicitypH-metric measurment of logP and logD
-3,00
-2,00
-1,00
0,00
1,00
2,00
3,00
4,00
5,00
6,00
7,00
Aman
tadine
Amilo
ride
Bendr
oflum
ethia
zide
Capto
pril
Chlorp
roth
ixene
Chlorz
oxaz
one
Erythr
omyc
in
Fenof
ibrat
e
Flupen
thixo
l
Glipizi
de
Hydro
chlor
othia
zide
Levo
dopa
Levo
thyr
oxine
Mec
lizine
Met
oclop
ram
ide
Sulind
ac
Terfen
adine
Tetrac
yclin
e
Thiam
azole
Tinida
zole
Series1
Series2logP (neutral)logD (pH 7.4)
logP missing for;•Folic acid•Carbamazepin•Prednisone•Carisoprodol
2. LipophilicityExperimental logP vs calculated logP
R2 = 0,70
-4,0
-2,0
0,0
2,0
4,0
6,0
8,0
-2,0 0,0 2,0 4,0 6,0 8,0
logPexp
log
Pcr
ip
Crippen logP
R2 = 0,88
-4,0
-2,0
0,0
2,0
4,0
6,0
8,0
-2,0 0,0 2,0 4,0 6,0 8,0logPexp
log
PA
CD
ACD/LogP
R2 = 0,89
-4,0
-2,0
0,0
2,0
4,0
6,0
8,0
-2,0 0,0 2,0 4,0 6,0 8,0
logPexp
log
PC
log
P
ClogP (BioByte)
R2 = 0,80
-3,0
-2,0
-1,0
0,0
1,0
2,0
3,0
4,0
5,0
6,0
-2,0 0,0 2,0 4,0 6,0 8,0
logPexp
log
PM
or
Moriguchi logP
2. LipophilicityCorrelation between the measured HPLC
capacity factor (k) and pH-metric logD (pH 6.8)•Compounds from the 8 corner points have different colors
•The 2 compounds at each corner point have the same color
•The axis points are colored black
•Center point pink
R2 = 0.92
(pH=6.8)
3. SolubilityMeasurment of intrinsic solubility using CheqSol
(24-compound data set)
Log
(g
/mL
)
-3,0
-2,0
-1,0
0,0
1,0
2,0
3,0
4,0
Terfena
dine
Mecli
zine
Chlorpro
thixe
ne
Fenofib
rate
Glipizi
de
Folic A
cid
Sulinda
c
Bendro
flum
ethiazid
e
Levo
thyr
oxine
Flupe
nthixo
l
Meto
clopr
amide
Carbam
azepin
e
Prednis
one
Tetracy
cline
Hydro
chlor
othiaz
ide
Chlorzoxa
zone
Aman
tadin
e
names
Solubility ranges from 0.009 g/ml to 2119 g/ml
3. Solubility
http://www.cheqsol.com/download%20files/download01.pdf
19 of the compounds studied also present in the 691-compound data set
CheqSol solubility ranges from 0.9 g/mL to 3500 g/mL in these 19 compounds
Compound not present in the 691 data set
Kinetic Solubility
Kinetic Solubility
CheqSol Shake-Flask Literature Chaser non-chaser
1 Phthalic Acid 5330 5950 8462
2 Quinine 363 201 491 391
3 Trazodone 134.6 138.0 435
4 Nitrofurantoin 112.5 109.5 78.9 319
5 Nortriptyline 27.0 49.3 20.0 27.3
6 Verapamil 48.5 48.5 9.7 47.8
7 Niflumic Acid 9.53 29.5 59
8 Imipramine 17.2 21.7 18.1 17.3
9 Flumequine 34.2 20.7 121
10 Furosemide 19.7 20.4 5.9 96
11 Maprotiline 5.80 8.05 3.49 77
12 Piroxicam 5.92 5.95 3.16 233
13 Warfarin 5.30 5.25 5.60 120
14 Chlorpromazine 2.70 2.41 1.71 2.70
15 Lidocaine 3500 3810 4600
16 Famotidine 740 1100 5900
17 Hydrochlorothiazide 630 700 2400
18 Chlorpheniramine 608.3 615.2 668
19 Sulfamerazine 200.3 203.0 701
20 Ketoprofen 130.6 178.0 336
21 Propranolol 81.0 70.0 340
22 Ibuprofen 50.0 49.0 180
23 Pindolol 41.7 32.7 1424
24 Miconazole 1.00 0.67
25 Diclofenac 0.90 0.80 45
26 Amodiaquin 0.41 8.8
27 Pamoic acid 0.0003 0.019
All results in µg/mL
Name Equilibrium solubility
In the 24-compound data set the solubility ranges from 0.009 g/ml to 2119 g/ml
24-compound data set is structurally diverse
-10
-8
-6
-4
-2
0
2
4
6
8
-9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
t[2]
t[1]
No ClassClass 1Class 2
SIMCA-P+ 11 - 2006-11-10 14:05:50
-10
-8
-6
-4
-2
0
2
4
6
8
-8 -6 -4 -2 0 2 4 6 8 10 12 14 16SIMCA-P 11 - 2006-11-01 16:08:45
Polarity
SizeLipophilicity
No class19-data set24-data set
0.01
0.1
1
10
100
0.01 0.1 1 10 100 1000
Caco-2 permeability (x 10-6 cm/s) at pH 6.5
Hu
ma
n j
eju
nu
m p
erm
eab
ilit
y (x
10
-4 c
m/s
) at
pH
6.5 Furosemide
Hydrochlorothiazide
Atenolol
Cimetidine
Manni tol
Terbutaline
Amoxi ci l l i n (C)
Lisinopril(C)
Metoprolol
Cephalexin (C)
Enalapril (C)
Propranolol
Phenylalanine (C)
Desipramine
Antipyrine
Piroxi cam
Verapamil (C)
Ketoprofen
Naproxen
D-Glucose (C)
logY = 0.6532 logX - 0.3036, R2 = 0.7276 (all drugs)logY = 0.7524 logX - 0.5441, R2 = 0.8492 (passively diffusive)LogY = 0.542LogX + 0.06, R2 = 0.7854 (Carrier-mediated)
Sun, D. et al. Comparison of Human and Caco 2 Gene Expression Profiles for 12,000 Genes and the Permeabilities of 26 Drugs in the Human Intestine and Caco 2 Cells. Pharm Res 2002, 19, 1398-1413
4. Permeability/absorption
Low
Med
ium
Hig
h
4. Permeability/absorption In vitro Papp values in human Caco-2 cells
Suggestions on the ”Uppsala diverse data set” usage
• The 24 compounds can be used– as a test set for testing already derived models of permeability,
lipophilicity, solubility etc.– as a validation set for new experimental techniques– on its own for building and validating models by dividing it into a
training set and a test set
We hope that other groups are willing to help us to supplement the herein-started characterization
”Bench mark data set”
J. Med. Chem.; (ASAP); 2006; 49(23); 6660-6671
Acknowledgements
AstraZeneca R&D MölndalSusanne Winiwarter Anna-Lena UngellJohan WernevikFredrik BergströmLeif Engström
Sirius Analytical Instruments LtdJohn Comer Karl BoxRuth Allen Jon Mole
Faculty of Pharmacy Uppsala UniversityChristian SköldTorbjörn LundstedtAnders HallbergHans Lennernäs