Upload
timothy-d-veenstra
View
214
Download
1
Embed Size (px)
Citation preview
TECHNOLOGIES
DRUG DISCOVERY
TODAY
Drug Discovery Today: Technologies Vol. 3, No. 4 2006
Editors-in-Chief
Kelvin Lam – Pfizer, Inc., USA
Henk Timmerman – Vrije Universiteit, The Netherlands
Techniques for rational design
Proteomic approaches in drugdiscoveryTimothy D. VeenstraLaboratory of Proteomics and Analytical Technologies, SAIC-Frederick Inc., NCI-Frederick, P.O. Box B, Frederick, MD 21702, USA
To find a new drug against a chosen target usually
involves high-throughput screening, wherein large
libraries of chemicals are tested to determine their
ability to modify the target. Before a target can be
chosen, however, it must first be discovered. The omics
era has brought unprecedented abilities to screen cells
at the gene, transcript, protein, and metabolite level in
search of novel drug targets. Of the big four classes of
biomolecules, proteins remain the principal target of
drug discovery. The recent developments in proteomic
technologies have brought with them ability to com-
paratively screen large numbers of proteins within
clinically distinct samples. This capability has enabled
non-biased studies in which the goal is to discover
proteins that may act as suitable diagnostic biomarkers
or therapeutic drug targets. Although proteomics
technology has brought with it much hope, there are
still many challenges associated with leveraging the
experimental data into the discovery of novel drug
targets.
E-mail address: T.D. Veenstra ([email protected])
1740-6749/$ � 2006 Elsevier Ltd. All rights reserved. DOI: 10.1016/j.ddtec.2006.10.001
Section Editor:Hugo Kubinyi – University of Heidelberg, Heidelberg,Germany
Introduction
Drug discovery can be defined as a research process that
identifies and develops a molecule that produces a desired
effect in a living organism. Although the human cell is made
up of a large number of genes, transcripts, proteins, and
metabolites, most often a drug is designed to act upon a
protein [1]. Although on the surface the process may seem
straightforward – find a deranged protein that is causing an
adverse affect and then use a molecule to block its effect –
there are challenges, both technical and physiological, that
makes drug discovery a daunting challenge.
The first challenge is to find the protein target. Although
this article will not discuss this issue at length, the initial need
before any instrumental analysis can be implemented, is the
selection of suitable samples that are to be used in the dis-
covery of the target. Fundamentally, the sample set should
include materials acquired from patients who are affected by
a specific disorder and those acquired from healthy, matched
controls. Although human samples will be necessary at some
point if a drug is to be approved for human use, drug dis-
covery can often begin with much easier to obtain and
manipulate samples such as cell culture or a suitable animal
model. Although the efficacy of a drug in a non-human
system is often a poor predictor of its efficacy in a human,
issues such as husbandry and genetic background can be
controlled in non-human models.
Once a suitable model has been found, the next step is to
identify the deranged protein(s) that is (are) responsible for
the adverse condition being studied. This step is where the
technology developments made in surveying the protein
content of cells, tissues, and organisms has changed the
design of drug discovery studies. In the past (and to a large
extent currently), protein science was dominated by hypoth-
esis-driven studies in which a specific or small number of
433
Drug Discovery Today: Technologies | Techniques for rational design Vol. 3, No. 4 2006
proteins are studied to determine if they play a role in a
particular cell phenotype. Today’s technologies allow discov-
ery-driven studies in which the aim is to gather as much
possible information on as many proteins as possible to
determine which proteins are contributing to the observed
phenotype. As will be discussed later, the ability to gather
more information at the protein level would seem to simplify
the problem and enable the identification of large numbers of
novel drug targets; it has also resulted in a whole new set of
questions that need to be considered and new hurdles that
need to be cleared.
Proteomic technologies for the discovery of drug
targets
The mention of proteomics most often invokes images of
two-dimensional gels and mass spectrometers. If a nonbiased
approach is to be taken, the attributes of mass spectrometry
(MS) make it arguably the most powerful technology for the
discovery of protein drug targets. Although both two-dimen-
sional gels and MS play a major role in proteomics, they are
not the only technologies available, or necessary, for the
discovery of drug targets (Fig. 1). The successful discovery
of drug targets relies on a variety of techniques such as the
appropriate sample preparation, fractionation, protein mea-
Figure 1. A partial view of various proteomic technologies important in drug
434 www.drugdiscoverytoday.com
surement, and bioinformatics. Although much of the credit
for the ability to characterize proteomes to the extent possible
today is a direct result in the development of more powerful
mass spectrometers, the contributions of sample preparation
and protein fractionation should not be overlooked. After the
clinical sample set has been acquired, the design of the
sample preparation steps that will be used is probably the
most critical step that will determine success or failure. The
sample preparation steps need to be designed depending on
the level of information one has concerning the possible drug
target. For example, if there is evidence, empirical or other-
wise, that the target is a membrane receptor, ultracentrifuga-
tion should be incorporated into the sample preparation
steps to isolate membranes from the samples (if possible).
In the cases of serum and plasma, it is wise to remove high
abundance proteins, such as albumin and immunoglobulins,
because they can interfere with downstream analyses [2].
Unfortunately, in too many cases very little is known about
the potential drug target. In this case, the aim is to character-
ize as many proteins with the sample as possible.
The next decision point entails what type of separation is
best for the samples of interest. Two-dimensional polyacry-
lamide gel electrophoresis (2D-PAGE) has been widely used in
comparing proteomes extracted from comparative samples
discovery.
Vol. 3, No. 4 2006 Drug Discovery Today: Technologies | Techniques for rational design
Figure 2. High-throughput peptide identification using liquid chromatography (LC) coupled on-line with tandem mass spectrometry (MS). The mass
spectrometer takes an MS scan and measures the intensity of various peptide ions observed temporally during a separation of a complex peptide mixture
(a and b). The most abundant peptide ion (c) is isolated and subjected to collisional induced dissociation (d). The resulting tandem MS spectrum is
analyzed by the appropriate software to identify the peptide sequence that would most probably give rise to this fragmentation pattern (e). This peptide
sequence is then correlated back to its protein of origin. Modern mass spectrometers conduct steps (b) through (d) in a rapid cyclical fashion
enabling hundreds of peptides within complex mixtures to be identified per hour.
[3]. In 2D-PAGE, samples are fractionated based on their
isoelectric point (pI) and molecular mass. After staining of
the proteins, spots that are more or less intense within the
comparative samples are excised from the gel and identified.
Two-dimensional PAGE enables the relative abundances of
proteins from different samples to be compared within the gel
on the basis of intensity of the protein staining. Mass spectro-
metry is the tool of choice for protein identification because
of its throughput, sensitivity, and ability to identify proteins
based on sequence-related information [4].
Another approach that is commonly used when the goal is
to characterize large numbers of proteins is to circumvent 2D-
PAGE and directly analyze the samples by MS [5]. One of the
misnomers of this type of MS-based proteomics is that in
most studies they are peptides, rather than proteins, that are
characterized. In these bottom-up studies, the entire pro-
teome is digested into tryptic peptides that are introduced
into the mass spectrometer for identification. The digestion
of potentially thousands of proteins results in potentially tens
to hundreds of thousands of peptides. Therefore, it is neces-
sary to fractionate this mixture before MS analysis. The most
commonly used prefractionation technique is strong cation
exchange (SCX) followed by reversed-phase liquid chroma-
tography. This combination can either be done online
together using a bi-phasic column or offline in which frac-
tions are collected from the SCX column. The reversed-
phased separation is always done directly on-line so that
peptide elute directly from this column into the mass spectro-
meter.
The ability of the mass spectrometer to identify proteins
rapidly is arguably the parameter that makes this instrumen-
tation the driving force in proteomics today. How exactly
does a mass spectrometer identify peptides? As shown in
Fig. 2, peptides are being constantly eluted from a
reversed-phase column into the mass spectrometer
(Fig. 2a). During this separation, the instrument records
the mass-to-charge (m/z) ratios of the peptides that are elut-
ing at a specific time point (Fig. 2b). The instrument then
selects and isolates the most intense ion observed in the
previous scan (Fig. 2c) and fragments it, in a process referred
to as tandem MS, to create a series of sequence ladders
(Fig. 2d). After this fragmentation event the instrument
www.drugdiscoverytoday.com 435
Drug Discovery Today: Technologies | Techniques for rational design Vol. 3, No. 4 2006
proceeds to isolate and fragment the next most abundant
peptide ion. It does this sequential ion selection and frag-
mentation for anywhere from the 3–10 most abundant pep-
tide ions (depending on the operator setting). Today’s mass
spectrometers are able to collect approximately 7000 tandem
mass spectra in a single hour. All of these spectra are then
analyzed using the appropriate software and protein or gen-
ome database to identify the peptides that gave rise to the
individual spectra (Fig. 2e). In a typical analysis, 10–20% of
the spectra will give a hit, allowing between 700 and 1400
peptides to be identified confidently and then correlated to its
protein of origin.
Detection of changes in protein abundance
There are many attributes that can make a protein a potential
drug target. Protein phosphorylation, which controls many
aspects of cell physiology, is an important target for drug
design. For example, Gleevec works by targeting a constitu-
tively active tyrosine kinase, BCR-Abl and shutting off the
uncontrolled cell growth associated with the mutated gene
product. G-protein-coupled receptors have historically been
the most important group of drug targets. Not surprisingly,
protein kinases are now the second most important class of
drug target [6]. Other modifications on proteins such as
Figure 3. Quantitative methods for comparative proteomics. (a) Stable isoto
436 www.drugdiscoverytoday.com
prenylation [7], methylation [8], sulfation [9], among others
have also been targeted as potential drug targets. Although
MS-based proteomics is capable of detecting such modifica-
tions, outside of phosphorylation, the science is not mature
enough to identify such changes on the scale and with the
reliability necessary for drug target discovery. The major
focus in MS-based proteomics is to identify changes in the
relative abundances of proteins between comparative sample
sets.
As mentioned above, 2D-PAGE provides direct measure-
ment of changes in protein relative abundances courtesy of
the protein staining intensity. In non-gel based approaches
other means must be used to identify those proteins that are
differentially abundant between the sample cohorts. There
are essentially two main strategies used to gain a measure of a
protein’s relative abundance in different proteome samples:
differential stable isotope labeling [10] and subtractive pro-
teomics (Fig. 3) [11]. There are many different methods in
which differential stable isotope labeling is used in quanti-
tative proteomics, however, they all have the same basic
premise; label amino acids within one proteome with a light
isotope of a common element (e.g. 12C, 14N, and so forth) and
label the other proteome with the matching heavy isotope
(e.g. 13C, 15N). This labeling can be done either chemically
pe labeling and (b) subtractive proteomics.
Vol. 3, No. 4 2006 Drug Discovery Today: Technologies | Techniques for rational design
(e.g. in the case of isotope-coded affinity tags or iTRAQ), or
metabolically (e.g. culturing of cells in medium enriched for a
particular heavy stable isotope). Although there are subtle
differences in the sample processing steps depending on the
type of stable isotope labeling approach used, in either case
the differentially labeled proteome samples are combined
and digested into tryptic peptides. The peptide mixture is
then analyzed through a combination of multidimensional
chromatography coupled directly on-line with data-depen-
dent tandem MS, as shown previously in Fig. 2. The relative
abundance of the peptides within the different samples is
measured in the MS scan, and MS/MS is used for identifica-
tion. The result is a list of the relative abundances of proteins
among samples being compared. The hope is that a protein(s)
that has an observable abundance difference between two (or
among more) sets of samples is an intriguing candidate as a
potential drug target and can be graduated to further valida-
tion and future clinical development.
Although stable isotope labeling methods enable the quan-
titation of thousands of proteins in complex clinical samples,
they are low throughput, requiring days to compare even a
couple of samples. They are generally limited to the compar-
ison of no more than four samples, and metabolic stable
isotope labeling methods are not applicable to the study of
human samples. Although they have made a major impact in
the analysis of cellular and tissue proteomes, stable isotope
labeling methods, have not been widely used in the study of
biofluids. Although the reasons for this are not readily
obvious, it is possible that the domination of serum and
plasma by a few high abundant proteins impacts the ability
to modify lower abundant proteins chemically.
Subtractive proteomic approaches have been recently
developed to simplify and increase the throughput of analyz-
ing clinically important samples [11]. These methods do not
rely on gels or stable isotopes, but quantitate proteins based
on the number of peptides identified for each species
(Fig. 3b). In this method, proteomes are extracted from a
series of biological samples and digested into tryptic peptides.
The peptide mixtures are then individually analyzed using
multidimensional chromatography coupled directly on-line
with a mass spectrometer operating in a data-dependent
tandem MS mode (Fig. 2). The relative abundance of each
protein across a set of samples is determined by the number of
peptides identified for that specific protein.
This quantitative method is based on the observation that
the number of unique peptides identified for a protein is
related to its abundance in the mixture. For example, albu-
min, which is present at �60–80 mg/mL, is consistently
detected by large numbers of peptides (i.e. >20) in the MS
analysis of serum, whereas lower abundance proteins such as
cytokines are generally identified by one or two peptides [2].
This result is directly related to the concentration of albumin
(i.e.�60 mg/mL) compared with cytokine proteins (i.e. in the
ng/mL range). The subtractive approach is an attractive
approach to screening for changes in protein abundances
across many samples because of its inherent simplicity and
the fact that an unlimited number of samples can be inter-
compared, whereas stable isotope labeling methods in prac-
tice have been limited to two-way (e.g. ICAT) or four-way (e.g.
iTRAQ) comparisons. Like most techniques, however, it also
has its disadvantages. It is relatively low throughput. Each
sample would take a minimum of one day to acquire the
necessary data even if the whole process was automated. The
quantitative comparison method is imprecise compared with
stable isotope labeling methods and, therefore, changes less
than threefold cannot be accurately determined with a high
confidence level. Low abundance proteins, although detect-
able, may not provide enough unique peptide identifications
to be quantitated using this method.
Challenges in drug target discovery
Although MS-based methods are routinely able to detect
hundreds of differences between biological samples, this
ability is somewhat of a blessing and a curse. The blessing
is in the ability to detect so many differences and the curse is
in trying to determine which differences are most important
and likely to survive downstream pre-clinical validation.
Obviously many differences, such as inflammatory or
acute-phase response proteins, can be ruled out as potential
drug targets, but how to determine the best candidates is still
a difficult chore. One method that is now routinely done is to
compare changes in the proteome with those observed in an
mRNA array. Unfortunately, numerous studies have now
shown that the correlation between the amount of a protein
and its transcript’s abundance is poor. For example, in a
study conducted in our laboratory comparing changes in the
abundances of proteins and their transcripts during osteo-
blast differentiation, we found that the correlation was an
abysmal 0.09 [12]. There are many potential reasons for this
lack of correlation ranging from post-transcriptional proces-
sing events to temporal differences in mRNA and protein
expression. The data were then re-compared by binning
proteins and their transcripts into functional pathways
and the correlation between these groups was then com-
pared. As shown in Table 1, a series of different functional
pathways including cell cycle regulation and apoptosis
induction showed significant correlation. This comparison
allows the location of potential drug targets to be localized
within specific functional pathways that can be examined
using hypothesis-driven studies directed towards the indi-
vidual proteins.
Let us assume that global screening has brought to fruition
potential drug targets. It is at this point that many of the
other technologies highlighted in Fig. 1, such as structural
proteomics and binding measurements, become relevant.
Obviously the standard approach of conducting high-
www.drugdiscoverytoday.com 437
Drug Discovery Today: Technologies | Techniques for rational design Vol. 3, No. 4 2006
Figure 4. Proteomic technologies in the discovery of a biomarker and possible drug target for interstitial cystitis (IC). A series of chromatography steps
were performed in which desired fractions were graduated based on their activity in a cell-based assay. The antiproliferative factor (APF) was identified by
tandem mass spectrometry (MS) of a simplified fraction that still retained the desired activity. A biotinylated version of APF was synthesized and coupled to
an avidin column to serve as a bait to isolate its receptor. The receptor was identified, and validated, by MS and Western blotting as CKAP4.
Table 1. Pearson correlation values comparing overall functional pathways of proteins and their transcripts during osteoblastdifferentiation
BioCarta pathway Pearson correlation (Prot. V mRNA) P-value
GO pathway 0.501 0.047
Cell Cycle 0.829 0.048
Integrin-mediated signaling 0.763 0.046
G-protein coupled-receptor 0.963 0.046
Induction of apoptosis 0.963 0.050
Mitosis 0.825 0.050
Rho protein signal transduction 0.831 0.049
Although poor correlation was observed at the individual protein/transcript level, good correlation was observed when the overall abundance changes seen within functional pathways were
compared [12].
438 www.drugdiscoverytoday.com
Vol. 3, No. 4 2006 Drug Discovery Today: Technologies | Techniques for rational design
throughput screening of combinatorial libraries of com-
pounds against the proposed target will play a critical role,
but it is advantageous to have a purified version of the drug
target available to determine its binding characteristics. The
determination of protein structures has seen a tremendous
increase in throughput in the last few years as automated
methods of testing for the optimal expression conditions of
recombinant proteins in different cell types have been devel-
oped [13]. Automation has also positively impacted the abil-
ity to purify expressed proteins, and more powerful X-ray
beams and higher field nuclear magnetic resonance spectro-
meters along with the development of better software and
faster hardware have increased the rate at which protein
structures can be solved [14,15]. Knowledge of a drug target’s
structure can be used to determine if it possesses homology to
any other class of protein. This homology mapping can aid in
either the selection or the design of an appropriate drug to
inhibit the protein’s activity.
Application of proteomics to discovery of anti-
interstitial cystitis drug target
Although the number of drug targets identified in the aca-
demic proteomics world is lacking, there have been suc-
cesses. In our own laboratory, we have been working over
the past few years on interstitial cystitis (IC), a chronic and
painful bladder disorder that is characterized by thinning of
the bladder epithelial lining. Our initial interest in IC was the
discovery of a diagnostic biomarker as it had been shown that
urine from these patients contained a factor, named anti-
proliferative factor (APF), that inhibited bladder epithelial
cell growth in vitro. By using a series of separation methods
and testing each fraction for growth inhibition, we were able
to isolate an active molecule that was identified using tan-
dem MS as a sialoglycopeptide made up of a three moiety
sugar group bound to a nine residue hydrophobic peptide, as
shown in Fig. 4 [16]. On the basis of the structure of APF, we
hypothesized that it exerted its effects on the bladder epithe-
lial lining through binding to a membrane receptor. To find
this receptor, a biotinylated form of APF was synthesized and
coupled to an avidin column. A membrane preparation
prepared from explanted bladder epithelial cells from IC
patients was solubilized and passed over the column. The
column was equilibrated and bound material was eluted
from the column using solutions containing increasing salt
concentrations. Each of these fractions was then analyzed by
SDS-PAGE. Two faint bands were detected on a silver stained
gel of the eluant collected at the highest salt concentration.
These two bands were identified as CKAP4, a single pass
membrane receptor, and vimentin [17]. Reducing CKAP4
expression in bladder epithelial cells by siRNA diminished
the growth inhibitory effects of APF on these cells. Incuba-
tion of epithelial cells with an anti-CKAP4 antibody also
prevents the growth inhibition effects of APF. These results
suggest that CKAP4 may be a possible druggable target to
treat patients suffering the adverse effects of IC.
Although this project demonstrates the use of proteomics
technology for finding a possible drug target, careful analysis
shows that many more technologies beyond MS were critical
in the discovery. For instance, a significant amount of chro-
matography was used to simplify the final mixture enabling
APF to be recognized, and a cell-based assay was critical for
screening for the desired activity. In the identification of
CKAP4, sample preparation in the form of subcellular frac-
tionation to prepare a membrane preparation was instrumen-
tal in the identification of CKAP4 as a receptor for APF and a
potential druggable target. Finally, functional studies to
block CKAP4 activity in the presence of APF are critical to
proving a link between APF and CKAP4. Although MS will
continue to play a key role, the inclusion of other technolo-
gical assays will bolster the chances of finding clinically valid
protein drug targets in the future.
Conclusion
The scientific community is able to survey proteins like never
before. The two most pressing needs for this type of technol-
ogy is to find more effective biomarkers for disease detection
and discover proteins to which therapeutic drugs can be
targeted. One sentiment that is often expressed in the MS
community is that if we had more sensitive instruments, we
could do better at identifying biomarkers or drug targets.
Frankly, I disagree with this thinking. We have the capability
of not only identifying orders of magnitude more proteins
than just ten years ago, but can also do it in a fraction of time.
Unfortunately, this capability has resulted in too many stu-
dies that rely too heavily on MS for the discovery of drug
targets. One hurdle that must be overcome is to find ways to
complement high-throughput MS data with other types of
studies that cull the number of possible targets found in a
global screening into those targets that are most likely to pass
future clinical trials.
Acknowledgements
This project has been funded in whole or in part with federal
funds from the National Cancer Institute, National Institutes
of Health, under Contract NO1-CO-12400. The content of
this publication does not necessarily reflect the views or
policies of the Department of Health and Human Services,
nor does it mention trade names, commercial products, or
organization implied endorsement by the United States Gov-
ernment.
References1 Hofstadler, S.A. and Sannes-Lowery, K.A. (2006) Application of ESI-MS in
drug discovery: interrogation of noncovalent complexes. Nat. Rev. Drug
Discov. 5, 585–595
2 Conrads, T.P. et al. (2006) Sampling and analytical strategies for biomarker
discovery using mass spectrometry. Biotechniques 40, 799–805
www.drugdiscoverytoday.com 439
Drug Discovery Today: Technologies | Techniques for rational design Vol. 3, No. 4 2006
3 Pietrogrande, M.C. et al. (2006) Decoding 2D-PAGE complex maps:
relevance to proteomics. J. Chromatogr. B Analyt. Technol. Biomed. Life Sci.
833, 51–62
4 Domon, B. and Aebersold, R.A. (2006) Mass spectrometry and protein
analysis. Science 312, 212–217
5 Liu, H. et al. (2002) Multidimensional separations for protein/peptide
analysis in the post-genomic era. Biotechniques 32, 898–902
6 Cohen, P. (2003) Protein kinases – the major drug targets of the 21st
century? Nat. Rev. Drug Discov. 1, 309–315
7 Glenn, J.S. (2006) Prenylation of HDAg and antiviral drug development.
Curr. Top. Microbiol. Immunol. 307, 133–149
8 Abbosh, P.H. et al. (2006) Dominant-negative histone H3 lysine 27 mutant
derepresses silenced tumor suppressor genes and reverses the drug
resistant phenotype in cancer cells. Cancer Res. 66, 5582–5591
9 Farzan, M. et al. (1999) Tyrosine sulfation of the amino terminus of CCR5
facilitates HIV-1 entry. Cell 96, 667–676
10 Aggarwal, K. et al. (2006) Shotgun proteomics using the iTRAQ isobaric
tags. Brief. Funct. Genomic. Proteomic. 5, 112–120
440 www.drugdiscoverytoday.com
11 Oh,P. et al. (2004) Subtractive proteomic mapping of the endothelial surface
in lung and solid tumour for tissue-specific therapy. Nature 429, 629–635
12 Conrads, K.A. et al. (2005) A combined proteome and microarray
investigation or inorganic phosphate-induced pre-osteoblast cells. Mol.
Cell. Proteomics 4, 1284–1296
13 Vinarov, D.A. and Markley, J.L. (2005) High-throughput automated
platform for nuclear magnetic resonance-based structure proteomics.
Expert Rev. Proteomics 2, 49–55
14 Scapin, G. (2006) Structural biology and drug discovery. Curr. Pharm. Des.
12, 2087–2097
15 Tugarinov, V. et al. (2004) Nuclear magnetic resonance spectroscopy of
high-molecular-weight proteins. Annu. Rev. Biochem. 73, 107–146
16 Keay, S. et al. (2004) An antiproliferative factor from interstitial cystitis
patients is a frizzled 8 protein-related sialoglycopeptide. Proc. Natl. Acad.
Sci. U S A 101, 11803–11808
17 Conrads, T.P. et al. CKAP4/p63 is a receptor for the frizzled-8 protein-
related antiproliferative factor from interstitial cystitis patients. J. Biol.
Chem. (in press)