33
Corresponding Author: Stefanie S. Jeffrey 1 Molecular Profiling of Breast Cancer Vincent A. Funari, Ph.D. 1 and Stefanie S. Jeffrey, M.D. 2 1 Department of Surgery, Stanford University School of Medicine, Medical School Lab-Surge Bldg. Room P229, 1201 Welch Road, Stanford, California 94305-5494, USA phone (650) 724-3519, fax (650) 724-3229 2 Department of Surgery, Stanford University School of Medicine, Medical School Lab-Surge Bldg. Room P214, 1201 Welch Road, Stanford, California 94305-5494, USA phone (650) 723-0799, fax (650) 724-3229 Email addresses for authors: Vincent A. Funari - [email protected] Stefanie S. Jeffrey - [email protected] Corresponding author: Stefanie S. Jeffrey (email: [email protected])

Molecular Profiling of Breast Cancer - Stanford Universityweb.stanford.edu/.../coursework/breastcancersupplement.pdf · Corresponding Author: Stefanie S. Jeffrey 1 Molecular Profiling

Embed Size (px)

Citation preview

Corresponding Author: Stefanie S. Jeffrey

1

Molecular Profiling of Breast Cancer

Vincent A. Funari, Ph.D. 1 and Stefanie S. Jeffrey, M.D.2

1Department of Surgery, Stanford University School of Medicine, Medical School Lab-SurgeBldg. Room P229, 1201 Welch Road, Stanford, California 94305-5494, USAphone (650) 724-3519, fax (650) 724-3229

2Department of Surgery, Stanford University School of Medicine, Medical School Lab-SurgeBldg. Room P214, 1201 Welch Road, Stanford, California 94305-5494, USAphone (650) 723-0799, fax (650) 724-3229

Email addresses for authors:Vincent A. Funari - [email protected] S. Jeffrey - [email protected]

Corresponding author: Stefanie S. Jeffrey (email: [email protected])

Corresponding Author: Stefanie S. Jeffrey

2

The human genome project and the development of high throughput technology over the

last 5-10 years have thrust biology and medicine into a new era. Characterizing, diagnosing, and

treating breast cancer using new molecular profiling techniques is a powerful patient-specific

approach to treating and even preventing breast cancer. The technology is advancing rapidly and

changes in the field occur often. This chapter will focus on the promises, progress and problems

of molecular profiling in breast cancer.

PROBLEMS WITH CURRENT METHODS

At the present time, only a limited set of tumor parameters are used to estimate prognosis

for a patient with breast cancer. In general, these include tumor type (ductal, lobular, medullary,

mucinous, etc), size of invasive component, grade of the invasive component, the expression of

hormone receptors including Estrogen Receptor (ER) and Progesterone Receptor (PR), the

expression of the growth factor receptor HER2/neu (ERBB2), the presence and number of lymph

node metastases, and any evidence of distant disease. In many areas of the U.S., measures of

tumor proliferation, such as S-phase analysis or Ki67 expression are also determined. From these

data, risk of distant relapse is assessed and recommendations for systemic therapy are given.

Generally, almost all patients with lymph node metastases and the great majority of patients with

lymph node negative invasive tumors greater than one centimeter will be candidates for systemic

therapy.1 As a result, many women with stage I and II breast cancer, who may be cured by

surgery and/or radiation alone, are over treated by systemic therapy. Other women are treated

with systemic therapies that are ineffective against their specific tumor type. Further, many

chemotherapeutic agents are non-specific, killing rapidly dividing cells in other organs (eg bone

marrow or the GI tract). In general, currently used tumor parameters do not provide sufficient

tumor-specific predictions for survival, need for systemic therapy, and drug response.

Corresponding Author: Stefanie S. Jeffrey

3

Although individual gene measurements (such as ER, PR, HER2/neu) have provided

insightful information, it is now possible to measure global genetic changes using new

technologies that provide a unique molecular profile or a fingerprint of the tumor. These multiple

gene measurements represent a more comprehensive tumor signature that should provide more

precise insights into a tumor’s clinical behavior, response to systemic therapy, or offer possible

targets for the development of novel tumor-specific therapeutics.

CONSTRUCTING A MOLECULAR PROFILE

For precise characterization, breast tumors must be analyzed at all molecular levels:

DNA, RNA, and protein. The goal is to identify tumor-specific features that molecularly subtype

a tumor and then to correlate clinical outcome with molecular features. This would enable a

patient and her physician to make specific decisions as to whether systemic therapy is indicated,

and if it is, to use targeted therapies for treatment specifically aimed to kill or immobilize the

molecular type-specific tumor cells.

There are four major steps in achieving accurate molecular profiling data. The first step is

to obtain samples (from cell cultures, human tissues, blood, or body fluids) and purify the

molecules of interest (DNA, RNA, protein). In the second step the DNA, RNA, or protein from

the sample is measured. This usually involves constructing or purchasing a high throughput

assay device, such as a microarray or protein chip, that can measure the presence or absence of

hundreds to tens of thousands of genes, expressed genes, or proteins in a single sample. The third

step involves data analysis using bioinformatics tools. This entails information storage and

application of data processing algorithms to analyze and visualize the complex data. Finally,

conclusions must be reached, validated, and translated into clinical applications.

Corresponding Author: Stefanie S. Jeffrey

4

DNA MOLECULAR PROFILES

Changes in chromosomal DNA occur in breast cancer. Identifying specific sites of DNA

copy number change may identify candidate oncogenes or tumor suppressor genes. In contrast to

methods such as loss of heterozygosity (LOH) and sequencing that traditionally have measured

genetic modifications in specific genes or loci, comparative genomic hybridization (CGH)2 and

array based CGH3-6 map chromosomal or gene copy number changes on a global genomic scale.

In CGH, tumor DNA and control DNA (isolated from peripheral blood lymphocytes from a

healthy donor) are differentially labeled with fluorescent dyes and cohybridized onto normal

metaphase chromosomes, also obtained from peripheral blood lymphocytes stimulated in vitro.

The image is digitized and bioinformatic tools calculate the fluorescence ratio of tumor to normal

genomic DNA. The ratio of fluorescence along the chromosome identifies regions of

amplifications (gains) and deletions (losses) in the tumor DNA.

Chromosomal imbalances in breast cancer

CGH has identified multiple regions of chromosomal gains and losses in breast cancer. In

primary breast cancers, chromosomal gains have been most frequently identified as whole arm

gains in 1q and 8q and regional copy number increases at 17q and 20q.7 These data are, for

example, consistent with known breast cancer oncogenes on chromosomes 8q (MYC) and 17q

(HER2/neu [ERBB2]). In DCIS, chromosomal gains are observed in 1q, 8q, and 17q, whereas

losses are most common in 8p, 11q, 13q, 14q, and 16q.8 In invasive breast cancer, gains of 1q,

6p, 8q, 11q, 16p, 17q, and 20q are most common. Chromosomal losses have been identified in

1p, 8p, 11q, 16q, 18q, and 22.9 Using CGH, Forozan and colleagues10 compared 38 established

tumor cell lines to a meta analysis of CGH results from 698 primary tumors. In addition to the

Corresponding Author: Stefanie S. Jeffrey

5

chromosomal gains and losses mentioned for invasive tumors above, gains at 3q, 5p, 7p, 7q, 20p

and losses at 4p, 18p, Xp, Xq were also found.

CGH may also be used to study tumor biology. Jain and colleagues11 studied the

statistical relationship between CGH loci ratios and survival. Alterations in two loci, a gain at

8q24 and loss at 9q13, were associated with poor survival and were also associated with

mutations in TP53, the tumor suppressor gene that codes for p53 protein. To study tamoxifen

resistance, CGH has been used to compare a tamoxifen sensitive breast cancer cell line (MCF-7)

and a tamoxifen resistant clone (CL-9).12 CGH findings revealed differential gains on

chromosomes 2p, 2q, 3p, 12q, 13q, 17q, 20q, 21q and differential losses on chromosomes 6p, 7q,

11p, 13q, 17p, 18q, 19p, 22q. Neither ER-alpha on 6q25.1 nor ER-beta on 14q were involved in

the differences. The authors suggest that this technique may be useful for identifying candidate

genes involved in tamoxifen resistance.

Characterizing cancer cell progression

Beginning with usual ductal hyperplasia, there is evidence of accumulation of

chromosomal aberrations that lead to invasive breast cancer.13-15 Progression from hyperplasia to

atypical hyperplasia to DCIS and finally to invasive breast cancer is thought to occur in a multi-

step fashion.16 Consistent with this linear progression theory is that higher grade DCIS lesions

demonstrate increased chromosomal aberrations with loss of differentiation.17

However, others have argued against this linear continuum, and instead suggest

alternative differentiation pathways of progenitor cells in the glandular tissue.18, 19 In support of

non-linear, independent pathways of genetic evolution in breast cancer, Buerger8 used CGH to

study DCIS samples including all differentiation grades and some with associated invasive breast

cancer. All cases showed chromosomal imbalances, identifying DCIS as a genetically advanced

Corresponding Author: Stefanie S. Jeffrey

6

lesion, with identical genetic lesions between the DCIS and invasive components in 83% of the

cases. The most frequent chromosomal changes in well-differentiated DCIS were losses at 16q

and gains at 1q. In contrast, high grade DCIS demonstrated losses at 8p, 11q, 13q, 14q and gains

at 1q, 8q, 17q. Moreover, in 30% of DCIS cases with an invasive component, a gain of 11q13

was identified which was not present in pure DCIS. CGH was then performed on a larger

population of intermediate and high grade invasive cancer.20 Chromosomal gains of 1q and 8q

were seen in all invasive tumor grades. The loss of 16q, seen in well-differentiated DCIS, was

not observed in the majority of poorly differentiated invasive cancers whereas more that half of

intermediate grade DCIS showed this loss, suggesting that a subset evolved from well-

differentiated DCIS and another subset evolved from poorly differentiated DCIS. Other

chromosomal alterations, including gains at 8q, 17q and 20q and losses of 13q were found to be

associated with poorly differentiated invasive carcinoma. Overall, this data suggests that invasive

carcinoma recapitulates the genetic differentiation pattern of its precursor DCIS (low grade DCIS

progresses to low grade invasive cancer and high grade DCIS progresses to high grade invasive

cancer). Intermediate grade carcinoma may represent a mixture of DCIS subtypes evolving along

different genetic pathways.

Array-based CGH analysis

While CGH provides a genome-wide view of chromosomal changes, its resolution is

limited to measuring chromosomal imbalances of 10-20 megabases or more. Assuming about 10

genes per megabase, the resolution of conventional CGH spans about a 100-200 gene range.

Array-based CGH is a high resolution alternative that can measure DNA copy number changes at

the kilobase or gene level. For array CGH, tumor and normal genomic DNA are labeled with two

different fluorescent dyes. The differentially labeled DNA is cohybridized to a microarray which

Corresponding Author: Stefanie S. Jeffrey

7

is a glass slide containing thousands of DNA elements. These elements can include either

cDNAs (individual genes) or larger chromosomal segments that contain one or more genes with

known chromosomal location, such as bacterial artificial chromosomes. The fluorescence ratio of

tumor to normal DNA at each gene represents the copy number ratio between the two samples.

Since gene expression studies may also be performed on similarly configured microarrays (see

below), it is possible to directly correlate DNA copy number change and gene expression. 5

Array-based CGH has been used to investigate previously recognized areas of

amplification, such as chromosome 20q13, which had been characterized extensively by other

techniques. The increased resolution of array CGH was able to identify two potential oncogenes,

CYP24 and ZNF217, the former not previously associated with breast cancer.21 Pollack and

colleagues22 used array CGH to study gene copy number changes and their correlation to gene

expression. Interrogating 6,691 mapped human genes in locally advanced primary breast tumors

and ten breast cancer cell lines, DNA chromosomal alterations were found in all samples with

aberrations found in every chromosome. Gains were identified within 1q, 8q, 17q and 20q in a

large proportion of tumors and cell lines; losses were observed within 1p, 3p, 8p, and 13q. A

strong relationship between DNA copy number and gene expression was found, well exemplified

by chromosome 17. Although gene amplification does not always yield an increase in gene

expression, for highly amplified DNA regions, 42% were associated with high gene expression

and 62% were associated with moderately high gene expression. This suggests that a tumor’s

molecular phenotype is in large part impacted by underlying variation in DNA copy number. The

authors estimate that overall 7-12% of variation in gene expression in breast tumors is due to

variation in gene copy number. A study by Kallioneimi and colleagues23 had similar findings.

Comparing DNA copy number and mRNA expression levels of 13,824 genes in 14 breast cancer

Corresponding Author: Stefanie S. Jeffrey

8

cell lines, they showed that 44% of highly amplified genes were overexpressed and 10.5% of the

genes with high-level expression were amplified.

RNA MOLECULAR PROFILES

Since only a fraction of genes in a cell are expressed at any given time, the set of

expressed genes (the gene expression profile) provides a snapshot reflecting that cell’s

physiology and response to environmental influences. Differences in gene expression profiles

can be used to define different molecular phenotypes of breast cancer, to predict the need for and

responsiveness to systemic therapies, and to identify novel targets for tumor-specific therapies.

There are a number of reasons why RNA expression profiling has dominated the

molecular profiling arena: (1) RNA is the product of an expressed gene and usually contains

more functional significance than DNA, (2) protein assays are still in their infancy and

sensitivity and precision require further optimization and validation, (3) classical RNA

technologies were easily adapted to high throughput systems, and (4) conserved RNA properties

facilitate amplification and measurement of minute amounts.

Before the genome project began, scientific methodology was candidate gene dependent,

discovering and identifying one gene at a time was knowledge driven. High throughput

technologies developed as part of the Human Genome Project changed this systematic

methodology. Using these technologies, global gene expression profiles for thousands of known

and unknown genes were determined in tissues before the genome was even sequenced.24, 25 The

initial gene discovery methods, included Expressed Sequence Tags (ESTs, explained below),

subtractive hybridization,26, 27 serial analysis of gene expression (SAGE),28 and differential

display (DD),29 were developed based on universal RNA properties and available laboratory

techniques without needing prior knowledge of an expressed gene’s function, sequence, or

Corresponding Author: Stefanie S. Jeffrey

9

chromosomal location. Using this technology, novel genes were identified at a more rapid pace

than functions could be assigned. Today, approximately half of the expressed sequences (ie,

genes) still have no assigned function, yet the abundance of gene sequence knowledge available

from these techniques has enabled scientific focus to change from gene discovery to gene

function. While these methods are powerful, they are technically difficult, require large-scale

robotic sequencing instruments, and only allow study of a few different biological samples at one

time. In contrast, DNA microarrays were developed in the mid-1990’s and have been used to

measure RNA expression of thousands of genes from multiple samples at one time.30, 31 They

represent the quickest, easiest, and least expensive method to relate expressed genes to clinical

data.

ESTs

An EST is a sequence of nucleotides that represents a portion of an expressed gene. It is

obtained from automated sequencing of a cDNA library. A cDNA library is constructed by first

isolating mRNA from a tissue sample of interest. The mRNA is reverse transcribed into

complementary DNA (cDNA), which is then inserted into plasmids that are replicated in E. coli

colonies on a nutrient-enriched plate. The colonies are randomly picked and the amplified cDNA

is isolated and sequenced using an automated sequencer. A set of sequences from the same tissue

sample is called an EST library.

If every cDNA clone is picked and sequenced, the entire transcript population of the cell

(called a transcriptome) will be represented quantitatively and qualitatively in the EST library.

The ESTs are matched by sequence identity to a database of known genes to determine if the

expressed sequences have been previously identified. Thousands of unidentified genes have been

discovered using EST technology. ESTs were the first successful functional molecular profiling

Corresponding Author: Stefanie S. Jeffrey

10

project of the human genome era. They represented a paradigm shift in scientific methodology

because huge sums of expression data were collected without having any prior information about

genes. EST technology yielded the publication of many transcriptomes, and as of mid-2003, 17.8

million ESTs were deposited in GenBank (with many times this number available in the private

sector). New functional genomic technologies such as microarrays depend on EST sequences.

DNA Microarrays

Microarrays produce a gene expression profile by simultaneously measuring gene

expression of hundreds to thousands of genes from a single sample. Known gene sequences are

attached to membrane-based or glass arrays. Although more expensive than membrane-based

arrays, glass arrays are smaller, easier to use, and allow a higher density of gene spots.

There are two types of glass arrays. One type is constructed using short 20-80 nucleotide

fragments (oligonucleotides) to represent each gene. The oligonucleotides are synthesized in situ

on the glass slide, using special lithographic32 or ink-jet printing33 technologies that were

developed by Affymetrix Corporation or Agilent Technologies. Oligonucleotides can also be

synthesized in batches prior to immobilization onto an array, which can reduce the cost. The

second type of array, cDNA microarrays,30 contains partial to full length cDNAs, 500-5,000

nucleotides in length that are “spotted” on histological slides using robotics and fine print tips

and then immobilized. The cDNAs consist of known and unknown genes, identified using EST

technology. Oligonucleotide array technology is more expensive, but in general, demonstrated to

be more precise and sensitive.

Total RNA (approximately 50 µg) or mRNA (3 µg) is used to measure expressed

transcripts on cDNA microarrays. In general, total RNA or mRNA is isolated and reverse

transcribed with fluorescently-tagged nucleotides to label the cDNA. For samples that do not

Corresponding Author: Stefanie S. Jeffrey

11

contain sufficient amounts of RNA for microarray hybridization, RNA amplification techniques

can be employed.34

Each spot (or feature) on a microarray corresponds to a specific gene or EST. Labeled

cDNA from an experimental sample (eg, cDNA prepared from breast cancers and containing

unknown quantities of specific genes, such as HER2/neu) is hybridized to the microarray. Excess

or non-hybridized cDNA is washed off. Because of the specificity of base pairing at each feature,

the abundance of a gene in the sample is measured. It is difficult to measure an absolute gene

expression value on cDNA microarrays due to systematic differences in gene printing and

hybridization kinetics. Therefore, reference RNA is used to generate a relative abundance ratio

between the sample and a reference that allows gene-to-gene comparisons between different

samples. Sample and reference RNA are labeled with different fluorophores (usually Cy5, which

fluoresces red at 635 nm, and Cy3, which fluoresces green at 525 nm) and cohybridized to the

microarray (Figure 86-1). The hybridized fluorescence signals can be read with an optical

scanner. Using bioinformatics software, a fluorescence signal intensity ratio between the sample

and the reference is computed. Signal intensity ratios provide a relative measure of gene

abundance. Correlations can be made based on the gene expression similarity between

independent samples.35 Genes or samples that demonstrate similar expression patterns are called

clusters (Figure 86-2). Statistical analyses36-38 can be performed and related to pathological and

clinical data to define samples or reactions to treatments.

Characterizing breast cancer subtypes

In 1999, human breast cancers were the first solid tumor to undergo global transcription

analysis using microarrays.39, 40 Before these studies, it was not known whether the genetic and

cellular diversity of solid tumors would preclude identifying gene expression patterns in breast

Corresponding Author: Stefanie S. Jeffrey

12

cancer. Despite the limited number of tumor samples consisting of different breast cancer types

and grades, the small number of genes assayed, and lack of usual breast cancer-associated genes

on the array (eg, HER2/neu and ER), Perou and colleagues39 identified multiple genes that were

similarly expressed and implicated in the molecular phenotype of solid tumors. In a follow-up

study, cDNA microarrays were used to molecularly subtype normal, benign and malignant breast

tumors.41 Variations in growth rate, activity of specific signaling pathways, and cellular

composition of the tumors were all reflected in gene expression profiles. This and follow-up

studies42, 43 identified genes that divided the tumors into distinct molecular subtypes: two ER-

overexpressing subtypes (denoted “Luminal A and B” due to presence of luminal epithelial

cytokeratin markers) and three ER-negative subtypes: “basal-like” tumors that expressed

cytokeratin markers characteristic of basal epithelial cells, “ERBB2 (HER2/neu)-

overexpressing” tumors, and “normal-like” tumors that showed relatively high expression of

genes characteristic of basal epithelial cells and adipocytes which clustered with normal breast

tissue samples. The expression of known luminal and basal cytokeratin epithelial cell markers

suggests that breast cancers may arise from at least two progenitor cell types through different

mechanisms. Other studies44-48 have since demonstrated that ER and ER co-regulated gene

expression (or lack thereof) provides a pervasive molecular signature marked by an abundant and

robust gene expression. c-myc is amplified in 15% of breast cancers and is highly expressed in

“basal-like” tumors, possibly regulating the expression of genes that play a role in the behavior

of these tumors.48 Overall, these data suggest that groups of genes better characterize and refine

tumor subtypes than single gene markers, like ER or HER2/neu.

Using 43,000 feature cDNA microarrays to profile histologically varied tumors from

more racially diverse patient populations, our lab has identified additional molecular subtypes of

Corresponding Author: Stefanie S. Jeffrey

13

breast cancer. We have also shown that invasive lobular carcinomas may be classified into

“typical” and “ductal-like” lobular tumors by their expression profiles.

Subtype profiling of hereditary breast cancers

Mutations in the breast cancer susceptibility genes, BRCA1 and BRCA2, influence DNA

repair and transcriptional regulation differently. Using microarrays, multiple genes were

identified that distinguished BRCA1 from BRCA2 subtypes.49 Interestingly, a patient without a

BRCA1 mutation whose tumor expressed a BRCA1 molecular phenotype, had DNA

hypermethylation of the BRCA1 promoter, silencing its expression. In another expression

profiling study,46 16 of 18 BRCA1 tumors from lymph node negative patients under age 55 were

characterized by downregulation of ER co-regulated genes and upregulation of lymphocytic

genes, including those primarily expressed by B and T cells. All the BRCA1 tumors from this

study were also demonstrated in a different study43 to have a “basal-like” gene-expression

phenotype, consistent with classical studies characterizing BRCA1 tumors as mostly high grade

ER, PR, and HER2/neu negative tumors that stain positive for basal cytokeratins and are often

associated with a lymphocytic infiltrate.50-52 Tumors from patients with BRCA2 mutations,

however, appeared to have a luminal estrogen receptor positive expression profile,43 consistent

with ER positive status and luminal keratin overexpression also found in another study.49

Prophylactic tamoxifen therapy significantly reduces the incidence of breast cancer in patients

with BRCA2 mutations and only modestly, if at all, in patients with BRCA1 mutations,53, 54

further supporting the hypothesis that these tumors arise from different epithelial origins, luminal

ER-expressing and basal ER-negative cell types. Global profiling studies are also being used to

evaluate familial non-BRCA1/2 breast cancers, with preliminary studies suggesting a partition

Corresponding Author: Stefanie S. Jeffrey

14

into at least two subtypes that do not share gene expression profiles with BRCA1 or BRCA2

tumors.55 In summary, molecular profiling data suggest that BRCA1 and BRCA2 hereditary

breast cancers originate from different progenitor cell populations, with independent malignant

mechanisms, different prognosis, and different response to prophylactic tamoxifen treatment.

Characterizing cancer cell progression

Specific changes in DCIS, atypical hyperplasias, usual hyperplasias, normal lobules or

ducts, can be measured by isolating these cell populations from neighboring cells by

microdissection. This can be done manually with a dissecting microscope13 or with newer

techniques that, under microscopic guidance, apply laser energy to excise the cells of interest

(laser microdissection, LMD) or melt a polymer onto the cells to be captured and extract only the

targeted cells from the surrounding tissue (laser capture microdissection, LCM).56

LCM has been used to extract pure populations of epithelial cells from normal lobules

from reduction mammoplasties or breasts with associated cancer, atypical ductal hyperplasia

(ADH), DCIS, and invasive ductal carcinoma (IDC)57 for microarray analysis. Expression

profiling demonstrated that normal epithelial cells distant from cancers had similar

transcriptional signatures to normal epithelial cells from reduction mammoplasties. Significant

expression changes were observed in ADH and persisted in DCIS and IDC dissected from the

same patient, showing patient-specific phenotypes and suggesting that ADH and DCIS are

precursors to IDC. The authors found that Grade I expression signatures generally differed from

Grade III signatures, but intermediate grade lesions shared either a hybrid signature or a distinct

low grade or high grade signature.

Corresponding Author: Stefanie S. Jeffrey

15

In sum, global RNA profiling studies at the invasive41, 43 and preinvasive57 stages suggest

that breast cancers originate from progenitor cells with specific molecular subtypes. This

corroborates earlier studies by Warnberg58 with traditional immunohistochemical (IHC)

techniques and by Buerger8, 59 who used CGH, fluorescence in situ hybridization (FISH), and

IHC analyses.

Molecular profiling in clinical use

Tailoring patient treatments using microarrays

Several groups have shown that molecular profiling can be performed on minimally

invasive breast biopsies taken prior to primary chemotherapy or from non-palpable lesions

identified by breast imaging. Fine needle aspiration (FNA) biopsies and core needle biopsies

have been used47, 60-63 to successfully isolate RNA for microarray studies. In a small pilot study

using core needle biopsies taken before and within the first 48 hours of different regimens of

neoadjuvant chemotherapy, Buchholz and colleagues60 showed that expression profiles of tumors

with and without a good pathological response clustered distinctly. Sotiriou and colleagues61

used FNA biopsies performed on ten patients before and during neoadjuvant chemotherapy to

monitor patient response to doxorubicin and cyclophosphamide. Candidate gene expression

profiles were identified that distinguished responders from nonresponders. Interestingly, the

responders also showed expression changes in ten times the number of genes than the

nonresponders after the first cycle of chemotherapy. Chang and colleagues63 performed core

needle biopsies on 24 patients with locally advanced breast cancer. Using microarray analysis,

they were able to define a set of 92 differentially expressed genes that characterized docetaxel

sensitive tumors, defined as those that had 25% or less residual disease following treatment. This

Corresponding Author: Stefanie S. Jeffrey

16

gene set showed a positive predictive value of 92% and negative predictive value of 83% and is

currently being applied in a larger clinical trial.

Prognosis Profiling

Currently, most lymph node negative breast cancer patients with tumors over 1 cm and

all lymph node positive patients are candidates for adjuvant systemic treatment,64 yet only 2-15%

will benefit.65 Better diagnostic methods are necessary to successfully identify the patients that

require treatment, predict who will benefit from specific therapies, and discover targets to serve

as the basis for new therapies. Patients whose breast cancers are stratified by expression profiling

into five molecular subtypes (ER-positive “luminal A and B”; ER-negative “basal”, “ERBB2

over-expressing” and “normal” breast subtypes) or simply into luminal and basal phenotypes

demonstrate independent relapse-free survival curves.42, 43, 48 Of the five major subtypes, the

basal-like and ERBB2 subtypes reveal the poorest prognosis. Although luminal A and B

subtypes share gene expression similarities and both overexpress ER co-regulated genes, luminal

A tumors show the best prognosis of all the subtypes, even in patients with locally advanced

breast cancer, while luminal B tumors demonstrate poorer survival. Luminal B tumors express

groups of known and unknown genes that are also expressed in ERBB2+ and basal-like tumors,

and like these other subtypes, also exhibit TP53 mutations, possibly influencing the poorer

prognosis of these three subtypes in initial studies. Since long-term survival in locally advanced

breast cancer patients treated with 16 weeks of doxorubicin and tamoxifen was better for the

luminal A phenotype, these results suggest that either the tumors possessed a favorable biology

or it reflected their responsiveness to doxorubicin and/or tamoxifen treatment.

Breast cancer staging criteria is based on tumor size and the presence of lymph node

metastases. Recent data, however, suggests that current staging criteria may need reevaluation. In

Corresponding Author: Stefanie S. Jeffrey

17

expression profiling studies, nodal status and tumor size appear to have less impact on gene

expression and survival than tumor biology. Hormone receptor status and grade, however, appear

to strongly impact gene expression.46, 48, 66 Metastatic potential may be pre-programmed in the

biology of the tumor.46, 67, 68

Using a 70 gene expression profile, van’t Veer and colleagues46 were able to successfully

predict outcome in 81% of women aged less than 55 years with lymph node negative Stage I and

II breast cancers, most of whom did not receive systemic therapy: 91% of the good prognosis

group and only 27% of the poor prognosis group were disease-free at five years. In a follow-up

study by van de Vijver and colleagues,66 the 70 gene profile was retrospectively tested on tumors

from patients less than 53 years of age, but this time with lymph node negative and positive

Stage I and II disease, many of whom received treatment. Lymph node positive patients were

evenly divided between good- and poor-prognosis signatures, suggesting that lymph node

metastasis may be an independent event distinct from systemic metastasis. After ten years, 85%

in the good prognosis set remained distant metastases free compared to 51% in the poor

prognosis group, offering improvement over St. Gallen69 and NIH70 criteria. A clinical trial is

now underway in Europe to prospectively compare this 70 gene profile to standard classification

criteria as the basis for treatment decisions.

Although ER status of the tumors was not an independent prognostic factor in the van de

Vijver study, it has been shown to be the most important clinico-pathological discriminator of

expression subtype by Sotiriou and colleagues,48 who also showed that lymph node status has a

minimal influence on expression profiling. Using overlapping gene expression data from the

van’t Veer study, Sorlie43 showed that basal-like tumors were a prominent subtype with rapid

development of metastases within five years. It is possible that the relatively homogeneous

Corresponding Author: Stefanie S. Jeffrey

18

expression pattern shared by basal-like tumors strongly influenced the 70 gene poor-prognosis

signature.

Validation

The advantage of high throughput global gene expression is the precision afforded in

measuring thousands of genes simultaneously; precision at the individual gene level, however,

can sometimes be sacrificed to perform global assessments. Therefore, validation must be

performed to confirm gene expression and, if desired, to identify the cell type expressing the

gene.

A high throughput validation technology is the tissue microarray (TMA). This is a

paraffin block made up of hundreds of cores from paraffin-embedded tissues from different

patients.71 When the TMA is sectioned, placed on a slide and combined with traditional

validation technologies like IHC and RNA in situ hybridization, gene expression can be

validated over hundreds of samples at one time. Another validation tool is real-time quantitative

polymerase chain reaction (qPCR) (also called TaqMan PCR). In this technique, RNA from a

tissue sample is purified and amplified under optimized gene-specific conditions. Fluorescence

molecules are discharged with each amplification cycle and the amount of fluorescence released

is dependent on the abundance of RNA in the sample.72 Using this technology with plates

containing multiple sample wells, hundreds of genes can be rapidly measured with high precision

and sensitivity.

Proteomics

Corresponding Author: Stefanie S. Jeffrey

19

Proteomics is the study of expressed proteins from a genome. The proteome is potentially the

most important molecular profile because proteins are the actuators of the genome and a cell’s proteins

should determine its phenotype at a given moment. Like DNA and RNA, comprehensive proteomics can

be studied in a quantitative (abundance) and qualitative (presence or absence) manner. Unlike DNA and

RNA, protein function is also influenced by other factors that shape protein activity, such as protein-

protein interactions, subcellular location, conformational changes, half-life changes, and post-

translational modifications. To resolve these changes, proteomic techniques include separation and

identification techniques.73 Due to the increased biochemical and structural diversity of proteins relative

to DNA and RNA, these two tasks are difficult. Current techniques are still in development and have not

been able to construct a genome-wide proteome to describe breast cancer phenotypes. However,

proteomic patterns in the breast cancer serum and ductal fluid already show promise for clinical use in

early diagnosis of breast cancer.

Proteomic Techniques—Protein separation and identification

Two-dimensional gel electrophoresis (2-DE) sequentially separates proteins by their charge and

mass. The separation on a single gel can show thousands of proteins, including proteins that may

undergo post-translational modification (such as by phosphorylation, glycosylation, lipid attachment, or

peptide cleavage) and be represented by multiple spots on a gel. 2-DE can be utilized to identify protein

patterns or to separate proteins prior to identification by mass spectrometry.

Mass spectrometry (MS) is a sensitive and precise approach to identify proteins that are first

separated, digested into peptides, and then ionized. Protein separation can be accomplished with 2-DE or

other methods such as high performance liquid chromatography (HPLC), 2-D liquid chromatography

(2D-LC or LC/LC), capillary electrophoresis, or by biochip chromatography. Proteins are then

individually ionized into a protonated gas phase using multiple techniques. Electrospray ionization (ESI)

Corresponding Author: Stefanie S. Jeffrey

20

creates a fine spray of charged droplets from a liquid sample that evaporates, producing gaseous ionized

molecules. For samples in a solid state, matrix-assisted laser desorption/ionization (MALDI) is a

technique that mixes proteins digested by sequence-specific proteases with a light-absorbing organic

acid matrix that catapults the peptides into an ionized form when irradiated by an ultraviolet laser.

Surface-enhanced laser desorption/ionization (SELDI) uses resin biochips with different

chromatographic properties on their surface to fractionate and isolate proteins through affinity capture.

After washing, retained proteins are mixed with energy absorbing molecules and ionized by laser

pulsation. A newer modification places the energy absorbing molecules directly on the chip. After

ionization by any of these methods, protein fragments are propelled and accelerated by magnetic or

electrostatic forces through a time of flight (TOF) mass spectrometer, which separates them by their

specific mass to charge (m/z) ratio, forming a peptide mass fingerprint. For MALDI, protein

identification is typically accomplished by searching large protein databases and comparing the masses

of collections of peptides (peptide mass fingerprint) to those predicted from digestion of protein

sequences. For LC-ESI analysis, tandem mass spectrometry (MS/MS), in which individual peptides are

fragmented in the mass spectrometer, is utilized to determine the identity of proteins by their amino acid

sequences. For SELDI, in which proteins are analyzed in intact form, there is as of yet no

straightforward method to identify proteins from mass spectra.

Characterizing breast tissues proteomes and identifying biomarkers and targets

2-DE has been used to differentiate protein patterns in normal breast tissue, benign breast tissue,

and breast cancer.74 A 2-DE technique called difference gel electrophoresis (DIGE), which compares

samples from multiple sources differentially labeled with fluorescent dyes by using post-run fluorescent

imaging, has been used to differentiate lysates of breast cancer cell lines to identify proteins associated

with ERBB2 overexpression.75 Bergman and colleagues76 used 2-DE combined with ESI-MS and

Corresponding Author: Stefanie S. Jeffrey

21

MALDI-MS to identify polypeptides differentially expressed in solid tumor cell extracts made from

scrapings of benign and malignant breast tumors. Some of the overexpressed proteins in breast cancer

included nuclear matrix proteins, cytoskeletal and redox proteins, while the known oncogene product

DJ-1 was identified in a breast fibroadenomas, not malignant tissue. Truncated forms of overexpressed

proteins were also identified, suggesting proteolytic processing in both benign and malignant tissue.

Cell type heterogeneity in breast tissue adds complexity to the characterization of protein

populations. Page and colleagues77 grew primary epithelial cell cultures derived from reduction

mammoplasties and used cell sorting techniques to separate luminal and myoepithelial cells. Protein

differences were studied with 2-DE, MALDI and MS/MS technology; a fraction of the differentially

expressed proteins were annotated. Many of these corresponded to known cytokeratin markers that

distinguish the two cell types. Luminal and myoepithelial cell types also demonstrated significant global

homology in their protein profiles, which the authors believed was consistent with derivation from a

common stem cell. Several groups have purified epithelial cells from breast cancers and normal tissue

using LCM and then performed comparative proteomic analyses.78, 79 Wulfkuhle and colleagues80

isolated DCIS and normal ductal epithelium by LCM and identified proteins in DCIS involved in

intracellular trafficking of lipids, vesicles, and membranes. They also found changes in proteins

involved in cell motility and genomic instability, suggesting that DCIS is an already advanced

preinvasive lesion.

In the future, sets of cancer-associated biomarkers identified in nipple aspirate fluid and serum

may prove useful as clinical diagnostic tools. Varnum and colleagues81 collected nipple aspirate fluid

(NAF) in healthy women and identified 64 proteins, showing that NAF is a highly concentrated source

of biomarkers. Paweletz and colleagues82 used SELDI-TOF to analyze NAF, and found protein profiles

that appeared to distinguish women with breast cancer from healthy controls. Reasoning that the breast

Corresponding Author: Stefanie S. Jeffrey

22

is a paired organ, Kuerer and colleagues83 found much higher spot variation comparing protein profiles

by 2-DE of paired NAF samples between matched malignant and normal breasts in women with

unilateral breast cancer. Applying SELDI-TOF technology to NAF, Sauter and colleagues84 identified

five proteins differentially expressed in women with and without breast cancer that are now being tested

in a prospective clinical trial.

At present, investigators are searching for accurate blood tests to diagnose breast cancer. They

are hoping that serum protein profiles may be eventually applied to clinical practice. Using SELDI

technology, Li and colleagues85 identified three biomarkers in breast cancer serum. Together these

markers can differentiate over 90% of serum samples obtained from women with and without breast

cancer. This test did not, however, discriminate serum samples on the basis of tumor size or lymph node

metastases. Following up on studies suggesting distinct serum markers in women with ovarian cancer,86

Petricoin, Liotta and colleagues are using serum protein profiles to develop a blood test to screen for

early breast cancer.

New techniques to more accurately characterize subpopulations of the proteome

Since current technologies are not able to measure the entire proteome, scientists have also

focused on developing proteomic technologies to analyze protein subpopulations. These techniques

promise a more detailed and complete view of interesting proteins (membrane proteins or biomarkers) or

protein characteristics (protein activity).

A protein microarray measuring comparative fluorescence can be constructed based on protein

(eg, antibody) and ligand interactions, analogous to a high throughput enzyme-linked immunosorbant

assay (ELISA).87, 88 Used as an antibody array to detect antigens or an autoantigen array to detect

antibodies,89 this high density array can separate and identify proteins related to breast cancer in

complex solutions such as serum90 or NAF81 in a fast, efficient, and cost effective manner.

Corresponding Author: Stefanie S. Jeffrey

23

Adam and colleagues91 combined membrane isolation techniques, gel electrophoresis and mass

spectrometry to gain insight into the enriched membrane protein fractions of breast cancer cell lines,

which traditionally have been poorly defined by current global proteomic techniques because of their

hydrophobic properties. In addition to many membrane proteins with known significance in breast

cancer, such as MUC1 and the HER2/neu and EGF receptors, three novel genes were identified:

BCMP11, BCMP84, BCMP101. Protein and mRNA expression of BCMP101 was low in normal tissues

in contrast to high levels in many breast cancers confirming BCMP101 as a potential breast cancer

marker.

Le Naour and colleagues92 used a novel proteomics approach to identify secreted breast cancer

proteins in serum using antibodies from patients’ serum. Antibodies in breast cancer serum identified a

reactive protein in lysates of human breast cancer tissues and cell lines spotted on a 2-D gel. MALDI-

TOF was then used to identify the protein as RS/DJ-1, which was detected at high levels in the sera of

37% of patients diagnosed with breast cancer. The combined use of autoantibodies and proteomics to

discover and identify secreted proteins in cancer remains a promising methodology.

In contrast to other proteomic techniques that measure protein abundance, Jessani and

colleagues93 used a technique called activity-based protein profiling (ABPP) that detects enzymes only

in their active states. Specific active site-directed probes that covalently labeled serine hydrolases, a

large enzyme superfamily that comprises approximately 1% of all proteins in the human proteome,

allowed detection of activity in different subcellular locations and glycosylation states in various cancer

cell lines. The authors identified proteases, lipases and esterases differentially regulated specific to tissue

origin, including breast cancer. The most invasive cell lines, as demonstrated by matrigel assay, showed

downregulation of these enzyme activities while a different set of secreted and membrane-associated

serine hydrolases showed activation, possibly representing new markers of tumor aggressiveness.

Corresponding Author: Stefanie S. Jeffrey

24

Conclusions:

The progress achieved through molecular profiling tools has allowed us to reevaluate

concepts involved in breast tumor evolution, diagnosis, and treatment. DNA molecular profiles

have shown tumor progression is associated with accumulating genetic alterations and have

exposed DCIS as an advanced lesion; one model suggests specific genetic lesions in DCIS can

determine progression of invasive carcinomas; ie, that the differentiation status of the invasive

cancer recapitulates that of the in situ lesion. In breast tumors, when RNA expression was

compared to changes in the DNA, gene expression signatures were most often related to

increases in DNA copy number. Furthermore, single mutations or events are probably not

entirely culpable for carcinogenesis since global DNA profiling shows that among multiple

breast cancers, a wide range of tumor genotypes (different chromosomal amplifications and

deletions) exist. RNA molecular profiles are not quite as diverse, and at least five different

expressed phenotypes exist, each with independent survival characteristics. This evidence

suggests breast cancer treatments may need to be tailored to different tumor biologies.

RNA expression profiles indicate breast cancers may arise from progenitor cells that

occur along basal or luminal differentiation pathways, with basal-like tumors associated with a

worse prognosis. BRCA1 breast cancers exclusively carry a basal-like expression signature that is

easily identified using molecular profiling. The profiling also takes into account BRCA1

methylation, which is not measured by mutation analysis. Importantly, expression profiling also

shows that a tumor’s ability to metastasize may not be reliably measured by lymph node

metastasis or size. This is in contradistinction to hormone receptor status and grade that play

greater roles in distinguishing expression phenotypes.

Corresponding Author: Stefanie S. Jeffrey

25

Promising proteomic studies have utilized nipple aspirate fluid and serum to identify

several breast cancer biomarkers. These non-invasive approaches are being tested in clinical

trials. Functional proteomics, a new field that measures protein activity within tumor specimens,

may identify biomarkers and therapeutic targets not discoverable by other techniques.

Despite the clear impact molecular profiling has made to improving our understanding of

breast cancer, there is still a great deal of work ahead. It is important to note that nucleotide

mutations in many key genes associated with breast cancer (eg TP53) are not distinguished using

the global DNA, RNA, or protein molecular profiling methodologies discussed here, but are

being studied using other techniques such as single nucleotide polymorphism (SNP) arrays. SNP

arrays may also augment our understanding of the affects of chromosomal vs. nucleotide

instability on tumor evolution and progression. Furthermore, other areas that may strongly

impact breast cancer biology, such as racial/ethnic differences or stromal-epithelial cell

interactions, are now being explored.

RNA expression profiling currently holds the most translational promise in breast cancer,

but may ultimately be superceded by proteomic techniques. This technique appears to predict

clinical outcome and response to systemic therapy better than classical staging criteria in initial

studies. Recruitment for large prospective clinical trials to better assess molecular prognostic and

predictive gene lists is now underway. It is anticipated that as new global profiling technologies

are applied to clinical care, breast cancer diagnosis and care will be more precise and

individualized than current methods and will lead to the development of novel tumor-specific

therapeutics.

Corresponding Author: Stefanie S. Jeffrey

26

Figure Legends

Figure 1. A general illustration of a cDNA microarray protocol. A cDNA microarray can be

used to determine either gene expression (RNA) or gene copy number (DNA) changes. After

purification, the tumor and reference samples are labeled with Cy5 and Cy3, respectively. The

mixture is hybridized to a microarray and scanned with two wavelengths to measure the relative

intensities of red and green fluorescence at each feature. The relative intensities of features can

be compared among tumors to identify changes in expression associated with a tumor subtype.

Reproduced with permission from the American Society for Pharmacology and Experimental

Therapeutics (ASPET) 94

Figure 2. Gene expression patterns of 85 breast samples. Seventy-eight carcinomas, three

benign tumors, and four normal breast tissues cluster into 5 subtypes: Luminal A (ER positive,

favorable survival); Luminal B (ER positive, poor survival); Normal breast-like; ERBB2

(HER2/Neu) amplicon; Basal epithelial-like cluster. (A) Tumors clusters are represented by

branched dendrograms at the upper figure which indicate degree of similarity between samples.

Genes are clustered by rows with genes that are expressed most similarly clustered together. Red

indicates high relative gene expression compared to reference; green indicates more expression

in reference RNA than in tumor sample (low relative expression). Representataive gene clusters

expressed by the five tumor subtypes above are shown: (B) the ERBB2 amplicon cluster; (C)

genes coexpressed by the Luminal B tumors and the basal and ERBB2 tumors; (D) basal

epithelial cluster containing keratins 5, 17; (E) normal breast-like cluster; and (F) Luminal A

cluster containing ER-associated genes with lower relative expression of these genes by the

Luminal B tumors.

Permission requested by the Proceedings of the National Academy of Sciences 42

Corresponding Author: Stefanie S. Jeffrey

27

1. Carlson RW, Edge SB, Theriault RL. NCCN: Breast cancer. Cancer Control 2001;8(6 Suppl2):54-61.

2. Kallioniemi A, Kallioniemi OP, Sudar D, et al. Comparative genomic hybridization formolecular cytogenetic analysis of solid tumors. Science 1992;258(5083):818-21.

3. Solinas-Toldo S, Lampel S, Stilgenbauer S, et al. Matrix-based comparative genomichybridization: biochips to screen for genomic imbalances. Genes Chromosomes Cancer1997;20(4):399-407.

4. Pinkel D, Segraves R, Sudar D, et al. High resolution analysis of DNA copy number variationusing comparative genomic hybridization to microarrays. Nat Genet 1998;20(2):207-11.

5. Pollack JR, Perou CM, Alizadeh AA, et al. Genome-wide analysis of DNA copy-numberchanges using cDNA microarrays. Nat Genet 1999;23(1):41-6.

6. Albertson DG. Profiling breast cancer by array CGH. Breast Cancer Res Treat2003;78(3):289-98.

7. Kallioniemi A, Kallioniemi OP, Piper J, et al. Detection and mapping of amplified DNAsequences in breast cancer by comparative genomic hybridization. Proc Natl Acad Sci US A 1994;91(6):2156-60.

8. Buerger H, Otterbach F, Simon R, et al. Comparative genomic hybridization of ductalcarcinoma in situ of the breast-evidence of multiple genetic pathways. J Pathol1999;187(4):396-402.

9. Gunther K, Merkelbach-Bruse S, Amo-Takyi BK, et al. Differences in genetic alterationsbetween primary lobular and ductal breast cancers detected by comparative genomichybridization. J Pathol 2001;193(1):40-7.

10. Forozan F, Mahlamaki EH, Monni O, et al. Comparative genomic hybridization analysis of38 breast cancer cell lines: a basis for interpreting complementary DNA microarray data.Cancer Res 2000;60(16):4519-25.

11. Jain AN, Chin K, Borresen-Dale AL, et al. Quantitative analysis of chromosomal CGH inhuman breast tumors associates copy number abnormalities with p53 status and patientsurvival. Proc Natl Acad Sci U S A 2001;98(14):7952-7.

12. Achuthan R, Bell SM, Roberts P, et al. Genetic events during the transformation of atamoxifen-sensitive human breast cancer cell line into a drug-resistant clone. CancerGenet Cytogenet 2001;130(2):166-72.

13. O'Connell P, Pekkel V, Fuqua SA, et al. Analysis of loss of heterozygosity in 399premalignant breast lesions at 15 genetic loci. J Natl Cancer Inst 1998;90(9):697-703.

Corresponding Author: Stefanie S. Jeffrey

28

14. Gong G, DeVries S, Chew KL, et al. Genetic changes in paired atypical and usual ductalhyperplasia of the breast by comparative genomic hybridization. Clin Cancer Res2001;7(8):2410-4.

15. Jones C, Merrett S, Thomas VA, et al. Comparative genomic hybridization analysis ofbilateral hyperplasia of usual type of the breast. J Pathol 2003;199(2):152-6.

16. Lakhani SR. The transition from hyperplasia to invasive carcinoma of the breast. J Pathol1999;187(3):272-8.

17. Tirkkonen M, Tanner M, Karhu R, et al. Molecular cytogenetics of primary breast cancer byCGH. Genes Chromosomes Cancer 1998;21(3):177-84.

18. Boecker W, Moll R, Dervan P, et al. Usual ductal hyperplasia of the breast is a committedstem (progenitor) cell lesion distinct from atypical ductal hyperplasia and ductalcarcinoma in situ. J Pathol 2002;198(4):458-67.

19. Korsching E, Packeisen J, Agelopoulos K, et al. Cytogenetic alterations and cytokeratinexpression patterns in breast cancer: integrating a new model of breast differentiation intocytogenetic pathways of breast carcinogenesis. Lab Invest 2002;82(11):1525-33.

20. Buerger H, Mommers EC, Littmann R, et al. Ductal invasive G2 and G3 carcinomas of thebreast are the end stages of at least two different lines of genetic evolution. J Pathol2001;194(2):165-70.

21. Albertson DG, Ylstra B, Segraves R, et al. Quantitative mapping of amplicon structure byarray CGH identifies CYP24 as a candidate oncogene. Nat Genet 2000;25(2):144-6.

22. Pollack JR, Sorlie T, Perou CM, et al. Microarray analysis reveals a major direct role ofDNA copy number alteration in the transcriptional program of human breast tumors. ProcNatl Acad Sci U S A 2002;99(20):12963-8.

23. Hyman E, Kauraniemi P, Hautaniemi S, et al. Impact of DNA amplification on geneexpression patterns in breast cancer. Cancer Res 2002;62(21):6240-5.

24. Adams MD, Dubnick M, Kerlavage AR, et al. Sequence identification of 2,375 human braingenes. Nature 1992;355(6361):632-4.

25. Adams MD, Kelley JM, Gocayne JD, et al. Complementary DNA sequencing: expressedsequence tags and human genome project. Science 1991;252(5013):1651-6.

26. Bonaldo MF, Lennon G, Soares MB. Normalization and subtraction: two approaches tofacilitate gene discovery. Genome Res 1996;6(9):791-806.

Corresponding Author: Stefanie S. Jeffrey

29

27. Diatchenko L, Lau YF, Campbell AP, et al. Suppression subtractive hybridization: a methodfor generating differentially regulated or tissue-specific cDNA probes and libraries. ProcNatl Acad Sci U S A 1996;93(12):6025-30.

28. Velculescu VE, Zhang L, Vogelstein B, et al. Serial analysis of gene expression. Science1995;270(5235):484-7.

29. Liang P, Pardee AB. Differential display of eukaryotic messenger RNA by means of thepolymerase chain reaction. Science 1992;257(5072):967-71.

30. Schena M, Shalon D, Davis RW, et al. Quantitative monitoring of gene expression patternswith a complementary DNA microarray. Science 1995;270(5235):467-70.

31. Lockhart DJ, Dong H, Byrne MC, et al. Expression monitoring by hybridization to high-density oligonucleotide arrays. Nat Biotechnol 1996;14(13):1675-80.

32. Fodor SP, Read JL, Pirrung MC, et al. Light-directed, spatially addressable parallel chemicalsynthesis. Science 1991;251(4995):767-73.

33. Hughes TR, Mao M, Jones AR, et al. Expression profiling using microarrays fabricated by anink-jet oligonucleotide synthesizer. Nat Biotechnol 2001;19(4):342-7.

34. Zhao H, Hastie T, Whitfield ML, et al. Optimization and evaluation of T7 based RNA linearamplification protocols for cDNA microarray analysis. BMC Genomics 2002;3(1):31.

35. Eisen MB, Spellman PT, Brown PO, et al. Cluster analysis and display of genome-wideexpression patterns. Proc Natl Acad Sci U S A 1998;95(25):14863-8.

36. Tusher VG, Tibshirani R, Chu G. Significance analysis of microarrays applied to the ionizingradiation response. Proc Natl Acad Sci U S A 2001;98(9):5116-21.

37. Tibshirani R, Hastie T, Narasimhan B, et al. Diagnosis of multiple cancer types by shrunkencentroids of gene expression. Proc Natl Acad Sci U S A 2002;99(10):6567-72.

38. Slonim DK. From patterns to pathways: gene expression data analysis comes of age. NatGenet 2002;32 Suppl:502-8.

39. Perou CM, Jeffrey SS, van de Rijn M, et al. Distinctive gene expression patterns in humanmammary epithelial cells and breast cancers. Proc Natl Acad Sci U S A1999;96(16):9212-7.

40. Sgroi DC, Teng S, Robinson G, et al. In vivo gene expression profile analysis of humanbreast cancer progression. Cancer Res 1999;59(22):5656-61.

41. Perou CM, Sorlie T, Eisen MB, et al. Molecular portraits of human breast tumours. Nature2000;406(6797):747-52.

Corresponding Author: Stefanie S. Jeffrey

30

42. Sorlie T, Perou CM, Tibshirani R, et al. Gene expression patterns of breast carcinomasdistinguish tumor subclasses with clinical implications. Proc Natl Acad Sci U S A2001;98(19):10869-74.

43. Sorlie T, Tibshirani R, Parker J, et al. Repeated observation of breast tumor subtypes inindependent gene expression data sets. Proc Natl Acad Sci U S A 2003;100(14):8418-23.

44. Gruvberger S, Ringner M, Chen Y, et al. Estrogen receptor status in breast cancer isassociated with remarkably distinct gene expression patterns. Cancer Res2001;61(16):5979-84.

45. West M, Blanchette C, Dressman H, et al. Predicting the clinical status of human breastcancer by using gene expression profiles. Proc Natl Acad Sci U S A 2001;98(20):11462-7.

46. van 't Veer LJ, Dai H, van de Vijver MJ, et al. Gene expression profiling predicts clinicaloutcome of breast cancer. Nature 2002;415(6871):530-6.

47. Pusztai L, Ayers M, Stec J, et al. Gene expression profiles obtained from fine-needleaspirations of breast cancer reliably identify routine prognostic markers and reveal large-scale molecular differences between estrogen-negative and estrogen-positive tumors. ClinCancer Res 2003;9(7):2406-15.

48. Sotiriou C, Neo SY, McShane LM, et al. Breast cancer classification and prognosis based ongene expression profiles from a population-based study. Proc Natl Acad Sci U S A 2003.

49. Hedenfalk I, Duggan D, Chen Y, et al. Gene-expression profiles in hereditary breast cancer.N Engl J Med 2001;344(8):539-48.

50. Lakhani SR, Gusterson BA, Jacquemier J, et al. The pathology of familial breast cancer:histological features of cancers in families not attributable to mutations in BRCA1 orBRCA2. Clin Cancer Res 2000;6(3):782-9.

51. Olopade OI, Grushko T. Gene-expression profiles in hereditary breast cancer. N Engl J Med2001;344(26):2028-9.

52. Grushko TA, Blackwood MA, Schumm PL, et al. Molecular-cytogenetic analysis of HER-2/neu gene in BRCA1-associated breast cancers. Cancer Res 2002;62(5):1481-8.

53. King MC, Wieand S, Hale K, et al. Tamoxifen and breast cancer incidence among womenwith inherited mutations in BRCA1 and BRCA2: National Surgical Adjuvant Breast andBowel Project (NSABP-P1) Breast Cancer Prevention Trial. Jama 2001;286(18):2251-6.

54. Duffy SW, Nixon RM. Estimates of the likely prophylactic effect of tamoxifen in womenwith high risk BRCA1 and BRCA2 mutations. Br J Cancer 2002;86(2):218-21.

Corresponding Author: Stefanie S. Jeffrey

31

55. Hedenfalk I, Ringner M, Ben-Dor A, et al. Molecular classification of familial non-BRCA1/BRCA2 breast cancer. Proc Natl Acad Sci U S A 2003;100(5):2532-7.

56. Emmert-Buck MR, Bonner RF, Smith PD, et al. Laser capture microdissection. Science1996;274(5289):998-1001.

57. Ma XJ, Salunga R, Tuggle JT, et al. Gene expression profiles of human breast cancerprogression. Proc Natl Acad Sci U S A 2003;100(10):5974-9.

58. Warnberg F, Casalini P, Nordgren H, et al. Ductal carcinoma in situ of the breast: a newphenotype classification system and its relation to prognosis. Breast Cancer Res Treat2002;73(3):215-21.

59. Buerger H, Otterbach F, Simon R, et al. Different genetic pathways in the evolution ofinvasive breast cancer are associated with distinct morphological subtypes. J Pathol1999;189(4):521-6.

60. Buchholz TA, Stivers DN, Stec J, et al. Global gene expression changes during neoadjuvantchemotherapy for human breast cancer. Cancer J 2002;8(6):461-8.

61. Sotiriou C, Powles TJ, Dowsett M, et al. Gene expression profiles derived from fine needleaspiration correlate with response to systemic chemotherapy in breast cancer. BreastCancer Res 2002;4(3):R3.

62. Symmans WF, Ayers M, Clark EA, et al. Total RNA yield and microarray gene expressionprofiles from fine-needle aspiration biopsy and core-needle biopsy samples of breastcarcinoma. Cancer 2003;97(12):2960-71.

63. Chang JC, Wooten EC, Tsimelzon A, et al. Gene expression profiling for the prediction oftherapeutic response to docetaxel in patients with breast cancer. Lancet2003;362(9381):362-9.

64. National Institutes of Health Consensus Development Conference statement: adjuvanttherapy for breast cancer, November 1-3, 2000. J Natl Cancer Inst Monogr 2001(30):5-15.

65. Polychemotherapy for early breast cancer: an overview of the randomised trials. Early BreastCancer Trialists' Collaborative Group. Lancet 1998;352(9132):930-42.

66. van de Vijver MJ, He YD, van't Veer LJ, et al. A gene-expression signature as a predictor ofsurvival in breast cancer. N Engl J Med 2002;347(25):1999-2009.

67. Poste G, Fidler IJ. The pathogenesis of cancer metastasis. Nature 1980;283(5743):139-46.

Corresponding Author: Stefanie S. Jeffrey

32

68. Ramaswamy S, Ross KN, Lander ES, et al. A molecular signature of metastasis in primarysolid tumors. Nat Genet 2003;33(1):49-54.

69. Goldhirsch A, Glick JH, Gelber RD, et al. Meeting highlights: International Consensus Panelon the Treatment of Primary Breast Cancer. J Natl Cancer Inst 1998;90(21):1601-8.

70. Eifel P, Axelson JA, Costa J, et al. National Institutes of Health Consensus DevelopmentConference Statement: adjuvant therapy for breast cancer, November 1-3, 2000. J NatlCancer Inst 2001;93(13):979-89.

71. Kononen J, Bubendorf L, Kallioniemi A, et al. Tissue microarrays for high-throughputmolecular profiling of tumor specimens. Nat Med 1998;4(7):844-7.

72. Heid CA, Stevens J, Livak KJ, et al. Real time quantitative PCR. Genome Res1996;6(10):986-94.

73. Pandey A, Mann M. Proteomics to study genes and genomes. Nature 2000;405(6788):837-46.

74. Dwek MV, Ross HA, Leathem AJ. Proteome and glycosylation mapping identifies post-translational modifications associated with aggressive breast cancer. Proteomics2001;1(6):756-62.

75. Gharbi S, Gaffney P, Yang A, et al. Evaluation of two-dimensional differential gelelectrophoresis for proteomic expression analysis of a model breast cancer cell system.Mol Cell Proteomics 2002;1(2):91-8.

76. Bergman AC, Benjamin T, Alaiya A, et al. Identification of gel-separated tumor markerproteins by mass spectrometry. Electrophoresis 2000;21(3):679-86.

77. Page MJ, Amess B, Townsend RR, et al. Proteomic definition of normal human luminal andmyoepithelial breast cells purified from reduction mammoplasties. Proc Natl Acad Sci US A 1999;96(22):12589-94.

78. Wulfkuhle JD, McLean KC, Paweletz CP, et al. New approaches to proteomic analysis ofbreast cancer. Proteomics 2001;1(10):1205-15.

79. Xu BJ, Caprioli RM, Sanders ME, et al. Direct analysis of laser capture microdissected cellsby MALDI mass spectrometry. J Am Soc Mass Spectrom 2002;13(11):1292-7.

80. Wulfkuhle JD, Sgroi DC, Krutzsch H, et al. Proteomics of human breast ductal carcinoma insitu. Cancer Res 2002;62(22):6740-9.

81. Varnum SM, Covington CC, Woodbury RL, et al. Proteomic characterization of nippleaspirate fluid: identification of potential biomarkers of breast cancer. Breast Cancer ResTreat 2003;80(1):87-97.

Corresponding Author: Stefanie S. Jeffrey

33

82. Paweletz CP, Trock B, Pennanen M, et al. Proteomic patterns of nipple aspirate fluidsobtained by SELDI-TOF: potential for new biomarkers to aid in the diagnosis of breastcancer. Dis Markers 2001;17(4):301-7.

83. Kuerer HM, Goldknopf IL, Fritsche H, et al. Identification of distinct protein expressionpatterns in bilateral matched pair breast ductal fluid specimens from women withunilateral invasive breast carcinoma. High-throughput biomarker discovery. Cancer2002;95(11):2276-82.

84. Sauter ER, Zhu W, Fan XJ, et al. Proteomic analysis of nipple aspirate fluid to detectbiologic markers of breast cancer. Br J Cancer 2002;86(9):1440-3.

85. Li J, Zhang Z, Rosenzweig J, et al. Proteomics and bioinformatics approaches foridentification of serum biomarkers to detect breast cancer. Clin Chem 2002;48(8):1296-304.

86. Petricoin EF, Ardekani AM, Hitt BA, et al. Use of proteomic patterns in serum to identifyovarian cancer. Lancet 2002;359(9306):572-7.

87. Haab BB, Dunham MJ, Brown PO. Protein microarrays for highly parallel detection andquantitation of specific proteins and antibodies in complex solutions. Genome Biol2001;2(2):RESEARCH0004.

88. MacBeath G. Protein microarrays and proteomics. Nat Genet 2002;32 Suppl:526-32.

89. Robinson WH, DiGennaro C, Hueber W, et al. Autoantigen microarrays for multiplexcharacterization of autoantibody responses. Nat Med 2002;8(3):295-301.

90. Woodbury RL, Varnum SM, Zangar RC. Elevated HGF levels in sera from breast cancerpatients detected using a protein microarray ELISA. J Proteome Res 2002;1(3):233-7.

91. Adam PJ, Boyd R, Tyson KL, et al. Comprehensive Proteomic Analysis of Breast CancerCell Membranes Reveals Unique Proteins with Potential Roles in Clinical Cancer. J BiolChem 2003;278(8):6482-9.

92. Le Naour F, Misek DE, Krause MC, et al. Proteomics-based identification of RS/DJ-1 as anovel circulating tumor antigen in breast cancer. Clin Cancer Res 2001;7(11):3328-35.

93. Jessani N, Liu Y, Humphrey M, et al. Enzyme activity profiles of the secreted and membraneproteome that depict cancer cell invasiveness. Proc Natl Acad Sci U S A2002;99(16):10335-40.

94. Jeffrey SS, Fero MJ, Borresen-Dale A-L, et al. Expression Array Technology in theDiagnosis and Treatment of Breast Cancer. Mol Interv 2002;2(2):101-9.